fix(sdk): align SDK max-tokens floor with CLI default (4096 → 8192)

The SDK last-resort MaxTokens floor is applied in kit.New() when
Options.MaxTokens, KIT_MAX_TOKENS, .kit.yml, and per-model defaults
are all unset. It was 4096 (inherited from the old setSDKDefaults
viper default) while the CLI --max-tokens cobra default is 8192.

Bump the floor to 8192 so SDK and CLI callers start from the same
base value before rightSizeMaxTokens runs, then update README,
skills/kit-sdk/SKILL.md, and www/pages/{configuration,sdk/options}.md
to match.
This commit is contained in:
Ed Zynda
2026-04-17 11:59:49 +03:00
parent ecf95b52e1
commit e1c94cb362
6 changed files with 21 additions and 17 deletions
+1 -1
View File
@@ -547,7 +547,7 @@ host, err := kit.New(ctx, &kit.Options{
Quiet: true,
// Generation parameters (override env/config/per-model defaults)
MaxTokens: 16384, // 0 = auto-resolve (env → config → per-model → 4096 floor)
MaxTokens: 16384, // 0 = auto-resolve (env → config → per-model → 8192 floor)
ThinkingLevel: "medium", // "off", "low", "medium", "high"
Temperature: ptr(float32(0.2)), // pointer so 0.0 != unset; nil = provider default
TopP: nil, // nil = leave provider/per-model default
+4 -2
View File
@@ -40,10 +40,12 @@ Guidelines:
// sdkDefaultMaxTokens is the last-resort ceiling applied when the SDK caller
// has not configured max-tokens via Options, env, config, or a per-model
// default. It is intentionally applied on the *models.ProviderConfig struct
// default. It matches the CLI's --max-tokens cobra default so SDK and CLI
// callers see the same base value before per-model right-sizing runs.
// It is intentionally applied on the *models.ProviderConfig struct
// (not via viper) so that viper.IsSet("max-tokens") remains false and the
// right-sizing + per-model-default paths continue to work.
const sdkDefaultMaxTokens = 4096
const sdkDefaultMaxTokens = 8192
// setSDKDefaults registers viper defaults that match the CLI's cobra flag
// defaults for keys where SetDefault does not interfere with downstream
+9 -8
View File
@@ -825,20 +825,21 @@ type Options struct {
// .kit.yml / KIT_* environment variables. Leaving a field at its
// zero/nil value means "use the configured default", which in turn
// falls back to per-model defaults (modelSettings / customModels) and
// finally to a last-resort SDK floor of 4096 for MaxTokens (sampling
// params fall through to provider-level defaults).
// finally to a last-resort SDK floor of 8192 for MaxTokens (matching
// the CLI --max-tokens default; sampling params fall through to
// provider-level defaults).
//
// Pointer types are used for sampling parameters so the SDK can
// distinguish "explicitly set to 0" from "leave alone".
// MaxTokens overrides the maximum output tokens per LLM response.
// 0 = let the precedence chain resolve a value (env → config →
// per-model → 4096 SDK floor). Setting a non-zero value here
// suppresses automatic right-sizing, matching the CLI's
// --max-tokens flag semantics. Bump this when generating long
// outputs (HTML artifacts, large refactors, etc.) to avoid silent
// truncation mid-tool-call. The cap also applies after model
// switches via [Kit.SetModel].
// per-model → 8192 SDK floor, matching the CLI default). Setting a
// non-zero value here suppresses automatic right-sizing, matching
// the CLI's --max-tokens flag semantics. Bump this when generating
// long outputs (HTML artifacts, large refactors, etc.) to avoid
// silent truncation mid-tool-call. The cap also applies after
// model switches via [Kit.SetModel].
MaxTokens int
// ThinkingLevel sets the reasoning effort for models that support
+2 -2
View File
@@ -83,7 +83,7 @@ host, err := kit.New(ctx, &kit.Options{
// Generation parameters — override env/config/per-model defaults.
// Leaving a field at its zero/nil value lets the precedence chain
// resolve a value (KIT_* env → .kit.yml → modelSettings/customModels →
// 4096 floor for MaxTokens, provider defaults for samplers).
// 8192 floor for MaxTokens, provider defaults for samplers).
MaxTokens: 16384, // 0 = auto-resolve; non-zero suppresses right-sizing
ThinkingLevel: "medium", // "off", "low", "medium", "high" ("" = default)
Temperature: ptrFloat32(0.2), // pointer so explicit 0.0 != unset
@@ -148,7 +148,7 @@ func ptrFloat32(v float32) *float32 { return &v }
| Field | Type | Empty/nil means | Notes |
|-------|------|-----------------|-------|
| `MaxTokens` | `int` | Auto-resolve (env → config → per-model → 4096 floor) | Non-zero suppresses `rightSizeMaxTokens` |
| `MaxTokens` | `int` | Auto-resolve (env → config → per-model → 8192 floor) | Non-zero suppresses `rightSizeMaxTokens` |
| `ThinkingLevel` | `string` | Auto-resolve (→ `"off"`) | Valid: `"off"`, `"low"`, `"medium"`, `"high"` (and `"minimal"` for some providers) |
| `Temperature` | `*float32` | Leave provider/per-model default | Pointer so explicit `0.0` ≠ unset |
| `TopP` | `*float32` | Leave provider/per-model default | |
+1 -1
View File
@@ -189,7 +189,7 @@ For the generation and provider parameters documented above, the resolved value
4. `.kit.yml` / `.kit.yaml` / `.kit.json` (project-local, then global)
5. Per-model defaults (`modelSettings[provider/model]` / `customModels[...].params`)
6. Provider-level defaults (e.g. Anthropic's own temperature default)
7. SDK last-resort floor — currently a 4096 output-token ceiling when nothing else is configured
7. SDK last-resort floor — currently an 8192 output-token ceiling matching the CLI `--max-tokens` default, auto-raised per-model up to 32768 when the model's catalog ceiling is higher
See the [SDK options reference](/sdk/options) for the full list of `kit.Options` fields that map to these keys.
+4 -3
View File
@@ -96,8 +96,9 @@ host, err := kit.New(ctx, &kit.Options{
These fields override the corresponding values from `.kit.yml` / `KIT_*`
environment variables. Leaving a field at its zero/nil value lets the
precedence chain resolve a value (`KIT_*` env → config file → per-model
defaults from `modelSettings`/`customModels` → a 4096 SDK floor for
`MaxTokens` and provider-level defaults for samplers).
defaults from `modelSettings`/`customModels` → an 8192 SDK floor for
`MaxTokens` (matching the CLI `--max-tokens` default) and provider-level
defaults for samplers).
| Field | Type | Default | Description |
|-------|------|---------|-------------|
@@ -174,7 +175,7 @@ in this order (highest priority first):
3. `.kit.yml` (project-local then `~/.kit.yml`)
4. Per-model defaults (`modelSettings[provider/model]` or `customModels[...].params`)
5. Provider-level defaults (e.g. Anthropic's own temperature default)
6. SDK last-resort floor (currently: `MaxTokens = 4096`)
6. SDK last-resort floor (currently: `MaxTokens = 8192`, matching the CLI `--max-tokens` default)
Sampling params that remain `nil` after the SDK resolution step are left out
of the provider call entirely, so the LLM library applies its own default.