Track whether a model sends proper ReasoningDeltaEvent events. If so,
skip parsing <thinking> tags from text to avoid sending reasoning content
twice (once as proper reasoning, once parsed from text).
Also reset the tracking state at the start of each new prompt turn.
Parse <thinking>...</thinking> tags from models (Qwen, DeepSeek) that
wrap reasoning content in XML-style tags instead of using proper
reasoning events.
When text chunks contain thinking tags:
- Extract content between tags and send as reasoning/thought updates
- Send content outside tags as regular message text
- Track state across chunks to handle streaming properly
This mirrors the TUI's thinking tag parsing behavior.
- Implement proper handling for all ACP content block types:
- ContentBlockText: extracts text content
- ContentBlockImage: decodes base64 to LLMFilePart
- ContentBlockAudio: decodes base64 to LLMFilePart
- ContentBlockResource: handles text and binary embedded resources
- ContentBlockResourceLink: reads files from disk
- Text files are now included inline in the message (not as FilePart)
to avoid OpenAI API errors. Only binary files (images, audio, PDFs)
are sent as FilePart attachments.
- Add fallback MIME types when not provided by client
- Add default prompt text when user attaches files without text
- Add comprehensive debug logging for content extraction
- Enable debug logging in ACP command when --debug flag is used
Previously, when streaming text grew taller than the allocated view
height, the top (older) lines were silently discarded by viewContent().
This meant users could not scroll up to see them.
Now, overflow lines are emitted directly via tea.Println so they land
in the terminal's real scrollback buffer — matching the diagram where
completed text lives in the red scrollback region and the green viewable
area always shows the most recent streaming lines + input/footer.
Key changes:
- StreamComponent: add scrollbackFlushedLines counter and ConsumeOverflow()
method that returns newly overflowed lines and advances the pointer
- StreamComponent.Reset(): zero the counter between steps
- StreamComponent.GetRenderedContent(): skip already-flushed lines so
the end-of-step flush doesn't re-emit content already in scrollback
- AppModel.Update(): call ConsumeOverflow() each cycle and emit overflow
directly via tea.Println (not appendScrollback, to avoid triggering
drainScrollback's auto-flush guard while streaming is active)
- streamComponentIface: add ConsumeOverflow() to interface
- model_test.go: add stub ConsumeOverflow() to test double
- children_test.go: add 7 unit tests covering ConsumeOverflow and the
updated GetRenderedContent skip-flushed-lines behaviour
When users run `/theme <name>`, the alert colors (Tip, Note, Warning, etc.)
now update correctly. Previously, MessageRenderer and StreamComponent cached
herald.Typography instances that weren't refreshed after theme changes.
Changes:
- Added UpdateTheme() method to Renderer interface
- Implemented UpdateTheme() for MessageRenderer to recreate herald typography
- Added no-op UpdateTheme() stub for CompactRenderer (fetches colors fresh)
- Implemented UpdateTheme() for StreamComponent reasoning block renderer
- Modified handleThemeCommand() to notify all renderers of theme changes
This ensures newly rendered messages use the current theme's alert colors.
- Display kitBanner() before PrintStartupInfo() when running Kit normally
- The ASCII art banner with KITT scanner lights now appears at the top
of the screen, before Model, Context, Skills information
- Maintains consistent styling with the existing usage/help screen
Models like Qwen and DeepSeek wrap reasoning content in ... XML-like
tags within the regular content field. This was causing the reasoning
text to appear twice - once as a reasoning block and once as regular text.
Changes:
1. Provider hooks (providers.go):
- Extract reasoning from tags and emit proper reasoning events
- Use openai provider directly with custom ExtraContentFunc and
StreamExtraFunc hooks to parse thinking content
2. Stream filtering (stream.go):
- Filter out all text content between and tags at the
streaming level to prevent duplicate rendering
- Track state with inThinkTag flag across stream chunks
3. Message conversion (content.go):
- Strip any remaining tags from text content when converting
from fantasy messages
The regex patterns use string concatenation to avoid XML tag corruption:
regexp.MustCompile( + + + + + + + )
Fixes duplicate reasoning text when using custom provider with models
that wrap thinking in tags.
Two related fixes for --provider-url handling:
1. Don't restore custom/* models from preferences without --provider-url
- When user runs with --provider-url, model defaults to custom/custom
- If they switch models, custom/custom gets persisted to preferences
- On next run without --provider-url, restoring custom/custom fails
- Now we skip restoring custom/* models when no --provider-url is provided
2. Auto-prefix bare model names with custom/ when --provider-url is set
- Users often provide just the model name (e.g., qwen3.5-35b-a3b)
- This failed with 'invalid model format' error
- Now auto-prefixed with custom/ for OpenAI-compatible endpoints
- Replace glamour-based markdown rendering with herald/herald-md
- Update go.mod and go.sum with new dependencies
- Refactor styles.go to use Typography cache instead of TermRenderer
- Update enhanced_styles.go for compatibility
- Update btca.config.jsonc configuration
- Update github.com/indaco/herald v0.9.0 -> v0.10.0
- Update charm.land/bubbles/v2 v2.0.0 -> v2.1.0
- Update AWS SDK v2 packages
- Update google.golang.org/genai v1.51.0 -> v1.52.0
- Update various other dependencies
refactor(ui): use herald.CodeBlock for Read tool output
- Replace manual renderCodeBlock() with herald.CodeBlock()
- Add WithCodeLineNumberOffset() support for correct line numbers
- Extract language hint from file extension for syntax highlighting
- Preserve existing syntax highlighting via WithCodeFormatter()
- Remove unused codeLine struct and renderCodeBlock function
Add two test files that auto-discover and validate every single-file
extension in examples/extensions/:
- all_extensions_load_test.go: Verifies all 32 extensions load into the
Yaegi interpreter without errors (syntax, imports, Init signature).
- all_extensions_sanity_test.go: Six generalized sanity checks:
- Lifecycle: SessionStart → SessionShutdown round-trip
- CommandSanity: non-empty names/descriptions, no spaces/leading slash,
non-nil Execute, no duplicates
- ToolSanity: non-empty names/descriptions, at least one executor,
valid JSON parameters, no duplicates
- ZeroValueEvents: all 22 event types fired as zero-value structs
- WidgetSanity: non-empty IDs, consistent keys, valid placements
- IdempotentLifecycle: repeated SessionStart/SessionShutdown
Shared extensionFiles() helper auto-discovers extensions so new files
are automatically covered.
performFork() called ClearMessages() after Branch(targetID), but
ClearMessages() calls TreeSession.ResetLeaf() which sets leafID back
to empty — immediately undoing the branch. The in-memory message store
was also never reloaded from the tree session after branching, so the
LLM had zero context.
Add ReloadMessagesFromTree() which clears the store and reloads it
from the tree session's current branch without resetting the leaf
pointer. Use it in performFork() instead of ClearMessages().
The issue was that cache control persisted across turns in conversation
history, causing accumulation beyond Anthropic's 4-block limit.
Changes:
- Count existing cache blocks in message history before adding new ones
- Only add new cache blocks up to the 4-block limit
- Remove tool caching (was adding 1 block per turn)
- Skip messages that already have cache control set
Tested with 5 sequential messages - no errors, proper cache metrics.
Crush's proven 4-block strategy:
1. Last system message (if present)
2. Last 2 conversation messages
3. Last tool definition
This stays exactly at Anthropic's 4-block limit without exceeding it.
Previous implementation could exceed the limit in certain edge cases.
Now matches Crush's battle-tested approach.
Anthropic API enforces a maximum of 4 blocks with cache_control per request.
The previous implementation could exceed this limit when combining:
- System message caching
- Recent message caching
- Tool definition caching
Changes:
- Add explicit cache block counting (max 4)
- Remove tool cache control to stay under limit
- Prioritize: system message first, then recent messages
- Work backwards from end to cache most recent context first
Fixes: bad request error 'A maximum of 4 blocks with cache_control may be provided'
- Fix gofmt formatting issues in 7 files
- Replace atomic.AddUint64 with atomic.Uint64 type (modernize)
- Replace for i := 0; i < count; i++ with for i := range count (modernize)
- Replace strings.Split with strings.SplitSeq (modernize)
- Replace deprecated GetFantasyProviders with GetLLMProviders
- Replace deprecated GetFantasyMessages with GetLLMMessages
- Replace deprecated ConvertFromFantasyMessage with ConvertFromLLMMessage
- Replace deprecated FromFantasyMessage with FromLLMMessage
- Replace deprecated ToFantasyMessages with ToLLMMessages
- Remove 2 unused formatToolArgs functions
- Upgrade golangci-lint to v2.11.4
- Fix errcheck warnings for os.Setenv/os.Unsetenv in tests
- Use maps.Copy instead of manual loop (modernize lint)
- Add maps import for maps.Copy
Remove internal monologue comments that don't add value for readers:
- Remove lengthy explanations of type conflicts that are now resolved
- Remove 'NOTE:' and 'TODO:' comments documenting implementation history
- Remove obvious test comments that just restate what the code does
- Keep only meaningful comments that explain design intent
The code is now cleaner and easier to read without the self-referential
commentary that was useful during development but not for maintenance.
Implements automatic prompt caching to reduce API costs by 60-90% for
repeated prompts with the same context.
Architecture:
- Provider-level caching for OpenAI (PromptCacheKey)
- Message-level caching for Anthropic (avoids type conflicts)
- Model family detection enables caching regardless of provider
Key Changes:
- Add ModelInfo.Family with SupportsCaching() and CacheType() methods
- Add ProviderConfig.DisableCaching for opt-out
- Implement message-level cache control in agent (like Crush)
- Last system message gets cache control
- Last 2 messages get cache control
- Last tool gets cache control
- Auto-disable caching when thinking is enabled (type conflict avoidance)
- Add KIT_DISABLE_CACHE environment variable for global opt-out
Tested with opencode/claude-sonnet-4-6 showing cacheRead/cacheWrite
tokens in debug output, confirming 60-90% cost savings.
Closes cost optimization for multi-turn conversations.
Rename public SDK symbols to use generic LLM terminology instead of
exposing the internal dependency name (charm.land/fantasy):
Public API renames (with deprecated wrappers for backward compat):
- ConvertToFantasyMessages() → ConvertToLLMMessages()
- ConvertFromFantasyMessage() → ConvertFromLLMMessage()
- GetFantasyProviders() → GetLLMProviders()
New type alias:
- LLMFilePart = fantasy.FilePart (eliminates need for direct fantasy import)
- PromptResultWithFiles() signature now uses LLMFilePart
Internal renames (with deprecated wrappers):
- ModelsRegistry.GetFantasyProviders() → GetLLMProviders()
- TreeManager.GetFantasyMessages() → GetLLMMessages()
- TreeManager.AppendFantasyMessage() → AppendLLMMessage()
- TreeManager.AddFantasyMessages() → AddLLMMessages()
- Message.ToFantasyMessages() → ToLLMMessages()
- FromFantasyMessage() → FromLLMMessage()
- npmToFantasyProvider → npmToLLMProvider
- isProviderFantasySupported() → isProviderLLMSupported()
All internal callers migrated to new names. ~30 comments updated
to remove Fantasy references across pkg/kit/, internal/agent/,
internal/models/, internal/message/, internal/session/.
Documentation updates:
- AGENTS.md: added Public SDK rules section (no dependency leakage,
naming conventions, deprecation pattern)
- README.md: removed Fantasy references
- pkg/kit/README.md: full rewrite with current API surface
- skills/kit-sdk/SKILL.md: updated examples and type references
- www/pages/providers.md, www/pages/cli/commands.md: updated
- Add StepUsageEvent and SteerConsumedEvent to event types table
- Add new Extension API section documenting kit.Extensions() sub-API
- Add extension_api.go to Key Files reference list
- Fix Close() error handling in README SDK example
- Create ExtensionAPI interface with all extension-related methods
- Add extensionAPI type that wraps *Kit and implements the interface
- Add Kit.Extensions() method to access the ExtensionAPI
- Remove ~30 Extension* methods from Kit (breaking SDK change)
- Update all internal callers (cmd/, internal/acpserver/) to use Extensions().Method()
- Extensions themselves unaffected (use kit/ext API via Yaegi)
This cleans up the Kit API surface while maintaining full extension functionality.
kit.go
- Extract iterBranchMessages helper to eliminate ~15 lines of duplicated
branch-fetch/type-assert boilerplate between GetSessionMessages and
GetStructuredMessages
- Move skillCache from package-level global to per-Kit field; avoids
cross-contamination when multiple Kit instances exist in same process
skills.go
- Remove globalSkillCache var and skillCache type definition
- Update DiscoverSkillsForExtension and ClearSkillCache to use m.skillCache
- Remove unused sync import
sessions.go
- Use m.treeSession.EntryID instead of local getEntryID duplicate
- Remove local getEntryID function (was missing LabelEntry, SessionInfoEntry,
CompactionEntry types that internal/session.TreeManager.EntryID handles)
internal/session/tree_manager.go
- Export entryID -> EntryID so pkg/kit can use it directly
- Update all internal callers to use EntryID
config.go
- Add sync comment for defaultSystemPrompt noting it should be kept in sync
with CLI default in cmd/root.go
hooks_test.go
- Add newEmptyHookedTool helper for tests that need hookedTool with empty
hook registries
- Update TestHookedTool_Passthrough and TestHookedTool_InfoDelegates to use
helper (saves ~6 lines of boilerplate each)
- Merge TestHookRegistry_HasHooks into TestHookRegistry_Unregister (was
testing same behavior, now just one initial state assertion added)
All changes tested with opencode/kimi-k2.5 exploring the repo in tmux.
6 files changed, 69 insertions(+), 98 deletions(-)
Replace var function aliases with proper func wrappers (types.go)
- ParseModelString, CreateProvider, GetGlobalRegistry, LoadSystemPrompt
were package-level vars, making them reassignable and rendering oddly
in go doc. Now plain func wrappers with matching signatures.
Fix Subagent() double-error return convention
- Was returning both (*SubagentResult{Error: err}, err) simultaneously.
Now returns (nil, err) on failure, consistent with Go conventions.
- Removed SubagentResult.Error field; errors come from the error return.
- Updated all call sites in cmd/root.go, internal/acpserver, and kit.go.
Fix NavigateTo/SummarizeBranch/CollapseBranch string-encoded errors
- All three returned "" or an error string instead of error values,
making it impossible to distinguish success from failure in SummarizeBranch
(empty string meant both "no content" and "LLM failed").
- NavigateTo: string -> error
- SummarizeBranch: string -> (string, error)
- CollapseBranch: string -> error
- Updated cmd/root.go bridge closures to use err != nil and err.Error().
Remove duplicate GetSessionFilePath (use GetSessionPath)
- GetSessionPath (sessions.go) and GetSessionFilePath (kit.go) were
identical. Removed GetSessionFilePath; updated cmd/root.go and
internal/acpserver to call GetSessionPath directly.
events.go
- Delete subagentListenerSet (verbatim duplicate of eventBus); reuse
*eventBus in SubscribeSubagent and getSubagentListenerSet
hooks.go
- Add early-exit in run() when hooks slice is empty, making all
hasHooks() guard call sites in kit.go and compaction.go redundant
kit.go
- Remove four if m.X.hasHooks() { m.X.run(...) } outer guards
(beforeTurn, contextPrepare, afterTurn x2); run() now short-circuits
- Replace goto drained with an idiomatic return inside default: branch
- Replace stdlib log.Printf with charmlog.Debug (charmbracelet/log),
consistent with the rest of the codebase; remove "log" import
config.go
- Collapse single-element configNames := []string{".kit"} loop into a
direct viper.SetConfigName call (removes slice, for, break, flag)
auth.go
- Fix GetOpenAIAPIKey: it documented OPENAI_API_KEY env var fallback but
never called os.Getenv; now it does
compaction.go
- Extract persistAndEmitCompaction helper; eliminates duplicated
AppendCompaction + events.emit block in compactInternal and
applyCustomCompaction
- Replace fmt.Errorf("%s", reason) with errors.New(reason)
- Name the 16384 magic number as const defaultReserveTokens
skills.go
- Fix broken double-checked lock in DiscoverSkillsForExtension: the
read-unlock -> write-lock gap had a TOCTOU race; replaced with a
single write-lock covering the check and load
- Remove dead nil guard in convertSkills (convertSkill never returns nil)
- Rename convertSkills parameter skills->skillList to avoid shadowing
the skills package import
extensions_bridge.go
- Delete taskMutex struct (sync.Mutex wrapper with map passed as param);
replace with inline var taskMu sync.Mutex at the use site
- Simplify AgentEnd double-if into a single combined := declaration
template_bridge.go
- Fix RenderTemplate: use varRegex.ReplaceAllStringFunc instead of
two-pass strings.ReplaceAll; handles arbitrary whitespace in {{var}}
- Remove dead isFlag function and simplify ParseArguments guard
(the outer !HasPrefix guard made isFlag always return false)
- Cache matchModelPattern compiled regexps in a sync.Map to avoid
repeated regexp.Compile on hot streaming paths
pkg/extensions/test/mock.go
- Remove dead local StatusBarEntry type (duplicate of extensions type,
never referenced)
- Change make([]T, 0) to nil for nine slice fields in NewMockContext
pkg/extensions/test/harness.go
- Remove MustLoad (no callers outside the package)
- Remove extPath field (assigned but never read)
- Remove redundant os.Stat in LoadFile (os.ReadFile already errors)
events_test.go
- Add five missing event types to TestEventTypes table
(Compaction, ReasoningDelta, ToolOutput, StepUsage, SteerConsumed)
- Expand TestEventOrdering from 11 to 16 events with the same types
- Add a got < 0 assertion to TestEventBusConcurrentSubscribeEmit so the
test can actually fail rather than only logging
## Dead code removal
- Delete slash_command_input.go (352 lines, never instantiated)
- Remove FormatCompactLine, StyleCompactSymbol/Label/Content from
enhanced_styles.go (zero call sites)
- Remove getTheme() alias in messages.go; standardize on GetTheme()
across compact_renderer.go (8 sites) and tool_renderers.go (14 sites)
## BubbleTea correctness
- Fix child model discards: all m.stream.Update() and m.input.Update()
calls now store the returned model via type-assertion (13 sites)
- Fix Init(): remove vestigial nil guards; StreamComponent.Init() always
returns nil so only m.input.Init() is needed
- Fix /clear divergence: remove silent InputComponent /clear handler so
parent AppModel handles it with the proper system message (one path)
## Architecture / maintainability
- Unify slash-command dispatch from two-pass (exact + prefix) to single
parse: strings.Cut once, GetCommandByName on name, pass args to
handleSlashCommand(sc, args); eliminates 3 separate dispatch sites
- Add noopCmd package-level var replacing three inline func()tea.Msg{nil}
sentinel returns
- Remove stale TAS-15/16/17 comments from interface declarations
- Deduplicate headerProviderForUI / footerProviderForUI in cmd/root.go
into a shared headerFooterProviderForUI helper (removes ~28 duplicated lines)
## Performance
- Cache glamour.TermRenderer keyed by width in styles.go; invalidate on
theme change — eliminates full goldmark parser re-init every flush tick
- Add styleMarginBottom1 package-level var replacing 9 per-frame
lipgloss.NewStyle().MarginBottom(1) allocations
- Add layoutDirty flag: replace 9 distributeHeight() calls in Update()
with m.layoutDirty=true; flush once in View() — guarantees exactly one
layout measurement per frame instead of N (reduces double-render)
- Add WidgetUpdateEvent coalescing in app.NotifyWidgetUpdate() via
atomic.Bool + 16ms debounce; prevents fast extension tickers from
flooding BubbleTea's message queue with redundant re-render triggers
## Concurrency safety
- Convert all NotifyWidgetUpdate() call sites in cmd/root.go to
go appInstance.NotifyWidgetUpdate() (16 sites) — eliminates deadlock
risk when called synchronously from inside BubbleTea's Update() handler
- Extract isShellTool() helper in tool_renderers.go to eliminate
duplicated shell tool matching logic
- Replace bannedCommands slice with compiled regex in bash.go for
cleaner security validation
- Extract pathSet helper type in loader.go for reusable path
deduplication
- Consolidate ac()/acOr() helpers in themes.go for better organization
- Total reduction: ~34 lines across 4 files
All tests pass (go test -race ./...) and build succeeds.
These methods have been deprecated since the narrow-accessor and event-
subscriber APIs were introduced. No callers exist in this repository.
- pkg/kit/kit.go: remove GetExtRunner(), GetBufferedLogger(), GetAgent(),
and PromptWithCallbacks(); update Subscribe() doc comment which still
mentioned PromptWithCallbacks; tighten section header comment
- pkg/kit/README.md: replace PromptWithCallbacks example with the
OnToolCall/OnToolResult/OnStreaming subscriber pattern; remove method
from the quick-reference list
- README.md: same example migration in the SDK section
- www/pages/sdk/callbacks.md: remove the PromptWithCallbacks section
entirely; the event-based monitoring section that followed it is now
the lead content
- www/pages/sdk/overview.md: remove PromptWithCallbacks row from the
prompt-variant table
- skills/kit-sdk/SKILL.md: remove the deprecated legacy callback snippet
- agent: remove unused currentToolName variable and its compiler-suppressor
'_ = currentToolName'; currentToolArgs is the field actually used by
OnToolResult callbacks
- tools/connection_pool: collapse double-nested identical if guard into a
single check (copy-paste artifact)
- tools/mcp_test: replace hand-rolled contains() helper with strings.Contains;
add 'strings' import and delete the redundant function
- config: fix tilde path expansion (filepath.Join result was discarded)
- config: remove dead comment '// base := GetConfigPath()'
- auth: extract oauthTokenExpired/oauthTokenNeedsRefresh helpers to
eliminate copy-paste duplication across AnthropicCredentials and
OpenAICredentials
- ui/messages: remove dead RenderToolCallMessage on MessageRenderer
(not part of Renderer interface, never called)
- ui/compact_renderer: remove dead RenderToolCallMessage on CompactRenderer
(symmetric duplicate, never called)
- ui/enhanced_styles: remove dead CreateGradientText wrapper
(one-liner over ApplyGradient, never called)
- ui/fuzzy: fix fuzzyCharacterMatch to use rune iteration instead of
byte indexing (was silently wrong for multi-byte Unicode input)
- ui/file_suggestions: remove duplicate fuzzyCharMatch; call the now-
correct shared fuzzyCharacterMatch instead; drop unused utf8 import
- app: replace TODO comment with descriptive note (batch file attachment
limitation is intentional, not a pending action item)
This commit fixes several issues with token usage tracking:
1. Fix InputTokens-only validation bug - now checks any token field > 0
to handle OpenAI-compatible providers where cached prompts result in
InputTokens=0 while OutputTokens>0
2. Remove per-step context token updates from recordStepUsage() - context
fill is now set once at turn completion via updateUsageFromTurnResult
using FinalUsage.InputTokens, preventing display jumps during multi-step
tool calls
3. Track maximum context seen in SetContextTokens() - prevents the status
bar from showing decreasing token counts when FinalUsage.InputTokens
reflects only the last step's input
4. Add comprehensive debug logging for token tracking at key points:
- StepUsageEvent emission
- recordStepUsage processing
- updateUsageFromTurnResult processing
5. Update tests to reflect new behavior:
- TestRecordStepUsage_updatesTracker: no longer expects context updates
- TestUpdateUsageFromTurnResult_contextTokensUsesInputOnly: verifies
InputTokens-only tracking
All tests pass. Token tracking now correctly accumulates costs and shows
monotonically increasing context size.
- /new command now properly resets usageTracker stats when starting fresh session
- Remove EstimateAndUpdateUsage fallback in updateUsageFromTurnResult()
- Remove EstimateAndUpdateUsage fallback in UpdateUsageFromResponse()
- Only use actual API-reported tokens for cost tracking (following opencode pattern)
- Estimation is inaccurate and should never be used for billing
Fixes issues with kimi-k2.5 and opencode token tracking where:
1. /new didn't reset token count/cost
2. Tokens never updated correctly due to estimation fallback
Change thinking content from H6 to Italic for more subdued,
secondary visual appearance. Makes reasoning text less prominent
than main assistant responses.