references/kit - kit - Gitea: Git with a cup of tea

mirror of https://github.com/mark3labs/kit.git synced 2026-06-14 03:30:26 +00:00

Author	SHA1	Message	Date
Ed Zynda	b0802a5c32	fix: properly count existing cache blocks to stay under 4-block limit The issue was that cache control persisted across turns in conversation history, causing accumulation beyond Anthropic's 4-block limit. Changes: - Count existing cache blocks in message history before adding new ones - Only add new cache blocks up to the 4-block limit - Remove tool caching (was adding 1 block per turn) - Skip messages that already have cache control set Tested with 5 sequential messages - no errors, proper cache metrics.	2026-03-29 14:48:08 +03:00
Ed Zynda	dfe65ca227	chore: remove all Crush references from comments Remove mentions of Crush from: - cache_control.go - agent.go (2 references) - content.go - tool_renderers.go - lsp-diagnostics.go (2 references)	2026-03-29 14:43:51 +03:00
Ed Zynda	d4ec756ce5	fix: match Crush's cache_control strategy exactly Crush's proven 4-block strategy: 1. Last system message (if present) 2. Last 2 conversation messages 3. Last tool definition This stays exactly at Anthropic's 4-block limit without exceeding it. Previous implementation could exceed the limit in certain edge cases. Now matches Crush's battle-tested approach.	2026-03-29 14:42:29 +03:00
Ed Zynda	2971e73ee8	fix: limit Anthropic cache_control blocks to maximum 4 Anthropic API enforces a maximum of 4 blocks with cache_control per request. The previous implementation could exceed this limit when combining: - System message caching - Recent message caching - Tool definition caching Changes: - Add explicit cache block counting (max 4) - Remove tool cache control to stay under limit - Prioritize: system message first, then recent messages - Work backwards from end to cache most recent context first Fixes: bad request error 'A maximum of 4 blocks with cache_control may be provided'	2026-03-29 14:40:44 +03:00
Ed Zynda	bca08476de	chore: fix remaining linting issues in caching code - Use max() built-in instead of if statement (modernize) - Remove unused buildAnthropicCacheOptions function - Remove unused anthropic import	2026-03-29 14:32:28 +03:00
Ed Zynda	b295a25946	feat: automatic prompt caching for cost reduction Implements automatic prompt caching to reduce API costs by 60-90% for repeated prompts with the same context. Architecture: - Provider-level caching for OpenAI (PromptCacheKey) - Message-level caching for Anthropic (avoids type conflicts) - Model family detection enables caching regardless of provider Key Changes: - Add ModelInfo.Family with SupportsCaching() and CacheType() methods - Add ProviderConfig.DisableCaching for opt-out - Implement message-level cache control in agent (like Crush) - Last system message gets cache control - Last 2 messages get cache control - Last tool gets cache control - Auto-disable caching when thinking is enabled (type conflict avoidance) - Add KIT_DISABLE_CACHE environment variable for global opt-out Tested with opencode/claude-sonnet-4-6 showing cacheRead/cacheWrite tokens in debug output, confirming 60-90% cost savings. Closes cost optimization for multi-turn conversations.	2026-03-29 14:24:07 +03:00

6 Commits