mirror of
https://github.com/lobehub/lobe-chat.git
synced 2026-06-13 19:20:04 +00:00
fdb529d598
* 🐛 fix(agent): deliver sub-agent resume bridge via QStash webhook in queue mode The callSubAgent completion bridge was a handler-only hook, which lives in process memory: in queue mode (AGENT_RUNTIME_MODE=queue) HookDispatcher only delivers webhook-configured hooks, so the bridge never fired — the parent op stayed parked in waiting_for_async_tool forever after all sub-agents finished. - Give the bridge hook a webhook config (delivery: qstash) targeting the new /api/agent/webhooks/subagent-callback endpoint; local mode keeps the in-process handler. Both paths converge on AgentRuntimeService.completeSubAgentBridge (backfill + barrier/CAS resume). - Park-time self-check: after the parked state and operation row are persisted, re-run the resume barrier once to recover children that completed before the parent finished parking. - One-shot verify watchdog: when a completion finds the parent not yet resumable, schedule a delayed verifyAsyncToolBarrier re-check (no step lock, CAS-idempotent, never re-arms). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * 📝 docs(agent): correct verify-watchdog rationale comment Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * 📝 docs(agent): clarify eventFields trimming rationale Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * ♻️ refactor(agent): align subagent-callback with workspace-scoped step worker Post-rebase adaptation to canary's runtime restructure (#15609): - Route the webhook bridge through AiAgentService (like the /run step worker) so the runtime's models stay workspace-scoped — a bare AgentRuntimeService would be personal-scoped and the tool-message backfill / resume barrier could miss workspace-scoped rows. - Extract SubAgentBridgeParams into agentRuntime/types and add the completeSubAgentBridge passthrough next to executeStep. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * 🐛 fix(agent): fail sub-agent callback loudly on backfill or delivery failure Address two review findings on the resume bridge: - completeSubAgentBridge now checks updateToolMessage's { success } result (it swallows transaction errors instead of throwing) and propagates all infrastructure failures. The webhook endpoint then returns non-2xx so QStash redelivers the whole bridge — previously a failed backfill was acked with 200 and the parent stayed parked forever, since the verify recheck only re-reads the barrier and cannot retry the backfill. - New AgentHookWebhook.fallback: 'none' opts a qstash-delivered hook out of the unsigned plain-fetch fallback, which can never authenticate against a QStash-signed endpoint and only masked publish failures as silently dropped 401s. The bridge hook uses it; dispatch escalates such delivery failures to console.error instead of the debug namespace. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>