131 Commits

Author SHA1 Message Date
YuTengjing a2fd98a2d1 🐛 fix: restore file URLs in context prompts (#15549) 2026-06-08 19:26:16 +08:00
LiJian ee65cf2a0f feat(connector): custom OAuth MCP connectors — onboarding, runtime execution & connector-first (LOBE-9983) (#15546)
*  feat(connector): wire custom MCP OAuth — Pre-registration & DCR (LOBE-9983)

Connect the two OIDC schemes designed in LOBE-9736 (oidcConfig) end-to-end so
users can add a custom OAuth MCP server from /settings/skill. Until now the DB
schema, models, and tool-permission UI existed, but nothing ran the OAuth
authorization flow — syncTools only worked when a token already existed.

Flow (shared pipeline, branches only on where client_id comes from):
- Add modal (client_id present → Pre-registration; absent → DCR/RFC 7591)
- startOAuth: probe MCP URL → RFC 9728 protected-resource metadata → RFC 8414
  AS metadata; DCR-register the client when no client_id; persist resolved
  oidcConfig; build PKCE authorize URL, stash verifier in Redis keyed by state
- /oauth/connector/callback: consume state → exchange code → store encrypted
  tokens (KeyVaultsGateKeeper) + tokenExpiresAt + status=connected → postMessage
- syncTools lazily refreshes the access token before connecting

Built on @modelcontextprotocol/sdk OAuth helpers (discover/register/start/
exchange/refresh) — no hand-rolled protocol code.

Security:
- Wire KeyVaultsGateKeeper into ConnectorModel so OAuth tokens are encrypted at
  rest (previously the router passed no gatekeeper → plaintext)
- Strip decrypted credentials and oidcConfig.clientSecret from the list response

UI:
- "+" button in /settings/skill Connectors tab opens the Add modal
- SkillList surfaces custom connectors from the connector store
- Modal wires the client secret field, infers the scheme, and shows the
  redirect URI to register

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): request server-advertised scopes in OAuth flow

The authorize request sent an empty scope list, so providers that require a
scope (e.g. Linear MCP advertises scopes_supported ["read","write"]) issued a
useless token or rejected the flow. Default to the authorization server's
advertised scopes_supported when the user did not specify any, and use them for
both DCR registration and the authorize request.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): let OAuth callback bypass SPA rewrite and auth gate

/oauth/connector/callback is a backend route handler reached via a cross-site
redirect from the OAuth provider, so the proxy middleware broke it two ways:

1. It was not in the backend passthrough list, so it got rewritten to the SPA /
   locale shell instead of running the route handler (307 → blank).
2. It was not in isPublicRoute, so BetterAuth treated it as protected; the
   cross-site top-level navigation doesn't reliably carry the SameSite session
   cookie, so it redirected to sign-in (307).

Add /oauth/connector to backendApiEndpoints and /oauth/connector/callback to
isPublicRoute (the handler validates its own single-use state, so it must not be
session-gated). Scoped so /oauth/callback/success|error SPA pages are unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

*  feat(connector): execute connector tools server-side + agent-runtime wiring

Make custom OAuth MCP connectors actually callable, and sync their tools as
soon as authorization completes.

- callback: after token exchange, sync the tool list server-side via a shared
  syncConnectorToolsById — the connector is usable without a client round-trip
- sync.ts: extract buildConnectorMcpParams (http+auth / stdio), shared by
  syncTools and the new callTool
- connector router: add `callTool` (resolve connector, hard-block disabled
  tools, refresh token, call the remote MCP with decrypted credentials)
- aiAgent runtime: pass a KeyVaultsGateKeeper when resolving connectors so OAuth
  tokens decrypt (otherwise tool calls 401); surface connectors in the
  agent-management availablePlugins as a new 'connector' type
- AgentManagementContextInjector: render a <connector_plugins> section

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

*  feat(connector): wire connectors into the classic client chat path

The front-end chat orchestrates tools client-side (via /webapi/chat proxy),
separate from the server agent runtime. Connectors were invisible and
unexecutable there. Wire them in, connector-first.

- toolEngineering: build connector manifests from the store and inject them into
  createToolsEngine; drop plugins sharing a connector identifier (connector wins)
- buildClientConnectorManifests: store rows → type 'mcp' manifests (no token; the
  client has none) with permission → humanIntervention mapping
- mcpService.invokeMcpToolCall: route connector tool calls to connector.callTool
  before the plugin path (only connectors with a real MCP endpoint, so
  Lobehub/Klavis skills keep their executor)
- DeferredStoreInitialization: fetch connectors post-login so chat sees them
- AddConnectorModal: refresh after OAuth regardless of popup outcome
- chat-input skills picker: surface custom connectors in the auto group

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): open OAuth popup synchronously + escape callback HTML (codex P1)

- AddConnectorModal: open the OAuth popup synchronously inside the click handler
  (before any await), then navigate it to the authorize URL. Browsers block
  window.open once an async boundary is crossed, which left popup=null and the
  poll loop never resolving — the Add modal hung. Null popup now fails fast with
  a "allow popups" message.
- callback route: escape the postMessage payload for `<script>` context
  (`<`, `>`, `&`, U+2028/U+2029 → \uXXXX). A malicious OAuth server could put
  `</script>...` in the error param and execute script on the app origin.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): tighten execution boundary + surface OAuth failures + tests

Address review: enforce the same constraints at the call site that the manifest
layer enforces, and stop swallowing OAuth failures.

- isEnabled on BOTH sides: invokeMcpToolCall only routes enabled connectors
  (a disabled connector no longer steals a same-name plugin's call), and the
  server rejects calls to a disabled connector. Matches buildClientConnectorManifests
  which only exposes enabled connectors.
- callTool requires the toolName to exist in the synced user_connector_tools
  list — unsynced / hand-crafted tool names are rejected instead of being
  forwarded blindly to the remote MCP.
- extract callConnectorToolById (typed ConnectorToolCallError → tRPC codes) so
  the gates are unit-testable.
- AddConnectorModal: distinguish success / provider-error (show the reason) /
  user-dismissed instead of collapsing every failure into a silent close.
- tests: exec gates (not-found / disabled connector / unknown tool / disabled
  tool / success / token-refresh) + buildClientConnectorManifests mapping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): align redirect URI, connector-override & partial-failure UX

Second review round.

- redirect URI: the modal showed a client-origin URI while the server sent an
  APP_URL one — register-vs-use mismatch broke the callback. Add a
  `connector.getRedirectUri` query (server source of truth) and show exactly
  that in the modal.
- execAgent: derive the plugin-override set from the connectors that ACTUALLY
  produce a manifest (enabled + with tools), not the raw endpoint-having set —
  a disabled / not-yet-synced same-named connector no longer evicts the plugin
  and leaves the runtime with no tools. Matches the client-chat behaviour.
- partial failure: when code exchange succeeds but the tool sync fails, the
  callback now reports `synced: false`; the modal shows "authorized but tools
  could not be synced" instead of a false "connected".

Tests: execAgent overlap regression (disabled / 0-tool keeps the plugin; real
tools replace it) + callback partial-failure (synced:false on sync error).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* ♻️ refactor(connector): name the availablePlugins source 'custom' not 'connector'

The agent-management availablePlugins types describe a tool's SOURCE
(builtin / klavis / lobehub-skill); 'connector' named the storage system
instead. Once plugins migrate to the connector table everything is a connector,
so the source-based label is what matters. Rename to 'custom' to align with
ConnectorSourceType.custom (single source of truth); section is <custom_plugins>.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(connector): enforce connector permissions for community MCP plugins

Community MCPs execute via the plugin path (not connector.callTool), so the
per-tool permissions a user sets in the new Connectors UI weren't surfaced:
needs_approval didn't trigger the approval prompt on either runtime. (disabled
was already hard-blocked at execution by ToolExecutionService and the mcp
router.)

- extract patchManifestWithPermissions into a pure, client-safe module
  (patchManifestPermissions.ts); connectorPermissionCheck.ts re-exports it.
- execAgent: also patch community-plugin manifests (pluginsWithoutConnectors)
  with their connector permissions, alongside lobehub/klavis.
- client createToolsEngine: patch community-plugin manifests with connector
  permissions from the store so needs_approval surfaces as humanIntervention
  in the classic chat path too.
- unit tests for the shared patch function.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

*  fix(connector): tolerate uninitialized connectors slice in selectors

createToolsEngine now reads connectorSelectors.{customConnectors,connectorList};
toolEngineering/index.test.ts mocks getToolStoreState without `connectors`, so
the selectors hit `undefined.filter`. Guard with `?? []` (the real store always
seeds connectors:[] via initialState) and add connectors:[] to the test mock.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

*  fix(connector): guard every connector selector against an uninitialized slice

mcp.test.ts mocks the tool store without `connectors`, and invokeMcpToolCall
calls connectorByIdentifier → `s.connectors.find` threw. The previous fix only
guarded connectorList/customConnectors; harden all of them (find/filter) so any
partial-store mock is safe. The real store always seeds connectors:[].

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 18:01:41 +08:00
Arvin Xu dbf743cc12 feat(verify): Agent Run delivery checker system (#15489)
* 🗃️ feat(database): add verify system tables for agent run delivery checker

Implement the database layer for the Agent Run delivery checker (Verify System).

Reuse / definition layer:
- verify_criteria: a single reusable pass/fail standard (atomic unit), carrying
  its verifier config + onFail default and bound to a document for judging
  guidance (iteration history reuses document_history; no version columns)
- verify_rubrics: a named group that aggregates criteria — the reusable unit
- verify_rubric_criteria: junction, which criteria a rubric aggregates
  (criteria are reusable across rubrics)

Mounted onto an agent via the existing agency config jsonb:
- agencyConfig.verifyRubricId: a reusable rubric (criteria template)
- agencyConfig.verifyCriteriaIds: ad-hoc one-off criteria
A run's plan instantiates the union of both. No dedicated bindings table.

Snapshot + result layer:
- agent_operations.verify_plan (jsonb) + verify_plan_confirmed_at: the per-run
  immutable check-item snapshot lives ON the operation (1:1 — auto-repair spawns
  a new operation), instead of a separate plans table
- agent_operations.verify_status: denormalized rollup for list-page badges
- verify_check_results: per-criterion result with the Toulmin model
  (verdict/confidence as columns, narrative in a typed toulmin jsonb), N:1
  verifier_tracing_id for batch judging, FP/FN flags for the data flywheel;
  relates to the plan via operation_id + stable check_item_id

Ref: LOBE-10019

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

*  feat(verify): add Agent Run delivery checker backend + frontend module

Implements the verify system on top of the schema (PR #15480):
- models: verifyCriterion / verifyRubric (+junction) / verifyCheckResult;
  agentOperation verify plan/status methods
- services/verify: AI plan generation (auto-create criteria), executor with
  LLM Toulmin judge (per-criterion + batch), program placeholder, agent &
  auto-repair spawner seams, rollup chokepoint, feedback fp/fn, completion
  lifecycle bridge
- lambda verify router (criteria/rubric CRUD, plan, results, feedback)
- frontend feature module: service, SWR hooks, CheckerDock state machine,
  RunArtifact, verify i18n namespace
- tracing scenarios: VerifyPlanGen / VerifyJudge

Live UI mount (dock/artifact into chat) pending server operationId source.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(verify): persist delivery-checker verdicts via async tracing backfill

The LLM judge produced valid verdicts but they were never persisted, leaving
every run stuck at `verifying`. Two root causes:

1. FK ordering: `writeVerdict` stamped `verifier_tracing_id` synchronously, but
   the `llm_generation_tracing` row is written asynchronously (best-effort,
   after the response) — so the hard FK was violated every time and the verdict
   write was rolled back. Now the verdict is written with a null link, and the
   tracing id is backfilled by an `onPersisted` callback that fires only after
   the tracing row commits (still non-blocking). If tracing is disabled the link
   simply stays null.

2. Verdict parse: the judge JSON schema is non-strict, so the provider returns
   optional Toulmin fields as explicit `null`. The Zod validator used
   `.optional()` (accepts undefined, not null), so any null failed the whole
   `safeParse` and discarded the batch. Switched to `.nullish()`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(cli): add `verify` command for the delivery checker

Adds `lh verify` covering the full delivery-checker chain — criteria & rubric
CRUD, per-run plan (generate/state/confirm/skip), execute (LLM judge), results,
and feedback — calling the `verify` lambda router. Enables end-to-end backend
testing of the verify system.

Also adds the missing `tool-runtime` / `prompts` / `const` workspace entries to
the CLI's `pnpm-workspace.yaml` so the standalone package installs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 💄 feat(verify): add verify message role + delivery-checker card UI

Make the delivery-checker renderable in chat:

- Fix the `features/Verify` components so they compile: flatten the `verify`
  locale to the repo's flat-dotted-key convention (keySeparator: false), import
  `Flexbox`/`TextArea` from `@lobehub/ui` (react-layout-kit is no longer a dep),
  and the token cast.
- Add a `verify` UI message role + a `VerifyMessage` card that renders the
  Run Artifact + checker dock from `metadata.verifyOperationId`, wired into the
  message renderer switch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): add lobe-agent `generateVerifyPlan` tool (server runtime)

Lets an agent set up the delivery checker for its run: the agent calls
`generateVerifyPlan` early (per the new `<delivery_checker>` system-role
guidance), which instantiates the rubric / ad-hoc criteria into a frozen plan on
the current `agent_operations` row. Executed server-side only — the executor is
dispatched via `runtime[apiName]` with `operationId` threaded through the tool
execution context; the client `BaseExecutor` gracefully no-ops it.

Also registers the metadata fields (`verifyOperationId`/`verifyRound`) on the
message metadata zod schema so the role='verify' card can carry its operation id.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): surface role=verify card on run completion (LOBE-10051)

Connect the delivery checker to the conversation: when an Agent Run with a
verify plan completes, `CompletionLifecycle` inserts a persisted `role='verify'`
message (parented to the assistant, carrying `metadata.verifyOperationId`) that
renders the checker card. Self-guarded — no plan → no card, failures never
affect the run.

`role='verify'` behaves like a `user` leaf message everywhere it flows
(persistence + conversation-flow pass it through unchanged); only the
context-engine treats it specially: a new `VerifyMessageProcessor` drops it from
the model context (UI-only card, not a valid model role). Adds `verify` to
`CreateMessageRoleType`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 💄 feat(verify): merge run-artifact + checker into one card

The role=verify message rendered two stacked cards (Run Artifact summary +
Delivery Checker) that duplicated the check-item list. Merge into a single card:
the `Run Artifact · Round N` header, then the checker results + actions, then the
snapshot note. RunArtifact/CheckerDock gain an `embedded` prop (header-only /
body-only, no card chrome) and VerifyMessage composes them under one border.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): derive generateVerifyPlan rubric from agencyConfig

A real agent calls `generateVerifyPlan` with just a `goal` and doesn't know
rubric ids. When `rubricId`/`criteriaIds` params are absent, derive the mounted
rubric + ad-hoc criteria from the executing agent's
`agencyConfig.verifyRubricId / verifyCriteriaIds`. Params still win when given.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(cli): surface agent gateway WebSocket close code + reason

The `onclose` handler logged `String(event)` → the useless "[object
CloseEvent]". Surface `event.code` (+ `event.reason` when present) so a gateway
disconnect before completion is actually diagnosable.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 💄 fix(verify): rename "Run Artifact" → "Verification", drop failed red border

- The kicker said "Run Artifact" — it's automated verification, not an artifact.
  Renamed to "Verification · Round N".
- Removed the red error border on a failed check — a normal card reads better.
- Fixes a render crash (`useVerifyState is not defined`): the border removal left
  a dangling reference after the import was dropped.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(cli): poll run status when the agent stream drops

When the live stream (gateway WebSocket / SSE) closes before the run finishes,
the run is still executing server-side — so instead of hard-exiting, fall back to
polling `aiAgent.getOperationStatus` every 10s until the run reaches a terminal
state (or is no longer tracked). Pairs with surfacing the WS close code/reason.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 💄 feat(verify): add Render for generateVerifyPlan tool call

The generateVerifyPlan tool call rendered as the default param/result dump. Add a
Render that lists the generated delivery checks (title + gate/auto-fill tag), and
surface the items on the tool state so the Render can read them.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): auto-confirm generated plan so checks run on completion

The agent generated a plan but it stayed `planned`/unconfirmed, so the completion
hook (which gates on a confirmed plan) never ran the checks — the card was stuck
at "awaiting confirmation" with no pass/fail. In the headless agent flow there's
no one to click Confirm, so `generateVerifyPlan` now auto-confirms the plan it
generates; the checks then run automatically on completion. (An interactive
"review before run" gate is a future enhancement.)

Also: the verify card header disappeared in the draft/planned phase
(`phaseToArtifact.draft` was null). Give it a header so the card always shows its
"Verification · Round N" heading.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(agent-tracing): only count opaque/presentational attrs as structural noise

The first structuralNoiseRatio charged ALL markup (every <...> tag) as noise,
which over-penalized legitimately structured results 3x. Grounding against real
web-search output (`<item title="…" url="…">snippet</item>`) showed the tags and
the title=/url= attributes ARE the signal the model reads.

Now only opaque/presentational attribute names (id, class, style, data-*, aria-*,
role, on*) count as noise; semantic element tags and content-bearing attributes
(title, url, href, name…) are kept. On a 57-op user-interrupted sample this drops
web-search noise 42%→0% and overall estimated waste 16%→5%, leaving large-payload
(readDocument) and high error-rate tools as the real signal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): model-authored criteria with name/description/instruction-in-document + agent verifier

Restructure the generateVerifyPlan tool to a createDocument-style full-create flow
and wire up the agent verifier path:

- criteria now = title + description (required one-liner) + instruction (required
  detailed rubric); instruction lives in a linked document (verify_criteria.documentId),
  description is a new verify_criteria column (migration 0111). verifierConfig no
  longer holds description/instruction.
- generateVerifyPlan creates verify_criteria + a rubric, snapshots the plan onto
  the operation and confirms it; judge resolves the instruction from the document.
- agent-type checks run as verifier sub-agents (execAgent + isolated thread) whose
  onComplete hook parses a VERDICT and writes it back to verify_check_results
  (renamed AgentVerifierSpawner → VerifierAgentRunner).
- UI: custom Inspector for the tool header; check list shows per-verifier-type icons
  (llm/agent/program) + description + required/optional tag; i18n en/zh.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ️ perf(verify): run program/llm/agent checks concurrently on completion

The three verifier kinds are independent; previously the agent spawn waited for
the batched LLM judge to finish. Run them via Promise.all so agent sub-agents
start immediately alongside the LLM batch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): dedicated builtin verify-agent + writeback tool, role=verify message, portal check editor

- Add `@lobechat/builtin-tool-verify` (submitVerifyResult) + builtin `verify-agent`;
  agent-type checks now run as the dedicated verify agent (not the user's agent),
  which investigates and writes its verdict back via the tool during its run.
- Verifier inherits the parent run's model/provider (builtin default may be
  unconfigured locally).
- role=verify completion message no longer requires an assistantMessageId, so the
  delivery-checker card always surfaces when a plan exists.
- Portal editor for verify checks (title/description/instruction/verifier/onFail).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(verify): restrict verify-agent to its writeback tool; fix running loader icon

Root cause of stuck `running` agent checks: the verify-agent ran in agent mode and
inherited all default tools (web-browsing, cloud-sandbox, skills, activator), so it
went off web-searching/crawling to "investigate" and never called submitVerifyResult.

- Run the verify-agent in chat mode (enableAgentMode: false, searchMode: off) — the
  strict whitelist — and whitelist `lobe-verify` for chat mode so the verifier gets
  ONLY its writeback tool.
- Sharpen the verify systemRole: judge from the provided deliverable/instruction
  (no external tools), always reach a verdict, and always call submitVerifyResult.
- CheckerDock: running check now uses the standard RingLoadingIcon (warning ring),
  matching the app's loader instead of a blue spinner.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): auto-repair loop — re-run the agent with failure feedback on failed checks

When required checks fail with onFail=auto_repair, automatically run a second
iteration instead of ending at `failed`:

- createRepairRunner: re-runs the SAME agent in the same topic with the failure
  feedback as the prompt, re-snapshots the plan onto the repair operation and
  confirms it so it re-verifies on completion (the next round). Capped at
  MAX_REPAIR_ROUNDS via parent-chain depth to prevent runaway loops.
- maybeAutoRepair: fires only once every required check has a terminal result, so
  it works for inline LLM checks (triggered from lifecycle) and async agent checks
  (triggered from the verify tool's writeback path).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): open check result detail in portal & rename artifact→result

- add a VerifyResult portal view: clicking any check row opens that result's
  detail (verdict, confidence, Toulmin sections, suggestion) on the right; agent
  checks expose their execution trace from inside the panel
- CheckerDock rows are all clickable now (chevron affordance), status shown by
  icon only; verify card uses colorBgElevated
- rename the run-result surface from "artifact" to "result" everywhere: RunArtifact
  → RunResult, phaseToArtifact → phaseToResult, and all `artifact.*` i18n keys →
  `result.*`
- ship verify namespace zh-CN / en-US locales

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): enrich check result portal — criterion stepper, richer detail view

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): rubric run-policy config + repair feedback on the verify card

Auto-repair feedback now lives on the failed round's role=verify message
(content), and the VerifyMessageProcessor surfaces it into the repair run's
context as a tagged user turn — so the repair op runs off history via a new
execAgent `suppressUserMessage` path instead of injecting a synthetic user
message. createVerifyMessage is awaited before verification to avoid a race.

maxRepairRounds becomes a rubric-level config: new `verify_rubrics.config`
jsonb column, read live at repair time via the plan's sourceRubricId. Adds a
RubricConfig portal panel (reachable from the plan card's settings affordance)
to view/edit it, wired through the verify store + TRPC.

Verify domain types/vocab/config are extracted from the DB schema into
@lobechat/types as the single source of truth; schema and consumers import
from there.

Tests: VerifyMessageProcessor dual behavior; VerifyRubricModel config
round-trip; MessageModel.findVerifyMessageByOperationId.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🗃️ refactor(verify): squash the 3 verify migrations into one

Collapse 0110 (tables) + 0111 (criteria.description) + 0112 (rubrics.config)
into a single regenerated 0110_add_verify_tables so the PR ships one clean,
idempotent migration. No schema change vs the three combined.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(cli): verify rubric run-policy config commands + shrink judging-rule editor font

CLI: `verify rubric create --max-repair-rounds`, `verify rubric view`, and
`verify rubric update` exercise the rubric config endpoints end-to-end; adds a
mocked command test. UI: judging-rule editor font 16px → 14px.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(verify): editable rubric name in the config panel + default 3 repair rounds

Add a name (title) field to the RubricConfig portal, persisted via a new
updateRubricTitle store action + service (optimistic + debounced, alongside
the config write-back). Bump DEFAULT_MAX_REPAIR_ROUNDS 2 → 3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(verify): extract generateVerifyPlan into installable lobe-delivery-checker tool

Move the delivery-checker plan-creation flow out of the always-on lobe-agent
tool into a new standalone, installable builtin tool `lobe-delivery-checker`
(Skill Store, opt-in per agent — not loaded by default). lobe-agent no longer
ships generateVerifyPlan.

- new packages/builtin-tool-lobe-delivery-checker (manifest/types/systemRole +
  client Render/Inspector/Portal moved wholesale from lobe-agent)
- new serverRuntimes/lobeDeliveryChecker.ts (generateVerifyPlan moved out of
  lobeAgent.ts), registered alongside verifyResult
- registered installable in builtin-tools (no hidden/discoverable:false, not in
  defaultToolIds/alwaysOnToolIds/runtimeManagedToolIds); renders/inspectors/
  portals/identifiers wired; lobe-agent portal entries removed
- i18n keys moved builtins.lobe-agent.verifyPlan.* → builtins.lobe-delivery-checker.*

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(agent): add `custom` tool mode; verify agent uses it instead of chat-mode

Chat mode's contract is to strip ALL user/agent plugins (strict KB/memory/web
allow-list) — so the verify sub-agent couldn't get its writeback tool without a
leaky blanket rule. Introduce a third tool mode `custom` where the toolset is
EXACTLY the agent's declared plugins (no always-on, no defaults, no activator),
for focused builtin sub-agents.

- chatConfig.toolMode: 'agent' | 'chat' | 'custom' (overrides enableAgentMode)
- AgentToolsEngine: custom branch (defaultToolIds = plugins, rules = plugins-on,
  allowExplicitActivation only in agent mode); chatModeRules restored to strict
- verify agent → toolMode: 'custom'; lobe-verify dropped from chatModeAllowedToolIds
- test: custom mode enables exactly the declared plugin, no always-on / defaults

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 09:16:35 +08:00
YuTengjing bab3ff4a7a 🐛 fix: reduce agent document context latency (#15436) 2026-06-04 16:23:51 +08:00
Arvin Xu 475f391d97 ♻️ refactor(message): prefer dedicated usage column over metadata.usage (#15457)
* ♻️ refactor(message): prefer dedicated usage column over metadata.usage

Token usage was promoted out of metadata.usage into a dedicated messages.usage
column, but nothing populated it and all reads still went through metadata.usage.

- Centralize write-side promotion in the DB model (update / updateMetadata /
  create), so all executor callers populate the usage column from a top-level
  usage payload, falling back to metadata.usage. metadata.usage stays dual-written
  for backward-compatible reads.
- Reads prefer the usage column and fall back to metadata.usage: message queries,
  getTokenHeatmaps, recomputeTopicUsage, the usage record service, and context
  token accounting.
- Add top-level usage to UpdateMessageParams + DBMessageItem types.
- Mark metadata.usage and the legacy flat token fields as @deprecated, pointing
  to the top-level usage field.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(message): dual-write metadata.usage for top-level usage updates

When a caller passed the new top-level `usage` param without also sending
`metadata.usage`, the update wrote only `messages.usage` and left
`metadata.usage` stale/absent — legacy readers and rollback paths still consume
it during the dual-write transition. Fold the resolved usage into the metadata
patch so `metadata.usage` stays in sync regardless of how usage was passed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-04 14:14:11 +08:00
YuTengjing 2657b667be feat: export agent profiles as Markdown (#15312) 2026-05-29 02:45:25 +08:00
Arvin Xu ddb5794826 chore: clean up LOBE-XXX code annotations (#15135)
* chore: clean up LOBE-XXX annotations from codebase comments

- Remove 【LOBE-XXX】 bracket markers
- Remove LOBE-XXXX references from inline comments
- Clean up test descriptions containing LOBE identifiers
- Preserve linear.app URLs and code-level regex patterns
- Generated: 2026-05-23 02:30:09

* 🐛 fix(tests): restore () in arrow callbacks broken by annotation cleanup

The LOBE-XXX annotation cleanup script over-matched `(LOBE-XXXX', () =>`
and stripped the callback `()`, leaving invalid syntax like
`describe(..., => {` and `it(..., async => {` across 24 test files.

This caused parse failures in Test Packages, Test Desktop App, Test
Database lint, and Test App shard runs. Restoring `()` / `async ()`
unblocks the suites while keeping the ticket-text cleanup intact.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(hintFormat-test): restore label + ellipsis in stripMarkdownLinks fixture

The annotation cleanup stripped `LOBE-8516` from a markdown-link's
*label* (`[LOBE-8516](/task/T-1)` → `[](/task/T-1)`), which then survived
`stripMarkdownLinks` because the pattern requires non-empty link text —
the test expected the link to disappear and asserted equality on a
LOBE-free output. The same line also lost a `.` from the trailing
`...` indicator in both input and expected strings.

Substitute a neutral Chinese label (`发布计划`) so the link continues
to exercise the multi-link substitution path, and restore the full
`...` ellipsis.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Arvin Xu <arvinxx@lobehub.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 17:18:18 +08:00
Arvin Xu a0fac0b700 feat(skills): recognize project-level skills in the homogeneous agent runtime (#15110) 2026-05-22 19:22:41 +08:00
Innei cec72199bb 🐛 fix(onboarding): prevent agent identity from using user name (#15112) 2026-05-22 17:09:01 +08:00
Arvin Xu 97111fc99d 🐛 fix(context-engine): guard placeholder log preview against undefined content (#15097)
A tool error result (e.g. budget-exceeded) can arrive with
`content: undefined`. The processor's logging step called
`JSON.stringify(undefined).slice(...)`, which throws because
`JSON.stringify(undefined)` returns `undefined`, not a string — crashing
the whole processor before any message was processed.

Coerce the preview to a string before slicing.

Fixes LOBE-9408
2026-05-22 12:46:52 +08:00
Arvin Xu 83b8aa5a04 🐛 fix(agent-document): propagate sourceType and dedupe web crawls (#15088) 2026-05-22 08:40:26 +08:00
Innei 2a66071210 ♻️ refactor(onboarding): streamline discovery to a single profession question (#14987)
* ♻️ refactor(onboarding): streamline discovery to a single profession question

*  test(onboarding): update structured field fixtures
2026-05-19 22:46:46 +08:00
YuTengjing d3973a5cc0 feat: add chat cost estimate support (#14876) 2026-05-19 19:14:47 +08:00
Arvin Xu 73fa3b1689 feat: agent-documents index — hide web crawls + new table format (#14292)
*  feat: agent-documents index — hide web crawls + new table format

The default `<agent_documents_index>` was injecting every progressive
document — including hundreds of web-crawled snapshots (~73% of all
agent docs in production). The result was a low-signal list dominated
by duplicate page titles, plus zero metadata for the LLM to rank by.

This revamp:

- Hides `source_type=web` documents from the default index. Header
  surfaces the count and points the LLM at `listDocuments(sourceType=
  'web')` to enumerate them when needed.
- Renders the index as a fixed-width table with TITLE / ID / SIZE /
  UPDATED columns. Rows are sorted by recency (most-recent first).
  Empty docs render as `empty` to discourage retry reads.
- Adds `sourceType` and `updatedAt` to the `AgentContextDocument`
  contract; client mapping populates both from the DB row.
- Adds `sourceType: 'all' | 'file' | 'web'` parameter to the
  listDocuments tool/TRPC; service-layer filter applies before
  shaping the LLM response.
- Renames `target` → `scope` on listDocuments + createDocument
  (manifest, types, runtime, system role, TRPC, client service,
  call sites, tests). `target="currentTopic"` becomes
  `scope="currentTopic"` everywhere.

Coverage: inline snapshot tests in
`packages/context-engine/src/providers/__tests__/AgentDocumentInjector.test.ts`
pin the rendered output for the three load cases (mixed user docs,
web-hidden header, empty doc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(test): update listDocuments mock assertion for sourceType default

The agent-documents listDocuments runtime now forwards sourceType
(defaulting to 'all'), so the spy receives two positional args.

* 📝 docs(builtin-tool-local-system): bump documented runCommand max timeout to 800000ms

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 22:08:08 +08:00
Arvin Xu 43b0b5e854 🐛 fix(agent-runtime): honor per-tool timeout end-to-end for client tool dispatch (#14817)
* 🐛 fix(agent-runtime): honor per-tool timeout end-to-end for client tool dispatch (LOBE-8436)

Server BLPOP was hardcoded to 60s and ignored the LLM-supplied `timeout` in
`tool_call.arguments`, so long-running shell commands consistently failed
with a server-side timeout while the desktop runner was still happily
executing. Renderer also never raced its own deadline, leaving it free to
hang past the server budget.

Plumb a per-tool timeout through the full chain:

  - New `resolveToolTimeoutMs` (server) — priority: `args.timeout` >
    `manifest.api[apiName].defaultTimeoutMs` > 120s global default,
    clamped to [1s, 800s] (cloud function ceiling).
  - `dispatchClientTool` accepts `timeoutMs` in ctx; constants moved into
    `resolveToolTimeout.ts`. Default 60→120s, max 270→800s.
  - `RuntimeExecutors` calls the resolver at both client-dispatch sites
    (single + batch) using the LLM-parsed args and the effective manifest.
  - `LobeChatPluginApi` (types + context-engine) gains
    `defaultTimeoutMs?: number` so tool authors declare per-API budgets.
  - `LocalSystemManifest` sets per-API defaults: runCommand 120s,
    read/write/edit/list 30s, grep/glob/search/move 60s, killCommand 10s.
  - `local-file-shell/runner.ts` internal kill cap raised 600→800s to
    match the server ceiling.
  - Renderer `clientToolExecution.ts` rewritten to (1) race executor
    against `executionTimeoutMs - 500ms`, abort the operation's
    AbortController, and send `client_executor_timeout` on overrun;
    (2) read `gatewayConnections[operationId]` live on every send so
    reconnects between dispatch and result are picked up; (3) wrap in
    try/finally with an exactly-once `sent` guard so every `tool_execute`
    yields exactly one `tool_result` even on logic gaps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(test): drop unused @ts-expect-error and tighten timeout assertion

CI lint failed on tsgo: an `@ts-expect-error` directive in
`resolveToolTimeout.test.ts` was unused (the field's `unknown` value
type happily accepts a string at compile time), and the
`sendToolResult.mock.calls[0][0]` access in `clientToolExecution.test.ts`
tripped TS2493/TS2532 because vitest typed `calls` as an empty tuple.

Cast the test-only string value through `unknown` for the resolver
defense check; merge the budget assertion into the `toHaveBeenCalledWith`
matcher via `expect.stringContaining('2000ms')` so we never index into
`mock.calls` by hand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 19:23:15 +08:00
Innei a35c55c57b 🐛 fix(onboarding): remind discovery turn progress (#14833) 2026-05-15 20:28:33 +08:00
Arvin Xu 6e6970f1b2 🐛 fix(context-engine): account for tool_calls + reasoning + tool defs in compression budget (#14813)
🐛 fix(context-engine): account for tool_calls + reasoning + tool defs in compression budget

The pre-compression token check (`shouldCompress`) only counted `msg.content`,
which under-counted typical agent conversations by ~58% — tool_calls (~33%
of payload), reasoning traces (~17%), and top-level tool definitions (~2%)
were all silently ignored. As a result, conversations that the provider
tokenizer measured at ~656K passed the harness's 524K threshold without
firing compression, and were rejected upstream as ExceededContextWindow.

Verified empirically against 2 op snapshots in the same topic that hit
the failure mode (LOBE-8964): harness counted 267K, deepseek measured
649K — a 380K (58.8%) gap. ~92% of that gap is fixable by accounting
for the missing fields; the remaining ~8% is `tokenx` vs provider
tokenizer drift, compensated by a 1.25× multiplier on the trigger path.

Changes:

- New `@lobechat/context-engine/tokenAccounting` module exporting
  `countContextTokens({messages, tools, options})`. Returns structured
  per-source + per-message + per-tool breakdown — usable both by the
  compression trigger and by UI panels showing "context by type".
- `shouldCompress` in agent-runtime delegates to `countContextTokens`,
  applies the 1.25× drift multiplier on `adjustedTotal` for the trigger
  decision, exposes raw count via `currentTokenCount`. Signature now
  takes `UIChatMessage[]` directly.
- Removed deprecated `calculateMessageTokens` / `estimateTokens` /
  `TokenCountMessage` from agent-runtime — the new module supersedes
  them. `createAgentExecutors.ts` updated to call `countContextTokens`
  directly for post-compression telemetry.
- Added `raw-md` plugin to agent-runtime vitest config (needed once
  context-engine is imported transitively, since the import graph pulls
  in `@lobechat/agent-templates` `.md` files).

What's intentionally NOT counted (DB-only fields not sent to provider):
`plugin`, `pluginState`, `chunksList`, `extra`, `fileList`, etc.
Counting these would over-estimate and trigger compression too early.

Tests:

- 19 new unit tests for `countContextTokens` covering content / tool_calls
  / reasoning / tool_call_id / tool definitions / fast-path / aggregation
  / DB-only field exclusion.
- `tokenCounter.test.ts` updated for new drift semantics + UIChatMessage
  signature; one boundary case now triggers compression (intentional —
  the drift multiplier kicks in at the threshold).

Refs: LOBE-8964 (ECW edge boundary), LOBE-8972 (ECW umbrella),
LOBE-8973 (openrouter `:free` ctx), LOBE-8976 (compression diagnostics).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 13:22:19 +08:00
Arvin Xu 36d0994ec2 🐛 fix(context-engine): attach diagnostic context to PlaceholderVariablesProcessor errors (#14741)
* fix: attach diagnostic context to ProcessorError/PipelineError

* fix: include cause summary in PipelineError message

* fix: pass structured cause to ProcessorError

* fix: enhance PlaceholderVariablesProcessor with diagnostic context

* 🐛 fix: preserve placeholderVariablesProcessed count for no-op messages

processMessagePlaceholdersWithDiagnostics always returns a spread {...message},
so the identity check `processed !== message` was always true and the count
incremented even when content was unchanged (e.g. messages with no placeholders
or only unresolved `{{missing}}` tokens). Restore the JSON-equality comparison
used by the pre-PR `processMessagePlaceholders` path.

Add regression coverage for the no-op cases and for new error paths:
- only-unresolved string content, only-unresolved array text parts, mixed batch
- per-message isolation when a generator throws
- defensive validation when variableGenerators is undefined / null

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 11:26:19 +08:00
Arvin Xu e0d20e86fc feat: support chat mode and redesign chat input action bar (#14774)
* Refine chat parameter controls and working sidebar

* 💄 style: refine chat parameter controls

* 💄 style: refine chat input action affordances

* 💄 style: refine chat input control menus

* 💄 style: refine chat input skills menu

* 🐛 fix: replace skills policy dropdown with popover

* fix: base-ui dropdown

* fix: base-ui dropdown

* 💄 style: fix popover conflict and refine skills menu layout

- Extract PopoverLabel component with controlled open state to prevent
  conflict when skill policy menu opens
- Dispatch custom close event so detail popovers close before policy popover opens
- Add divider between pinned and auto skill groups
- Refine sticky search/footer padding via CSS attribute selectors
- Remove stray console.log from ActionDropdown

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* 💄 style: refine skills policy menu and chat input UI

- Skills policy menu: change active icon color to blue, add divider +
  uninstall action for Klavis/MCP/agent-skill items, suppress detail
  popover when the "..." policy menu is open
- Minor refinements across ChatInput, Conversation Error/ContentLoading,
  and HeterogeneousAgent StatusGuide components

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: add custom MCP tag and configure action to skills menu

- Show orange "Custom" tag next to custom MCP plugin entries
- Add Configure action above Uninstall in the policy popover that
  opens the PluginDevModal drawer for editing the custom plugin

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: default agent mode to true and gate chat mode at the tools engine

- Move `enableAgentMode` from `LobeAgentConfig` to `LobeAgentChatConfig` so it
  persists via the existing `chat_config` jsonb column and is readable on the
  server (the top-level field was silently dropped by drizzle).
- Default to agent mode for all agents — selectors treat `undefined` as `true`;
  only an explicit `false` collapses to chat mode.
- Introduce `chatModeAllowedToolIds = [knowledge-base, memory, web-browsing]`.
  Both `createServerAgentToolsEngine` and the frontend `createAgentToolsEngine`
  now switch on this whitelist in chat mode: skip user plugins, skip
  `alwaysOnToolIds`, narrow `defaultToolIds`, and turn off
  `allowExplicitActivation` so the activator can't smuggle other tools in.
- `useToggleAgentMode` is the single mode-switch entry; `plugins[]` is left
  alone — chat mode is enforced at runtime, not by mutating saved config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: extend topic status with running/paused/failed

Widen `ChatTopicStatus` enum (DB schema, types, TRPC validation) to cover the
in-flight lifecycle that gateway and heterogeneous executor runs report. Add a
`updateTopicStatus` store action and have both runtime paths write `running`
on start and `active` on completion (or `failed` on terminal error). Sidebar
topic items render a spinner while `status === 'running'`.

Note: drizzle migration for the widened enum needs to be generated separately.

* 💄 style: polish skills menu — official tag, tooltip on settings button

Add a LobeHub "official" badge to builtin tools and agent skills surfaced in
the Skills menu. Wrap the menu's settings button in a Tooltip. Scope the
group-header padding reset to the skill-activation group only so the
Knowledge submenu keeps its native section padding.

*  feat: mark topic as paused while awaiting human tool approval

Extend the heterogeneous-agent topic status machine (c0170d032f) with a
paused state. The gateway event handler writes topic.status = 'paused' on
step_start { phase: 'human_approval' } — one hook covers both Gateway and
desktop heterogeneous paths since they share the same handler.

Resume back to 'running' is free: approve / reject_continue both spawn a
fresh op via the executor entries, which already persist 'running'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: gate skills and agent-document injectors at the context engine in chat mode

Thread `enableAgentMode` into `MessagesEngine`. When it is explicitly `false`,
the engine forces `enabled: false` on:
- SkillContextProvider — drops the <available_skills> block
- All AgentDocument injectors (BeforeSystem / SystemAppend / SystemReplace /
  Context / Message) — drops every agent-document position

The frontend (`src/services/chat/mecha/contextEngineering.ts`) and server
(`src/server/modules/AgentRuntime/RuntimeExecutors.ts` →
`serverMessagesEngine`) read `chatConfig.enableAgentMode` from agent config
and pass it through; no caller needs to know which injectors to skip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: also gate agent-management context in chat mode

`agentManagementContext` (the `<current_agent>` + `<available_agents>` block)
was leaking into chat-mode prompts whenever the agent was in auto-skill mode,
because its caller-side guard (`isInAutoSkillMode || isAgentManagementEnabled`)
is orthogonal to `enableAgentMode`. Fold the gate into the same `isAgentMode`
switch already covering skills + agent documents in `MessagesEngine` so the
injector goes off in chat mode regardless of how the caller populates the
context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix: drop orphan rebase marker in OperationTraceRecorder

Leftover `<<<<<<< HEAD` from an earlier rebase that was only half cleaned —
the HEAD-side content is the one we want; just delete the marker line so the
file type-checks again.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: cursor-style action bar on home input

Rework the home ChatInput footer to read like Cursor's composer while keeping
the model picker on the right:

- Replace the `agentMode` icon-only button with a pill trigger (icon + label
  + chevron) carrying a persistent fill, dropping a `bottomLeft` mode
  popover. Reuses the `RuntimeConfig/ModeSelector` design in place so any
  other action bar consumer picks it up automatically.
- Introduce a `modelLabel` action that shows the resolved model display name
  + chevron, opening `ModelSwitchPanel`. The original `model` icon stays
  untouched for callers that prefer the compact form.
- Wire the home input to use ['agentMode','plus'] on the left and
  ['modelLabel'] on the right; bump `SendArea` gap to 12 and add
  `paddingLeft={6}` to the action bar so the pill aligns with the input
  placeholder.
- Localize `chatMode.chat` to "对话" in zh-CN (default English stays "Chat").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: surface params panel toggle and hide it for heterogeneous agents

- Drop the developer-mode gate on the conversation header params toggle so it
  ships by default; popup routes remain excluded.
- Hide both the header toggle and the right sidebar `Params` tab for
  heterogeneous agents (Claude Code / Codex etc.), since their model params
  panel doesn't apply. The active-tab resolver also falls back away from
  `params` when it isn't available.
- Strengthen the Tools popover divider to `colorFill` so the header /
  footer separators stay visible against the elevated dark-mode surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🚑 fix: address type errors surfaced on the new-input branch

- Move the `border` from the removed `overlayInnerStyle` onto `styles.content`
  so the AgentMode / ModeSelector popovers compile against the base-ui
  `PopoverProps` shape.
- Pass `paddingLeft: 6` through `style` on `ChatInputActions` since the
  underlying Flexbox only accepts `padding` / `paddingBlock` / `paddingInline`.
- Tighten skill / market menu items: drop the unsupported `closeOnClick`
  from the group item, fallback the uninstall display name to
  `identifier`, swap the antd-style `type: 'warning'` confirm option for
  `okButtonProps.danger`, and assert the conditionally-spread market
  items as `ItemType` so the inferred union no longer contains
  `undefined`.
- Annotate `resolveMark` in `LevelSlider` so the fallback branch returns
  a `ReactNode` label, fixing the `MarkObj` mismatch on `LevelOption`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Innei <tukon479@gmail.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 00:07:47 +08:00
Arvin Xu ccddbaa25d ♻️ refactor(builtin-tool): move sub-agent dispatch from lobe-gtd to lobe-agent (#14715)
* ♻️ refactor(builtin-tool): move sub-agent dispatch from lobe-gtd to lobe-agent

Move the `execTask` / `execTasks` capability out of `packages/builtin-tool-gtd/`
and into `packages/builtin-tool-lobe-agent/`, renaming the public APIs to
`callSubAgent` / `callSubAgents`. The "subtask" naming inside GTD overlapped
with the new lobe-task tool's task model and conflated planning with
sub-agent dispatch.

- API names: `execTask` → `callSubAgent`, `execTasks` → `callSubAgents`
- TS types: `ExecTaskParams` → `CallSubAgentParams`, etc.; introduce
  `SubAgentTask` to replace `ExecTaskItem`
- Client UI (Inspector / Render / Streaming) ported under
  `packages/builtin-tool-lobe-agent/src/client/`
- Central registries (`packages/builtin-tools/src/{inspectors,renders,streamings}.ts`)
  updated to register lobe-agent
- GTD `meta.description` and system role no longer mention async tasks;
  they point to lobe-agent for sub-agent dispatch
- `isSubTask` filtering in `agentConfigResolver` now excludes `lobe-agent`
  (new owner of sub-agent dispatch) instead of `lobe-gtd`
- i18n: new `builtins.lobe-agent.apiName.callSubAgent*` and
  `workflow.toolDisplayName.callSubAgent*` keys in default/zh-CN/en-US

Kept the executor's emitted `state.type` values (`execTask` / `execTasks` /
`execClientTask` / `execClientTasks`) unchanged so the agent-runtime
instruction layer (`exec_task` / `exec_tasks` / `exec_client_task*`) and all
downstream tests / heterogeneous executors (`builtin-tool-agent-management`,
server `agentManagement` runtime) continue to work without modification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ♻️ refactor(chat): rename isSubTask flag to isSubAgent

After moving sub-agent dispatch from lobe-gtd to lobe-agent, the flag name
no longer matches what it controls. Rename `isSubTask` → `isSubAgent` across
the chat / agent runtime layer and update related comments and test labels.

- `agentConfigResolver` context field + filter helper
- `streamingExecutor.internal_createAgentState` + `executeClientAgent`
  signatures and call sites
- `createAgentExecutors` (exec_task / exec_client_task handlers) and
  `GroupOrchestrationExecutors` (batch_exec_async_tasks)
- `chatService.createAssistantMessageStream` `resolvedAgentConfig` docs
- Test descriptions and assertions in `agentConfigResolver.test.ts` and
  `streamingExecutor.test.ts`

No behavior change — the flag's filter target (`lobe-agent` identifier) is
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ♻️ refactor(agent-runtime): rename exec_task wire identifiers to exec_sub_agent

Bring the agent-runtime "wire" naming in line with the lobe-agent
callSubAgent / callSubAgents API rename. Three layers are renamed in lockstep
to keep the bridge between tool executors and the runtime consistent:

1. Tool-emitted state.type discriminators
   - 'execTask' → 'execSubAgent'
   - 'execTasks' → 'execSubAgents'
   - 'execClientTask' → 'execClientSubAgent'
   - 'execClientTasks' → 'execClientSubAgents'

2. AgentInstruction.type and matching TS interfaces
   - 'exec_task' / 'exec_tasks' / 'exec_client_task' / 'exec_client_tasks'
     → 'exec_sub_agent' / 'exec_sub_agents' / 'exec_client_sub_agent' /
       'exec_client_sub_agents'
   - AgentInstructionExecTask → AgentInstructionExecSubAgent (and the three
     siblings)
   - ExecTaskItem → SubAgentTask

3. AgentRuntimeContext.phase + matching payload types
   - 'task_result' → 'sub_agent_result'
   - 'tasks_batch_result' → 'sub_agents_batch_result'
   - TaskResultPayload → SubAgentResultPayload
   - TasksBatchResultPayload → SubAgentsBatchResultPayload

Also renames the operation-type discriminator 'execClientTask' /
'execClientTasks' to 'execClientSubAgent' / 'execClientSubAgents' and updates
its locale string in default / zh-CN / en-US.

Tests / fixtures / mocks updated in lockstep:
- packages/agent-runtime/src/agents/{GeneralChatAgent.ts,__tests__/...}
- packages/builtin-tool-{lobe-agent,agent-management}/src/...
- src/server/services/toolExecution/serverRuntimes/agentManagement.ts
- packages/agent-mock/src/cases/builtins/todo-write-stress.ts (helper renamed
  to callSubAgent)
- src/store/chat/agents/createAgentExecutors.ts + exec-task / exec-tasks tests
  + fixtures/mockInstructions.ts (createExecSubAgent[s]Instruction)
- src/store/chat/slices/aiChat/actions/streamingExecutor.ts (phase check)
- packages/conversation-flow/src/__tests__/fixtures/**/*.json (8 fixtures
  retargeted from lobe-gtd/execTask[s] to lobe-agent/callSubAgent[s] with the
  new state.type wire values)

No behavior change — the agent runtime, executors and tests all go through
the same code paths; only the strings on the wire change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ♻️ refactor(builtin-tool): absorb GTD tool (plan + todo) into lobe-agent

Delete `packages/builtin-tool-gtd/` and fold its full surface — plan, todo,
ExecutionRuntime, all client UI (Inspector / Render / Streaming /
Intervention / SortableTodoList) and the system role — into
`packages/builtin-tool-lobe-agent/`. Single `lobe-agent` identifier now
owns: plan + todo management, sub-agent dispatch, and visual media analysis.

Also restructures the lobe-agent package so the executor lives under
`./client/` alongside the UI it ships with, and drops the dedicated
`./executor` export — consumers go through `./client` for everything
client-side.

Package-level changes:
- DELETE `packages/builtin-tool-gtd/` entirely.
- `packages/builtin-tool-lobe-agent/`
  - Move `src/executor/` → `src/client/executor/`. Drop `./executor` from
    `package.json` exports; expose `lobeAgentExecutor` via `./client` only.
  - Rename `GTDExecutionRuntime` → `PlanExecutionRuntime` and place under
    `src/client/executor/PlanRuntime/`. Re-export from package root so the
    server runtime can consume it without pulling in client UI deps.
  - Extend `LobeAgentExecutor` with `createPlan` / `updatePlan` /
    `createTodos` / `updateTodos` / `clearTodos`, all delegated to the
    shared runtime.
  - Add Plan + Todo API entries to the manifest (with their original
    descriptions, humanIntervention, renderDisplayControl).
  - Move all GTD client UI verbatim:
    `Inspector/{ClearTodos,CreatePlan,CreateTodos,UpdatePlan,UpdateTodos}`,
    `Render/{CreatePlan,TodoList}`, `Streaming/CreatePlan`,
    `Intervention/{AddTodo,ClearTodos,CreatePlan}`,
    `components/SortableTodoList`. Register them in
    `LobeAgentInspectors / Renders / Streamings`, add new
    `LobeAgentInterventions`.
  - Merge GTD system role into lobe-agent's (`<plan_and_todos>` plus the
    existing `<sub_agents>` and `<run_in_client>` sections).
  - `package.json`: pick up `@lobechat/prompts` dep and `@lobehub/editor` +
    `antd` + `lucide-react` peer-deps inherited from GTD.

Central registries (`packages/builtin-tools/src/*`) and consumers:
- Remove every `GTDManifest / Inspectors / Renders / Streamings /
  Interventions` import + registration; existing `LobeAgent*` registrations
  now cover them.
- Replace `[GTDManifest.identifier]: GTDInterventions` with
  `[LobeAgentManifest.identifier]: LobeAgentInterventions`.
- Drop `@lobechat/builtin-tool-gtd` workspace dep from
  `packages/builtin-tools/package.json`, `packages/builtin-agents/package.json`
  and root `package.json`.
- Remove `gtdExecutor` from `src/store/tool/slices/builtin/executors/index.ts`;
  switch `lobeAgentExecutor` import to `/client`.
- Replace `serverRuntimes/gtd.ts` with a service factory
  `serverRuntimes/lobeAgentPlan.ts` (`createServerPlanRuntimeService`).
  `serverRuntimes/lobeAgent.ts` instantiates `PlanExecutionRuntime` with
  that service so the registry exposes one runtime per `lobe-agent`
  identifier covering both visual analysis and plan/todo.
- `services/chat/mecha/contextEngineering.ts`: gate plan/todo injection on
  `LobeAgentIdentifier` instead of `GTDIdentifier`.
- `agentConfigResolver.test.ts`: switch fixture plugin IDs to
  `LobeAgentIdentifier`.
- `packages/const/src/recommendedSkill.ts`: drop the standalone `lobe-gtd`
  recommendation — `lobe-agent` already covers it via `defaultToolIds`.

i18n migration (default + zh-CN + en-US; other locales regenerate on
`pnpm i18n`):
- `builtins.lobe-gtd.*` → `builtins.lobe-agent.*` in `plugin.ts/json`.
- `lobe-gtd.*` (tool namespace) → `lobe-agent.*` in `tool.ts/json`.
- Remove `tools.builtins.lobe-gtd.{description,readme,title}` from
  `setting.ts/json` (lobe-agent has its own meta now).
- Update all client component `t(...)` keys to the new namespace.

Mocks / fixtures / tests:
- `packages/agent-mock/src/cases/builtins/todo-write-stress.ts`: all
  `identifier: 'lobe-gtd'` → `'lobe-agent'`; helper comments updated.
- `packages/types/src/stepContext.ts`: comment refers to
  `builtin-tool-lobe-agent` (the only consumer of `StepContextTodoItem`).
- `packages/model-runtime/src/core/streams/google/google-ai.test.ts`:
  function-call names from `lobe-gtd____createPlan` etc. → `lobe-agent____*`.
- `src/store/chat/slices/message/selectors/dbMessage.test.ts`: same.
- `src/features/DevPanel/RenderGallery/fixtures/lobe-gtd.ts` deleted; its
  plan/todo fixtures are folded into `fixtures/lobe-agent.ts` alongside the
  existing `callSubAgent[s]` ones.
- Replace `console.log` → `console.info` in moved client components to
  satisfy lobe-agent's stricter ESLint rules (GTD package allowed
  `console.log`; lobe-agent inherits the repo-wide `no-console` rule).

No behavior change for end users: `lobe-agent` now owns all the APIs,
identifiers, and UI that previously lived in `lobe-gtd`, but as a single
consolidated package under a single tool identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ♻️ refactor(context-engine): drop residual GTD naming, rename to PlanInjector / TodoInjector

Follow-up to 9ca5c9d (which absorbed the GTD tool package into lobe-agent).
That commit moved the package surface but left the GTD vocabulary embedded
in context-engine providers, types, metadata fields, XML tags, and a pile
of comments. This change finishes the sweep so the only remaining GTD
references are user-facing docs and the legitimate Productivity & GTD Coach
methodology suggestion.

context-engine
- `GTDPlanInjector` → `PlanInjector`; types `GTDPlan`/`GTDPlanInjectorConfig`
  → `Plan`/`PlanInjectorConfig`; metadata `gtdPlanId`/`gtdPlanInjected` →
  `planId`/`planInjected`; XML tag `<gtd_plan>` → `<plan>`; debug channel
  `provider:GTDPlanInjector` → `provider:PlanInjector`.
- `GTDTodoInjector` → `TodoInjector`; types `GTDTodoItem`/`GTDTodoList`/
  `GTDTodoStatus`/`GTDTodoInjectorConfig` → `TodoItem`/`TodoList`/
  `TodoStatus`/`TodoInjectorConfig`; metadata `gtdTodo*` → `todo*`;
  XML tag `<gtd_todos>` → `<todos>`, wrapper `gtd_todo_context` →
  `todo_context`; debug channel renamed similarly.
- `MessagesEngineParams.gtd?: GTDConfig` → `planTodo?: PlanTodoConfig`;
  internal vars `isGTDPlanEnabled`/`isGTDTodoEnabled` →
  `isPlanEnabled`/`isTodoEnabled`. Re-exports updated in `providers/index.ts`
  and `engine/messages/{index,types}.ts`.

prompts
- `packages/prompts/src/prompts/gtd/` → `planTodo/` (only export was
  `formatTodoStateSummary`, which kept its name). Updated `prompts/index.ts`
  re-export.

src/services
- `contextEngineering.ts`: `GTDConfig` import → `PlanTodoConfig`;
  `isGTDEnabled`/`gtdConfig` → `isPlanTodoEnabled`/`planTodoConfig`; payload
  field `gtd` → `planTodo`; log message wording.

Tests
- `dbMessage.test.ts`: helper `createGTDToolMessage` →
  `createLobeAgentToolMessage`; `gtdMessage` → `lobeAgentMessage`; all `it`
  descriptions reworded to "lobe-agent" instead of "GTD".
- `agentConfigResolver.test.ts`: test descriptions reworded.

Comments / docs (no behavior change)
- agent-runtime (`instruction.ts`, `runtime.ts`, `generalAgent.ts`,
  `messageSelectors.ts`), `types/{stepContext,tool/builtin}.ts`,
  `builtin-agents/group-supervisor`, `builtin-tool-claude-code/types.ts`,
  `builtin-tool-lobe-agent/Render/TodoList`, `createAgentExecutors.ts:1426`,
  `AssistantGroup/{constants,Fallback.test}`, `agent-mock/todo-write-stress`,
  `.agents/skills/builtin-tool/references/architecture.md`.

Intentionally left alone
- `docs/usage/agent/gtd.{mdx,zh-CN.mdx}` and other docs — user-facing
  product brand "GTD Tools".
- `src/locales/default/suggestQuestions.ts` "Productivity & GTD Coach" —
  references the methodology, not the tool.
- `ToolSystemRoleProvider.test.ts` `'gtd-tool'` fixture — generic test
  identifier, unrelated.
- Translated locale files still carrying `lobe-gtd.*` keys — regenerated by
  `pnpm i18n` from the updated default namespace.

Verified: `bun run type-check` passes; touched test files
(dbMessage, agentConfigResolver) and full context-engine + prompts test
suites pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(builtin-tool-lobe-agent): reset TodoList auto-save status to idle

`performSave` (the debounced auto-save path) was leaving `saveStatus` stuck
on 'saved' forever — `saveNow` had the 1.5s setTimeout-to-idle but the
auto-save twin didn't, so the inline indicator never eased back to idle
after a settle. Add the same idle-reset to performSave so both paths
behave the same.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 01:13:04 +08:00
Innei b794eb1fb9 ♻️ refactor(web-onboarding): merge agent-marketplace identifier into onboarding tool (#14672)
* ♻️ refactor(web-onboarding): merge agent-marketplace identifier into onboarding tool

Drop the standalone `lobe-agent-marketplace` builtin tool and fold its
`showAgentMarketplace` / `submitAgentPick` APIs into `lobe-web-onboarding`
so onboarding exposes a single tool identifier.

- Move marketplace API entries (with humanIntervention/renderDisplayControl)
  into WebOnboardingManifest; extend WebOnboardingApiName.
- Compose AgentMarketplaceExecutionRuntime inside WebOnboardingExecutionRuntime;
  the client WebOnboardingExecutor now owns showAgentMarketplace/submitAgentPick
  with telemetry hooks. Drop the separate client/server executor + runtime files.
- Merge marketplace Inspector / Intervention / Render maps under the
  web-onboarding identifier. Remove AgentMarketplace* entries from
  builtin-tools registries and from the builtin web-onboarding agent's
  plugins list.
- Switch customInteractionHandlers to route by (identifier, apiName) so
  the marketplace picker handler fires only on `showAgentMarketplace`.
- Drop the `lobe-agent-marketplace` fallback string in
  OnboardingActionHintInjector; match by apiName only.
- Rename plugin/setting locale keys under `lobe-web-onboarding.*`.

* 🐛 fix(onboarding): reserve scroll headroom for agent marketplace overlay

- Add a footerSlot spacer in ChatList matching the marketplace panel height so the latest message can be scrolled into view above the absolute overlay.
- Nudge the marketplace overlay inset by 2px to hide subpixel border seams.
- Document turn output order in the onboarding system role to avoid trailing filler text after tool calls.
2026-05-11 21:29:41 +08:00
Innei 831c2585f1 🐛 fix(onboarding): skip marketplace on early exit, drop CJK in prompts (#14598)
* 🐛 fix(onboarding): skip marketplace on early exit, drop CJK examples in prompts

Honor the user's wish to leave: when the onboarding agent detects a true
early-exit signal in any phase, persist what is known, send a brief
farewell, and call finishOnboarding directly. The marketplace handoff is
mandatory only on normal Phase 4 / Summary completion. Previously the
spec forced the agent to invent categoryHints from environment cues
when discovery was thin, producing noisy recommendations for users who
explicitly asked to stop.

- Replace systemRole §Early Exit with a 4-step flow (no marketplace, no
  summary), and remove the trailing "respect their time" rationale that
  contradicted the new policy.
- Update toolSystemRole turn-protocol exception accordingly; mark
  persistence as best-effort (do not retry on failure) since the
  Pre-Finish Checklist is overridden on early exit.
- Update OnboardingActionHintInjector L101/L127 hints to match the new
  flow, and append an EXCEPTION clause to the Summary not-opened hint
  so a true exit signal in Summary skips the marketplace too.
- Strip CJK example phrases from prompt text; rely on the LLM's
  multilingual recognition with "equivalents in any language" hints.

* 🔨 refactor(FollowUpChips): remove unused consume function and reset editor state on chip click
🔨 style(InterventionBar): remove overflow hidden from container style

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(ci): align FollowUpChips test with removed consume and increase timeout for PGlite cold-start

---------

Signed-off-by: Innei <tukon479@gmail.com>
2026-05-11 15:45:54 +08:00
YuTengjing 58318e97df 🐛 fix: store onboarding interests as keys (#14624) 2026-05-10 16:44:22 +08:00
Arvin Xu 0d39dff2d5 🐛 fix(agent-runtime): recover malformed tool_call names instead of finishing silently (#14577)
* 🐛 fix(agent-runtime): recover malformed tool_call names instead of finishing silently

When an LLM emits tool_call names without the `____` separator (e.g. `activateTools`
instead of `lobe-activator____activateTools`), the resolver dropped them silently and
the harness finished with "completed without tool calls" — empty assistant bubble,
no error in dashboards.

Three layers of defense:

- Resolver fallback: when the bare name uniquely matches an API across known
  manifests, recover the identifier; ambiguous matches still drop to avoid
  false binding.
- StreamingHandler logs unresolved tool_call names so the silent-drop path is
  observable in debug output.
- GeneralChatAgent surfaces the unresolvable count and names in reasonDetail
  so dashboards can distinguish this from a genuine no-tool completion.

Fixes LOBE-8696

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(agent-runtime): restrict bare-name fallback to tools offered this turn

Address review feedback on the LOBE-8696 resolver fallback. The
manifests map passed to ToolNameResolver.resolve is broader than the
tools actually sent to the LLM (the client builds it from every
installed plugin and every builtin; the server can preserve manifests
even after a step deactivates a tool). Without a turn-scope
restriction:

- A model returning a malformed bare name could resolve to a tool that
  was not enabled for this turn.
- A disabled duplicate API name could shadow the enabled call and make
  it look ambiguous, dropping a valid call.

Pipe an `offeredToolNames` list (the names actually sent in this LLM
payload) into resolve(): when set, the missing-prefix fallback only
considers manifests whose generated tool name appears in the list.

- ToolNameResolver.resolve gains an optional `offeredToolNames` param.
- internal_transformToolCalls forwards the list through.
- createAgentExecutors builds resolvedAgentConfig before the
  StreamingHandler so the closure can bind the offered names — same
  list that gets sent to the model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 20:47:21 +08:00
Innei 4ebd8f7f7c ♻️ refactor(onboarding): extract language and privacy as shared prefix steps (#14538)
* ♻️ refactor(onboarding): extract language and privacy as shared prefix steps

Move the language-selection and privacy/telemetry consent out of the classic
flow into a shared prefix that runs at /onboarding before branching into either
the agent or classic experience. Welcome decoration is merged with language
selection on a single screen, dropping the total step count by one.

Shared-prefix completion is derived from raw stored settings
(s.settings.general.responseLanguage and telemetry), so no new schema fields
are introduced and existing consumers that rely on the merged-default
telemetry value are unaffected.

Branch routing remains automatic (feature flag + isDesktop check) and is now
encapsulated in deriveOnboardingBranchPath. Both branch routes guard against
entering before the shared prefix is complete.

MAX_ONBOARDING_STEPS drops from 5 to 3 (FullName, Interests, ProSettings).

* ♻️ refactor(onboarding): use original Telemetry + ResponseLanguage as shared steps

Revert the merged welcome+language design. The shared prefix now reuses the
original two classic steps as-is:
- Step 1: TelemetryStep (welcome decoration + privacy/telemetry consent)
- Step 2: ResponseLanguageStep (language selection)

Also suppress the mode-switch + skip footer on the bare /onboarding path so
it only appears once the user has entered the agent or classic branch.

* 🐛 fix(onboarding): persist shared-prefix step in URL to survive locale-triggered remounts

Use react-router's useSearchParams to keep the active shared step in the URL
(?step=2). Local useState was lost when switching language for the first time
because i18next's first-time resource load triggers a remount up the tree;
the URL param survives any remount.

* 🐛 fix(onboarding): unblock branch redirect when user accepts default telemetry

Derive commonStepsCompleted from responseLanguage alone. setSettings strips
fields whose value matches DEFAULT_COMMON_SETTINGS, so accepting the default
telemetry: true left s.settings.general.telemetry undefined and the derive
selector never flipped to true — the redirect to the branch never fired.

Step 2 (language) implies step 1 was completed because the flow is sequential,
so checking responseLanguage alone is sufficient and robust against the
default-strip behavior.

* 🐛 fix(onboarding): redirect after step 2 by deriving completion from responseLanguage only

setSettings strips fields that match defaultSettings, so writing
telemetry=true (the default) never persists to s.settings.general.
That made commonStepsCompleted permanently false even after the user
finished both steps, blocking the redirect to the branch flow.

Drop telemetry from the derive check. Step 1 completion is already
tracked via the URL ?step=2 marker; step 2 completion is the only
event that needs to flip commonStepsCompleted, signalled by writing
responseLanguage (which always differs from the default since
DEFAULT_COMMON_SETTINGS has no responseLanguage entry).

* 🔨 chore(scripts): add reset-onboarding script for redoing the flow

Takes an email, clears users.onboarding, agent_onboarding, full_name,
interests and removes responseLanguage + telemetry from
user_settings.general so the user re-enters the shared-prefix
onboarding from step 1.

Usage:
  pnpm workflow:reset-onboarding <email>
  bunx tsx scripts/resetOnboarding/index.ts <email>

* 🐛 fix(signup): add refs for email and password inputs to improve focus handling

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(onboarding): skip responseLanguage auto-fill while onboarding is in progress

useInitUserState's onSuccess callback auto-fills general.responseLanguage
from navigator.language whenever the field is missing. For new users
this fired immediately after signup, which made commonStepsCompleted
(which derives from responseLanguage being set) flip to true on first
load, and CommonOnboardingPage's early-redirect skipped past the shared
prefix straight into /onboarding/agent.

Gate the auto-fill on onboarding.finishedAt or agentOnboarding.finishedAt
being set, so legacy users who finished onboarding without
responseLanguage still get the safety-net detection, but in-progress
users keep the field undefined until they explicitly choose it on the
language step.

* 🐛 fix(onboarding): refresh welcome message locale until conversation starts

ensureWelcomeMessage previously only created the welcome on first call
and skipped on subsequent ones, leaving stale welcomes locked to the
locale that was active when the topic was first created. After the
shared-prefix refactor users pick their language earlier than they
used to, so the welcome that was generated during the auto-detect
phase never gets re-translated.

Now the welcome content is rewritten in-place to match the current
responseLanguage as long as no user reply has been recorded yet
(message count <= 1). Once the conversation has started, the welcome
is left as part of the chat history.

* 🐛 fix(onboarding): update welcome message handling to render client-side and avoid persisting during onboarding

Signed-off-by: Innei <tukon479@gmail.com>

* Refactor onboarding user profile handling: remove responseLanguage field

- Removed responseLanguage from SaveUserQuestionInput and related schemas.
- Updated onboarding logic to no longer save or request responseLanguage.
- Adjusted related components and services to reflect the removal of responseLanguage.
- Enhanced user info handling to include displayName and fullName from OAuth.
- Updated tests to align with the new onboarding structure.

Signed-off-by: Innei <tukon479@gmail.com>

* refactor(onboarding): update locale handling to use i18n's resolved language

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(onboarding): remap legacy 5-step classic currentStep on shared-prefix mount

Mid-flow legacy users with persisted currentStep authored under the old
5-step classic flow (Telemetry, FullName, Interests, Language, ProSettings)
would silently skip required profile steps after the renumbering: old
step 2 (FullName) rendered Interests, old step 3 (Interests) rendered
ProSettings. Apply a one-time remap (2->1, 3->2, >=4->MAX) when Common
mounts, gated by isUserStateInit and onboarding.finishedAt absence so it
fires only for in-flight legacy users. Idempotent for new-schema values.

* refactor(onboarding): implement AGENT_ONBOARDING_ENABLED master switch for onboarding flow

Signed-off-by: Innei <tukon479@gmail.com>

* refactor(onboarding): standardize AGENT_ONBOARDING_ENABLED naming in tests

Signed-off-by: Innei <tukon479@gmail.com>

---------

Signed-off-by: Innei <tukon479@gmail.com>
2026-05-09 14:31:50 +08:00
YuTengjing 38c92fa04a 🐛 fix: sanitize provider tool names (#14510) 2026-05-08 11:47:07 +08:00
YuTengjing b5d7696dbd feat: add visual understanding tool (#14378) 2026-05-02 22:18:50 +08:00
AmAzing- babdc6ade5 Fix task drawer agent metadata hydration (#14315) 2026-04-29 22:55:55 +08:00
Innei 990942fb45 feat(agent-marketplace): fetch onboarding templates from market API (#14286)
*  feat(agent-marketplace): implement onboarding agent marketplace picker

Adds a new builtin tool `@lobechat/builtin-tool-agent-marketplace` that
opens a categorized agent picker UI during web onboarding. The picker
fetches the live curated catalog from the marketplace API
(`/api/v1/agents/onboarding-full`) via a TRPC procedure that injects the
trust-token, and lets the user select template agents to install.

Highlights:

- Self-contained marketplace package with manifest, system role, executor,
  and ExecutionRuntime
- React intervention component with category sidebar, skeleton loading
  state, and avatar/empty/error UI; all user-visible strings i18n-driven
- Dependency-inverted fetcher: package exports `setAgentTemplatesFetcher`,
  app registers a TRPC-backed implementation in AgentOnboardingPage
- New TRPC `market.agent.getOnboardingFull` proxies the upstream API with
  trust-token authentication; client never sees secrets
- Splits the existing `saveUserQuestion` intervention into agent identity
  and user profile cards for clearer onboarding approval UX
- Wires marketplace into `builtin-tools` registry, executor map, and
  onboarding metrics; web-onboarding agent system prompt updated to
  reference the picker

Closes LOBE-7801

*  feat(onboarding): enhance early exit handling and marketplace integration in onboarding flow

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(agent-marketplace): register server runtime, scope picks per-topic, and harden onboarding handoff prompts

The summary phase silently skipped the marketplace handoff because the
server toolExecution registry had no runtime for `lobe-agent-marketplace`,
so every `showAgentMarketplace` call returned "not implemented" and the
agent fell through to `finishOnboarding`. The runtime-injected phase
guidance and action hints also instructed the agent to call
finishOnboarding directly after the summary, contradicting the new
system role.

- Register `agentMarketplaceRuntime` in
  `src/server/services/toolExecution/serverRuntimes` so the executor
  can actually run.
- Scope the in-memory `picks` map by `topicId` and reject a second
  `showAgentMarketplace` call in the same conversation with a clear
  "already opened, finish on next turn" message.
- Tighten the success content to instruct the model to STOP the current
  turn after opening the picker and run closing + finishOnboarding on
  the FOLLOWING user turn.
- Update `OnboardingActionHintInjector`, `PHASE_GUIDANCE.summary`,
  `toolSystemRole` and `web-onboarding/systemRole` so all four prompt
  layers agree: open the picker exactly once during summary, do not
  call finishOnboarding in the same turn, and do not call the
  submit/skip/cancel APIs ourselves.
- Stop treating short affirmations like "好的" / "行" / "ok" as
  early-exit signals; they are confirmation of the summary and should
  let the picker handoff proceed normally.

Verified end-to-end with `bun run agent-evals run onboarding/web-onboarding-v3
--case-id fe-intj-crud-v1 --model deepseek-v4-pro`: hard assertions all
pass, judge moves from 7/10 (premature finishOnboarding in same turn)
to 8/10 with picker opened once and finishOnboarding deferred to the
next turn.

* fix(ci): attempt 1 for PR #14286

Auto-generated by pr-dispatcher (task: 01KQBY8GAC1MNQCJ6T6X5DEP2F, attempt: 1).

Co-Authored-By: Claude <noreply@anthropic.com>

* 🐛 fix(agent-marketplace): wire picker submit + fix marketplace-already-opened detection

The marketplace picker confirm flow was sending the user's selection back as a
synthetic user message, and the action hint kept telling the model to open the
marketplace again — leading to a death loop where the agent re-opened the
picker instead of summarizing + persisting + finishing onboarding.

Two issues:

1. Pick confirm forwarded the selection as a user message instead of forking
   the agents and resuming from the tool result. Wire `prepareCustomInteractionSubmit`
   into the intervention's submit branch so it runs `installMarketplaceAgents`
   client-side and returns a descriptive `toolResultContent`. Plumb a
   `createUserMessage: false` + `toolResultContent` option through
   `submitToolInteraction` (slice + chat store): when set, skip the synthetic
   user message, override the tool message content, and resume runtime from the
   tool message (`parentMessageType: 'tool'`) so the LLM sees the install
   result and continues from there.

2. `OnboardingActionHintInjector.marketplaceAlreadyOpened` read `msg.tool_calls`,
   but this provider runs in pipeline phase 4.5 (virtual tail guidance) BEFORE
   `ToolCallProcessor` (phase 5) converts DB-shape `tools` → OpenAI-shape
   `tool_calls`. Detection always returned false → the hint kept saying
   "call showAgentMarketplace" → death loop. Fix: match on `tools[].apiName`
   (with `tool_calls` kept as a fallback). Also rewrote the Summary-phase hints
   to reflect the new flow (picker resolves directly via tool result, no
   synthetic user reply needed).

Includes intervention bar portal-target plumbing for approval actions.

*  feat(onboarding): wire marketplace picker analytics on agent onboarding page

Mount AnalyticsBridge under AgentOnboardingPage to inject useAnalytics() into
setOnboardingAnalyticsClient, so onboarding_marketplace_shown/picked events
emit through PostHog instead of being silently dropped. Adds spm fields to
align with onboardingFeedback's telemetry shape.

* ♻️ refactor: move DEFAULT_ONBOARDING_MODEL to business-const

Made-with: Cursor

*  test(customInteractionHandlers): add tests for persisting marketplace picks and resolutions

Signed-off-by: Innei <tukon479@gmail.com>

*  feat(onboarding): enhance agent marketplace integration with metadata persistence

Signed-off-by: Innei <tukon479@gmail.com>

*  feat(agent): add web onboarding agent selectors and integrate into Actions and Usage components

Signed-off-by: Innei <tukon479@gmail.com>

---------

Signed-off-by: Innei <tukon479@gmail.com>
Co-authored-by: Claude <noreply@anthropic.com>
2026-04-29 21:25:16 +08:00
Innei 7b6978271a feat(chat): support local file mention snapshots (#14278)
*  support local file mention snapshots

*  feat(local-file-mention): implement useLocalFileMention hook for local file search functionality

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix desktop project file index fallback

---------

Signed-off-by: Innei <tukon479@gmail.com>
2026-04-29 19:39:31 +08:00
Arvin Xu fbe8ab3891 ♻️ refactor(context-engine): drop ____builtin suffix from tool names (#14289)
♻️ refactor(context-engine): drop ____builtin suffix from tool names

Builtin tools now generate two-segment names like documents____upsertDocumentByFilename instead of documents____upsertDocumentByFilename____builtin. The "default" plugin type was already suffix-less, and "default" is no longer in active use, so collapsing builtin into the same shape removes redundant LLM-facing tokens. resolve() falls back to type 'builtin' for two-segment names and still parses legacy three-segment ____builtin names from message history.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 11:25:24 +08:00
Arvin Xu 998c22890d 🐛 fix(context-engine): normalize tool parameters required to [] (#14178)
Object-typed JSON Schemas without `required` could be reserialized as
`required: null` by strict OpenAI-compatible upstreams (bailian / glm /
zhipu), which then reject the request with `at '/required': got null,
want array`. Default missing/non-array `required` to `[]` at the tool
generation boundary so the wire format stays consistent.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 23:43:04 +08:00
Innei 043d2a81fb feat(agent): add floating chat panel and workspace improvements (#13887)
*  feat(FloatingChatPanel): add single-instance mount guard

*  feat(FloatingChatPanel): add inner ChatBody layout

*  feat(FloatingChatPanel): add reusable floating conversation panel

*  test(FloatingChatPanel): add props wiring smoke tests

* Refactor agent topic and page routes

* Restore topic page routing for floating chat panel

*  feat(FloatingChatPanel): enhance ChatBody and TopicItem for improved routing and styling

- Updated ChatBody to maintain scroll ownership while hiding overflow.
- Refactored TopicItem to correctly highlight active topics based on routing context.
- Added tests for TopicItem to ensure correct active state behavior.
- Introduced static styles for FloatingChatPanel to manage layout overflow.

Signed-off-by: Innei <tukon479@gmail.com>

* chore: help to merge & rebase

* chore: align merge with canary — drop pkg.pr.new ui, adopt canary useMenu, remove NotebookButton

*  feat: add ViewSwitcher component and update localization for chat views

- Introduced a new ViewSwitcher component to toggle between chat, page, and task views in the conversation header.
- Updated English and Chinese localization files to include new labels for the view switcher options.
- Refactored the conversation header to integrate the ViewSwitcher, enhancing the user interface for better navigation.

Signed-off-by: Innei <tukon479@gmail.com>

* fix: update @lobehub/ui to version 5.9.1 and refactor FloatingChatPanel to use FloatingSheet component

- Updated the @lobehub/ui dependency in package.json to version 5.9.1.
- Refactored FloatingChatPanel to utilize the new FloatingSheet component, enhancing its layout and state management.
- Introduced a new ChatLayout component for better organization of chat-related UI elements.
- Adjusted routing configuration to incorporate the new ChatLayout for agent chat pages.

Signed-off-by: Innei <tukon479@gmail.com>

* feat: add TopicCanvas and TitleSection components for topic management

- Introduced TopicCanvas component to serve as a document canvas for topics, integrating an editor and title section.
- Added TitleSection component for managing topic titles and emojis, enhancing user interaction with a dedicated UI.
- Updated FloatingChatPanel to accommodate the new TopicCanvas, ensuring a cohesive layout in the topic page.
- Enhanced tests to verify the integration of TopicCanvas within the topic page route.

Signed-off-by: Innei <tukon479@gmail.com>

*  feat(agent-page): bind documentId to URL and introduce HeaderSlot

- Add nested /agent/:aid/:topicId/page/:docId route with PageRedirect for bare /page
- Introduce useAutoCreateTopicDocument with module-level inflight de-dup
- Lift Portal + WorkingSidebar to (chat) layout; keep ChatHeader in left column
- Sidebar document clicks on page route navigate to /page/:docId instead of opening Portal
- Add HeaderSlot (context + createPortal) as a reusable header injection point
- Mount AutoSaveHint via HeaderSlot; register Files hotkey scope in TopicCanvas so Cmd+S triggers manual save
- Sync desktopRouter.config.tsx and desktopRouter.config.desktop.tsx
- Extend RecentlyViewed plugin to round-trip optional docId segment

* Use topic titles for auto-created page documents

* Add page-agent init gating and runtime diagnostics

* Support current-topic agent documents

* Implement Active Topic Document and Disabled Tool Call Filtering

- Introduced ActiveTopicDocumentContextInjector to inject context for active topic documents into user messages.
- Added DisabledToolCallFilter to remove historical tool calls for disabled tools in the current runtime scope.
- Updated MessagesEngine to utilize the new context injectors and filters.
- Enhanced tests to verify the correct injection of active topic document context and filtering of disabled tool calls.

This update improves the handling of document editing contexts and tool management in the conversation flow.

Signed-off-by: Innei <tukon479@gmail.com>

* feat: enhance agent document management with LiteXML operations

- Updated API names for clarity, changing 'patchDocument' to 'modifyNodes'.
- Introduced LiteXML operation schema for document modifications.
- Implemented new mutation for modifying document nodes via LiteXML.
- Enhanced document retrieval methods to support format options (XML, Markdown, Both).
- Added support for editor data snapshots and normalization of diff nodes.
- Improved document history management to handle editor data with diff nodes.
- Created tests for new features and ensured existing functionality remains intact.

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix: apply agent document xml edits directly

* Refine document cache invalidation and editor hydration

* 🐛 fix: stabilize agent topic hydration

* fix: update @lobehub/editor dependency version and clean up test mocks

Signed-off-by: Innei <tukon479@gmail.com>

* Potential fix for pull request finding 'Useless assignment to local variable'

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

* 🐛 fix(document): preserve pending diff nodes through save path

Skip normalizeEditorDataDiffNodes on every autosave so diff nodes awaiting
user review survive persistence. Normalization now runs only on explicit
Accept/Reject via DiffAllToolbar. Also flip headless litexml ops to delay:true
to match the new review flow.

* 🐛 fix(agent): detect agent sub-route from URL params not cached topic

isInAgentSubRoute used routeTopicId (with activeTopicId fallback) as its
base path. On /agent/:aid/profile with a cached activeTopicId, the base
became /agent/:aid/:cachedTopicId which pathname cannot startsWith, so
sub-route detection returned false and sidebar topic clicks only called
switchTopic without routing back to chat — users stayed stuck on profile.

Derive the sub-route base from params.topicId directly so stale store
state cannot mask the check. routeTopicId export keeps the fallback for
sidebar highlighting.

* 🐛 fix(page): repair topic page document recovery

* 🐛 fix(page-agent): block tool calls when page editor is not mounted

scope is topic-bound not route-bound, so navigating from /agent/.../Page
to /agent/... keeps scope==='page' and PageAgentIdentifier stayed in the
injected plugin list. The LLM could still call initPage / modifyNodes /
etc. against a stale editor reference, returning misleading success
(e.g. nodeCount=0).

Two layers of guard:
- PageAgentExecutor wraps `invoke` and returns a structured
  PAGE_EDITOR_NOT_MOUNTED / kind: 'replan' result when the runtime
  editor is not mounted, pointing the LLM at lobe-agent-documents.
- streamingExecutor drops PageAgentIdentifier from the tool set via
  the new `composeEnabledTools` pipeline when scope==='page' and
  the page-agent runtime is not ready.

Also extract the tool-set composition (inject merge + runtime drops)
out of the ~320-line internal_createAgentState into
`mecha/toolSetComposer`, with unit tests.

* 🐛 fix(chat): unify message stream for /agent/:topicId and /page/:docId

Before this change a page-scoped conversation (FloatingChatPanel with
scope='page' in the /Page route) partitioned the client message store by
scope, so /agent/:topicId and /agent/:topicId/page/:docId each built their
own messagesMap slot and SWR cache — but the TRPC getMessages endpoint
ignores scope and returned the same messages for both, producing duplicate
fetches and a visible message-history split between the two surfaces.

Fixes by keeping scope='page' as a capability/surfacing marker only:
- messageMapKey: collapse 'page' to the default scope early in
  toMessageMapContext, so threadId/groupId still win and only the
  main/page pair actually unifies.
- useFetchMessages: build the SWR key from identity fields
  (agentId, groupId, threadId, topicId) instead of the full
  ConversationContext, so scope no longer partitions the cache.

agentConfigResolver/streamingExecutor/composeEnabledTools still read
scope='page' from operation.context for PageAgent injection and
initialContext.pageEditor wiring — the capability layer is unchanged.

Also fix two pre-existing test regressions surfaced by re-running the
impacted suites:
- streamingExecutor page-editor initialContext test now mocks
  pageAgentRuntime.isReady() (required since the PageAgent editor-ready
  guard landed).
- FloatingChatPanel default shell props test updated to match the
  [180,320,520,800] snap points introduced in 62dc91e444.

* ♻️ refactor(FloatingChatPanel): read main slot without changing scope

Revert the global messageMapKey/SWR-key changes from b650cdc9d7 — the
global collapse over-reached and coupled message routing to scope in
ways other surfaces don't want. Instead, specialize only the place that
actually has the dual-role problem.

`scope` should be a capability marker (PageAgent tool + pageEditor
initialContext injection), not a message-list partition. Floating panel
on /agent/:topicId/page is the only caller that sets scope='page', and
its message list should mirror /agent/:topicId — the surfaces share a
topic.

Local collapse in FloatingChatPanel: compute chatKey with
`scope === 'page' ? 'main' : scope`, so messagesMap is read from the
main slot. The downstream ConversationContext keeps scope='page' for
the capability layer; only the slot lookup is specialized.

Kept from b650cdc9d7 (unrelated to the revert):
- streamingExecutor test mocks pageAgentRuntime.isReady() — required
  by the PageAgent editor-ready guard in 01ef7bc142.
- FloatingChatPanel snap-points test matches [180,320,520,800] from
  62dc91e444.

* 🐛 fix(FloatingChatPanel): simplify chat key computation for message retrieval

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(index.desktop.test): update LocationProbe to reflect route changes and improve test accuracy

Signed-off-by: Innei <tukon479@gmail.com>

* Constrain agent header title under centered switcher

* 🐛 Fix conversation header view switcher layout

* 🐛 Fix agent topic path links and cmdk context

* 🐛 fix(test): align document history fixtures and layout ui mock

* 🐛 fix(e2e): support dialog-based topic rename

* ♻️ refactor(debug): use scoped debuggers for PR logging

---------

Signed-off-by: Innei <tukon479@gmail.com>
Co-authored-by: Neko Ayaka <neko@ayaka.moe>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
2026-04-24 23:56:25 +08:00
Tsuki 70e7e441b2 🔨 chore: premerge Task detail page UI (#13653)
*  feat: add AgentTaskList component on agent welcome page (LOBE-6597)

- AgentTaskList with TaskListHeader, TaskItem, and styles
- Embedded in AgentWelcome below ToolAuthAlert
- Each task rendered as independent rounded card with status badge
- Status: green filled circle (Done), blue circle (In progress)
- Card width matches chat input (960px)
- i18n keys for taskList.title and taskList.viewAll
- Fix updateReview type to use TRPC-inferred type

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add Tasks page at /agent/:aid/tasks with route, breadcrumb, and view toggle (LOBE-6597)

- Register tasks route in both desktopRouter.config.tsx and .desktop.tsx
- Thin route page at src/routes/(main)/agent/tasks/index.tsx
- Feature components in src/features/AgentTasks/: page, breadcrumb, header with list/kanban toggle, full task list
- Wire up "View All Tasks" navigation from AgentTaskList welcome card
- Add i18n keys (taskList.activeTasks, taskList.breadcrumb.task) and generate translations via pnpm i18n

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add Task detail page at /agent/:aid/tasks/:taskId (LOBE-6597)

- Register :taskId child route in both desktopRouter configs
- TaskDetailPage with auto-save hint, breadcrumb, and scrollable content
- TaskDetailHeader: editable title (borderless Input), Run/Pause button, status/priority tags, delete
- TaskInstruction: click-to-edit Markdown with debounced auto-save
- TaskSubtasks: sub-issues list with status badges
- TaskActivities: timeline with topic/brief/comment icons
- TaskItem now navigates to detail page instead of just setting activeTaskId
- Add taskDetail.* i18n keys with generated translations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add TaskModelConfig, TaskScheduleConfig, and refine Task detail UI (LOBE-6597)

Add model/provider selector and periodic execution config to Task detail page.
Refine TaskDetailHeader, TaskInstruction with auto-save and i18n support.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: refine Task detail UI with Linear-style design (LOBE-6597)

- Redesign SubTasks with collapsible header, progress circle, hover + click navigation
- Redesign Activities with agent avatar, comment input box, and Linear-style layout
- Add TaskParentBar showing parent task relationship with sibling navigation popover
- Add delete confirmation modal using App.useApp().modal.confirm
- Move ModelSelect to separate row below action bar
- Fix zustand selector recreation in ActivityItem
- Replace hardcoded colors with cssVar tokens

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add Properties panel, parent link hover, activity icon, and lifecycle save status (LOBE-6597)

- Add TaskProperties sidebar with collapsible status/priority dropdowns
- Parent bar: clickable parent link with hover, sibling navigation popover on progress
- Activity title: add BotMessageSquare icon
- Fix lifecycle actions not updating taskSaveStatus (saving/saved indicator)
- Filter status dropdown to only user-selectable states (backlog/completed/canceled)
- Add test task creation script for dev

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add recursive tree view for subtasks with Linear-style connecting lines (LOBE-6597)

- Add buildTaskTree utility to convert flat getTaskTree API response into nested tree
- Implement SubtaskTreeItem recursive component with CSS connecting lines (├─ and └─)
- Fetch full task tree via taskService.getTaskTree for nested subtask display
- Show loading spinner during tree fetch, fallback to flat list on error
- Remove padding-inline from AgentTaskList container

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: address PR review — delete redirect, debounce cleanup, schedule resync (LOBE-6597)

- Redirect to task list after successful delete (P1)
- Clean up instruction debounce timer on unmount/task switch to prevent stale writes (P1)
- Resync TaskScheduleConfig local state when active task changes (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: use backend nested subtasks directly, remove buildTaskTree (LOBE-6597)

Backend now returns nested subtasks in task.detail (LOBE-6814).
Remove buildTaskTree utility, getTaskTree API call, and loading state.
Use TaskDetailSubtask from @lobechat/types instead of local interface.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  perf: add optimistic update and save status for model config change (LOBE-6597)

updateTaskModelConfig now immediately reflects new model/provider in UI
via optimistic store dispatch, and tracks taskSaveStatus (saving/saved).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  perf: skip redundant refreshTaskDetail on successful model config update (LOBE-6597)

Optimistic update is trusted on success — no need for full detail re-fetch.
Aligns with updateTask pattern. Refresh kept only in error path for revert.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: use backend author info for activities, fix AgentTaskList after AgentHome refactor (LOBE-6597)

- Activity: use act.author (TaskDetailActivityAuthor) from backend instead of agentMap lookup (LOBE-7013)
- AgentTaskList: fix agentId from useParams instead of useAgentStore.activeAgentId (was undefined)
- AgentHome: integrate AgentTaskList into new AgentHome layout (replaces old AgentWelcome)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: show participant avatars on task cards, use backend author for activities (LOBE-6597)

- TaskItem: display up to 3 participant avatars next to task title (LOBE-6805)
- Activity: use act.author from backend instead of agentMap lookup (LOBE-7013)
- AgentHome: integrate AgentTaskList into new AgentHome layout
- Revert AgentTaskList/TaskItem agentId back to useAgentStore (works correctly when mounted)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: fix type safety, memoize participants filter, extract avatar styles (LOBE-6597)

- Use TaskParticipant type instead of `any` in filter/map
- Compute displayParticipants once with useMemo (was filtering twice per render)
- Move avatar overlap styles to CSS classes (was inline objects per render)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🔇 chore: hide kanban view toggle until implemented (LOBE-6597)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: export TaskStatus/TaskPriority/TaskActivityType from @lobechat/types (LOBE-6597)

Replace hardcoded string/number types with shared type aliases:
- TaskStatus: 'backlog' | 'canceled' | 'completed' | 'failed' | 'paused' | 'running'
- TaskPriority: 0 | 1 | 2 | 3 | 4
- TaskActivityType: 'brief' | 'comment' | 'topic'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: update

* style: update

* style: update

* style: update

* style: update

* style: update

* style: update

* style: update

* style: update

* style: update

*  feat: add Daily Brief module to homepage (#13851)

*  feat: add Daily Brief module to homepage

Add a Daily Brief section below the chat input on the homepage that
displays unresolved briefs from the Agent Tasks system. Users can
resolve, comment, and provide feedback directly from the brief cards.

- Service: BriefService with listUnresolved, resolve, markRead, addComment
- Store: Independent Zustand store (src/store/brief/) with SWR data fetching
- Components: BriefCard, BriefCardActions (dynamic action buttons),
  BriefCardSummary (Markdown with expand/collapse), CommentInput (@lobehub/editor)
- Three action types: resolve (closes brief), comment (resolve with text),
  link (safe URL navigation with protocol validation)
- Fixed feedback button: adds task comment without resolving the brief
- Inline success state ("Feedback sent") with 1.5s auto-restore
- i18n: zh-CN + en-US translations
- Tests: 21 tests across service, store selectors, and components
- CLI: Register task and brief commands for local development

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add agent avatars to Daily Brief cards

Display stacked agent avatars next to brief card titles using the
new `agents` data from Arvin's enriched listUnresolved API (#13489).

- Add AgentAvatarInfo type and agents field to BriefItem
- Render overlapping circular avatars (20px, -6px overlap)
- Use cssVar.colorBgContainer for border (dark mode compatible)
- Extract avatar style to function to avoid inline object creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: clean up Daily Brief components

- Extract duplicate success state JSX into reusable SuccessTag component
- Remove redundant comments that describe what code does
- Use DEFAULT_AVATAR from @lobechat/const instead of hardcoded emoji

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: address PR review feedback for Daily Brief

- Use cssVar.colorBgBase instead of hardcoded #fff for primary button
  text color (dark mode contrast fix)
- Add submitting state to CommentInput to prevent duplicate submissions
  (disable buttons + show loading during async submit)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🌐 chore: generate i18n translations for Daily Brief

Run pnpm i18n to generate translations for all 18 locales.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: use shared BriefType from @lobechat/types

Export BriefType union from packages/types and use it in
BRIEF_TYPE_COLOR and BRIEF_TYPE_ICON records for compile-time
key validation. Adding a new brief type now requires updating
the shared type, and TypeScript will flag missing mappings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: update

* style: update

* style: update

---------

Co-authored-by: Tsuki <976499226@qq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: update

* style: update

* style: update

* style: update

* fix: stopPropagation

* fix: i18n

* 🐛 fix: wire comment inputs to editor instance so Send actually submits

CommentInput in AgentTasks and DailyBrief used antd TextArea inside
@lobehub/editor's ChatInput while reading content via
editor.getDocument('markdown'). The TextArea was never connected to the
editor instance, so getDocument always returned empty and handleSubmit
short-circuited silently — Send appeared to do nothing (no network
request fired).

Replace the TextArea with <Editor editor={editor} type="text"
variant="chat" /> so useEditor() actually drives the editable surface.
Keep plain-text behavior via markdownOption={false} +
enablePasteMarkdown={false}, and bind Cmd/Ctrl+Enter submit via
onPressEnter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: use participant.title after TaskParticipant schema rename (#13877)

PR #13877 renamed TaskParticipant.name → .title and added
.backgroundColor. Our branch's UI code (AgentAvatars, listViewOptions,
TaskList group header, Breadcrumb) was already written against the new
schema, but TaskProperties still read firstParticipant?.name — update
the last remaining call site so the type matches post-rebase.

backgroundColor is already plumbed through everywhere it applies within
#13877's scope; TaskActivities' TaskDetailActivityAuthor is a separate
type untouched by the PR and kept as-is.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: resolve type-check errors exposed after canary rebase

canary upgraded react-i18next to a version with typed i18n keys and
tightened @lobehub/editor's SendButton + IEditor APIs. Rebase pulled
these in, surfacing latent type errors in LOBE-6597 code.

- CommentInput: use editor.cleanDocument() (IEditor's actual API;
  clearContent never existed).
- TaskActivities / TaskLatestActivity / TaskTriggerTag: type t as
  TFunction<'chat'> so typed i18n accepts the known-literal keys used
  inside module-level helpers.
- TaskPriorityTag / TaskStatusTag / listViewOptions: add
  defaultValue: '' to dynamic-key t() calls (template literals and
  Record lookups) to match the broad-key i18n overload.
- BriefCardActions: swap unusable <SendButton> (no children, no
  iconPlacement) for <Button>; add defaultValue to the dynamic
  brief-action key lookup; drop stale @ts-ignore.
- DailyBrief/CommentInput: drop unsupported children on SendButton;
  keep label via title attribute.
- Recents/Item: type TYPE_ICON_MAP as Partial<Record<...>> so 'task'
  (rendered via TaskStatusIcon elsewhere) is a safe absent key.
- brief/slices/list/action: cast briefService.listUnresolved() result
  back to BriefItem[] (TRPC serialization widens BriefType to string).
- AgentTasks/TasksHeader: delete dead file — no importers and its
  ./style module was removed by an earlier refactor.

Also ran pnpm install to materialize the newly-extracted
@lobechat/agent-gateway-client workspace package (canary #13866),
clearing ~7 "cannot find module" errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor(builtin-tool-task): polish task tool paths (#13869)

*  feat: navigate to task detail when clicking brief card header

Clicking the header row of a Daily Brief card (icon + title + time +
agent avatars) now jumps straight to the associated task, using the
brief's task-tree agent (with activeAgent / inbox as fallback).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: show parent task ids as clickable breadcrumb trail

Walk the cached parent chain from taskDetailMap and insert each ancestor's
identifier as a link between the "任务" entry and the current task name in
the task detail breadcrumb.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add cross-agent /tasks page with View All Tasks on Daily Brief

- Register `/tasks` route in desktop (web + Electron) and mobile router configs
- `useFetchTaskList` supports `allAgents` mode via options object API to fetch
  tasks without agent filter; backend already supports optional assigneeAgentId
- `Breadcrumb` accepts optional `agentId`, renders "All tasks" crumb when absent
- `AgentTaskItem` navigation uses `task.assigneeAgentId` so clicks work from
  the cross-agent page (falls back to `activeAgentId` for unassigned tasks)
- Extract `useScenarioEnabledTools` hook to share layout effect between
  `/tasks/_layout` and `/agent/:aid/tasks/_layout`

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* ♻️ refactor: use assigneeAgentId for task avatar instead of participants array

Replace AgentAvatars (took participants[]) with AssigneeAvatar (takes agentId,
resolves meta from agent store). This correctly represents that a task is
assigned to a single agent via assigneeAgentId/detail.agentId.

- New AssigneeAvatar component reads agent meta from agent store by ID
- TaskProperties reads activeTaskAgentId from task detail store
- listViewOptions uses task.assigneeAgentId directly for groupBy/sort
- Extract shared isInboxAgentId helper to eliminate 4x inline duplication
- Group headers resolve agent title at render time via AssigneeLabel component

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: enable vertical scrolling on cross-agent tasks page

Add overflowY and flex to WideScreenContainer wrapper so the task list
can scroll when content exceeds viewport height.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add re-assign task agent with popover selector

- Add AssigneeAgentSelector component with Popover agent list
- Extract useAgentDisplayMeta hook for consistent agent name/avatar resolution
- Fix optimistic update mapping assigneeAgentId → agentId in task store
- Disable reassignment for running tasks with tooltip hint
- Integrate selector into task list and task detail property panel

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: reuse BriefCard in task detail activities & fix raw-id navigation

Render brief-type activities as full BriefCard (same as homepage) instead of
plain tree rows. Decouple BriefCardActions from useBriefStore for actions
lookup so it can be reused across pages. Fix infinite loading when navigating
to task detail via raw DB id (task_xxx) by storing detail under both the
identifier and the raw id key in taskDetailMap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add TopicCard component for task detail activities

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: allow re-running completed tasks with dedicated button

Completed tasks now show a "Re-run" button (with rotate icon) instead of
hiding the action. The backend already supported this — only the frontend
selector gate needed updating.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: add create task modal with markdown editor

Add a "+" button on the tasks list page that opens a Linear-style modal
for manually creating tasks. The modal features a title input, a markdown
editor (EditorCanvas), and a bottom toolbar with priority and assignee
selectors. Existing tag components (TaskStatusTag, TaskPriorityTag,
AssigneeAgentSelector) are extended with an `onChange` controlled mode
so they can be used in creation context where no task exists yet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 🐛 fix: suppress spurious updateTask on Task Detail page load

EditorDataMode was missing the contentChangeLockRef pattern that
DocumentIdMode already uses, causing Lexical's registerUpdateListener
to treat programmatic content hydration as a user edit and fire
onContentChange → updateTask on every page visit.

- Add contentChangeLockRef + lockIdRef staleness guard
- Extract loadContentWithLock to deduplicate lock/load/unlock logic
- Pass contentChangeLockRef to InternalEditor
- Remove unreachable dead code in loadEditorContent

Closes LOBE-7362

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

*  feat: task detail comment CRUD and various UX improvements

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix: move canceled status group to the end of task list

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: polish task detail layout, title, and run button

- Title switched to auto-sizing TextArea so long names wrap (like Linear)
- Reduce title font-size from 32px to 24px and tighten paddings
- Make "运行任务" button small-sized to match the denser header
- Add 120px bottom padding for end-of-content scroll breathing room
- Default EditorCanvas paddingBottom trimmed from 64 to 32

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: refine task assignee, priority, and comment input

- Assignee block uses filled variant in dark mode for better contrast
- Urgent priority (level 1) renders in orange for quick scanning
- Comment input keeps SendButton slot reserved to prevent layout shift

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: task detail — inline subtasks, automation mode, chronological activity

- Inline subtask creation under a task via CreateTaskInlineEntry
  (parentTaskId/autoFocus/onCollapse/placeholder), refreshes parent on create
- Track agent-created tasks via createdByAgentId through service, router,
  types, and the builtin task executor
- Replace scheduler Segmented-only UI with an Enable switch + heartbeat/
  schedule mode; persist via automationMode on the task
- Sort detail activities oldest → newest for a natural timeline reading
- Reducer patches nested subtask entries on updateTaskDetail so in-place
  edits reflect in the parent's subtask tree

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: render activate-tool chips as rounded pills

Switch inspector tool chips from monospace code tags to filled rounded
pills with ellipsis overflow, making multi-tool rows scan better in tight
headers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix: keep finished tool call out of loading state while siblings run

The message-level isAssistantMessageBusy flag stays true while sibling
tool calls are still running. Without guarding on this tool's own
result, a finished tool would flip back to "loading". Now a tool that
has a real result or error is never shown as calling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: use small Segmented in schedule config popover

Keeps the automation mode switcher visually aligned with the denser
popover controls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: agent profile hover card on task activity author

- Extract shared AgentProfileCard + unified AgentProfilePopup (click / hover)
  with lazy agent fetch; move out of group sidebar path.
- Wire activity author avatar + name to a hover card; brighten title on hover;
  keep a small "agent" tag on the author row.
- Show inline skeletons (description + footer stats) while loading.
- Enrich subtask payload with assignee agent info for cleaner UI.

*  feat: open task topic chat in side drawer

Click a topic row in the task detail activities to open a right-side drawer
showing the topic's full chat history. Messages stream in live via the existing
agent gateway pipeline (gateway events land in chatStore.dbMessagesMap keyed by
the topic context), so a running topic refreshes its drawer in real time without
a dedicated subscription.

Reuses the Conversation feature (ConversationProvider + ChatList) with an
isolated context (agentId + topicId + isolatedTopic), so the drawer never
touches the global active topic and multiple panels coexist cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style: outline activate-tool chip with subtle border

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat: show topic handoff summary on activity card

Pull `handoff.summary` through the task service into TaskDetailActivity and
render it under the title in TopicCard so completed topics surface what was
accomplished without opening the drawer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🎸 chore: gate agent task feature behind agent_task flag

Hide every client-side entry point to the Agent Task feature when the
`agent_task` flag (default `isDev`, off in prod) is disabled:

- Sidebar: task tab in the agent sidebar nav
- Routes: `/agent/:aid/tasks/*` and `/tasks/*` layouts redirect to `/` when
  the flag is off (mobile router reuses the same layout)
- Home Recents: filter out `type='task'` items in both the list and the
  "all recents" drawer
- Daily Brief: skip fetch + hide the entire panel (all briefs link to tasks)

Backend TRPC / lifecycle stays on — the feature is already live for CLI
usage. Flag name mirrors `agent_onboarding` for consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix: prioritize includeTriggers in topic queries

* 🐛 fix: normalize task detail activity payloads

*  feat: add Kanban board view for task list with drag-and-drop

LOBE-7493

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* 💄 style: shorten schedule tag labels & fix time width in task cards

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* update i18n

* 💄 style: hide task tool from user selectors

* 💄 style: hide task skill from user selectors

---------

Co-authored-by: canisminor1990 <i@canisminor.cc>
Co-authored-by: YuTengjing <ytj2713151713@gmail.com>
Co-authored-by: Arvin Xu <arvinx@foxmail.com>
2026-04-23 02:10:45 +08:00
Arvin Xu 6d339d6a64 🐛 fix(agent-runtime): sanitize invalid tool_call arguments to unbreak strict providers (#14033)
* 🐛 fix(agent-runtime): sanitize invalid tool_call arguments to prevent history poisoning

When a model emits malformed JSON as tool_calls[].arguments (e.g. Qwen
producing `{, "description": ...}`), the raw string was persisted to
`messages.tools[].arguments` and replayed verbatim on every subsequent
turn. Strict providers (NVIDIA NIM) validate the full history and 400
the whole request, terminating the op and wasting all accumulated tokens.

Add a shared `sanitizeToolCallArguments` helper in @lobechat/utils and
wire it in at three layers so both new captures and already-poisoned DB
history are safe:

- Server entry (RuntimeExecutors onToolsCalling) — mirrors the frontend's
  `internal_transformToolCalls` pattern; prevents new poisoning.
- Outbound context build (ToolCallProcessor) — last line of defense for
  historical messages that were persisted before this fix.
- Agent-runtime core (call_tools_batch normalization) — covers the
  old-format ToolsCalling[] path.

Behavior: valid JSON passes through unchanged (prompt cache stable);
partial-json recovers truncated streams; unrecoverable payloads fall
back to "{}" so the tool_call structure survives and the model can
replan on the next turn.

Fixes LOBE-7761
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(agent-runtime): preserve INVALID_JSON_ARGUMENTS feedback when sanitizing

Sanitizing `tool_calls[].arguments` at capture (onToolsCalling) was too
early — the normalized "{}" reached `BuiltinToolsExecutor.execute` and
bypassed the `INVALID_JSON_ARGUMENTS` branch, so the model got a generic
"missing required field" error instead of the precise "your JSON syntax
was broken, fix it" feedback. That regressed the self-reflection signal.

Move sanitization to the persist boundaries only:
- DB write via `messageModel.update({tools: ...})`
- `state.messages` push for the assistant message's `tool_calls`

The execution path keeps the raw `arguments` string so the executor can
still emit its `INVALID_JSON_ARGUMENTS` tool-result with the original
malformed payload echoed back — exactly the frontend-symmetric self-
reflection flow.

Add a regression test pinning the LOBE-7761 Qwen shape so future changes
can't silently drop the feedback again.

Fixes LOBE-7761
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(agent-runtime): drop sanitize from runtime normalization to avoid undeclared @lobechat/utils dep

Review flagged that `runtime.ts` imported `sanitizeToolCallArguments` from
`@lobechat/utils` while `agent-runtime/package.json` doesn't list utils as
a runtime dependency — in strict/hermetic installs this resolves to
MODULE_NOT_FOUND before the runtime can start.

Rather than add a new dep just for a belt-and-suspenders path, drop the
sanitize on the old-format `call_tools_batch` normalization. The actual
LOBE-7761 bug is server-side history poisoning; that's fully covered by:

- RuntimeExecutors persist-boundary sanitize (DB write + state.messages)
- context-engine ToolCallProcessor outbound sanitize (handles any DB
  history that was persisted before this fix)

Old-format agents in agent-runtime don't persist or replay to providers
on their own — sanitization is the consuming application's
responsibility and can live closer to its persistence layer.

Drops the dep-cycle-free path.
Related LOBE-7761
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(model-runtime): log tool_call parse errors in Anthropic adapter

The assistant→Anthropic conversion was swallowing `JSON.parse` errors
silently and falling back to empty `input: {}`. Combined with the
LOBE-7761 fix, bad arguments should always be sanitized upstream in
context-engine, so hitting this catch means something bypassed the
defense and we're about to send a tool_use with empty input to Claude.
That's worth knowing about.

Match the `console.error('parse tool call arguments error:', ...)`
pattern already used in openaiCompatibleFactory so logs are greppable.

Related LOBE-7761
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:09:26 +08:00
Arvin Xu 4af6fddd7a 🐛 fix(context-engine): downgrade image_url parts when target model lacks vision (#14029)
* 🐛 fix(context-engine): downgrade image_url parts when target model lacks vision

Historical messages persisted as multimodal parts (content is an array
with `image_url` entries, or assistant messages with `metadata.isMultimodal`)
bypassed the legacy `imageList` vision check and got forwarded verbatim to
the provider. DeepSeek rejects the `image_url` variant outright, so any
topic containing an image broke the moment the user switched to a
non-vision model.

Replace image parts with a textual placeholder so the conversation still
carries the signal that an image was sent, without including content
non-vision providers reject. Applies uniformly across user array content,
assistant multimodal content, and legacy `imageList` paths.

Fixes LOBE-7214.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  test: update vision-disabled expectations after downgrade placeholder

Two tests in the app suite asserted the silent-drop behavior the
MessageContentProcessor used to exhibit for `imageList` + vision-off:

- src/services/chat/chat.test.ts
- src/services/chat/mecha/contextEngineering.test.ts

After this PR the processor appends the downgrade placeholder instead of
silently dropping the image, so the expected content grows by one line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style(context-engine): place vision downgrade placeholder before SYSTEM CONTEXT

The placeholder stands in for an image the user actually sent, so it
should sit adjacent to the user text rather than trailing after the
SYSTEM CONTEXT metadata block. Reorder so the payload reads:

  <user text>

  [image omitted: not supported by this model]

  <!-- SYSTEM CONTEXT ... -->

Keeps the conversational flow intact and matches the semantic position
the image occupied in the original message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 13:07:42 +08:00
Innei a59a9c4943 feat(onboarding): structured hunk ops for updateDocument (#13989)
*  feat(onboarding): structured hunk ops for updateDocument

Extend `updateDocument` (and the underlying `@lobechat/markdown-patch`) with
explicit hunk modes so agents can unambiguously express deletes and inserts
instead of encoding them as clever search/replace pairs.

Modes: `replace` (default, backward-compatible), `delete`, `deleteLines`,
`insertAt`, `replaceLines`. Line-based modes use 1-based inclusive ranges
and are applied after content-based hunks, sorted by anchor line descending
so earlier lines stay stable. New error codes: `LINE_OUT_OF_RANGE`,
`INVALID_LINE_RANGE`, `LINE_OVERLAP`.

Onboarding document injection now prefixes each line with its 1-based number
(cat -n style) so the agent can cite line numbers when issuing line-based
hunks. Tool description, system role, and per-phase action hints updated to
teach the new shape.

* 🐛 fix(onboarding): align patchOnboardingDocument zod schema with structured hunks

The tRPC input schema still accepted only the legacy `{search, replace}` shape,
so agent calls using the new `insertAt`/`delete`/`deleteLines`/`replaceLines`
hunk modes were rejected before reaching `applyMarkdownPatch`. Switch to a
z.union matching MarkdownPatchHunk.

* 🐛 fix(markdown-patch): validate line ranges before overlap detection

Previously the overlap loop ran before per-hunk range validation, so an
invalid range (e.g. startLine=0 or endLine<startLine) combined with another
line hunk would be misreported as LINE_OVERLAP instead of the real
LINE_OUT_OF_RANGE / INVALID_LINE_RANGE. Validate each line hunk against the
baseline line count first, then run overlap detection on valid ranges only.
2026-04-20 21:17:28 +08:00
Innei 568389d43f ♻️ refactor(web-onboarding): rename doc tools and drive incremental persona writes (#13933)
* ♻️ refactor(web-onboarding): rename doc tools and drive incremental persona writes

- Rename writeDocument (full rewrite) and updateDocument (SEARCH/REPLACE patch) so tool
  names match model intuition; the old updateDocument (full) is now writeDocument and the
  old patchDocument (patch) is now updateDocument.
- Rework systemRole, toolSystemRole, and OnboardingActionHintInjector to require per-turn
  persistence: seed persona on user_identity, patch on every discovery turn where a new
  fact is learned, and stop the one-shot full-write pattern.
- Add a Pre-Finish Checklist so agents verify soul/persona reflect the session before
  calling finishOnboarding.

Eval (deepseek-chat, web-onboarding-v3):
- fe-intj-crud-v1: write=2, updateDocument=6/6 success
- extreme-minimal-response-v1: write=2, updateDocument=4/4 success
- Previously 0 patch usage; now patch dominates incremental edits.

* 🐛 fix(web-onboarding): decouple fullName persistence from role discovery

Persona seeding and saveUserQuestion(fullName) were gated on learning both
name AND role in the same turn, which regressed the prior behavior of saving
the name the moment it was provided. If the user shared only a name (or left
early before role was clarified), the agent could skip the save and end
onboarding with missing identity data.

Split the hint:
1. saveUserQuestion(fullName) fires as soon as the name is known, regardless
   of role.
2. Persona seeding fires on ANY useful fact (name alone, role alone, or both).

Thanks to codex review for catching this.
2026-04-18 20:02:39 +08:00
Innei 03d2068a5d feat(onboarding): add feature flags and footer promotion pipeline (#13853)
*  feat(onboarding): enhance agent onboarding experience and add feature flags

- Added new promotional messages for agent onboarding in both Chinese and default locales.
- Updated HighlightNotification component to support action handling and target attributes.
- Introduced feature flags for agent onboarding in the configuration schema and tests.
- Implemented logic to conditionally display onboarding options based on feature flags and user state.
- Added tests for the onboarding flow and promotional notifications in the footer.

This update aims to improve the user experience during the onboarding process and ensure proper feature management through flags.

Signed-off-by: Innei <tukon479@gmail.com>

*  feat(home): add footer promotion pipeline with feature-flag gating

Extract resolveFooterPromotionState for agent onboarding vs Product Hunt promos.
Normalize isMobile boolean, refine HighlightNotification CTA layout, extend tests.

Made-with: Cursor

*  feat(locales): add agent onboarding promotional messages in multiple languages

Added new promotional messages for agent onboarding across various locales, enhancing the user experience with localized action labels, descriptions, and titles. This update supports a more engaging onboarding process for users globally.

Signed-off-by: Innei <tukon479@gmail.com>

* 💄 chore: refresh quick wizard onboarding promo

* 🐛 fix(chat): keep long mixed assistant content outside workflow fold

*  feat(onboarding): add agent onboarding feedback panel and service

LOBE-7210

Made-with: Cursor

*  feat(markdown-patch): add shared markdown patch tool with SEARCH/REPLACE hunks

Introduce @lobechat/markdown-patch util and expose patchDocument API on the
web-onboarding and agent-documents builtin tools so agents can apply
byte-exact SEARCH/REPLACE hunks instead of resending full document content.

*  feat(onboarding): prefer patchDocument for non-empty documents

Teach the onboarding agent (systemRole) and context engine
(OnboardingActionHintInjector) to prefer patchDocument over updateDocument
when SOUL.md or User Persona already has content, keeping updateDocument
reserved for the initial seed write or full rewrites.

* 🐛 fix(conversation): add rightActions to ChatInput component

Updated the AgentOnboardingConversation component to include rightActions in the ChatInput, enhancing the functionality of the onboarding conversation interface.

Signed-off-by: Innei <tukon479@gmail.com>

* Add specialized onboarding approval UI

* 🐛 fix(serverConfig): handle fetch errors in server config actions

Updated the server configuration action to include error handling for fetch failures, ensuring that the server config is marked as initialized when an error occurs. Additionally, modified the SWR mock to simulate error scenarios in tests.

Signed-off-by: Innei <tukon479@gmail.com>

* 🐛 fix(tests): update Group component tests with new data-testid attributes

Added data-testid attributes for workflow and answer segments in the Group component tests to improve test targeting. Adjusted the isFirstBlock property for consistency and ensured the component renders correctly with the provided props.

Signed-off-by: Innei <tukon479@gmail.com>

---------

Signed-off-by: Innei <tukon479@gmail.com>
2026-04-17 21:14:27 +08:00
LiJian 15fcce97c9 ♻️ refactor: add more tools in lobe-agent-manangerment(modify、update、delete) (#13842)
* feat: add more tools in lobe-agent-manangerment

* feat: add the ensureAgentLoaded to modify it

* feat: add the update prompt tools
2026-04-15 17:57:05 +08:00
LiJian 116495bd1e 🐛 fix: slove the execAgents tools exec types not correct (#13807)
* fix: slove the execAgents tools exec types not correct

* fix: should inject source:discovery when tools type is lost

* fix: delete the source inject test
2026-04-14 17:51:08 +08:00
Arvin Xu e569c8dee0 ♻️ refactor: introduce ToolExecutor field orthogonal to ToolSource (#13760)
Add ToolExecutor ('client' | 'server') as a new orthogonal dimension
alongside ToolSource to describe where a tool invocation is dispatched.
Thread executorMap through OperationToolSet / ResolvedToolSet / AgentState
and attach executor to the ChatToolPayload emitted in onToolsCalling.

Defaults remain empty (all server-side), so behavior is unchanged. This
is pure scaffolding to unblock subsequent work on client-side dispatch.

Also remove the unused 'plugin' value from ToolSource (no downstream
consumers branched on it; installed plugins now labeled 'mcp').

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 00:28:30 +08:00
Arvin Xu 0486be4773 🐛 fix: guard non-string content in context-engine to prevent e.trim errors (#13753)
🐛 fix: guard non-string content in context-engine to prevent `e.trim is not a function`

Two unguarded `.trim()` / string-concatenation paths in the context-engine
could throw or produce garbage text when a message's `content` is not a
plain string (multimodal parts array, null tool turns). Both are reached
in normal chat and trigger `e.trim is not a function` in production.

- `resolveTopicReferences`: filter out non-string content in the fallback
  `lookupMessages` path before calling `.trim()`. Without this guard, the
  outer try/catch swallows the TypeError and drops the whole fallback.
- `MessageContent` processor: normalize `message.content` (string or
  parts array) before concatenating file context, instead of relying on
  implicit `toString()` coercion which emitted `[object Object]` into
  the LLM prompt.

Adds regression tests for both paths.
2026-04-12 19:27:52 +08:00
Rdmclin2 ac1abbaf8b 🐛 fix: bot error lobe 6925 (#13724)
* chore: remove unused variables

* fix: add  catch error

* chore: use url for anthropic image

* feat: add bot  process warnings to context

* feat: add thread context

* fix: rename thread name when already has one

* chore: update test cases

* fix: warning sanitize

* fix: threadName safe review
2026-04-11 02:11:33 +08:00
Arvin Xu b95720d210 🐛 fix: add typeof guard before .trim() calls in context engine (#13715)
Add `typeof !== 'string'` checks before `.trim()` calls in BaseSystemRoleProvider,
SystemRoleInjector, and BaseProcessor to prevent TypeError when a non-string truthy
value (e.g. object, array, number) is passed at runtime.
2026-04-10 14:21:18 +08:00
LiJian 06ac87dc45 🐛 fix: should inject current agnets information when actived the lobehub_skill (#13661)
* fix: should inject current agnets information when actived the lobehub skill

* fix: not inject the agent systemRole in lobehub skill inject

* fix: should use the isLobeHubSkillActive hook to judge

* fix: change the tools inject to vars replace function

* fix: add the lost topic id & agent title

* fix: later the PlaceholderVariablesProcessor

* fix: update the description
2026-04-09 16:11:18 +08:00
LiJian 26d1d6bbfb 🐛 fix: slove the agents_documents will coverd the systemRole (#13667)
fix: slove the agents_documents will coverd the systemRole
2026-04-08 20:54:20 +08:00
LiJian 8d8b60e4f9 🐛 fix: should filiter the current agents in avaiable agents list (#13644)
* fix: should inject the current agents & remove current agent from avaiable agents list

* fix: delete the current agents blocks
2026-04-08 11:24:53 +08:00
LiJian 33f729cd1a 🐛 fix: add the availableAgents into the prompt inject (#13621)
* fix: add the availableAgents into the prompt inject

* fix: should auto inject the avaiable agents into context when use the auto model

* fix: update the prompt

* fix: test fixed
2026-04-07 19:45:29 +08:00
Innei 8b3c871d08 ♻️ refactor(onboarding): add OnboardingContextInjector and wire context engine (#13518)
* ♻️ refactor(onboarding): add OnboardingContextInjector and wire context engine

Made-with: Cursor

* 🔧 refactor(onboarding): update tool call references to use `lobe-user-interaction________builtin`

Modified onboarding documentation and utility functions to standardize the use of the `lobe-user-interaction________builtin` tool call for structured input collection, enhancing clarity and consistency across the codebase.

Signed-off-by: Innei <tukon479@gmail.com>

* 🔧 refactor(onboarding): standardize tool call references to `lobe-user-interaction____askUserQuestion____builtin`

Updated documentation and utility functions to replace instances of the `lobe-user-interaction________builtin` tool call with `lobe-user-interaction____askUserQuestion____builtin`, ensuring consistency in structured input collection across the onboarding process.

Signed-off-by: Innei <tukon479@gmail.com>

* ♻️ refactor(onboarding): move onboarding context before first user

* ♻️ refactor(context-engine): add virtual last user provider

* update v3

* 🐛 fix(onboarding): add early exit escape hatch for boundary cases

The `<next_actions>` directive only prompted finishOnboarding in the
summary phase, but phase transition required all fields + 5 discovery
exchanges — a condition extreme cases rarely meet. This left the model
stuck in discovery, never calling finishOnboarding.

- Add EARLY EXIT hint in discovery phase next_actions
- Add universal completion-signal REMINDER across all phases
- Add minimum-viable discovery fallback in systemRole
- Add explicit completion signal list in Early Exit section
- Add off-topic redirect limit in Boundaries
- Add CRITICAL persistence rule in toolSystemRole

*  test(context-engine): fix OnboardingContextInjector tests to match BaseFirstUserContentProvider

Remove brittle MessagesEngine onboarding test that hardcoded XML content.

---------

Signed-off-by: Innei <tukon479@gmail.com>
2026-04-07 19:25:16 +08:00