Compare commits

..

92 Commits

Author SHA1 Message Date
renovate[bot] b3f4507d32 Update dependency code-inspector-plugin to v1.6.1 2026-06-15 02:03:04 +00:00
Arvin Xu d6ca168199 ♻️ refactor(swr): converge remaining store-layer SWR keys into swrKeys registry (#15850)
♻️ refactor(swr): converge remaining store-layer keys into swrKeys registry

Migrate all ad-hoc SWR keys still living in the store/service layer onto the
central registry (src/libs/swr/keys.ts), under the uniform `domain:resource`
naming. New domains: discover, eval (agent eval), ragEval, knowledgeBase,
device (incl. git), userMemory, agentKnowledge, agentBot, file, chatTool.

- Pure key convergence: no tiering/caching change. The new prefixes are kept
  deliberately OUT of CACHE_TIERS, so every migrated key stays memory-only
  exactly as before (agentKnowledge:/agentBot: avoid the cached `agent:` tier).
- Behavior preserved: key array shapes, mutate matchers (key[0] === *.root),
  and personal-vs-workspace match semantics are unchanged; string-join keys
  (discover assistant/social) become arrays with equivalent identity.
- UI-embedded SWR keys (features/routes/components/packages) intentionally left
  for a later pass.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 09:55:24 +08:00
Arvin Xu ae88d7535f ♻️ refactor(swr): centralize session/thread/recent/group keys into swrKeys registry (#15848)
* ♻️ refactor(swr): migrate session/thread/recent/group-list keys into swrKeys registry

Batch 1 of the SWR key centralization: add session/thread/recent keys and
group:list to the registry under the domain:resource convention, migrate call
sites + mutate matchers, update the localStorage tier patterns (recent:list,
group:list), and update tests. Removes the ALL_RECENTS_DRAWER_SWR_PREFIX export
in favor of recentKeys.allDrawer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(swr): version+unify message key, drop isLogin from keys, migrate agent/aiModel/image/video/serverConfig

- message: drop `listLegacy`; both stores use the accurate `message:list` key,
  now carrying MESSAGE_CACHE_VERSION; fix the chat store `refreshMessages` to
  invalidate the real key via a context matcher (was a dead key, never matched).
- keys: remove the redundant `isLogin` arg from all list factories (the app is
  always authenticated); drop the now-unused isLogin param from useFetchSessions.
- migrate agent config/available/search, aiModel, image+video generation, and
  serverConfig keys into the registry; update call sites, mutate matchers, tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(swr): restore isLogin arg in list keys

Re-introduce the isLogin argument across the session/agent/group/recent/brief
list key factories and their call sites (incl. useFetchSessions). The key must
vary with auth state so login/logout transitions invalidate the cached list
instead of serving another user's snapshot.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(swr): harden tiered cache flush + scope re-hydration

- localStorageProvider: flush both tiers on visibilitychange→hidden (and
  pagehide) instead of beforeunload. IndexedDB writes are async and can't be
  awaited on teardown; flushing while the page is still alive (hidden) gives
  them time to land before unload.
- Query: reset the new scope's hydration readiness before reloadScope() (in a
  layout effect), so the boot gate keeps blocking through the async IDB re-load
  instead of rendering stale data from a previously-visited scope.
- CacheHydrationGate: render the brand logo while gating instead of returning
  null, keeping the hand-off from the static loading screen seamless.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 03:04:29 +08:00
Arvin Xu 66370675ab test(chat): characterize parked states + post-persist title wiring (#15847)
Fills the three refactor-critical holes left in the characterization net
(LOBE-10377) — exactly the invariants LOBE-10378/10379/10382 will rewrite.

- client (streamingExecutor): waiting_for_async_tool leaves the op UNcompleted
  (no switch case) and emits an undefined complete-signal status (normalize
  falls through); waiting_for_human completes-for-UI but does NOT drain queue
  or mark unread (parked != terminal).
- gateway (gatewayEventHandler): waiting_for_async_tool park is currently
  treated as a completed + unread terminal (no pause short-circuit), and shares
  the `interrupted` reconciliation branch (preserve streamed content vs DB
  refetch, uiMessages SoT takes precedence).
- lifecycle (conversationLifecycle): post-persist summaryTopicTitle fires on the
  CLIENT path (new-topic OR empty-title gate) and is NOT invoked on the GATEWAY
  path (early return; title handled server-side).

Tests-only; characterization (locks current behavior, incl. suspected gaps with
comments). 135 tests pass across the 3 files.

Part of LOBE-10376
2026-06-15 02:33:57 +08:00
Arvin Xu 457b4638c1 🐛 fix(home): hide agent-mode notice while config is loading (#15846)
Home InputArea computed isAgentConfigLoading but never passed it to
DesktopChatInput, so AgentModeNotice flashed the "model unsupported"
warning during hydration. Forward isConfigLoading like every other
call site so the notice only appears after config loads.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 02:16:24 +08:00
Arvin Xu edf058e325 feat(swr): unified tiered cache provider (localStorage + IndexedDB) with scope isolation (#15844)
*  feat(swr): unified tiered cache provider (localStorage + IndexedDB) with scope isolation

Route SWR persistence to a tier chosen centrally by key — IndexedDB for large
business entities (messages, topics, tasks, documents, agents), localStorage for
small list shells (recents) — instead of stuffing everything into one ~5MB
localStorage blob. Partition every tier by identity scope (`${userId}:${workspaceId}`)
so users/workspaces sharing an origin never collide, and add a boot hydration gate
so local-first data is present before the routed app mounts.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(swr): centralize IndexedDB-tier keys into swrKeys registry with domain:resource naming

Introduce src/libs/swr/keys.ts as the single source of truth for SWR cache keys,
named uniformly as `<domain>:<resource>` (e.g. message:list, topic:list,
task:detail). Migrate the IndexedDB-tier domains (message, topic, agent, group,
task, document/page/notebook, brief) off scattered local consts/inline literals
onto registry factories, updating call sites, mutate matchers, and tests. The
tiered cache provider now routes by `domain:` prefix instead of ad-hoc
substrings, and matchDomain() enables refreshing a whole domain at once.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 02:02:09 +08:00
Innei c740b13021 🐛 fix(provider): correct delete confirm z-index by switching to base-ui modal (#15845)
* 🐛 fix(provider): use base-ui modal so delete confirm stacks above the config dialog

Closes #15836

* 💄 style(provider): split delete confirm into short title and description

* 🌐 chore(i18n): sync delete confirm title/description across all locales
2026-06-15 01:53:27 +08:00
Arvin Xu f9e7ca5b68 test(chat): characterization net for agent runtime run-lifecycle (#15843)
*  test(chat): characterization net for agent runtime run-lifecycle

Lock the CURRENT client / gateway / heterogeneous run-completion behavior
across all terminal branches BEFORE the unified run-lifecycle refactor
(LOBE-10376), so any behavioral drift is caught by tests.

- client (streamingExecutor): afterCompletion fires on error terminal;
  complete-signal status=failed on error; queue-drain + markUnread skipped
  on error (negative); desktop-notification gating (content && !tools)
- gateway (gatewayEventHandler): error event completes op WITHOUT markUnread
  (asymmetry vs agent_runtime_end); completeOperation double-call idempotency
- hetero (heterogeneousAgentExecutor): notification + dock badge on success;
  updateTopicMetadata-rejection behavior; queue-drain gating
  (success / !aborted / !error); error & abort paths fire no notification/drain
- entry points: regenerate-hetero (imageList + parentOperationId +
  onRegenerateComplete), continue-hetero early-return, rejectAndContinue
  client dual-op, submitHeteroIntervention IPC submit + GC fallback

Tests-only; no implementation changes. 255 tests pass across the affected files.

Part of LOBE-10376
Closes LOBE-10377
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  test(chat): surface executor rejections in hetero completion helpers

The clean-completion `runToComplete` helper (and its sibling `runToError`)
awaited the executor with `.catch(() => {})`, swallowing any rejection. Both
paths resolve today, so this only masked future regressions: a happy/error
run that starts rejecting after some side effects would still pass — the
isDesktop=false "no notification" negative assertion is especially vulnerable
since an early rejection before the notification step trivially satisfies it.

Await the executor promise directly so a rejection fails the characterization
test instead of passing silently. 70/70 still green (both paths resolve today).

Part of LOBE-10376

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 01:49:48 +08:00
Arvin Xu 542197d8ab 🐛 fix(hetero): correct cold-replica main-turn idempotency and mark topic failed on terminal errors (#15838)
* 🐛 fix(server): dedupe replayed main-turn newStep on a cold replica

The main-agent coordinator cuts a turn purely on the adapter's `newStep` signal and minted a fresh random assistant id each time, with no DB-homed idempotency key for the turn (unlike the subagent path after #15808). On a cold serverless replica the in-memory `processedKeys` dedupe is empty, so a BatchIngester retry reprocesses the `newStep` and `openTurn` forks a second assistant — orphaning the first as a usage-only empty shell (the remote-CC "空壳" bubble).

Mirror #15808 onto the main chain: the adapter emits the turn's CC `message.id` on `stream_start{newStep}`; the reducer records it as `currentMainMessageId` and treats a same-id `newStep` as a replay (no-op); the server stamps it on `metadata.mainMessageId` and recovers it on a cold replica. Backward-compatible: a `newStep` without a message id opens a turn as before.

Regression: HeterogeneousPersistenceHandler.mainTurnRehydration.test.ts (cold-replica retry: 2 assistants + empty shell -> 1) plus 4 mainAgentCoordinator reducer unit tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(cli): mark topic failed when remote CC relays a terminal error on a clean exit

Claude Code relays API/rate-limit errors as an in-stream terminal `error`
event but still exits 0. The CLI derived the heteroFinish result from the
process exit code alone, so such runs reported `result: 'success'` →
`reason: 'done'` and the topic/task was wrongly marked completed instead of
failed (the error was only persisted on the message).

Track whether a terminal `error` event was pushed to the ingester and force
`result: 'error'` even on a clean exit, mirroring the desktop executor where
the stream error drives both the message error and the topic status. Also
surface the terminal error message as the finish error detail (CC relays these
on stdout, so stderr is empty in this case).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 01:00:19 +08:00
Arvin Xu 507f251ac5 🔨 chore(agent-runtime): enable S3 tracing by default in production (#15841)
 feat(agent-runtime): enable S3 tracing by default in production
2026-06-15 00:56:20 +08:00
Rdmclin2 d3cc667c97 fix: workspace preifx (#15837)
chore: remove workspace prefix
2026-06-14 22:17:19 +08:00
LiJian 346d5be27c feat(connectors): add edit/uninstall buttons for connectors in SkillDetail (#15829)
* 🐛 fix: clear credentials on URL change; gate Edit button to http connectors

P1 (AddConnectorModal): when handleEdit detects a URL change, pass
credentials: null so the server drops the old OAuth token — a stale token
from the previous server must not be sent to the new one. The server-side
update mutation now also clears tokenExpiresAt in the same round-trip
whenever credentials are set to null.

P2 (ConnectorDetail): narrow the Edit button (and the modal mount) from
isMcpConnector to isMcpConnector && connector.mcpConnectionType === 'http'.
stdio connectors have no mcpServerUrl, so the URL-edit dialog would open
with an empty field and mislead the user.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

*  feat(connectors): add edit/uninstall buttons for SkillDetail connectors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* 🐛 fix: re-enable OAuth in edit mode + pre-fill bearer/header credentials

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* 🐛 fix: resolve TypeScript errors in CustomConnectorModal edit mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* 🐛 fix: add clientId/clientSecret to mcp.auth type to resolve TS error

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* 🐛 fix: correct description field location in editValue

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-14 21:55:49 +08:00
Arvin Xu 43c91caf6a 📝 docs: add capability-gated feature checklist to ux skill (#15832)
* 📝 docs: add capability-gated feature checklist to ux skill

Guide designers to fulfil the reminder obligation when a selected model
or its still-loading config can't deliver a feature's required capability
(e.g. agentic tool calling): surface a soft, reactive, load-gated warning
with the remedy, rather than failing silently.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 📝 docs: broaden ux skill trigger to any UI work

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 📝 docs: simplify ux skill description

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:38:31 +08:00
Arvin Xu 602e768419 🐛 fix(page-editor): isolate page copilot context from global agent/document state (#15826)
* 🐛 fix(page-editor): isolate page copilot context from global agent/document state

Two independent bugs both rooted in the page conversation context leaning on
process-global singletons that can't express multiple tabs/documents:

- Heterogeneous agents (Claude Code / Codex) leaked into the page copilot:
  `selectedAgentId` only excluded empty and chat-group ids, so navigating from
  a heterogeneous agent tab made the page right panel run that external agent.
  Also fall back to the page agent when the active agent is heterogeneous.

- `documentId` was lost in multi-tab scenarios because the conversation context
  carried no documentId and relied on the `pageAgentRuntime` singleton, which
  represents only one open document and is cleared on tab switch — causing
  "PageAgent server runtime received a tool call without documentId". Inject the
  editor's `pageId` straight into `context.documentId` so the send-time guard
  uses a deterministic value instead of the singleton.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* 🐛 fix(page-editor): include documentId in the page conversation key

The previous fix injected `documentId` into the conversation context, but all
state isolation (messages, operations, input-loading/runtime selectors,
replaceMessages) is keyed through `messageMapKey(context)`, which dropped
`documentId` entirely for page scope. Two documents sharing the page agent thus
collapsed into one `page_<agent>_new` bucket — document B could inherit A's
copilot history or be queued behind A's running operation while tool calls now
target B.

Carry `documentId` into the page-scoped key (as subTopicId) so each open
document gets its own isolated bucket; topicless page keys avoid emitting a
literal `null` segment, and the no-document case still falls back to
`page_<agent>_new` without colliding with document-specific keys.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 20:29:42 +08:00
Arvin Xu 87966afec8 🐛 fix(chat): warn when agent mode is on but the model lacks tool calling (#15828)
 feat(chat): warn when agent mode is on but the model lacks tool calling

Show a warning above the desktop chat input when Agent mode is enabled
but the selected model does not support function/tool calling, suggesting
switching to a model with agent capability.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 20:17:58 +08:00
Arvin Xu 8b59a71f29 feat(topic): add queryTopics query with server-side status filter (#15822)
*  feat(topic): add queryTopics query with server-side status filter

Adds `topicModel.queryTopics({ statuses?, pageSize? })`, a lambda `queryTopics`
TRPC procedure, and `topicService.queryTopics` — filtering topics by status
server-side (e.g. to list actively-running topics across all agents without
pulling the full topic set to the client).

Removes the now-unused `getAllTopics` procedures (lambda + mobile),
`topicModel.queryAll`, and the `getAllTopics` service method.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  test(topic): ownership isolation tests for queryTopics; authed mobile getTopics

- queryTopics: assert it only returns the model user's topics (a status filter
  must not leak another user's data) and that personal vs workspace scopes stay
  isolated.
- mobile getTopics: switch from publicProcedure to the authed topicProcedure
  (drops the manual userId guard + ad-hoc TopicModel construction).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 19:43:32 +08:00
Arvin Xu 097987a262 📝 docs(skills): add ux design-values & execution-checklist skill (#15823)
Define LobeHub's four product design values — 自然 Natural / 意义感 Meaningful /
确定性 Certainty / 生长性 Growth (adapted from Ant Design's values) — in a
dedicated reference file (references/design-values.md), and keep the skill index
focused on per-aspect execution checklists, each tagged with the value it serves:

- Flow & momentum: push the user forward; success state = primary "go to result".
- States: empty / loading / error all designed; empty is a purpose-built page.
- Buttons & focus: exactly one primary button per surface.
- Lists at scale: design for 1 → 10k rows (virtual scroll / pagination / batch).
- Option visibility: pickers list all valid targets (e.g. the virtual inbox).
- Loading visuals: no antd Spin; use NeuralNetworkLoading / project loaders.
- Discoverability & growth: progressive disclosure; surface next capability in context.
- Entity lifecycle completeness: no display-only features — design full CRUD +
  lifecycle, with the operation set scoped to the entity's source (official =
  read-only, community = install/uninstall, custom = full CRUD).

Also: react skill points to ux for loading components, and AGENTS.md references
the ux skill for designing/reviewing user-facing flows.
2026-06-14 19:37:17 +08:00
Arvin Xu 455c25ed1b feat(topic): add bulk move topics to another assistant UI (#15809)
*  feat(topic): add bulk move topics to another assistant UI

Surface the batch-move feature in the per-agent Topics manager:

- `MoveToAgentButton`: a bulk action that opens an assistant picker
  (excludes the source agent) and moves the selected topics over.
- Wire it into `BulkActionBar` next to favorite/archive/delete.
- `batchMoveTopicsToAgent` store action: calls `topicService.batchMoveTopics`,
  optimistically drops moved topics from the current list, refreshes, and
  switches away if the active topic was moved.
- i18n keys (en-US source + zh-CN) for the move action, picker, and toast.

Depends on the server `topic.batchMoveTopics` mutation (already on canary).
Part of LOBE-10330

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  feat(topic): add per-topic move menu + confirm/progress move modal

Address review feedback on the move-topics UI:

- Add a "move to another assistant" item to the per-topic dropdown menu in
  the left sidebar topic list (single-topic move).
- Introduce a shared MoveTopicsModal (base-ui) with a pick → confirm →
  moving → done state machine: a confirmation step before the move, an
  in-progress "Moving…" view that locks dismissal, and a "moved" completion
  view. Both the bulk action and the per-topic menu open this modal.
- BulkActionBar's move button now opens the modal instead of a popover +
  toast, so multi-select moves get the confirm + progress + done flow.
- i18n: add management.moveModal.* + actions.moveToAgent (en-US + zh-CN);
  drop the now-unused management.bulk.moveSuccess toast keys.

Part of LOBE-10330

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(topic): allow moving topics to the inbox (LobeAI) assistant

The move picker sourced agents from the sidebar list, which excludes the
virtual inbox agent — so the default "LobeAI" assistant could never be
chosen as a move target (picker showed "no other assistants"). Prepend the
inbox agent to the target list (unless it is the source), mirroring
AssigneeAgentSelector. The DB-layer ownership check already accepts the
inbox agent, so moving into it is valid.

Part of LOBE-10330

* 💄 style(topic): use NeuralNetworkLoading for the move-in-progress state

Replace the antd Spin in the move modal's "moving" step with the project's
NeuralNetworkLoading, matching the product loading visual. Also document the
rule in the react skill: antd Spin is forbidden — use NeuralNetworkLoading
(or the other src/components loaders) instead.

Part of LOBE-10330

* 💄 style(topic): add "go to target assistant" action on move success

On the move modal's done step, make "Done" a secondary (weak) button and add
a primary "Go to <target>" button that navigates to the assistant the topics
were moved into, so the user can jump straight to the relocated topics.

Part of LOBE-10330

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 18:41:52 +08:00
Arvin Xu 6c8bcf0c8a 🐛 fix(conversation): stop tool workflow collapse showing 'in progress' once content renders below it (#15815)
🐛 fix(conversation): stop tool workflow collapse showing "working" once content renders below it

When an assistant group is still generating, a workflow segment can have a real
answer segment rendered below it — most notably an errored tool block, which
splits into a folded workflow (the tools) plus a trailing answer segment (the
error text). The group-level `workflowChromeComplete` only accounts for the
promoted-final-answer path (`postToolTailPromoted`), so in these cases the
collapse kept rendering its streaming "working" header even though the model had
already moved past it and content was visible below.

Derive completeness from segment ordering: a workflow segment that has any
rendered content after it is no longer the active step. Add
`hasRenderedContentAfter` and OR it into the per-segment `workflowChromeComplete`.

Guard the shortcut with `hasPendingIntervention`: `areWorkflowToolsComplete`
ignores pending-intervention tools and the "awaiting confirmation" UI only shows
while streaming, so a segment still awaiting user confirmation must keep its
streaming chrome even with content below it.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 18:31:16 +08:00
Arvin Xu 32cf754ae3 🐛 fix(chat): derive operation token usage from messages, not a parallel accumulation (#15819)
The operation status tray maintained its OWN running token total by summing
every `turn_metadata` event's usage (`addUsageToOperationMetrics`), separate
from the per-message usage written via `recordUsage`. The two diverged badly:
in an agentic Claude Code loop the tray showed ~8M while the per-message bubbles
summed to ~2.2M.

Root cause is two computations for one number:
- `recordUsage` OVERWRITES each assistant message's usage (last turn wins when
  multiple turns map to one message).
- the tray ADDED every turn's usage — and each turn's `totalTokens` includes
  `cache_read_input_tokens`, so a re-read context got counted once per turn.

Make the per-message usage the single source of truth: `OpStatusTray` always
derives the total via `calculateOperationUsageMetrics(messages)` (previously
only a fallback), and the parallel `addUsageToOperationMetrics` accumulation is
removed from both the heterogeneous-agent executor and the gateway handler. The
tray now equals the sum of the bubbles and refreshes as messages do.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 18:22:09 +08:00
Arvin Xu 8ee5f1c806 🐛 fix(chat): drop subagent-tagged events from the main gateway stream handler (#15814)
On a live gateway / remote-CC stream, a subagent (Claude Code `Agent`/`Task`)
inner-tool event is tagged with `data.subagent` and belongs to an isolation
Thread, not the main bubble. The gateway path fed raw events straight into
`createGatewayEventHandler` (main-agent-only), so a subagent `tools_calling`
chunk appended the inner tool onto the MAIN assistant's `tools[]` — the tools
"leaked" into the parent bubble DURING streaming, then snapped back when the
terminal `fetchAndReplaceMessages` pulled correct DB state (where they live
under the Thread). Classic "流式时漏出来、结束后正常".

The local desktop executor already drops `data.subagent` events before
forwarding (`heterogeneousAgentExecutor`); the gateway path didn't. Drop them at
the top of the handler — one place that covers every gateway caller, and a
no-op for the local executor (which already pre-drops). DB persistence is
unaffected: the server writes subagent rows under the Thread regardless, so they
still appear — correctly under their Thread — after the terminal fetch.

Regression: a subagent-tagged `tools_calling` chunk no longer dispatches onto
the main assistant (verified red without the drop); a non-subagent chunk still
dispatches.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 17:02:50 +08:00
Innei 536290973b feat(desktop): tray double-click opens main window (#15816)
Single click on the tray still starts the Quick Composer capture
session, but is now debounced 250ms so a follow-up double-click can
pre-empt it. Double-click surfaces the main window via
browserManager.showMainWindow(). macOS / Windows only; Linux trays
under AppIndicator do not emit click events and remain unaffected.
2026-06-14 16:44:28 +08:00
Innei 46b379f446 💄 style(chat): tighten revert confirm and toast copy (#15813)
* 💄 style(chat): tighten revert confirm and toast copy

Trim the file-revert Popconfirm description from a two-sentence warning
to a single line ("This can't be undone."), and switch the success toast
from full {{filePath}} to just {{fileName}} so it doesn't span the screen
for deep paths. Updated across all 18 locales.

* ♻️ refactor(chat): migrate file revert from Popconfirm to base-ui confirmModal

Per @lobehub/ui/base-ui-first convention. Drops the local confirmOpen/reverting
state and the data-force-visible CSS pin (no longer anchored to the trigger),
and lets confirmModal handle the OK button's in-flight loading.
2026-06-14 16:29:13 +08:00
Arvin Xu 97708c3fbb 🐛 fix(conversation): render mixed assistant blocks in natural order (#15810)
* 🐛 fix(conversation): render mixed assistant blocks in natural order

Drop the `shouldPromoteMixedBlockContent` heuristic that relocated a
tool-bearing block's prose below its tool when the text scored as
"final-answer-like". Within one assistant message the model's text always
precedes its tool_use (tool_use ends the turn; post-tool prose lands in a
separate, tool-less block), so a mixed block's content is always a preamble
and must stay above its tool. This fixes Claude Code turns (e.g.
askUserQuestion) that rendered the tool card above its own explanatory text.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(conversation): keep mixed multi-tool preamble outside the workflow fold

A mixed block's prose is a preamble, so in a multi-tool turn lift the full
text into a visible answer segment above the workflow and leave only the
tool(s) in the fold. Previously `leadingSentenceSplit` kept only the first
sentence visible and pushed the remaining prose into the WorkflowCollapse
body, which defaults to collapsed once complete — hiding most of the
explanation until the user expanded it.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 16:26:11 +08:00
YuTengjing 5872468c17 🔨 chore: update testing skill rules (#15807) 2026-06-14 15:33:23 +08:00
Arvin Xu bc9a7cfab8 feat(gateway): move gateway mode to chat config (#15714)
*  feat(gateway): move gateway mode to chat config

*  feat(gateway): add agent gateway env flag
2026-06-14 15:18:40 +08:00
lobehubbot d62843b90b Merge remote-tracking branch 'origin/main' into canary 2026-06-14 07:05:23 +00:00
Arvin Xu 21f78d3a6d 🚀 release: 20260614 (#15806)
# 🚀 LobeHub Release (20260614)

**Release Date:** June 14, 2026\
**Since v2.2.3:** 99 commits · 99 merged PRs · 11 contributors

> This cycle deepens cross-device collaboration — browser pairing, a
shared desktop/CLI device gateway, and edit locks that keep multiple
agents and people aligned on the same Context.

---

##  Highlights

- **Browser device pairing** — Pair a browser as a device and route
agent tools to it, with rename/delete actions on the branch switcher.
(#15678, #15774)
- **Shared device gateway** — Desktop and CLI now share one
remote-device gateway RPC, so device-bound runs behave the same
everywhere. (#15780)
- **Operation status tray** — A live op-status tray sits above the chat
input, tracking operation usage and staying compact on narrow screens.
(#14737, #15736, #15735)
- **Inline file previews** — HTML files render inline and remote
read-only local files preview directly in the portal. (#15671, #15673)
- **New providers** — Added AntGroup (蚂蚁百灵), Longcat with live
model-list fetch, and new SenseNova models. (#13713, #15134, #15306)
- **Desktop tab management** — Drag-to-reorder desktop tabs, plus
restored cloud desktop builds. (#15787, #15666)

---

## 🏗️ Core Agent & Runtime

- **Heterogeneous chaining** — Stabilized main-message chaining and
unified the client hetero executor on a shared `mainAgentReducer`.
(#15783, #15762)
- **Sub-agent resilience** — Block recursive server sub-agents, keep
async sub-agent streams alive, and rehydrate sub-agent runs from DB on
cold replicas. (#15731, #15646, #15788)
- **Reasoning persistence** — Always persist assistant reasoning to the
DB so it survives reloads. (#15687, #15690)
- **Device routing** — Resolve device routing and device-tool injection
through a single execution plan. (#15669, #15683)
- **Image attachments** — Persist and deliver image attachments for
device/sandbox hetero runs. (#15685)
- **Virtual sub-agents** — Split the virtual sub-agent entry and
clarified its naming. (#15733, #15737)

---

## 🖥️ Chat & User Experience

- **Topic management** — Topic sidebar status indicators, selector topic
actions, and a `batchMoveTopics` mutation for bulk moves. (#15739,
#15744, #15793)
- **Local file portals** — Scope local file tabs by working directory
and auto-close empty local previews. (#15732, #15760)
- **Editing** — Coalesce document autosave history into 10-minute
windows and fold connector OAuth into the custom MCP form. (#15716,
#15661)
- **Skills** — Delete/remove actions on settings skill items. (#15708)
- **Polish** — Preserve message order after tool results and stop
ContentLoading from leaking raw operation i18n keys. (#15657, #15752)

---

## 🤖 Models & Providers

- **Model bank metadata** — `knowledgeCutoff` batch 2 with a metadata
skill and an always-visible tab bar, plus backfilled family/generation
data. (#15663, #15642, #15640)
- **Provider quality** — Improved DeepSeek structured output, Kimi code
thinking mode, and a model guard kept in provider grouping. (#15680,
#15725, #15681)
- **Discoverability** — Surface model-list fetch failures instead of
failing silently. (#15753)

---

## 🔒 Reliability & Security

- **Error classification** — Classify "Agent state not found" as
`StateStoreReadError`, classify untyped `Error` throws via message
patterns, and surface missing tool calls as errors. (#15778, #15767,
#15691)
- **Codex** — Parse retry time in the stated timezone and detect the
bundled Codex CLI from Codex.app on macOS. (#15758, #15759)
- **Mobile** — Stop the `pushToken.unregister` 401 storm while
preserving authenticated legacy cleanup, and gate inbox unread count by
login state. (#15719, #15723, #15724)
- **Performance** — Derive topic activity from messages and drop sitemap
generation to cut static export time. (#15726, #15702)
- **Security:** Bumped `@opentelemetry/auto-instrumentations-node`,
`@opentelemetry/sdk-node`, and `vitest`. (#14686, #14687, #15698)

---

## 🔧 Tooling & Docs

- **Agent testing** — Merged local-testing and cli-backend-testing into
a single `agent-testing` skill, with local dev env bootstrap and
post-run iteration. (#15699, #15757, #15700, #15750)
- **Docs** — Replaced Claude-specific references with generic agent
wording across skills. (#15785)

---

## 👥 Contributors

Huge thanks to **11 contributors** who shipped **99 merged PRs** this
cycle.

@hezhijie0327 · @cokeSEE1 · @R3pl4c3r · @arvinxx · @tjx666 · @Innei ·
@Rdmclin2 · @LiJian · @sudongyuer · @Neko · @cy948

Plus @lobehubbot and renovate[bot] for maintenance.

---

**Full Changelog**:
https://github.com/lobehub/lobehub/compare/v2.2.3...release/weekly-20260614
2026-06-14 15:03:54 +08:00
Arvin Xu 9f1ab92242 🐛 fix(chat): normalize reconnect startTime to epoch ms (#15811)
* 🐛 fix(chat): normalize reconnect startTime to epoch ms

After a DB rehydrate (quit + relaunch), an assistant message's `createdAt`
can arrive as an ISO string / Date rather than epoch ms (the message service
casts rows `as unknown` without converting). The gateway reconnect path
anchored a running operation's `startTime` to that value verbatim, so the
running-elapsed-time label computed `Date.now() - startTime` as NaN and
rendered "NaN:NaN" in the topic list.

Normalize `createdAt` to epoch ms and only set `startTime` when the result is
finite; otherwise fall back to `startOperation`'s default `Date.now()`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  test(chat): assert reconnect omits startTime via matcher

Avoid indexing mock.calls (TS2532/TS2493 on the untyped spy tuple); use
toHaveBeenCalledWith + expect.not.objectContaining instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 15:01:58 +08:00
Arvin Xu 73b58d5bba feat(chat): show token usage cache rate (#15812) 2026-06-14 14:44:24 +08:00
renovate[bot] 729393ca1b Update dependency @vitest/coverage-v8 to v3.2.6 (#15802)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2026-06-14 14:33:01 +08:00
Arvin Xu 3335072bdb 🐛 fix(topic): scope per-agent topic search by agentId (#15798)
* 🐛 fix(topic): scope per-agent topic search by agentId

The per-agent Topics search resolved agentId→sessionId and filtered only
by the container (sessionId/groupId). Topics created by the new agent
system carry `agentId` directly with a null sessionId, so they were never
matched — the search showed "No topics match these filters" even though
the topics list (filtered by agentId) and global search displayed them.

`queryByKeyword` now accepts an agentId-aware scope mirroring `query`'s
precedence (groupId > agentId > containerId), matching `topics.agentId`
directly while still matching the resolved sessionId for legacy
un-migrated rows. The lambda searchTopics router passes the agentId
through.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(topic): align keyword search scope with the topics list

Address review on #15798:
- Drop the resolved-sessionId fallback in the agent branch. The topics list
  (`query`) scopes by agentId only, so the fallback (a) surfaced un-migrated
  rows the list hides and (b) leaked topics owned by another agent that shares
  the same session mapping. `matchKeywordScope` now mirrors `query` exactly:
  groupId > agentId > containerId (the last only for legacy/mobile string args).
- Topic inbox no longer exists, so no isInbox handling is threaded through.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 14:32:33 +08:00
Arvin Xu 01278efdde 🐛 fix(server): persist subagent turn id so cold replicas don't fragment a turn (#15808)
On a cold serverless replica the subagent run is rebuilt from DB, but the run's
turn identity — CC's per-turn `message.id` (`currentSubagentMessageId`) — was the
one field with no DB home, so rehydration hard-set it to ''. The subagent reducer
detects in-thread turn boundaries by comparing that id, so the first event of
every cold batch satisfied `'' !== realId` → a SPURIOUS turn boundary. One CC
subagent turn then fragmented across multiple in-thread assistant rows (text on
one, tools on another), spawned empty-shell assistants (only usage, no
content/tools), and mis-anchored siblings under the same old tool.

Give the turn id a DB home: stamp it on the in-thread assistant's
`metadata.subagentMessageId` at creation (`CreateMessageIntent.subagentMessageId`
→ server interpreter), and recover it in `buildSubagentSnapshot` →
`SubagentRunSnapshot.currentSubagentMessageId` → `rehydrateSubagentRunsState`. A
continuation is then recognized as the SAME turn — no spurious boundary, no
fragmentation, no empty shells. `MessageModel.update` deep-merges metadata, so
later usage/content writes don't clobber the stored id.

Follow-up to #15788 (subagent thread rehydration): that fixed the thread-
duplication half of cold-replica recovery; this fixes the turn-boundary half.

Regression: a CC turn continued on a fresh replica now yields exactly one
in-thread assistant (verified red without the recovery).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 14:21:40 +08:00
Arvin Xu 60fa4e31cd 🐛 fix(chat): inject device-bound project skills into the slash menu (#15797)
* 🐛 fix(chat): inject device-bound project skills into the slash menu

The `/` slash menu loaded project skills via `localFileService.listProjectSkills`
(local Electron IPC) and gated on `isDesktop` alone, so a device-bound (remote)
run scanned the controlling machine instead of the device — and the device's
`.claude/skills` / `.agents/skills` never appeared.

Route through the device-aware `projectSkillService` with the resolved
`remoteDeviceId` and gate on `(isDesktop || !!remoteDeviceId)`, mirroring the
WorkingSidebar's `SkillsGroup`. The SWR key shape matches `useProjectSkills` so
the two share one fetch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(chat): extract shared useFetchProjectSkills hook

Both the `/` slash menu and the SkillsList UI hook duplicated the same
project-skills SWR call (key, fetcher, options). Pull it into a single
`useFetchProjectSkills(workingDirectory, deviceId)` hook so the transport choice
and SWR key live in one place and the two callers dedupe one fetch.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(chat): revalidate remote project skills on focus

Remote skills live on a device this client can't watch for filesystem changes,
so refetch them on window focus to pick up edits made on the device. The local
IPC path keeps revalidateOnFocus off — the desktop already sees its own
filesystem.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(chat): resolve effective execution target before picking device id

The slash menu read the raw stored `executionTarget`, so a hetero agent saved as
desktop "This device" (`local` + boundDeviceId) opened on web — where
`resolveExecutionTarget` coerces it to `device` — kept `remoteDeviceId`
undefined and left the menu without project skills, even though the
WorkingSidebar (which resolves the effective target) lists them for the same
agent. Resolve the effective target the same way and treat it as remote only
when it lands on `device` with a bound device.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 13:11:17 +08:00
Arvin Xu 21a1b45fa1 feat(server): add batchMoveTopics TRPC mutation (#15793)
Expose TopicModel.batchMoveToAgent through a new topic.batchMoveTopics
lambda mutation (topic:update scoped permission, input { topicIds,
targetAgentId }) and add the matching topicService.batchMoveTopics client
wrapper.

Depends on the database layer (TopicModel.batchMoveToAgent).
Part of LOBE-10330

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 13:00:49 +08:00
Arvin Xu 30fc25fad7 chore(database): add batchMoveToAgent to TopicModel (#15792)
 feat(database): add batchMoveToAgent to TopicModel

Add a transactional TopicModel.batchMoveToAgent(topicIds, targetAgentId)
that reassigns topics to another agent purely via the agentId foreign key.
Both topics.agentId and messages.agentId are updated together (topic lists
query by topics.agentId and message queries filter by messages.agentId),
and sessionId is cleared on both tables so rows fully detach from the
source agent's legacy session. Scoped by ownership to prevent cross-user
moves.

Part of LOBE-10330

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 12:36:29 +08:00
YuTengjing 2f1a746756 📝 docs: replace Claude-specific references with generic agent wording in skills (#15785) 2026-06-14 11:47:20 +08:00
Arvin Xu f47e65d215 🐛 fix(server): rehydrate subagent runs from DB on cold replica (#15788)
* 🐛 fix(server): rehydrate subagent runs from DB on cold replica

Server-side hetero persistence kept per-operation state in a module-level
map. On a cold serverless replica (or any cross-replica batch), the main
agent state is rebuilt from DB but `MainAgentRunState.subagents` was seeded
empty. A continuing subagent event then hit the `!existing` branch of
`ensureRun` and forked a brand-new isolation thread for a parentToolCallId
that already had one — producing piles of generic "Subagent" threads that
were never attached to the right thread. Desktop never hit this (one
long-lived run-state closure).

Rebuild `state.main.subagents` from DB the same way the main half is
rehydrated: add `rehydrateSubagentRunsState` to @lobechat/heterogeneous-agents
and call a new `refreshSubagentRunsFromDb` each ingest. Only runs MISSING
from memory are rehydrated (warm accumulators win); finalized (Active)
threads are excluded so completed spawns are never resurrected.

Sibling of #15783 (main message chaining) — same root cause, subagent half.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(server): scope subagent rehydration to operation + de-dupe inner tools

Two follow-up fixes on the cold-replica subagent rehydration:

- P1: de-dupe inner tool creation against the run-lifetime tool set, not just
  the per-turn `persistedIds`. Per-turn state is reset on every turn boundary
  and starts empty after a rehydration, so a replayed / continued tools_calling
  on a cold replica minted a SECOND tool message for an id the run already
  wrote. `lifetimeToolCallIds` survives boundaries and is restored from DB, so
  it is the durable de-dupe key. Mirrors the main-agent retry protection.

- P2: scope `refreshSubagentRunsFromDb` to the current operation. Topics are
  reused across turns; a prior crashed/cancelled run can leave a subagent
  thread stuck `Processing`. Rehydrating purely by topic+status would merge
  that unrelated thread into the new operation's reducer state and finalize it
  on the new run's terminal drain. Stamp `operationId` on the subagent thread
  metadata at creation and filter rehydration by it.

Adds regression cases for both (each verified to fail without its fix).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 03:13:35 +08:00
Arvin Xu 6dcbd387f7 feat: support drag-to-reorder for desktop tabs (#15787)
*  feat: support drag-to-reorder for desktop tabs

Make the Electron titlebar tabs draggable horizontally to reorder them,
like Chrome tab dragging. Wires the existing `reorderTabs` store action
to a @dnd-kit sortable context.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix: preserve scroll position when reordering background tabs

The active-tab auto-scroll effect depends on `tabs`, so reordering
retriggered it and jumped the viewport back to the active tab. Guard it
with a ref so it only scrolls when the active tab id actually changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 02:57:21 +08:00
Arvin Xu fa58fd12a0 🔨 chore(testing): automate local auth setup (#15790)
🧪 test(agent): automate local auth setup
2026-06-14 02:00:49 +08:00
Rdmclin2 913ee4210d feat: page/agent/agentGroup/task edit lock (#15786)
* feat: support page editor lock

Squashed page-lock feature work:
- support page editor lock
- support agent group / agent / task edit
- add edit lock to agent/agentgroup/task
- refactor page lock
- fix workspaceId for edit objects
- align with agent/group/task

* fix: collaborative edit lock

* chore: update i18n

* fix: redis acquire

* fix: release lock

* fix: test case

* chore: complement page lock test cases
2026-06-14 01:40:36 +08:00
Arvin Xu 99411041b9 feat(device): share remote-device gateway RPC between desktop and CLI (#15780)
*  feat(device): share remote-device gateway RPC between desktop and CLI

Extract the desktop's remote-device gateway RPC surface into a shared
`@lobechat/device-control` package and wire it into the CLI so `lh connect`
serves the same git / workspace / file device RPCs as the desktop app.

- local-file-shell: relocate all git operations (branches, working-tree
  patches, branch diff, checkout/rename/delete/pull/push/revert) from the
  desktop GitCtr into the shared package as pure functions
- device-control (new): the `executeDeviceRpc` dispatch + workspace scan +
  portable file-preview / file-index defaults, with platform hooks injected
- desktop: GitCtr / WorkspaceCtr / GatewayConnectionCtr become thin wrappers
  delegating to the shared package (local IPC path unchanged)
- cli: handle `rpc_request` over the gateway via the shared dispatcher

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

*  test(device): cover git branch ops and device-control portable defaults

- local-file-shell: real-git integration tests for branch checkout / rename /
  delete (+ validation), working-tree files & patches, revert, branch-diff with
  no remote, and push / pull / ahead-behind against a bare origin
- device-control: defaultGetLocalFilePreview (text / image / accept filter /
  workspace containment / missing file) and defaultGetProjectFileIndex (git
  ls-files path + glob fallback)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(device): preserve directory entries in the glob project-file index

The CLI `getProjectFileIndex` glob fallback used `globLocalFiles`, which returns
only non-hidden file paths and no directory entries — so the Files tree builder
flattened nested files to the root and dropped dot-directories.

Walk with fast-glob (`dot: true`) and synthesize directory entries via the same
`collectProjectDirectories` path the git branch uses, so nesting and dot-dirs
(e.g. `.agents`) render correctly. Extracted a shared `buildEntries` helper.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 00:56:53 +08:00
YuTengjing 39bce329fd 🐛 fix: surface model list fetch failures (#15753) 2026-06-13 23:05:44 +08:00
Arvin Xu 55a969a3c1 🐛 fix(server): stabilize heterogeneous main message chaining (#15783)
* ♻️ refactor(server): reduce main heterogeneous persistence

* 🐛 fix(server): anchor hetero turns to latest tool row
2026-06-13 22:13:45 +08:00
Arvin Xu f51dd06a36 🐛 fix(model-runtime): classify "Agent state not found" as StateStoreReadError (#15778)
`coordinator.loadAgentState(operationId)` returning null throws a raw
`Error("Agent state not found for operation …")`, which (after the refine fix)
otherwise lands as a bare 500. It is a state-store READ failure, so route it to
StateStoreReadError alongside the caller-gone abort.

Because losing an operation's state is a genuine system fault (not benign
client abandonment), promote StateStoreReadError to countAsFailure: true /
severity: error. `ERR caller gone` now counts too — accepted trade-off, both
are system-side read failures worth tracking.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 21:11:33 +08:00
Arvin Xu 24e34c7545 Revert "🐛 fix(agent-document): support image LiteXML in headless editor (#15764)"
This reverts commit 3f3f12dbd2.
2026-06-13 20:29:35 +08:00
Arvin Xu 81d40b90d4 ♻️ refactor(chat): unify client hetero executor on a shared mainAgentReducer (#15762)
*  feat(hetero): add shared mainAgentCoordinator reducer

Pure, transactional main-agent run reducer mirroring subagentCoordinator.
Owns the asst→tool→asst chain rule (lastToolMsgIdEver) as the single source
of truth so client and server can converge on one processing flow. Not yet
wired into either interpreter.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* ♻️ refactor(chat): drive client hetero executor via shared mainAgentReducer

Replace the renderer's hand-written main-agent event state machine with the
shared reduceMainAgent + an applyIntent interpreter (main + delegated subagent
intents). The executor keeps its shell (persistQueue/IPC ordering, optimistic
intervention UI, op usage-metrics tray, notifications, resume fallback) and
still forwards raw events to the gateway handler for live UI; durable DB writes
now flow through the reducer's intents, so the asst→tool→asst parent chain
(incl. the lastToolMsgIdEver toolless-step rescue) is a single shared source of
truth with the server.

Tool/assistant message ids are now pre-allocated by the reducer (matching the
subagent path); updated the executor tests to honor caller-provided ids and
assert against captured ids instead of mock-minted ones.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 📝 docs(chat): clarify why main-scope streamContent intent is a no-op

It's intentional, not dead code: main live token UI is driven by the raw
stream_chunk forward to the gateway handler; the intent only drives the
subagent thread bucket (whose events are dropped before that forward).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(chat): close two hetero executor races from reducer refactor

Two review-found bugs introduced by moving main-agent state into the queued
reduceAndApplyMain:

1. retryWithoutResume's hasStreamedState() read mainState, which is now only
   updated inside the queued reduce — so a recoverable resume error landing
   after partial output was queued (but before the queue drained) could start a
   second run and duplicate/interleave messages. Restore the old synchronous
   guarantee with a `sawStreamedEvent` flag set the moment a stream_chunk /
   tool_result arrives, before queueing.

2. A transient createMessage failure on a step-boundary assistant was
   best-effort (logged, not rethrown), so reduceAndApplyMain still committed
   currentAssistantId to a row that was never created — every later
   content/tool/result write then targeted a missing assistant and was lost.
   Rethrow so the commit is skipped and currentAssistantId stays valid, mirroring
   the subagent createMessage path.

Both guarded by regression tests that fail without the fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 20:10:51 +08:00
Arvin Xu 9cde29fb14 💄 style(workflow): inset partial warning badge (#15773)
* 💄 style(workflow): inset partial warning badge

*  feat(portal): support preview for local markdown images

* 🐛 fix(portal): narrow markdown image src
2026-06-13 20:10:08 +08:00
Arvin Xu ebe8411e7e 💄 style: compact device guard alert (#15776) 2026-06-13 20:09:16 +08:00
Arvin Xu 381e87474c feat(device): add rename & delete actions to branch switcher (#15774)
Hover a branch row in the branch switcher to rename or delete it. Wires
new renameGitBranch / deleteGitBranch operations through both transports
(Electron IPC for the local machine, device.* TRPC RPCs for remote/web),
mirroring the existing checkoutGitBranch / revertGitFile stack.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 20:07:45 +08:00
Arvin Xu 09fd6f3411 💄 style(chat): carousel the OpStatusTray generating phrase every 4s (#15775)
The generating status phrase was picked once per operation and stayed
frozen for the whole run. Rotate it like a carousel — advancing to the
next phrase every 4s with a subtle fade — so a long-running task feels
alive instead of stuck on one line.

- add pickRotatingStatusPhrase: seed keeps the starting phrase stable
  per operation, step advances the carousel; reuses the existing 1s
  elapsed ticker so no extra timer is needed
- fade/slide the phrase on each switch via a keyed wrapper span (keeps
  the shiny-text shimmer animation intact)

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 20:03:07 +08:00
Arvin Xu d9d9f44cb2 🐛 fix(model-runtime): classify untyped Error throws via message patterns (#15767)
* 🐛 fix(model-runtime): classify untyped Error throws via message patterns

`refineErrorCode` only re-derived a specific code when the incoming errorType
was `ProviderBizError`, so raw `Error` throws — which `formatErrorForState`
wraps as `InternalServerError` (HTTP 500) — never reached `matchErrorPattern`.
Persistence-layer (`Failed query: …`) and state-store drops therefore landed
as bare, un-classified 500s instead of `DatabasePersistError` etc.

Add the two un-typed fallback wrappers (`InternalServerError`, `AgentRuntimeError`)
to `REFINABLE_CODES` so their message runs through the pattern registry before
falling back. The existing `Failed query:` pattern already classifies these;
this just lets it run again.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(model-runtime): classify Upstash readonly-upgrade & dropped-caller drops

Add `READONLY Writes are temporarily rejected` and `ERR caller gone` to the
StateStorePersistError pattern block — both are Redis/Upstash state-store
failures that otherwise fall through to a bare 500. They describe the
connection/server condition rather than a specific command, so there is no
read-vs-write signal to split on.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(model-runtime): split caller-gone state-store reads into StateStoreReadError

`ERR caller gone` is an Upstash reply when an in-flight blocking READ
(XREAD on the agent event stream, BLPOP on a tool result) is aborted because
the originating caller disconnected — a benign client abandonment tied to the
request lifecycle, not a write/persist fault. Bucketing it under
StateStorePersistError mislabelled it as a harness failure (attribution:
harness, countAsFailure: true).

Add a dedicated StateStoreReadError (E7007, attribution: system, severity:
warning, countAsFailure: false) and route `ERR caller gone` to it. The
write-side rejection `READONLY Writes are temporarily rejected` stays under
StateStorePersistError.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(model-runtime): scope HTTP-status fallback to provider catch-alls

Opening the un-typed wrappers (InternalServerError / AgentRuntimeError) to the
full refine path also let them hit the leadingStatusFromMessage /
codeFromHttpStatus fallback. A harness/DB/Redis throw like `Error('429 …')` or
`Error('500 …')` with no registered pattern would then be recast as
RateLimitExceeded / ProviderServiceUnavailable — provider retry/failure
semantics on a harness error.

Split the sets: PATTERN_REFINABLE_CODES (message matching) stays open to the
wrappers; STATUS_REFINABLE_CODES (the coarse HTTP-status bucket) is limited to
ProviderBizError, where a leading status is a real upstream signal.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 19:16:43 +08:00
Arvin Xu 1244a40950 🐛 fix(chat): stop ContentLoading from leaking raw operation i18n keys (#15752)
Internal/bookkeeping operation types (createToolMessage, executeToolCall,
pluginApi, builtinTool*, callLLM, searchWorkflow, ...) have no `operation.*`
locale key, so ContentLoading fell back to rendering the raw key
(e.g. `operation.toolCalling...`).

Extract OpStatusTray's operation→activity mapping into a shared
`resolveOperationActivity` helper and reuse it in ContentLoading: mappable
ops show the localized `opStatusTray.status.*` phase label, container ops
keep their dedicated copy, and unmappable ones fall back to the dot loader.

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 19:14:24 +08:00
Arvin Xu a48c2badd9 💄 style: improve shared Linear tool rendering (#15769) 2026-06-13 18:37:51 +08:00
Arvin Xu 3f3f12dbd2 🐛 fix(agent-document): support image LiteXML in headless editor (#15764) 2026-06-13 17:37:51 +08:00
Rylan Cai 99023811d8 📝 fix: clarify local system shell result wording (#15745)
* 🔥 remove local system listFiles exposure

* 📝 clarify local system shell result wording

* 📝 refine local system shell manifest copy

* 📝 simplify local system shell prompt semantics

* 🐛 fix command wait-window result wording

* 📝 limit transient device retry guidance

*  show command output duration

* 🏷️ narrow command duration result type

* 🐛 propagate operation id for device tool calls

* 🐛 update project skill discovery hint

* 📝 clarify project skill file access

* 📝 align project skill discovery comment
2026-06-13 16:34:10 +08:00
Arvin Xu 480a2979e1 🐛 fix(codex): parse retry time in stated timezone (#15758)
* 🐛 fix(codex): parse retry time in stated timezone

* 🐛 fix: enable remote git review panel

* 🐛 fix(codex): preserve adjacent retry meridiem
2026-06-13 16:32:35 +08:00
Arvin Xu 531900cf70 🐛 fix(desktop): detect bundled Codex CLI from Codex.app on macOS (#15759)
* 🐛 fix(desktop): detect bundled Codex CLI from Codex.app on macOS

OpenAI's Codex desktop app bundles the real codex CLI inside Codex.app
(Contents/Resources/codex) but never symlinks it onto PATH. A user with
only the desktop app installed failed PATH-based detection, so codex was
never spawned and the chat silently produced no reply.

Add a well-known install-location fallback inside detectHeterogeneousCliCommand
(tried after the PATH lookup, so a user's own install still wins), covering
both /Applications and ~/Applications. The fallback runs at detection time,
not module load, so it touches no node:os named exports on import. Feed the
detector-resolved absolute path through to spawn so a bare `codex` doesn't
ENOENT under spawn's leaner env.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* 🐛 fix(desktop): carry login-shell PATH into CLI spawn env

When the detector resolved a bare command via the login-shell PATH, only
the absolute shim path was kept; the PATH used for resolution was dropped.
spawn() then built its env from the leaner Finder-inherited PATH, so an
absolute shim with `#!/usr/bin/env node` still failed with
`env: node: No such file or directory` even though preflight succeeded
(npm/Homebrew/mise installs launched from Finder on macOS).

Surface the resolved PATH through ToolStatus.resolvedPathEnv, stash it on
the session, and merge it into spawnEnv (session.env still wins). Only set
when resolution fell back to the login-shell PATH, so the common on-PATH
case is unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-13 16:32:27 +08:00
Arvin Xu c9325794e5 🐛 fix(portal): close empty local file preview (#15760) 2026-06-13 16:31:56 +08:00
Innei 4a11ed9887 ♻️ refactor(auth): migrate auth pages to a standalone lightweight SPA (#15689)
*  feat(oidc): add interaction details endpoint

*  feat(auth-spa): scaffold standalone auth SPA shell and build pipeline

* 🐛 fix(auth-spa): address review findings in AuthShell copies

*  feat(auth-spa): add spa-auth html route handler

* ♻️ refactor(auth-spa): migrate simple auth pages into auth SPA

* 🔒 fix(auth-spa): validate locale segment in spa-auth route

* ♻️ refactor(auth-spa): move verify-im route to main SPA

* 🔒 fix(auth-spa): sanitize callbackUrl, fix signup form wiring, add router error element

* ♻️ refactor(auth-spa): migrate oauth pages into auth SPA

* 🐛 fix(auth-spa): address oauth migration review findings

* ♻️ refactor(auth): route auth pages to standalone SPA and drop Next auth tree

* 🔒 fix(auth): validate locale before middleware rewrite

* 🔥 chore(auth-spa): drop unused messenger i18n namespace from auth shell

* ️ perf(build): share one react vendor bundle across web/mobile/auth SPA builds

Build react core (react, react-dom, react-dom/client, react/jsx-runtime)
once as a self-contained ESM bundle under /_spa/vendor-shared, then mark
those specifiers external in every SPA build and map them via rolldown
output.paths to the same hashed URLs, so the auth page warms the main
app's react cache. react-router-dom stays per-build: apps use ~19K of it
after tree shaking while a shared bundle must export all 252K.

Also split auth i18n namespaces into per-locale chunks, keep locale
runtime helpers out of the default locale chunk, and group packages/const
into app-const so vendor-ai-runtime no longer captures it.

* ♻️ refactor(spa): extract shared SPA html serving helpers

Both the main SPA and auth SPA route handlers duplicated the Vite dev
asset rewriting, analytics config assembly and html template rendering.
Move them into src/server/spaHtml.ts; the desktop umami block becomes an
opt-in flag only the main SPA enables.

* 🐛 fix(auth-spa): bundle default locale resources and disable i18n suspense to fix signin mount loop

*  feat(auth-spa): wrap auth shell with BusinessAuthProvider slot

* 👷 build(spa): support custom vite dev origin and mark SPA entries side-effectful

* 🔥 chore: drop dead /welcome entry from nextjsOnlyRoutes

* 🐛 fix(auth-spa): forward referral to signup and fix error boundary dark-mode contrast

* ♻️ refactor(spa): lift NextThemeProvider above RouterProvider so route error boundaries are theme-aware

* update
2026-06-13 16:15:04 +08:00
Arvin Xu be7b759820 🛠️ chore(agent-testing): add local dev env bootstrap (#15757) 2026-06-13 13:54:13 +08:00
Arvin Xu fa76928f62 🐛 fix: fix Codex resumed usage reporting for heterogeneous agents (#15751)
🐛 fix(heterogeneous-agent): normalize codex resumed usage
2026-06-13 13:34:41 +08:00
Arvin Xu f6db1361ee feat(agent): show topic sidebar status indicators (#15739) 2026-06-13 13:32:56 +08:00
Arvin Xu 5d6eaf53f3 📝 docs(agent-testing): require inline visual evidence (#15750) 2026-06-13 12:28:56 +08:00
YuTengjing c4e4469083 🐛 fix: improve fallback trace error UI (#15746) 2026-06-13 12:18:56 +08:00
Arvin Xu 800b534741 🐛 fix(chat): track operation usage in status tray (#15736) 2026-06-13 11:55:39 +08:00
Arvin Xu 03b9d07d0b feat(topic): add selector topic actions (#15744) 2026-06-13 11:53:21 +08:00
Arvin Xu f60d1fe8dd 🐛 fix(codex): reuse Linear inspector for MCP calls (#15738)
* 🐛 fix(codex): reuse Linear inspector for MCP calls

* 🐛 fix(codex): gate generic Linear MCP labels
2026-06-13 11:46:16 +08:00
YuTengjing e5a27dc97c 🐛 fix: handle Kimi code thinking mode (#15725) 2026-06-13 11:21:25 +08:00
Arvin Xu c7e0c83174 ♻️ refactor(agent-runtime): clarify virtual sub-agent naming (#15737) 2026-06-13 11:10:14 +08:00
Arvin Xu ab958a0b98 🐛 fix(chat): compact operation metrics on narrow inputs (#15735)
* 🐛 fix: compact operation metrics on narrow inputs

* 📝 docs: improve agent testing report template
2026-06-13 02:28:38 +08:00
Arvin Xu 5362be4078 ♻️ refactor(agent): split virtual sub-agent entry (#15733) 2026-06-13 02:10:47 +08:00
Arvin Xu 6887930428 🐛 fix: resolve local markdown image assets (#15729)
* 🐛 fix: resolve local markdown image assets

* 🐛 fix: preserve UNC markdown asset paths

* 🔒️ fix: restrict markdown image previews to images

* ♻️ refactor: pass markdown image preview accept directly
2026-06-13 01:55:00 +08:00
Arvin Xu da94942d9c 🐛 fix(portal): scope local file tabs by working directory (#15732) 2026-06-13 01:54:44 +08:00
Arvin Xu a9141c8ade 🐛 fix(page): stabilize agent editor sync (#15730) 2026-06-13 01:36:38 +08:00
R3pl4c3r 8ab5ec5364 🐛 chore(workflow): fix Upstream Sync workflow running error (#15706)
fix(workflow): fix Upstream Sync workflow running error
2026-06-13 01:29:44 +08:00
Arvin Xu 222534dbe1 🐛 fix(agent): block recursive server sub-agents (#15731) 2026-06-13 01:24:41 +08:00
Neko f31c94490d ️ perf(app,database): derive topic activity from messages (#15726) 2026-06-13 00:57:45 +08:00
Rdmclin2 52eaf2702e 🐛 fix: workspace url sync (#15728)
* fix: workspace url sync

* chore: remove billing as personal
2026-06-13 00:15:48 +08:00
YuTengjing ce81ea44bf 🐛 fix: gate inbox unread count by login state (#15724) 2026-06-12 23:32:14 +08:00
Tsuki 29974d3ab9 🐛 fix(mobile): preserve authenticated legacy unregister cleanup (#15723)
Follow-up to #15719 addressing a Codex P2 review note.

After #15719, legacy v1.0.7 clients that only send `deviceId` were
silent-OKed unconditionally. But `publicProcedure` still receives
`ctx.userId` from `createLambdaContext` — and in the *active*
sign-out path (the user is still authenticated when logout fires)
that userId is valid. Skipping the delete in that case orphans the
existing `(userId, deviceId)` row, so `PushChannel.deliver` keeps
fanning notifications out to a signed-out device. Expo's
`DeviceNotRegistered` receipt only fires on uninstall, not on
logout, so the cron worker doesn't catch this either.

Fix: add a Path B fallback — when `ctx.userId` is available, run
the original `(userId, deviceId)` delete. Path A (expoToken pair)
still wins when present; Path C (silent OK) is now reserved for
the case the original PR was actually targeting: a v1.0.7 client
whose session is already gone, which is the source of the 401
storm.

Path matrix:
  expoToken present                  → Path A: precise delete by (expoToken, deviceId)
  no expoToken, ctx.userId present   → Path B: legacy (userId, deviceId) delete
  no expoToken, no session           → Path C: silent OK, cron cleans up

Tests added:
- legacy + valid session → falls back to (userId, deviceId)
- legacy + no session    → silent OK
- expoToken always takes precedence over userId fallback
2026-06-12 21:58:23 +08:00
Tsuki f4c431b028 🐛 fix(mobile): stop pushToken.unregister 401 storm (#15719)
Symptom: app.lobehub.com production logs show ~50+ TRPCError
UNAUTHORIZED traces per second on /trpc/mobile/pushToken.unregister,
starting from the v1.0.7 mobile release. Only `unregister` is hit
— `register` never appears in logs.

Root cause: the v1.0.7 client calls unregister *during* sign-out,
after the session is already invalid in practice (expired OIDC
token / cleared cookie). With authedProcedure gating, every logout
turns into a 401 that the client mistakes for an auth-expired
event and retries → a storm. Inside the client this also creates
a logout → 401 → authExpired.redirect → logout recursion.

Fix: change `unregister` to publicProcedure and authorize by the
(deviceId, expoToken) pair the client received at registration —
holding both is proof of ownership of that row, same trust model
as APNs/FCM unregister. Legacy v1.0.7 clients that only send
deviceId get a silent 200; the stale row is cleaned up by the
existing `process-push-receipts` worker via Expo's
DeviceNotRegistered receipts.

Returning 200 to those legacy calls also breaks the client-side
recursion at the source — the in-the-wild v1.0.7 fleet stops 401
flooding the moment this ships, before users update.

Tests:
- Router (mocked): expoToken path deletes by (expoToken, deviceId);
  no-expoToken path silently succeeds; unauthenticated caller
  succeeds; empty-string fields rejected.
- Model (integration): only the row matching both fields is
  removed; mismatched expoToken is preserved (defense against
  callers who only guess deviceId).

Fixes LOBE-10174
2026-06-12 21:47:19 +08:00
Innei 34fbd9ffd3 feat(document): coalesce autosave history versions into 10-minute windows (#15716)
*  feat(document): coalesce autosave history versions into 10-minute windows

*  feat(document): break autosave history window on new page load session
2026-06-12 20:55:28 +08:00
Arvin Xu 09b5e926bf feat(conversation): add op status tray above chat input (#14737)
*  feat(conversation): add op status tray above chat input

Show elapsed time, total tokens, and total cost while an AI-runtime
operation is running in the current conversation. Lives in the floating
overlay above the chat input alongside QueueTray and TodoProgress,
attaches flush to the input panel below.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(conversation): read top-level message.usage in op status tray

Token totals stayed at 0 during regular agent runs because the standard
agent path writes usage to `message.usage` (top-level) while the
heterogeneous executor writes `metadata.usage`. Read both. Also drop the
fragile createdAt window — assistant messages can be created before the
AI_RUNTIME op's startTime, which excluded otherwise-valid rows — and
aggregate across the whole conversation instead.

UI: a little more padding, a pulsing dot to mark the running state, a
tokens label, and a divider between tokens and cost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

*  feat(conversation): streaming phase, ping dot, and richer metrics in op status tray

- Left side now shows the current streaming phase (thinking / calling tools /
  searching / compressing / generating) derived from the most recent running
  sub-operation; server runtimes surface no sub-ops on the client and fall
  back to 'generating'.
- Pulse dot upgraded to an expanding ping ring animation.
- Zero-valued metrics are hidden entirely (no more '0 tokens / $0').
- Long-running tasks additionally surface turns and tool-call counts next to
  tokens and total cost.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* 💄 style(conversation): polish op status tray display

* 💄 style(conversation): unify op status tray glyph to a single hue

The activity glyph mixed purple and cyan accents into the primary color;
all layers now derive from colorPrimary alone (opacity-only variation).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* 💄 style(conversation): strip glyph halo fill and drop-shadow

The halo's tinted fill plus the drop-shadow rendered as a muddy disc
behind the glyph (worst in light theme). Reduce to a breathing core dot
plus a single rotating dashed orbit, primary hue only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* 💄 style(conversation): drop dollar prefix and code font in op status tray

The dollar icon already conveys currency, and the code font made the
numbers feel out of place next to the body text.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

*  feat(conversation): show per-message cost next to the token chip

Renders usage.cost beside the token count in the assistant message
footer; hidden in credit mode (credits already express cost) and when
the value is zero/absent.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* 💄 style(conversation): hide per-message cost below $0.20

Cheap messages don't need a cost callout — the chip only surfaces once
the cost is large enough to matter.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* 🐛 fix(conversation): anchor reconnected op timer to real run start, surface steps

- Page-refresh reconnect recreated the gateway operation with
  startTime=Date.now(), resetting the tray timer to 00:00 mid-run.
  Anchor it to the assistant message's createdAt instead.
- Mirror the server's authoritative stepIndex onto op.metadata.stepCount
  at every step_start event, so the steps metric shows for real
  server-side runs (and survives reconnects).
- Drop the tool-call count metric from the tray.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

*  test(conversation): stub updateOperationMetadata in gateway event handler mock store

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-12 18:10:29 +08:00
Innei d3e8e7cb65 🐛 fix(locale): support eager dayjs locale modules (#15711) 2026-06-12 16:57:42 +08:00
Rdmclin2 60bed5782f chore: update i18n (#15712)
chore: update i18n files
2026-06-12 16:21:34 +08:00
Rdmclin2 35b6bc55b8 🐛 fix: workspace error (#15701)
feat: support workspace (page author, copyTo/transferTo, notifications, i18n & fixes)

Squashed 13 commits from fix/workspace-error for clean rebase onto main's submodule base.
2026-06-12 16:08:31 +08:00
Innei 365dd1ff64 ️ perf(build): remove sitemap generation to cut static export time (#15702)
* ️ perf(build): remove sitemap generation to cut static export time

The sitemap accounted for 772 of 827 prerendered pages, each fetching
marketplace data at build time. Static generation drops from 28.2s to
0.3s and total next build from ~59s to ~32s.

* Redirect legacy sitemap URLs to the landing site

* Redirect sitemap index to landing sitemap
2026-06-12 15:17:52 +08:00
Innei 7633c0e83f 🐛 fix(share): always serve desktop bundle for share routes (#15710) 2026-06-12 14:54:18 +08:00
LiJian 87b1f39c0f feat(skill): add delete/remove actions to settings/skill items (#15708)
*  feat: add delete/uninstall actions to settings/skill items

- LobehubSkillItem: show compact `...` dropdown in list mode for connected items with Disconnect action (revokes OAuth)
- KlavisSkillItem: show compact `...` dropdown in list mode for connected/pending servers with Remove action (true delete via removeKlavisServer)
- ConnectorDetail: add Delete button for custom (mcp) connectors; calls deleteConnector + notifies parent via onDelete
- SkillDetail / Page: thread onDelete callback so selecting null after deletion triggers auto-select of next item
- Locales: add tools.klavis.remove / removeConfirm.title / removeConfirm.desc in en-US, zh-CN, and default source

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(skill): gate Klavis remove by canEdit and clear selected after removal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(skill): show dropdown for all Klavis/Lobehub items in list mode

Previously, the ... button was gated behind `server` (Klavis) and
`isConnected` (LobehubSkill), so disconnected/never-connected items
showed no actions. Remove those guards so the dropdown always renders
in list mode. handleRemove/handleDisconnect now skip the server call
when no server instance exists and instead clear the selected item.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(skill): move delete/uninstall actions from list dropdown to detail panel

- Remove heavy ... dropdown from KlavisSkillItem / LobehubSkillItem list items
- Add danger Uninstall button to builtin-skill detail header (matches ConnectorDetail style)
- Add slim action bar with Uninstall to agent-skill detail panel
- All actions respect canEdit / canCreate permissions with confirmModal gating

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-12 12:38:22 +08:00
LiJian ca91d2d756 refactor: replace Segmented tabs with SearchBar in ProfileEditor; gate local-system injection (#15593)
* 🐛 fix: activator tool discovery for cloud-sandbox and local-system

- P0: Explicitly inject LocalSystemManifest when device gateway is configured
  (discoverable: isDesktop is always false on server, so it never enters
  the discovery loop. The explicit injection mirrors the canUseDevice guard.)

- P1: Skip CloudSandboxManifest when runtimeMode is not 'cloud'
  (resolveRuntimeMode unifies executionTarget='sandbox' and legacy
  chatConfig.runtimeEnv.runtimeMode paths, so agents with sandbox
  disabled correctly exclude the cloud-sandbox tool.)

Both fixes operate at the manifest-map build stage, consistently affecting
all downstream consumers (activator discovery, availableTools, etc.)

* 🐛 fix: remove cloud-sandbox manifest when runtime is not sandbox

The initial manifest seed via getEnabledPluginManifests includes
defaultToolIds (which contains lobe-cloud-sandbox), so the manifest
was already in toolManifestMap before the allowedBuiltinTools loop's
continue guard. This made lobe-cloud-sandbox activatable even when
sandbox was disabled.

Add a delete right after resolveRuntimeMode to cover both the
manifestMap seed and the allowedBuiltinTools loop in one place.

Co-authored-by: chatgpt-codex-connector[bot]

* ♻️ refactor: replace Segmented tabs with SearchBar in ProfileEditor tool dropdown

- PopoverContent: replace Segmented with SearchBar + internal client-side filtering (same pattern as ChatInput ActionBar)
- AgentTool: remove ~270 lines of duplicated installedTabItems useMemo; pass unified items
- AgentTool: add auto-cleanup for stale plugin identifiers in agent config
2026-06-12 11:18:44 +08:00
lobehubbot 553d3d8fc7 🔖 chore(release): release version v2.2.3 [skip ci] 2026-06-10 11:38:19 +00:00
871 changed files with 30908 additions and 9840 deletions
+193 -29
View File
@@ -19,9 +19,23 @@ also run as full cloud automation. Every test session follows the same
four-step contract:
```
Step 0: Env + Auth → Step 1: Pick surface → Step 2: Run → Step 3: Structured report
Step -1: Plan approval → Step 0: Env + Auth → Step 1: Pick surface → Step 2: Run → Step 3: Structured report
```
## Step -1 — Plan approval for non-trivial tests
Skip directly to Step 0 if: the test is a single re-run after a fix, the plan
was already agreed on, or the user gave exact commands.
Otherwise, propose a test plan (surface, cases, expected evidence, assumptions)
and use the runtime structured question tool (`request_user_input` /
ask-user-question equivalent) with two fixed choices:
1. `开始执行 (Recommended)` — 测试方案没问题,开始执行
2. `先讨论下` — 方案有问题,先讨论下
Wait for the user's choice before proceeding.
## Step 0 — Environment setup + auth check (mandatory)
Step 0 is about getting the environment ready: **dependencies are healthy**
@@ -29,6 +43,36 @@ and **auth is green**. A test run that dies halfway on a missing dependency or
a login wall wastes the whole session — clear both gates BEFORE writing a
single test step.
### 0.0 Resolve the current test environment
Before starting a dev server, checking auth, opening agent-browser, or writing
test steps, print and confirm the current local test environment:
```bash
./.agents/skills/agent-testing/scripts/test-env.sh
```
This command is the source of truth for local test ports. It reads the current
shell plus `.env` files using the same precedence as `scripts/runWithEnv.mts`,
then prints:
- `APP_URL`
- `PORT`
- `SERVER_URL`
- `AUTH_TRUSTED_ORIGINS`
- `SPA_PORT`
- `MOBILE_SPA_PORT`
- `DESKTOP_PORT`
For commands that need these values, export them from the same resolver:
```bash
eval "$(./.agents/skills/agent-testing/scripts/test-env.sh --exports)"
```
Do not rely on hard-coded port tables. If the printed values do not match the
running dev server, fix/export the env first, then continue.
### 0.1 Dependencies are installed — root AND standalone apps
The root pnpm workspace does **NOT** cover every app: `pnpm-workspace.yaml`
@@ -38,9 +82,9 @@ lists `packages/**`, `e2e`, `apps/server`, and only `apps/desktop/src/main` —
refresh them, so install in every app the test will touch:
```bash
pnpm install # root workspace
cd apps/desktop && pnpm install # Electron surface
cd apps/cli && pnpm install # CLI surface
pnpm install # root workspace
cd apps/desktop && pnpm install # Electron surface
cd apps/cli && pnpm install # CLI surface
```
Symptom of a stale standalone install: the build/launch fails to resolve a
@@ -55,27 +99,129 @@ directory — a script launched while `cwd` is `apps/desktop` fails with
`No such file or directory`. Verify `pwd` is the repo root before launching
long-running scripts.
### 0.3 Auth is green
### 0.3 Init local dev env without `.env`
**Auth is the gate for all automated testing.**
For Web smoke against local code, start a **normal local dev environment**.
First check the repo root for `.env`:
- If `.env` exists, use the existing local configuration and start the dev
server normally.
- If `.env` does not exist, use the agent-testing env bootstrap.
Do not start the standalone e2e server as the product under test.
Use `scripts/init-dev-env.sh`. It follows the e2e setup pattern — Postgres,
migrations, auth/key-vault/S3 test env, seed user — but it is owned by this
skill and starts the repo's dev server (`pnpm run dev:next` / `bun run dev`),
not `e2e/scripts/setup.ts --start`. The script hard-blocks when root `.env`
exists, so it cannot accidentally override a user's local config. When `.env`
exists, do not call any `init-dev-env.sh` subcommand.
Decision flow:
```bash
./.agents/skills/agent-testing/scripts/setup-auth.sh status
if [[ -f .env ]]; then
bun run dev
else
./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
fi
```
| Surface | Mechanism | One-key path | Standard check |
| -------- | ------------------------------------------------- | ------------------------------ | ------------------------------ |
| CLI | OIDC Device Code Flow (`apps/cli/.lobehub-dev`) | `setup-auth.sh cli` | `setup-auth.sh status` |
| Web | better-auth cookie injection into `agent-browser` | `pbpaste \| setup-auth.sh web` | `setup-auth.sh web-verify` |
| Electron | App's own persistent login state | Log in once in the app | `app-probe.sh auth` |
| Bot | Native apps already logged in | — | per-platform screenshot |
Bootstrap flow when no `.env` exists:
```bash
# From repo root. Managed DB flow requires Docker Desktop.
./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
```
If using an existing Postgres instead of the managed Docker DB, set
`DATABASE_URL` and skip `setup-db`:
```bash
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
```
For backend-only checks, `dev-next` is available, but Web smoke needs the
full-stack `dev` command so Next can proxy the SPA HTML from Vite:
```bash
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
```
Useful subcommands:
```bash
./.agents/skills/agent-testing/scripts/init-dev-env.sh env # print exports
./.agents/skills/agent-testing/scripts/init-dev-env.sh write # write .records/env/agent-testing-dev.env
./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate # migrations only
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user # seed user + CLI API key
./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash # local QStash for workflow paths
./.agents/skills/agent-testing/scripts/init-dev-env.sh clean-db # remove managed DB container
```
Default script env:
- `APP_URL=http://localhost:3010`
- `DATABASE_URL=postgresql://postgres:postgres@localhost:5433/postgres`
- `DATABASE_DRIVER=node`
- `FEATURE_FLAGS=-agent_self_iteration` so local smoke does not require QStash
- Local QStash defaults (`QSTASH_URL`, `QSTASH_TOKEN`, signing keys) are exported;
run `init-dev-env.sh qstash` in a separate terminal when the path under test
triggers QStash/Workflow.
- `KEY_VAULTS_SECRET`, `AUTH_SECRET`, auth verification off
- S3 mock vars
- Managed DB container: `lobehub-agent-testing-postgres`
`seed-user` creates `agent-testing@lobehub.com` / `TestPassword123!` with
onboarding already completed, plus a local API key in
`.records/env/agent-testing-cli.env` for CLI automation. When running Cucumber
against this dev server, pass the same script env into the test process too;
Cucumber has its own `BeforeAll` seed path and it must see `DATABASE_URL`
instead of silently skipping setup:
```bash
cd e2e
# Only in the no-.env branch.
eval "$(../.agents/skills/agent-testing/scripts/init-dev-env.sh env)"
BASE_URL=http://localhost:3010 HEADLESS=true bun run test:smoke
```
### 0.4 Auth is green for the selected surface
**Auth is the gate for automated testing, but the gate is surface-scoped.**
Pick the intended surface first when it is already clear from the task, then
check only that surface. Do not block a Web test on CLI device-code auth or an
Electron login state unless the test spans those surfaces.
```bash
./.agents/skills/agent-testing/scripts/setup-auth.sh status --surface web
```
Use `status` with no `--surface` only for cross-surface test plans.
| Surface | Mechanism | One-key path | Standard check |
| -------- | --------------------------------------------- | ------------------------ | ----------------------------------------- |
| CLI | Seeded API key, device-code fallback | `setup-auth.sh cli-seed` | `setup-auth.sh status --surface cli` |
| Web | Seeded better-auth login into `agent-browser` | `setup-auth.sh web-seed` | `setup-auth.sh status --surface web` |
| Electron | App's own persistent login state | Log in once in the app | `setup-auth.sh status --surface electron` |
| Bot | Native apps already logged in | — | per-platform screenshot |
Login-state checks are standardized — do NOT hand-roll `window.__LOBE_STORES`
eval snippets; use `scripts/app-probe.sh auth` (returns `{ isSignedIn, userId }`,
works for Electron CDP and web sessions via `AB_TARGET`).
If `status` is not all green, fix auth first (the steps that need a human must be
requested from the user explicitly). Full background and failure modes:
For Web tests, the test surface is always `agent-browser --session lobehub-dev`.
Use `setup-auth.sh web-seed` first in the seeded local env. The user's normal
Chrome is only a source for copying the Cookie header when seed auth is not
available or `status --surface web` still fails. If Chrome is already logged in,
do not open a login page; verify agent-browser first, then request the Network
`Cookie:` header only if that verification fails. Full background and failure modes:
[references/auth.md](./references/auth.md).
## Step 1 — Pick the surface by change scope
@@ -148,17 +294,19 @@ Surface guides above carry the detailed workflows. Shared infrastructure:
All under `.agents/skills/agent-testing/scripts/`:
| Script | Usage |
| ------------------------- | ------------------------------------------------------------------------------ |
| `setup-auth.sh` | One-stop auth setup & status check (`status` / `cli` / `web`) |
| `app-probe.sh` | LobeHub app probes: `auth` / `route` / `ops` / `goto <path>` / `errors` |
| `record-gif.sh` | Frame-sequence → GIF for time-based behavior (streaming, timers, animations) |
| `report-init.sh` | Scaffold a structured test report (Step 3) |
| `electron-dev.sh` | Manage Electron dev env (start/stop/status/restart, CDP 9222) |
| `capture-app-window.sh` | Screenshot a specific app window (general; used by bot tests) |
| `record-app-screen.sh` | Record app screen (video + periodic screenshots) |
| `record-electron-demo.sh` | Record Electron app demo with ffmpeg |
| `agent-gateway/` | Gateway probe / dump / analyze tools |
| Script | Usage |
| ------------------------- | ---------------------------------------------------------------------------- |
| `test-env.sh` | Print/export the resolved local test env and ports |
| `setup-auth.sh` | One-stop auth setup & status check (`status` / `cli` / `web`) |
| `init-dev-env.sh` | Self-contained local dev env (`setup-db` / `seed-user` / `dev-next` / `dev`) |
| `app-probe.sh` | LobeHub app probes: `auth` / `route` / `ops` / `goto <path>` / `errors` |
| `record-gif.sh` | Frame-sequence → GIF for time-based behavior (streaming, timers, animations) |
| `report-init.sh` | Scaffold a structured test report (Step 3) |
| `electron-dev.sh` | Manage Electron dev env (start/stop/status/restart, CDP 9222) |
| `capture-app-window.sh` | Screenshot a specific app window (general; used by bot tests) |
| `record-app-screen.sh` | Record app screen (video + periodic screenshots) |
| `record-electron-demo.sh` | Record Electron app demo with ffmpeg |
| `agent-gateway/` | Gateway probe / dump / analyze tools |
`app-probe.sh` is the LobeHub-specific fast path into app state — auth check,
current route, running operations, and `goto <path>` quick navigation
@@ -174,12 +322,13 @@ not a chat-only summary. Scaffold it up front and fill it as you test:
```bash
DIR=$(./.agents/skills/agent-testing/scripts/report-init.sh my-feature "Verify my feature")
# ... test, saving screenshots / CLI transcripts into $DIR/assets/ ...
# fill $DIR/report.md (case table, embedded evidence, verdict) and $DIR/result.json
# fill $DIR/report.md (scope, case table with inline evidence, verdict, score) and $DIR/result.json
```
Reports live in `.records/reports/<timestamp>-<slug>/` (gitignored): `report.md`
(human-readable, with embedded screenshots), `result.json` (machine-readable
pass/fail + score), `assets/` (evidence). Format spec and evidence rules:
(human-readable, with screenshots/GIFs embedded directly in the case table),
`result.json` (machine-readable pass/fail + score), `assets/` (evidence).
Format spec and evidence rules:
[references/report.md](./references/report.md).
Two hard rules worth front-loading:
@@ -187,6 +336,21 @@ Two hard rules worth front-loading:
- **Report language = the user's conversation language.** Write the ENTIRE
`report.md` (headings included) in the language the user is conversing in —
no mixed English. `result.json` keys/status values stay English.
- **The case table is the main reading surface.** Prefer the compact
`# | case | result | key observation | evidence` shape and embed the
screenshot/GIF in the evidence cell. Use separate evidence sections only for
long CLI transcripts, HAR summaries, or supplemental detail.
- **Visual evidence must render inline.** Screenshots and GIFs in `report.md`
must use Markdown image syntax like `![case 1](assets/case1.png)`. Do not
use bare file paths, Markdown links, or local file links as the primary
visual evidence; those make the report unreadable without opening each asset.
- **Final replies must include visual evidence links.** When a run includes UI
screenshots or GIFs, include the report directory and the most important
visual artifacts in the final chat response. Each item must include a stable
label, an evidence caption describing the observed UI outcome, and a
repo-relative path, for example:
`[Image #1 - error toast shows provider auth failure](<report-dir>/assets/foo.png)`.
Use repo-relative paths, not absolute paths.
- **Time-based behavior needs a GIF, not a screenshot.** If a case asserts
change over time (streaming output, a ticking timer, loading states,
animations), record it with `scripts/record-gif.sh` and embed the GIF —
+26 -16
View File
@@ -13,17 +13,18 @@ flakiness.
## Prerequisites
| Requirement | Details |
| ------------ | --------------------------------------------------------------------------------- |
| Dev server | `localhost:3010` — see [../references/dev-server.md](../references/dev-server.md) |
| Requirement | Details |
| ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
| Dev server | `localhost:3010` — see [../references/dev-server.md](../references/dev-server.md) |
| CLI source | `apps/cli/` — runs from source, no rebuild; standalone `node_modules` — run `pnpm install` inside `apps/cli/` (root install does not cover it) |
| CLI dev mode | `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated credentials |
| Auth | Device Code Flow login — see [../references/auth.md](../references/auth.md) |
| CLI dev mode | `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated settings |
| Auth | Seeded API key first; Device Code Flow only as fallback — see [../references/auth.md](../references/auth.md) |
All CLI dev commands run from `apps/cli/`. Subsequent examples use `$CLI`:
```bash
CLI="LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts"
source ../../.records/env/agent-testing-cli.env
CLI="bun src/index.ts"
```
## Workflow
@@ -39,14 +40,23 @@ check, start, and restart commands. Server-side code changes require a restart.
./.agents/skills/agent-testing/scripts/setup-auth.sh status
```
If the CLI is not logged in, **the user must run the login themselves**
(interactive browser authorization):
If the CLI is not ready in the seeded local environment:
```bash
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
source .records/env/agent-testing-cli.env
./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
```
If the target environment is not seeded, use the interactive fallback:
```bash
cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
```
Credentials persist in `apps/cli/.lobehub-dev/`. Details:
Seeded API-key auth does not store credentials. It writes local settings under
`$HOME/.lobehub-dev` and requires the generated env file to be sourced before
CLI commands. Details:
[../references/auth.md](../references/auth.md).
### Step 3 — Test with CLI commands
@@ -133,10 +143,10 @@ $CLI provider test <provider-id>
## Troubleshooting
| Issue | Solution |
| --------------------------- | ----------------------------------------------- |
| `No authentication found` | Run `login --server http://localhost:3010` |
| `UNAUTHORIZED` on API calls | Token expired; re-run login |
| `ECONNREFUSED` | Dev server not running — see dev-server.md |
| CLI shows old data/behavior | Server needs restart to pick up code changes |
| Login opens wrong server | Must use `--server` flag (env var doesn't work) |
| Issue | Solution |
| --------------------------- | ------------------------------------------------------------------------------------------------------ |
| `No authentication found` | Source `.records/env/agent-testing-cli.env`, or run device-code `login --server http://localhost:3010` |
| `UNAUTHORIZED` on API calls | Re-run `init-dev-env.sh seed-user` and re-source the env file; for device-code fallback, re-run login |
| `ECONNREFUSED` | Dev server not running — see dev-server.md |
| CLI shows old data/behavior | Server needs restart to pick up code changes |
| Login opens wrong server | Must use `--server` flag (env var doesn't work) |
+92 -49
View File
@@ -1,37 +1,72 @@
# Auth Setup for Local Agent Testing
**Auth is the gate for all automated testing.** Prepare and verify it before
writing any test step. The one-stop entry point is:
**Auth is the gate for all automated testing.** Complete
[Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) first so
`SERVER_URL` and ports are resolved, then verify auth before writing any test
step.
Initialize helpers first:
```bash
SCRIPT=".agents/skills/agent-testing/scripts/setup-auth.sh"
$SCRIPT status # check server + CLI + web auth readiness
$SCRIPT cli # interactive CLI device-code login (must be run by the user)
pbpaste | $SCRIPT web # inject a copied Cookie header into the agent-browser session
$SCRIPT web-verify # live-check that the agent-browser session is authenticated
SCRIPT="./.agents/skills/agent-testing/scripts/setup-auth.sh"
TEST_ENV="./.agents/skills/agent-testing/scripts/test-env.sh"
eval "$($TEST_ENV --exports)"
```
`SERVER_URL` defaults to `http://localhost:3010` (this repo's `dev:next` port).
Override it when testing against another server (e.g. `SERVER_URL=http://localhost:3011`
in the cloud repo).
Quick reference after initialization:
| Command | Purpose |
| ------------------------------ | -------------------------------------------------- |
| `$SCRIPT status` | Check all surfaces (server + CLI + web + Electron) |
| `$SCRIPT status --surface web` | Check only the Web surface gate |
| `$SCRIPT cli-seed` | Configure CLI API-key auth from the seeded key |
| `$SCRIPT cli` | Interactive CLI device-code login (user must run) |
| `$SCRIPT open-chrome` | Open Chrome at `SERVER_URL` with DevTools |
| `$SCRIPT web-seed` | Sign in the seeded user and inject cookies |
| `pbpaste \| $SCRIPT web` | Inject a copied Cookie header into agent-browser |
| `$SCRIPT web-verify` | Live-check agent-browser session auth |
Use `localhost` for Web auth; better-auth cookies are stored for `localhost`,
not `127.0.0.1`.
## Per-surface overview
| Surface | Mechanism | Persistence | Human interaction |
| -------- | ---------------------------------------- | ----------------------------------------------------------------- | ----------------------------------------------- |
| CLI | OIDC Device Code Flow | `apps/cli/.lobehub-dev/settings.json` | Yes — browser authorization, every token expiry |
| Web | better-auth cookie injection | `~/.lobehub-agent-testing/web-state.json` + agent-browser session | Copy the Cookie header once per token rotation |
| Electron | App's own login state | Electron user-data dir | Log in once manually in the app |
| Bot | Native apps (Discord/WeChat/…) logged in | Each app's own session | Once per app |
| Surface | Mechanism | Persistence | Human interaction |
| -------- | ---------------------------------------- | ----------------------------------------------------------------- | ---------------------------------------------- |
| CLI | Seeded API key or OIDC Device Code Flow | `.records/env/agent-testing-cli.env` + `$HOME/.lobehub-dev` | No for seed path; yes for device-code fallback |
| Web | Seeded better-auth login or cookie copy | `~/.lobehub-agent-testing/web-state.json` + agent-browser session | No for seed path; copy cookie only as fallback |
| Electron | App's own login state | Electron user-data dir | Log in once manually in the app |
| Bot | Native apps (Discord/WeChat/…) logged in | Each app's own session | Once per app |
## CLI — Device Code Flow
## CLI — Seeded API key
For the self-contained no-root-`.env` dev environment, seed the baseline user
and API key once:
```bash
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
source .records/env/agent-testing-cli.env
./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
```
The seed step writes `LOBE_API_KEY` for humans and maps it to the CLI's current
auth variable, `LOBEHUB_CLI_API_KEY`. It also sets `LOBEHUB_SERVER` so CLI
commands hit the local server without needing a stored device-code token.
Use this for automated CLI verification:
```bash
cd apps/cli
source ../../.records/env/agent-testing-cli.env
bun src/index.ts <command>
```
## CLI — Device Code Flow fallback
Use device-code login only when testing against a non-seeded environment.
Credentials are isolated from the user's real CLI config via
`LOBEHUB_CLI_HOME=.lobehub-dev` (kept inside `apps/cli/`, gitignored).
Login requires interactive browser authorization, so **the user must run it
themselves** (e.g. via the `!` prefix in Claude Code):
`LOBEHUB_CLI_HOME=.lobehub-dev`, which the current CLI stores under
`$HOME/.lobehub-dev`.
```bash
cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
@@ -40,10 +75,30 @@ cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server htt
- The `--server` flag is required — an env var does NOT work and login will hit
the wrong server without it.
- Check state without logging in: `setup-auth.sh status` (verifies
`settings.json` exists and `serverUrl` matches).
`LOBEHUB_CLI_API_KEY` when present, otherwise checks the stored server URL).
- `UNAUTHORIZED` on API calls means the token expired — re-run login.
## Web — better-auth cookie injection (agent-browser)
## Web — seeded better-auth login
The Web test surface is `agent-browser --session lobehub-dev`. The user's
ordinary Chrome is only a cookie source; Chrome screenshots, Chrome Network
records, and Chrome logged-in state do not prove the agent-browser test session
is authenticated.
For the seeded local dev environment, use the automatic path:
```bash
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
```
`web-seed` posts the seeded email/password to
`/api/auth/sign-in/email`, stores the returned cookie jar under
`~/.lobehub-agent-testing/`, converts it to Playwright `storageState`, loads it
into the `agent-browser` session, and verifies the session does not land on
`/signin`.
## Web — manual cookie injection fallback
`agent-browser --headed` on macOS often creates the Chromium window off-screen —
the user can't see or interact with it, so manual login inside the agent-browser
@@ -53,31 +108,19 @@ user's own logged-in Chrome and inject it as a Playwright-style state file.
Do **not** use this on production URLs — only local dev. Treat the cookie as a
secret: don't paste it into shared logs, PRs, or commit it anywhere.
### One-key path
### Web — decision flow
1. Ask the user to copy the Cookie header **from a Network request, NOT
`document.cookie`** (`document.cookie` cannot see HttpOnly cookies, which is
exactly where better-auth puts its session):
- Open the logged-in tab (`http://localhost:<port>/…`) in Chrome.
- `Cmd+Option+I`**Network** tab → refresh → click any same-origin request.
- Under **Request Headers**, right-click the `Cookie:` line → **Copy value**.
2. Inject and verify in one shot:
```bash
pbpaste | ./.agents/skills/agent-testing/scripts/setup-auth.sh web
```
The script filters the header down to the better-auth cookies
(`better-auth.session_token`, `better-auth.state`), builds the Playwright
`storageState` JSON, loads it into the `agent-browser` session (default name
`lobehub-dev`), opens `SERVER_URL`, and asserts the URL is not `/signin`.
1. `$SCRIPT status --surface web` — green? Start testing. Do not ask for a Cookie header.
2. Not green and using the seeded local env → `$SCRIPT web-seed`.
3. Still not green or not using the seed env → `$SCRIPT open-chrome` opens Chrome at `SERVER_URL` with DevTools.
4. User copies the `Cookie:` header from Network tab → any same-origin request → Request Headers → right-click `Cookie:`**Copy value**. Must be from Network, NOT `document.cookie` (HttpOnly cookies are invisible to `document.cookie`).
5. `pbpaste | $SCRIPT web` — filters to better-auth cookies (`session_token`, `session_data`, `state`), builds Playwright `storageState`, loads it into the `agent-browser` session (`lobehub-dev`), opens `SERVER_URL`, and asserts the URL is not `/signin`.
### Using the authenticated session
```bash
agent-browser --session lobehub-dev open "http://localhost:3010/"
agent-browser --session lobehub-dev open "$SERVER_URL/"
agent-browser --session lobehub-dev snapshot -i | head -20
# Look for the user's avatar/name in the sidebar, or absence of the signin form.
```
### Notes
@@ -90,12 +133,12 @@ agent-browser --session lobehub-dev snapshot -i | head -20
### Common failure modes
| Symptom | Cause | Fix |
| --------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------- |
| Still redirects to `/signin` after injection | User pasted from `document.cookie` → missed HttpOnly session | Re-pull from Network request Headers, not console |
| Script reports `no better-auth cookies found` | Separator wrong, or user pasted URL-decoded value | Keep the raw `Cookie:` header as-is |
| Login works briefly then expires | `better-auth.session_token` rotated (user logged out / signed in again) | Re-copy and re-inject |
| Domain mismatch | Cookie domain must be `localhost` literally, no leading dot for local dev | — |
| Symptom | Cause | Fix |
| --------------------------------------------- | ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| Still redirects to `/signin` after injection | User pasted from `document.cookie` → missed HttpOnly session | Re-pull from Network request Headers, not console |
| Script reports `no better-auth cookies found` | User pasted the wrong value, or the cookie parser regressed | Keep the raw `Cookie:` header as-is; run `scripts/setup-auth.test.sh` if the input looks valid |
| Login works briefly then expires | `better-auth.session_token` rotated (user logged out / signed in again) | Re-copy and re-inject |
| Domain mismatch | Cookie domain must be `localhost` literally, no leading dot for local dev | — |
## Electron
@@ -3,33 +3,71 @@
Single source of truth for starting / restarting the backend that all test
surfaces (CLI, Electron, Web) hit.
## Resolve ports first
Run `test-env.sh` as described in
[SKILL.md Step 0.0](../SKILL.md#00-resolve-the-current-test-environment)
before starting or probing any local test surface.
## Ports & modes
| Command | What it runs | Port |
| ------------------- | --------------------------------------------------------- | --------------------------------- |
| `pnpm run dev:next` | Next.js backend (API + auth) | `3010` |
| `bun run dev` | Full-stack (Next.js + Vite SPA, via `devStartupSequence`) | `3010` (API) + SPA |
| `bun run dev:spa` | Vite SPA only, proxies API to `3010` | `9876` (prints a Debug Proxy URL) |
| Command | What it runs | Port source |
| ------------------- | --------------------------------------------------------- | ------------------- |
| `pnpm run dev:next` | Next.js backend (API + auth) | `PORT` |
| `bun run dev` | Full-stack (Next.js + Vite SPA, via `devStartupSequence`) | `PORT` + `SPA_PORT` |
| `bun run dev:spa` | Vite SPA only, proxies API to `PORT` | `SPA_PORT` |
In the **cloud repo** (where this repo is the `lobehub/` submodule) the dev
server conventionally runs on `3011` — set `SERVER_URL=http://localhost:3011`
for the scripts in this skill when testing there.
In the **cloud repo** (where this repo is the `lobehub/` submodule), local
worktree names map to fallback defaults only when `.env` and shell env do not
provide values:
| Workspace directory | Default `SERVER_URL` |
| ------------------- | -------------------------------- |
| `lobehub` | `http://localhost:3010` |
| `lobehub-cloud` | `http://localhost:3020` |
| `lobehub-cloud-1` | `http://localhost:3021` |
| `lobehub-cloud-N` | `http://localhost:$((3020 + N))` |
`test-env.sh` and `setup-auth.sh` both use the resolved env first and these
worktree defaults only as fallback. Treat the dev-server terminal output as the
final source of truth when testing a non-standard port, then export it for every
agent-testing command:
```bash
export SERVER_URL=http://localhost:<port-from-dev-output>
```
## Health check
```bash
curl -s -o /dev/null -w '%{http_code}' http://localhost:3010/
curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/"
```
## Start / restart
```bash
# Start (from repo root)
# Start backend only.
# With root .env: use the existing local config.
pnpm run dev:next
# Without root .env: use the self-contained agent-testing env.
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
# Full-stack SPA + backend. Required for Web smoke.
# With root .env:
bun run dev
# Without root .env:
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
# Local QStash. Run in a separate terminal only when testing workflow paths.
./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash
# Restart — required to pick up server-side code changes
lsof -ti:3010 | xargs kill
lsof -ti:"$PORT" | xargs kill
pnpm run dev:next
# or, when no root .env exists:
# ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
```
## When a server restart is needed
@@ -48,8 +86,13 @@ in doubt.
## Troubleshooting
| Issue | Solution |
| ------------------------- | ------------------------------------------------------- |
| `ECONNREFUSED` | Server not running — start it |
| `EADDRINUSE` on the port | Already running — `lsof -ti:<port> \| xargs kill` first |
| Stale data / old behavior | Server needs a restart to pick up code changes |
| Issue | Solution |
| ------------------------- | --------------------------------------------------------------------------------------------- |
| `ECONNREFUSED` | Server not running — start it |
| `EADDRINUSE` on the port | Already running — `lsof -ti:<port> \| xargs kill` first |
| Stale data / old behavior | Server needs a restart to pick up code changes |
| QStash workflow failures | Start `init-dev-env.sh qstash` and make sure dev server inherited the script's `QSTASH_*` env |
Marketplace/community endpoints are not part of the local agent-testing auth
gate. Do not block local product-chain verification on marketplace API auth
unless the change explicitly targets marketplace behavior.
@@ -11,7 +11,7 @@ output):
```
.records/reports/<YYYYMMDD-HHMMSS>-<slug>/
├── report.md # human-readable report (embedded screenshots, case table, verdict)
├── report.md # human-readable report (case table with inline screenshots, verdict)
├── result.json # machine-readable results (pass/fail counts, score)
└── assets/ # evidence: screenshots, HAR files, CLI transcripts
```
@@ -25,13 +25,16 @@ output):
```
The script creates the directory, pre-fills branch / commit / date in both
files, and prints the directory path.
files, and prints the directory path. The scaffold uses the compact report
shape below; translate its headings and table labels to the user's language
before delivery if needed.
2. **Collect evidence as you test** — every asserted behavior gets one evidence
item in `$DIR/assets/`:
- UI (static state): `agent-browser screenshot` or `capture-app-window.sh`,
then **verify the screenshot with the Read tool before citing it** —
never cite an image you haven't looked at.
- UI (time-based behavior): **screenshot vs GIF is a judgment you must
make per case.** If the assertion is about change over time — streaming
output, a ticking timer, loading/progress states, animations,
@@ -48,33 +51,91 @@ output):
Embed it like an image: `![case 2](assets/case2-streaming.gif)`. Verify
at least the first/last frames visually (Read the GIF) before citing.
- CLI: exact command + trimmed output (`$CLI task list | tee "$DIR/assets/task-list.txt"`).
- Network: `agent-browser network requests` dumps or HAR files.
3. **Fill `report.md` as you go** — don't reconstruct from memory at the end.
The primary evidence belongs in the case table itself: each row should pair
the assertion with the screenshot/GIF or non-visual artifact that proves it,
so readers can scan the result without jumping between sections. UI evidence
must render inline with Markdown image syntax; a plain link or file path is
not acceptable as primary visual evidence.
4. **Set the verdict** in both `report.md` and `result.json`, then link the
report directory in your final answer to the user.
report directory in your final answer to the user. If UI evidence exists,
list the key screenshot/GIF links in the final chat response. Use Markdown
link text as the evidence caption, for example:
`[Image #1 - observed outcome](<report-dir>/assets/case1.png)`.
## Report language (hard rule)
**`report.md` MUST be written in the language the user is conversing in** —
the whole file, headings included. If the conversation is in Chinese, the
report is in Chinese; do not mix English prose into it. The scaffold's English
headings are placeholders — translate them when filling. Exceptions that stay
as-is: code/commands, identifiers, log excerpts, and `result.json` (its keys
and status values are machine-read and stay English; the `title` and case
`name` fields follow the user's language).
report is in Chinese; do not mix English prose into it. The scaffold headings
are placeholders — translate them when filling if the user is not conversing in
the scaffold language. Exceptions that stay as-is: code/commands, identifiers,
log excerpts, and `result.json` (its keys and status values are machine-read
and stay English; the `title` and case `name` fields follow the user's
language).
## report.md sections
| Section | Content |
| --------------- | ---------------------------------------------------------------------------------- |
| **Scope** | What changed / what is being verified; branch + commit |
| **Environment** | Server URL, surfaces used (cli / electron / web / bot), relevant versions |
| **Cases** | Table: `# \| case \| surface \| steps \| expected \| actual \| status \| evidence` |
| **Evidence** | Embedded screenshots/GIFs (`![case 1](assets/case1.png)`), fenced CLI transcripts |
| **Verdict** | Pass/fail/blocked counts, optional 0100 score, open issues / follow-ups |
Default report shape:
| Section | Content |
| ---------------- | -------------------------------------------------------------------------------------------- |
| **Scope** | What changed / what is being verified; branch, commit, date, surface, entry URL/page, focus |
| **Cases** | Compact table: `# \| Case \| Result \| Key observation \| Evidence` |
| **Verdict** | Overall verdict first (`pass` / `partial` / `fail`), then the concise reasons and follow-ups |
| **Verification** | Commands or automated checks run in this session, with trimmed results |
| **Score** | Pass/fail/blocked counts, optional 0100 score |
The case table is the main reading surface. Prefer one clear row per user
scenario or regression assertion, and put the screenshot/GIF directly in the
`Evidence` cell:
```markdown
| # | Case | Result | Key observation | Evidence |
| --- | ------------------------ | ------ | ----------------------------------------------------------------- | ------------------------------------------------ |
| 1 | Create a new page | pass | Title and body persisted after refresh | ![created page](assets/new-page-created.png) |
| 2 | Respect requested length | fail | Requested about 600 Chinese characters; final body was about 1286 | ![final article](assets/write-article-final.png) |
```
## Inline visual evidence
Screenshots and GIFs must be embedded so the report shows the image inline:
```markdown
![case 1 result](assets/case1-result.png)
![streaming response](assets/case2-streaming.gif)
```
Do **not** use these as the primary evidence for UI cases:
```markdown
[case 1 result](assets/case1-result.png)
assets/case1-result.png
file:///tmp/case1-result.png
```
Links are acceptable for non-visual artifacts such as CLI transcripts, HAR
files, or long logs. For videos, embed a representative screenshot/GIF inline in
the case row and link the full video as supplemental evidence.
Avoid the old wide table with separate `steps`, `expected`, and `actual`
columns unless the test is purely non-visual and truly needs that breakdown.
For UI reports, those columns make screenshot-backed reading harder. Put
procedural detail in the row's key observation only when it changes the
interpretation of the result.
Use an extra evidence/detail section only when the inline table cannot carry
the material cleanly, such as long CLI transcripts, HAR summaries, or multiple
screenshots for one case. In that situation, keep the table evidence cell as an
inline visual proof for UI cases or a concise link for non-visual artifacts,
then put the longer material under `Verification` or a brief
`Additional Evidence` section.
Status values: `pass` / `fail` / `blocked` (couldn't run — e.g. auth or env
missing; a blocked case is not a pass).
@@ -115,7 +176,8 @@ word the user reads first: `pass`, `fail`, or `partial`.
## Rules
- **No evidence, no claim** — every `pass`/`fail` in the case table must link
at least one asset.
at least one asset. UI cases must inline-embed their primary screenshot/GIF;
non-visual CLI/network cases may link transcripts, HAR files, or logs.
- **Screenshots must be visually verified** with the Read tool before being
cited.
- **Report failures faithfully** — a failing case with clear evidence is a good
+407
View File
@@ -0,0 +1,407 @@
#!/usr/bin/env bash
# init-dev-env.sh — self-contained local dev env for agent testing.
#
# This script initializes the env needed to run LobeHub's normal local dev
# server without depending on a root .env file. It follows the same shape as
# the e2e bootstrap (Postgres + migrations + auth/key-vault/S3 test env), but
# starts the repo's dev server, not the standalone e2e server.
#
# Guardrail: if repo-root .env exists, every non-help command exits immediately.
# Existing local config always wins.
#
# Usage:
# init-dev-env.sh env # print shell exports
# init-dev-env.sh write [file] # write a source-able env file
# init-dev-env.sh setup-db # start local Postgres and run migrations
# init-dev-env.sh migrate # run DB migrations against the configured DB
# init-dev-env.sh seed-user # seed the baseline test user + CLI API key
# init-dev-env.sh qstash # run local Upstash QStash dev server
# init-dev-env.sh dev-next # exec `pnpm run dev:next` with this env
# init-dev-env.sh dev # exec `bun run dev` with this env
# init-dev-env.sh clean-db # remove the managed Postgres container
#
# Overrides:
# SERVER_PORT=3010 DB_PORT=5433 DB_CONTAINER=lobehub-agent-testing-postgres QSTASH_DEV_PORT=8080
set -euo pipefail
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
ROOT_ENV_FILE="$REPO_ROOT/.env"
SERVER_PORT="${SERVER_PORT:-3010}"
DB_PORT="${DB_PORT:-5433}"
DB_CONTAINER="${DB_CONTAINER:-lobehub-agent-testing-postgres}"
DATABASE_URL="${DATABASE_URL:-postgresql://postgres:postgres@localhost:${DB_PORT}/postgres}"
ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-dev.env"
CLI_ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-cli.env"
AGENT_TESTING_API_KEY="${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}"
QSTASH_DEV_PORT="${QSTASH_DEV_PORT:-8080}"
QSTASH_LOCAL_TOKEN="${QSTASH_LOCAL_TOKEN:-eyJVc2VySUQiOiJkZWZhdWx0VXNlciIsIlBhc3N3b3JkIjoiZGVmYXVsdFBhc3N3b3JkIn0=}"
QSTASH_LOCAL_CURRENT_SIGNING_KEY="${QSTASH_LOCAL_CURRENT_SIGNING_KEY:-sig_7kYjw48mhY7kAjqNGcy6cr29RJ6r}"
QSTASH_LOCAL_NEXT_SIGNING_KEY="${QSTASH_LOCAL_NEXT_SIGNING_KEY:-sig_5ZB6DVzB1wjE8S6rZ7eenA8Pdnhs}"
ok() { printf ' \033[32m✔\033[0m %s\n' "$1"; }
bad() { printf ' \033[31m✘\033[0m %s\n' "$1"; }
note() { printf ' %s\n' "$1"; }
guard_no_root_env() {
if [[ -f "$ROOT_ENV_FILE" ]]; then
bad "root .env exists: $ROOT_ENV_FILE"
note "Use the existing local configuration instead of init-dev-env.sh."
note "Start normally from repo root, e.g. pnpm run dev:next or bun run dev."
exit 1
fi
}
apply_env() {
export APP_URL="${APP_URL:-http://localhost:${SERVER_PORT}}"
export AUTH_EMAIL_VERIFICATION="${AUTH_EMAIL_VERIFICATION:-0}"
export AUTH_SECRET="${AUTH_SECRET:-agent-testing-local-auth-secret-32chars}"
export DATABASE_DRIVER="${DATABASE_DRIVER:-node}"
export DATABASE_URL
export FEATURE_FLAGS="${FEATURE_FLAGS:--agent_self_iteration}"
export KEY_VAULTS_SECRET="${KEY_VAULTS_SECRET:-r2gbBPKyJ8ZRKCLKt+I3DImfcL+wGxaQyRC56xtm9Uk=}"
export NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION="${NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION:-0}"
export NODE_OPTIONS="${NODE_OPTIONS:---max-old-space-size=6144}"
export PORT="${PORT:-$SERVER_PORT}"
export QSTASH_CURRENT_SIGNING_KEY="${QSTASH_CURRENT_SIGNING_KEY:-$QSTASH_LOCAL_CURRENT_SIGNING_KEY}"
export QSTASH_DEV_PORT
export QSTASH_NEXT_SIGNING_KEY="${QSTASH_NEXT_SIGNING_KEY:-$QSTASH_LOCAL_NEXT_SIGNING_KEY}"
export QSTASH_TOKEN="${QSTASH_TOKEN:-$QSTASH_LOCAL_TOKEN}"
export QSTASH_URL="${QSTASH_URL:-http://127.0.0.1:${QSTASH_DEV_PORT}}"
export S3_ACCESS_KEY_ID="${S3_ACCESS_KEY_ID:-agent-testing-access-key}"
export S3_BUCKET="${S3_BUCKET:-agent-testing-bucket}"
export S3_ENDPOINT="${S3_ENDPOINT:-https://agent-testing-s3.localhost}"
export S3_SECRET_ACCESS_KEY="${S3_SECRET_ACCESS_KEY:-agent-testing-secret-key}"
}
env_keys() {
printf '%s\n' \
APP_URL \
AUTH_EMAIL_VERIFICATION \
AUTH_SECRET \
DATABASE_DRIVER \
DATABASE_URL \
FEATURE_FLAGS \
KEY_VAULTS_SECRET \
NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION \
NODE_OPTIONS \
PORT \
QSTASH_CURRENT_SIGNING_KEY \
QSTASH_DEV_PORT \
QSTASH_NEXT_SIGNING_KEY \
QSTASH_TOKEN \
QSTASH_URL \
S3_ACCESS_KEY_ID \
S3_BUCKET \
S3_ENDPOINT \
S3_SECRET_ACCESS_KEY
}
print_env() {
apply_env
while IFS= read -r key; do
printf 'export %s=%q\n' "$key" "${!key}"
done < <(env_keys)
}
write_env() {
local file="${1:-$ENV_FILE_DEFAULT}"
apply_env
mkdir -p "$(dirname "$file")"
{
printf '# Source this file before starting LobeHub local dev server.\n'
printf '# Generated by %s\n' "$0"
while IFS= read -r key; do
printf 'export %s=%q\n' "$key" "${!key}"
done < <(env_keys)
} > "$file"
ok "wrote env file: $file"
note "source it with: source $file"
}
require_docker() {
if ! command -v docker > /dev/null 2>&1; then
bad "docker CLI is not available"
note "Install/start Docker Desktop, or provide DATABASE_URL for an existing Postgres."
return 1
fi
}
wait_for_db() {
printf ' waiting for Postgres'
until docker exec "$DB_CONTAINER" pg_isready -U postgres > /dev/null 2>&1; do
printf '.'
sleep 2
done
printf '\n'
}
start_db() {
require_docker
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
ok "Postgres container already running: $DB_CONTAINER"
elif docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
docker start "$DB_CONTAINER" > /dev/null
ok "started existing Postgres container: $DB_CONTAINER"
else
docker run -d \
--name "$DB_CONTAINER" \
-e POSTGRES_PASSWORD=postgres \
-p "${DB_PORT}:5432" \
paradedb/paradedb:latest > /dev/null
ok "created Postgres container: $DB_CONTAINER"
fi
wait_for_db
}
migrate_db() {
apply_env
cd "$REPO_ROOT"
bun run db:migrate
}
seed_user() {
apply_env
export AGENT_TESTING_API_KEY
export AGENT_TESTING_CLI_ENV_FILE="${AGENT_TESTING_CLI_ENV_FILE:-$CLI_ENV_FILE_DEFAULT}"
cd "$REPO_ROOT"
node <<'NODE'
const bcrypt = require('bcryptjs');
const crypto = require('node:crypto');
const fs = require('node:fs');
const path = require('node:path');
const pg = require('pg');
const databaseUrl = process.env.DATABASE_URL;
if (!databaseUrl) {
throw new Error('DATABASE_URL is required to seed the baseline test user.');
}
const TEST_USER = {
email: 'agent-testing@lobehub.com',
fullName: 'Agent Testing User',
id: 'user_agent_testing_001',
password: 'TestPassword123!',
username: 'agent_testing_user',
};
const TEST_API_KEY = {
id: 'api_key_agent_testing_001',
key: process.env.AGENT_TESTING_API_KEY || 'sk-lh-agenttesting0001',
name: 'Agent Testing CLI API Key',
};
const validateApiKeyFormat = (apiKey) => /^sk-lh-[\da-z]{16}$/.test(apiKey);
const hashApiKey = (apiKey) => {
const secret = process.env.KEY_VAULTS_SECRET;
if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
return crypto.createHmac('sha256', secret).update(apiKey).digest('hex');
};
const encryptWithKeyVaultsSecret = (plaintext) => {
const secret = process.env.KEY_VAULTS_SECRET;
if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
const rawKey = Buffer.from(secret, 'base64');
if (![16, 24, 32].includes(rawKey.length)) {
throw new Error(
`KEY_VAULTS_SECRET must decode to 16, 24, or 32 bytes, got ${rawKey.length} bytes.`,
);
}
const iv = crypto.randomBytes(12);
const cipher = crypto.createCipheriv(`aes-${rawKey.length * 8}-gcm`, rawKey, iv);
const encrypted = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
const authTag = cipher.getAuthTag();
return `${iv.toString('hex')}:${authTag.toString('hex')}:${encrypted.toString('hex')}`;
};
const writeCliEnvFile = () => {
const file = process.env.AGENT_TESTING_CLI_ENV_FILE || '.records/env/agent-testing-cli.env';
fs.mkdirSync(path.dirname(file), { recursive: true });
fs.writeFileSync(
file,
[
'# Source this file before running LobeHub CLI agent tests.',
'# Generated by init-dev-env.sh seed-user',
`export LOBE_API_KEY=${TEST_API_KEY.key}`,
`export LOBEHUB_CLI_API_KEY="${'${LOBE_API_KEY}'}"`,
`export LOBEHUB_SERVER=${process.env.APP_URL}`,
'export LOBEHUB_CLI_HOME=.lobehub-dev',
'',
].join('\n'),
);
return file;
};
const client = new pg.Client({ connectionString: databaseUrl });
(async () => {
if (!validateApiKeyFormat(TEST_API_KEY.key)) {
throw new Error(`Invalid AGENT_TESTING_API_KEY format: ${TEST_API_KEY.key}`);
}
await client.connect();
const now = new Date().toISOString();
const onboarding = JSON.stringify({ finishedAt: now, version: 1 });
const passwordHash = await bcrypt.hash(TEST_USER.password, 10);
const encryptedApiKey = encryptWithKeyVaultsSecret(TEST_API_KEY.key);
const apiKeyHash = hashApiKey(TEST_API_KEY.key);
await client.query(
`INSERT INTO users (id, email, normalized_email, username, full_name, email_verified, onboarding, created_at, updated_at, last_active_at)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $8, $8)
ON CONFLICT (id) DO UPDATE SET onboarding = $7, updated_at = $8`,
[
TEST_USER.id,
TEST_USER.email,
TEST_USER.email.toLowerCase(),
TEST_USER.username,
TEST_USER.fullName,
true,
onboarding,
now,
],
);
await client.query(
`INSERT INTO accounts (id, user_id, account_id, provider_id, password, created_at, updated_at)
VALUES ($1, $2, $3, $4, $5, $6, $6)
ON CONFLICT DO NOTHING`,
[
'agent_testing_account_001',
TEST_USER.id,
TEST_USER.email,
'credential',
passwordHash,
now,
],
);
await client.query(
`INSERT INTO api_keys (id, name, key, key_hash, enabled, expires_at, user_id, workspace_id, created_at, updated_at)
VALUES ($1, $2, $3, $4, $5, NULL, $6, NULL, $7, $7)
ON CONFLICT (id) DO UPDATE
SET name = EXCLUDED.name,
key = EXCLUDED.key,
key_hash = EXCLUDED.key_hash,
enabled = EXCLUDED.enabled,
expires_at = NULL,
updated_at = EXCLUDED.updated_at`,
[
TEST_API_KEY.id,
TEST_API_KEY.name,
encryptedApiKey,
apiKeyHash,
true,
TEST_USER.id,
now,
],
);
const cliEnvFile = writeCliEnvFile();
console.log('seeded baseline user:');
console.log(` email: ${TEST_USER.email}`);
console.log(` password: ${TEST_USER.password}`);
console.log('seeded baseline API key:');
console.log(` LOBE_API_KEY: ${TEST_API_KEY.key}`);
console.log(` CLI env: ${cliEnvFile}`);
})()
.finally(() => client.end())
.catch((error) => {
console.error(error);
process.exit(1);
});
NODE
}
cmd_status() {
apply_env
echo "agent-testing local dev env:"
note "APP_URL=$APP_URL"
note "DATABASE_URL=$DATABASE_URL"
note "PORT=$PORT"
note "QSTASH_URL=$QSTASH_URL"
if command -v docker > /dev/null 2>&1; then
ok "docker CLI available"
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
ok "managed Postgres running: $DB_CONTAINER"
else
note "managed Postgres is not running: $DB_CONTAINER"
fi
else
bad "docker CLI is not available"
fi
}
cmd_qstash() {
apply_env
cd "$REPO_ROOT"
note "starting local QStash dev server at $QSTASH_URL"
note "keep this process running while testing workflow paths"
exec pnpm run qstash -- -port "$QSTASH_DEV_PORT"
}
cmd_dev_next() {
apply_env
cd "$REPO_ROOT"
exec pnpm run dev:next
}
cmd_dev() {
apply_env
cd "$REPO_ROOT"
exec bun run dev
}
cmd_clean_db() {
require_docker
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
docker stop "$DB_CONTAINER" > /dev/null
fi
if docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
docker rm "$DB_CONTAINER" > /dev/null
ok "removed Postgres container: $DB_CONTAINER"
else
note "Postgres container not found: $DB_CONTAINER"
fi
}
usage() {
sed -n '3,24p' "$0" >&2
}
COMMAND="${1:-status}"
case "$COMMAND" in
help|-h|--help) usage; exit 0 ;;
*) guard_no_root_env ;;
esac
case "$COMMAND" in
env) print_env ;;
write) shift; write_env "${1:-}" ;;
setup-db)
start_db
migrate_db
;;
migrate) migrate_db ;;
seed-user) seed_user ;;
qstash) cmd_qstash ;;
dev-next) cmd_dev_next ;;
dev) cmd_dev ;;
clean-db) cmd_clean_db ;;
status) cmd_status ;;
*)
usage
exit 2
;;
esac
@@ -24,39 +24,53 @@ DATE_HUMAN=$(date '+%Y-%m-%d %H:%M')
DATE_ISO=$(date '+%Y-%m-%dT%H:%M:%S%z')
cat > "$DIR/report.md" << EOF
# Test Report: $TITLE
# 测试报告:$TITLE
## Scope
## 范围
<!-- What changed / what is being verified -->
<!-- 测试目标 / 变更范围 / 重点风险 -->
- Branch: \`$BRANCH\`
- Commit: \`$COMMIT\`
- Date: $DATE_HUMAN
- 分支:\`$BRANCH\`
- 当前提交:\`$COMMIT\`
- 日期:$DATE_HUMAN
- 表面:<!-- CLI / Electron + CDP / Web / Bot:<platform> -->
- 测试页 / 入口:<!-- e.g. /settings or http://localhost:3010 -->
- 重点:<!-- 本轮最关心的体验、功能或回归点 -->
## Environment
## 用例
- Server: <!-- e.g. http://localhost:3010 -->
- Surfaces: <!-- cli / electron / web / bot:<platform> -->
| # | 用例 | 结果 | 关键现象 | 证据 |
| - | ---- | ---- | -------- | ---- |
| 1 | | 待测 | | ![用例 1](assets/case1.png) |
## Cases
## 结论
| # | Case | Surface | Steps | Expected | Actual | Status | Evidence |
| - | ---- | ------- | ----- | -------- | ------ | ------ | -------- |
| 1 | | | | | | | |
整体结论:\`pending\`。
## Evidence
<!-- 用 1-2 段概括用户最需要知道的结果;失败和阻塞必须明确说明影响。 -->
<!-- Embed screenshots: ![case 1](assets/case1.png) -->
<!-- CLI transcripts in fenced blocks, with the exact command -->
仍需处理 / 跟进:
## Verdict
- <!-- TODO -->
- Passed: 0 / 0
- Failed: 0
- Blocked: 0
- Score (optional): —
- Open issues / follow-ups:
## 本轮验证
<!-- 如有自动化或命令行验证,保留精简命令与结果;没有则写“未运行额外自动化验证”。 -->
\`\`\`bash
# command
\`\`\`
结果:
- <!-- TODO -->
## 评分
- 通过:0
- 失败:0
- 阻塞:0
- 评分:— / 100
EOF
cat > "$DIR/result.json" << EOF
@@ -5,29 +5,114 @@
# test step. Background and failure modes: ../references/auth.md
#
# Usage:
# setup-auth.sh status # check server + CLI + web auth readiness
# setup-auth.sh status # check server + CLI + web + Electron readiness
# setup-auth.sh status --surface web # check only the Web surface gate
# setup-auth.sh cli-seed # configure CLI API-key auth from seeded local env
# setup-auth.sh cli # interactive CLI device-code login (run by a human)
# setup-auth.sh open-chrome # open SERVER_URL in Chrome and show DevTools
# setup-auth.sh web-seed # sign in seeded user and inject cookies automatically
# setup-auth.sh web # stdin = Cookie header -> inject into agent-browser session
# setup-auth.sh web-verify # live-check the agent-browser session is authenticated
#
# Env:
# SERVER_URL (default http://localhost:3010) dev server under test
# SERVER_URL (default from test-env.sh) dev server under test
# SESSION (default lobehub-dev) agent-browser session name
# AUTH_DIR (default ~/.lobehub-agent-testing) where web state is persisted
# SEED_EMAIL / SEED_PASSWORD seeded better-auth login
set -euo pipefail
SERVER_URL="${SERVER_URL:-http://localhost:3010}"
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
workspace_root_for_port() {
local root="$REPO_ROOT"
local name
name="$(basename "$root")"
if [[ "$name" == "lobehub" ]]; then
local parent
parent="$(cd "$root/.." && pwd)"
local parent_name
parent_name="$(basename "$parent")"
if [[ "$parent_name" == lobehub-cloud* ]]; then
root="$parent"
fi
fi
printf '%s\n' "$root"
}
default_server_url() {
local env_resolver resolved
env_resolver="$(dirname "${BASH_SOURCE[0]}")/test-env.sh"
if [[ -x "$env_resolver" ]]; then
resolved="$("$env_resolver" --value SERVER_URL 2> /dev/null || true)"
if [[ -n "$resolved" ]]; then
printf '%s\n' "$resolved"
return 0
fi
fi
local root name suffix port
root="$(workspace_root_for_port)"
name="$(basename "$root")"
case "$name" in
lobehub-cloud)
port=3020
;;
lobehub-cloud-*)
suffix="${name#lobehub-cloud-}"
if [[ "$suffix" =~ ^[0-9]+$ ]]; then
port=$((3020 + 10#$suffix))
else
port=3010
fi
;;
*)
port=3010
;;
esac
printf 'http://localhost:%s\n' "$port"
}
SERVER_URL="${SERVER_URL:-$(default_server_url)}"
SESSION="${SESSION:-lobehub-dev}"
AUTH_DIR="${AUTH_DIR:-$HOME/.lobehub-agent-testing}"
STATE_FILE="$AUTH_DIR/web-state.json"
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
CLI_HOME="$REPO_ROOT/apps/cli/.lobehub-dev"
CLI_HOME_NAME="${LOBEHUB_CLI_HOME:-.lobehub-dev}"
CLI_HOME="$HOME/${CLI_HOME_NAME#/}"
CLI_CREDENTIALS_FILE="$CLI_HOME/credentials.json"
SEED_EMAIL="${SEED_EMAIL:-agent-testing@lobehub.com}"
SEED_PASSWORD="${SEED_PASSWORD:-TestPassword123!}"
SEED_API_KEY="${SEED_API_KEY:-${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}}"
CLI_ENV_FILE="${CLI_ENV_FILE:-$REPO_ROOT/.records/env/agent-testing-cli.env}"
ok() { printf ' \033[32m✔\033[0m %s\n' "$1"; }
bad() { printf ' \033[31m✘\033[0m %s\n' "$1"; }
note() { printf ' %s\n' "$1"; }
usage() {
cat << EOF
Usage:
$0 status [--surface all|cli|web|electron]
$0 cli-seed
$0 cli
$0 open-chrome [--dry-run]
$0 web-seed
$0 web
$0 web-verify
Env:
SERVER_URL=$SERVER_URL
SESSION=$SESSION
AUTH_DIR=$AUTH_DIR
SEED_EMAIL=$SEED_EMAIL
CLI_HOME=$CLI_HOME
EOF
}
check_server() {
local code
code=$(curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/" 2> /dev/null || true)
@@ -41,11 +126,35 @@ check_server() {
}
check_cli() {
if [[ -f "$CLI_HOME/settings.json" ]] && grep -q "$SERVER_URL" "$CLI_HOME/settings.json"; then
ok "CLI logged in to $SERVER_URL (creds: apps/cli/.lobehub-dev)"
local api_key="${LOBEHUB_CLI_API_KEY:-${LOBE_API_KEY:-}}"
if [[ -n "$api_key" ]]; then
local body_file code
body_file="$(mktemp)"
code=$(curl -sS -o "$body_file" -w '%{http_code}' \
-H "Authorization: Bearer $api_key" \
"$SERVER_URL/api/v1/users/me?includeCount=0" 2> /dev/null || true)
if [[ "$code" =~ ^[23] ]]; then
rm -f "$body_file"
ok "CLI API-key auth valid for $SERVER_URL"
return 0
fi
bad "CLI API-key auth failed for $SERVER_URL (http_code='$code')"
note "seed the local API key first:"
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
note "source $CLI_ENV_FILE"
rm -f "$body_file"
return 1
fi
if [[ -f "$CLI_HOME/settings.json" ]] && grep -q "$SERVER_URL" "$CLI_HOME/settings.json" && [[ -f "$CLI_CREDENTIALS_FILE" ]]; then
ok "CLI device-code credentials configured for $SERVER_URL (creds: $CLI_HOME)"
else
bad "CLI not logged in to $SERVER_URL"
note "ask the user to run:"
note "automated path:"
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user && source $CLI_ENV_FILE && $0 cli-seed"
note "interactive fallback:"
note "cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server $SERVER_URL"
return 1
fi
@@ -54,13 +163,24 @@ check_cli() {
check_web() {
if [[ -f "$STATE_FILE" ]]; then
ok "web auth state saved ($STATE_FILE)"
note "live-verify: $0 web-verify"
else
bad "no web auth state for agent-browser"
note "copy the Cookie header from Chrome DevTools (Network tab), then:"
note "for the seeded local user, run: $0 web-seed"
note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
note "pbpaste | $0 web (see references/auth.md)"
return 1
fi
cmd_web_verify --skip-server-check
}
check_agent_browser() {
if command -v agent-browser > /dev/null 2>&1; then
ok "agent-browser available"
else
bad "agent-browser command not found"
note "install or expose agent-browser before Web/Electron UI testing"
return 1
fi
}
check_electron() {
@@ -84,16 +204,75 @@ check_electron() {
}
cmd_status() {
echo "agent-testing auth status (SERVER_URL=$SERVER_URL):"
local surface="all"
while [[ $# -gt 0 ]]; do
case "$1" in
--surface)
if [[ $# -lt 2 ]]; then
echo "--surface requires one of: all, cli, web, electron" >&2
return 2
fi
surface="${2:-}"
shift 2
;;
--surface=*)
surface="${1#*=}"
shift
;;
all|cli|web|electron)
surface="$1"
shift
;;
-h|--help)
usage
return 0
;;
*)
echo "unknown status option: $1" >&2
usage >&2
return 2
;;
esac
done
case "$surface" in
all|cli|web|electron) ;;
"")
echo "--surface requires one of: all, cli, web, electron" >&2
return 2
;;
*)
echo "unknown surface: $surface" >&2
usage >&2
return 2
;;
esac
echo "agent-testing auth status (surface=$surface, SERVER_URL=$SERVER_URL):"
local rc=0
check_server || rc=1
check_cli || rc=1
check_web || rc=1
check_electron || rc=1
case "$surface" in
all)
check_server || rc=1
check_cli || rc=1
check_web || rc=1
check_electron || rc=1
;;
cli)
check_server || rc=1
check_cli || rc=1
;;
web)
check_server || rc=1
check_web || rc=1
;;
electron)
check_electron || rc=1
;;
esac
if [[ $rc -eq 0 ]]; then
echo "all green — safe to start automated testing."
echo "$surface auth green — safe to start automated testing on this surface."
else
echo "auth NOT ready — fix the ✘ items before writing any test step."
echo "$surface auth NOT ready — fix the ✘ items before writing any test step."
fi
return $rc
}
@@ -105,23 +284,148 @@ cmd_cli() {
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server "$SERVER_URL"
}
write_cli_seed_env() {
mkdir -p "$(dirname "$CLI_ENV_FILE")"
cat > "$CLI_ENV_FILE" << EOF
# Source this file before running LobeHub CLI agent tests.
# Generated by setup-auth.sh cli-seed
export LOBE_API_KEY=$SEED_API_KEY
export LOBEHUB_CLI_API_KEY="\${LOBE_API_KEY}"
export LOBEHUB_SERVER=$SERVER_URL
export LOBEHUB_CLI_HOME=.lobehub-dev
EOF
}
write_cli_settings() {
mkdir -p "$CLI_HOME"
python3 - "$CLI_HOME/settings.json" "$SERVER_URL" << 'PY'
import json
import os
import sys
path, server_url = sys.argv[1], sys.argv[2]
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, "w") as f:
json.dump({"serverUrl": server_url}, f, indent=2)
f.write("\n")
os.chmod(path, 0o600)
PY
}
cmd_cli_seed() {
check_server || return 1
write_cli_seed_env
write_cli_settings
ok "wrote CLI seed env: $CLI_ENV_FILE"
note "source it before CLI commands: source $CLI_ENV_FILE"
note "settings saved at: $CLI_HOME/settings.json"
LOBE_API_KEY="$SEED_API_KEY" LOBEHUB_CLI_API_KEY="$SEED_API_KEY" check_cli
}
cmd_open_chrome() {
local mode="${1:-}"
if [[ "$mode" != "" && "$mode" != "--dry-run" ]]; then
echo "unknown open-chrome option: $mode" >&2
usage >&2
return 2
fi
if [[ "$mode" == "--dry-run" ]]; then
echo "would open Google Chrome at $SERVER_URL/"
echo "would press Cmd+Option+I to open DevTools"
echo "would open DevTools command menu and run 'Show Network'"
return 0
fi
if [[ "$(uname -s)" != "Darwin" ]]; then
bad "open-chrome is macOS-only"
note "open $SERVER_URL/ in your browser and open DevTools manually"
return 1
fi
if ! command -v osascript > /dev/null 2>&1; then
bad "osascript not found"
note "open $SERVER_URL/ in Chrome and press Cmd+Option+I manually"
return 1
fi
SERVER_URL="$SERVER_URL" osascript << 'OSA'
set targetUrl to (system attribute "SERVER_URL") & "/"
tell application "Google Chrome"
activate
if (count of windows) = 0 then
make new window
end if
tell front window to make new tab with properties {URL:targetUrl}
end tell
delay 1
tell application "System Events"
tell process "Google Chrome"
set frontmost to true
keystroke "i" using {command down, option down}
delay 1
keystroke "p" using {command down, shift down}
delay 0.2
keystroke "Show Network"
key code 36
end tell
end tell
OSA
ok "opened Chrome at $SERVER_URL/ and requested DevTools Network panel"
}
cookie_header_from_jar() {
local jar="$1"
awk '
BEGIN { first = 1 }
/^$/ { next }
/^#/ {
if ($0 !~ /^#HttpOnly_/) next
sub(/^#HttpOnly_/, "")
}
NF >= 7 {
if (!first) printf "; "
printf "%s=%s", $6, $7
first = 0
}
END {
if (!first) printf "\n"
}
' "$jar"
}
# Build a Playwright storageState file from a raw Cookie header on stdin,
# keeping only the better-auth cookies. See references/auth.md for why the
# header must come from a Network request (HttpOnly) and why httpOnly=false.
cmd_web() {
mkdir -p "$AUTH_DIR"
python3 - "$STATE_FILE" << 'PY'
import json, sys, time
local raw
raw="$(cat)"
COOKIE_INPUT="$raw" python3 - "$STATE_FILE" << 'PY'
import json, os, sys, time
raw = sys.stdin.read().strip()
if raw.lower().startswith("cookie:"):
raw = raw.split(":", 1)[1].strip()
raw = os.environ.get("COOKIE_INPUT", "").strip()
cookie_lines = []
for line in raw.splitlines():
stripped = line.strip()
if not stripped:
continue
if stripped.lower().startswith("cookie:"):
cookie_lines.append(stripped.split(":", 1)[1].strip())
else:
cookie_lines.append(stripped)
WANTED = {"better-auth.session_token", "better-auth.state"}
raw = "; ".join(cookie_lines)
WANTED = {"better-auth.session_token", "better-auth.session_data", "better-auth.state"}
exp = int(time.time()) + 30 * 24 * 3600 # 30 days
cookies = []
for pair in raw.split("; "):
for pair in raw.split(";"):
pair = pair.strip()
if "=" not in pair:
continue
name, _, value = pair.partition("=")
@@ -146,14 +450,79 @@ with open(sys.argv[1], "w") as f:
json.dump({"cookies": cookies, "origins": []}, f, indent=2)
print(f"wrote {len(cookies)} cookie(s) to {sys.argv[1]}")
PY
agent-browser --session "$SESSION" state load "$STATE_FILE"
cmd_web_verify
}
cmd_web_seed() {
check_server || return 1
mkdir -p "$AUTH_DIR"
local cookie_jar="$AUTH_DIR/web-seed-cookie.jar"
local response_body="$AUTH_DIR/web-seed-response.json"
local payload code
payload="$(
SEED_EMAIL="$SEED_EMAIL" SEED_PASSWORD="$SEED_PASSWORD" python3 - << 'PY'
import json
import os
print(json.dumps({
"callbackURL": "/",
"email": os.environ["SEED_EMAIL"],
"password": os.environ["SEED_PASSWORD"],
}))
PY
)"
code=$(curl -sS -o "$response_body" -w '%{http_code}' \
-c "$cookie_jar" \
-H 'Content-Type: application/json' \
-X POST "$SERVER_URL/api/auth/sign-in/email" \
--data "$payload" 2> /dev/null || true)
if [[ ! "$code" =~ ^[23] ]]; then
bad "seed user sign-in failed at $SERVER_URL/api/auth/sign-in/email (http_code='$code')"
note "make sure the seed user exists:"
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
return 1
fi
local cookie_header
cookie_header="$(cookie_header_from_jar "$cookie_jar")"
if [[ -z "$cookie_header" ]]; then
bad "seed sign-in succeeded but no cookies were written to $cookie_jar"
return 1
fi
printf '%s\n' "$cookie_header" | cmd_web
}
cmd_web_verify() {
agent-browser --session "$SESSION" open "$SERVER_URL/" > /dev/null
local skip_server_check="${1:-}"
if [[ "$skip_server_check" != "--skip-server-check" ]]; then
check_server || return 1
fi
if [[ ! -f "$STATE_FILE" ]]; then
bad "no web auth state for agent-browser"
note "for the seeded local user, run: $0 web-seed"
note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
note "pbpaste | $0 web"
return 1
fi
check_agent_browser || return 1
if ! agent-browser --session "$SESSION" state load "$STATE_FILE" > /dev/null; then
bad "failed to load web auth state into agent-browser session '$SESSION'"
return 1
fi
if ! agent-browser --session "$SESSION" open "$SERVER_URL/" > /dev/null; then
bad "failed to open $SERVER_URL in agent-browser session '$SESSION'"
return 1
fi
local url
url=$(agent-browser --session "$SESSION" get url)
url=$(agent-browser --session "$SESSION" get url 2> /dev/null || true)
if [[ -z "$url" ]]; then
bad "agent-browser session '$SESSION' did not report a current URL"
return 1
fi
if [[ "$url" == *"/signin"* || "$url" == *"/login"* ]]; then
bad "agent-browser session '$SESSION' NOT authenticated (landed on $url)"
note "re-copy the Cookie header and re-run: pbpaste | $0 web"
@@ -163,12 +532,22 @@ cmd_web_verify() {
}
case "${1:-status}" in
status) cmd_status ;;
status)
shift || true
cmd_status "$@"
;;
cli-seed) cmd_cli_seed ;;
cli) cmd_cli ;;
open-chrome)
shift || true
cmd_open_chrome "$@"
;;
web-seed) cmd_web_seed ;;
web) cmd_web ;;
web-verify) cmd_web_verify ;;
-h|--help) usage ;;
*)
echo "Usage: $0 {status|cli|web|web-verify}" >&2
echo "Usage: $0 {status|cli-seed|cli|open-chrome|web-seed|web|web-verify}" >&2
exit 2
;;
esac
+197
View File
@@ -0,0 +1,197 @@
#!/usr/bin/env bash
# Smoke tests for setup-auth.sh. Uses a temporary agent-browser stub and local
# HTTP server, so it does not need real browser auth.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SCRIPT="$SCRIPT_DIR/setup-auth.sh"
fail() {
echo "FAIL: $*" >&2
exit 1
}
assert_contains() {
local file="$1"
local text="$2"
grep -Fq "$text" "$file" || fail "expected '$text' in $file"
}
tmp_dir="$(mktemp -d)"
server_pid=""
cleanup() {
if [[ -n "$server_pid" ]]; then
kill "$server_pid" > /dev/null 2>&1 || true
wait "$server_pid" > /dev/null 2>&1 || true
fi
rm -rf "$tmp_dir"
}
trap cleanup EXIT
export HOME="$tmp_dir/home"
port="$(python3 - << 'PY'
import socket
sock = socket.socket()
sock.bind(("127.0.0.1", 0))
print(sock.getsockname()[1])
sock.close()
PY
)"
python3 - "$port" << 'PY' > "$tmp_dir/http.log" 2>&1 &
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
import sys
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path.startswith("/api/v1/users/me"):
if self.headers.get("authorization") != "Bearer sk-lh-agenttesting0001":
self.send_response(401)
self.end_headers()
self.wfile.write(b'{"success":false}')
return
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(b'{"success":true,"data":{"id":"user_agent_testing_001"}}')
return
self.send_response(200)
self.end_headers()
self.wfile.write(b"ok")
def do_POST(self):
length = int(self.headers.get("content-length") or "0")
if length:
self.rfile.read(length)
if self.path != "/api/auth/sign-in/email":
self.send_response(404)
self.end_headers()
return
self.send_response(200)
self.send_header(
"Set-Cookie",
"better-auth.session_token=seed.token; Path=/; HttpOnly; SameSite=Lax",
)
self.send_header(
"Set-Cookie",
"better-auth.session_data=seed.data; Path=/; HttpOnly; SameSite=Lax",
)
self.send_header("Content-Type", "application/json")
self.end_headers()
self.wfile.write(b'{"ok":true}')
def log_message(self, format, *args):
return
ThreadingHTTPServer(("localhost", int(sys.argv[1])), Handler).serve_forever()
PY
server_pid="$!"
server_url="http://localhost:$port"
for _ in {1..50}; do
if curl -s -o /dev/null "$server_url/"; then
break
fi
sleep 0.1
done
curl -s -o /dev/null "$server_url/" || fail "test HTTP server did not start"
mkdir -p "$tmp_dir/bin" "$tmp_dir/auth"
cat > "$tmp_dir/bin/agent-browser" << 'SH'
#!/usr/bin/env bash
set -euo pipefail
if [[ "${1:-}" == "--session" ]]; then
shift 2
fi
case "${1:-}" in
state)
[[ "${2:-}" == "load" ]] || exit 2
[[ -f "${3:-}" ]] || exit 1
;;
open)
printf '%s\n' "${2:-}" > "${AGENT_BROWSER_URL_FILE:?}"
;;
get)
[[ "${2:-}" == "url" ]] || exit 2
cat "${AGENT_BROWSER_URL_FILE:?}"
;;
*)
echo "unexpected agent-browser command: $*" >&2
exit 2
;;
esac
SH
chmod +x "$tmp_dir/bin/agent-browser"
export PATH="$tmp_dir/bin:$PATH"
export AUTH_DIR="$tmp_dir/auth"
export SESSION="setup-auth-test"
export SERVER_URL="$server_url"
export AGENT_BROWSER_URL_FILE="$tmp_dir/current-url"
cookie_header="Cookie: foo=bar; better-auth.session_token=test.token; better-auth.session_data=encoded%3D; theme=dark"
printf '%s\n' "$cookie_header" | "$SCRIPT" web > "$tmp_dir/web.out"
python3 - "$AUTH_DIR/web-state.json" << 'PY'
import json, sys
with open(sys.argv[1]) as f:
state = json.load(f)
names = {cookie["name"] for cookie in state["cookies"]}
expected = {"better-auth.session_token", "better-auth.session_data"}
if names != expected:
raise SystemExit(f"unexpected cookies: {sorted(names)}")
PY
"$SCRIPT" web-seed > "$tmp_dir/web-seed.out"
python3 - "$AUTH_DIR/web-state.json" << 'PY'
import json, sys
with open(sys.argv[1]) as f:
state = json.load(f)
values = {cookie["name"]: cookie["value"] for cookie in state["cookies"]}
expected = {
"better-auth.session_token": "seed.token",
"better-auth.session_data": "seed.data",
}
if values != expected:
raise SystemExit(f"unexpected seeded cookies: {values}")
PY
"$SCRIPT" status --surface web > "$tmp_dir/status.out"
assert_contains "$tmp_dir/status.out" "surface=web"
assert_contains "$tmp_dir/status.out" "web auth green"
"$SCRIPT" cli-seed > "$tmp_dir/cli-seed.out"
assert_contains "$tmp_dir/cli-seed.out" "CLI API-key auth valid"
assert_contains "$tmp_dir/cli-seed.out" "settings saved at: $HOME/.lobehub-dev/settings.json"
if "$SCRIPT" status --surface cli > "$tmp_dir/cli-no-env.out"; then
fail "cli status without API key unexpectedly passed"
fi
assert_contains "$tmp_dir/cli-no-env.out" "CLI not logged in"
LOBEHUB_CLI_API_KEY=sk-lh-agenttesting0001 "$SCRIPT" status --surface cli > "$tmp_dir/cli-status.out"
assert_contains "$tmp_dir/cli-status.out" "CLI API-key auth valid"
assert_contains "$tmp_dir/cli-status.out" "cli auth green"
if printf 'foo=bar\n' | "$SCRIPT" web > "$tmp_dir/invalid.out" 2> "$tmp_dir/invalid.err"; then
fail "invalid cookie unexpectedly passed"
fi
assert_contains "$tmp_dir/invalid.err" "no better-auth cookies found"
echo "setup-auth tests passed"
+377
View File
@@ -0,0 +1,377 @@
#!/usr/bin/env bash
# Print the resolved local test environment for agent-testing.
#
# This is intentionally read-only. It mirrors scripts/runWithEnv.mts precedence:
# .env -> .env.$NODE_ENV -> .env.local -> .env.$NODE_ENV.local, then shell env.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)"
NODE_ENV="${NODE_ENV:-development}"
VALUE_APP_URL=""
VALUE_PORT=""
VALUE_SERVER_URL=""
VALUE_AUTH_TRUSTED_ORIGINS=""
VALUE_SPA_PORT=""
VALUE_MOBILE_SPA_PORT=""
VALUE_DESKTOP_PORT=""
SOURCE_APP_URL=""
SOURCE_PORT=""
SOURCE_SERVER_URL=""
SOURCE_AUTH_TRUSTED_ORIGINS=""
SOURCE_SPA_PORT=""
SOURCE_MOBILE_SPA_PORT=""
SOURCE_DESKTOP_PORT=""
LOADED_ENV_FILES=""
keys() {
printf '%s\n' \
APP_URL \
PORT \
SERVER_URL \
AUTH_TRUSTED_ORIGINS \
SPA_PORT \
MOBILE_SPA_PORT \
DESKTOP_PORT
}
trim() {
local value="$1"
value="${value#"${value%%[![:space:]]*}"}"
value="${value%"${value##*[![:space:]]}"}"
printf '%s' "$value"
}
workspace_root() {
local root="$REPO_ROOT"
local name
name="$(basename "$root")"
if [[ "$name" == "lobehub" ]]; then
local parent parent_name
parent="$(cd "$root/.." && pwd)"
parent_name="$(basename "$parent")"
if [[ "$parent_name" == lobehub-cloud* ]]; then
root="$parent"
fi
fi
printf '%s\n' "$root"
}
workspace_offset() {
local name="$1"
case "$name" in
lobehub-cloud)
printf '0\n'
;;
lobehub-cloud-*)
local suffix="${name#lobehub-cloud-}"
if [[ "$suffix" =~ ^[0-9]+$ ]]; then
printf '%s\n' "$((10#$suffix))"
else
printf '\n'
fi
;;
*)
printf '\n'
;;
esac
}
default_port() {
local base="$1"
local fallback="$2"
local root name offset
root="$(workspace_root)"
name="$(basename "$root")"
offset="$(workspace_offset "$name")"
if [[ -n "$offset" ]]; then
printf '%s\n' "$((base + offset))"
else
printf '%s\n' "$fallback"
fi
}
url_port() {
local url="$1"
local hostport
hostport="${url#*://}"
hostport="${hostport%%/*}"
if [[ "$hostport" == *:* ]]; then
local port="${hostport##*:}"
if [[ "$port" =~ ^[0-9]+$ ]]; then
printf '%s\n' "$port"
return 0
fi
fi
return 1
}
url_origin() {
local url="$1"
local scheme rest hostport
if [[ "$url" == *"://"* ]]; then
scheme="${url%%://*}"
rest="${url#*://}"
hostport="${rest%%/*}"
printf '%s://%s\n' "$scheme" "$hostport"
else
printf '%s\n' "$url"
fi
}
set_value() {
local key="$1"
local value="$2"
local source="$3"
case "$key" in
APP_URL) VALUE_APP_URL="$value"; SOURCE_APP_URL="$source" ;;
PORT) VALUE_PORT="$value"; SOURCE_PORT="$source" ;;
SERVER_URL) VALUE_SERVER_URL="$value"; SOURCE_SERVER_URL="$source" ;;
AUTH_TRUSTED_ORIGINS) VALUE_AUTH_TRUSTED_ORIGINS="$value"; SOURCE_AUTH_TRUSTED_ORIGINS="$source" ;;
SPA_PORT) VALUE_SPA_PORT="$value"; SOURCE_SPA_PORT="$source" ;;
MOBILE_SPA_PORT) VALUE_MOBILE_SPA_PORT="$value"; SOURCE_MOBILE_SPA_PORT="$source" ;;
DESKTOP_PORT) VALUE_DESKTOP_PORT="$value"; SOURCE_DESKTOP_PORT="$source" ;;
esac
}
value_for() {
case "$1" in
APP_URL) printf '%s\n' "$VALUE_APP_URL" ;;
PORT) printf '%s\n' "$VALUE_PORT" ;;
SERVER_URL) printf '%s\n' "$VALUE_SERVER_URL" ;;
AUTH_TRUSTED_ORIGINS) printf '%s\n' "$VALUE_AUTH_TRUSTED_ORIGINS" ;;
SPA_PORT) printf '%s\n' "$VALUE_SPA_PORT" ;;
MOBILE_SPA_PORT) printf '%s\n' "$VALUE_MOBILE_SPA_PORT" ;;
DESKTOP_PORT) printf '%s\n' "$VALUE_DESKTOP_PORT" ;;
esac
}
source_for() {
case "$1" in
APP_URL) printf '%s\n' "$SOURCE_APP_URL" ;;
PORT) printf '%s\n' "$SOURCE_PORT" ;;
SERVER_URL) printf '%s\n' "$SOURCE_SERVER_URL" ;;
AUTH_TRUSTED_ORIGINS) printf '%s\n' "$SOURCE_AUTH_TRUSTED_ORIGINS" ;;
SPA_PORT) printf '%s\n' "$SOURCE_SPA_PORT" ;;
MOBILE_SPA_PORT) printf '%s\n' "$SOURCE_MOBILE_SPA_PORT" ;;
DESKTOP_PORT) printf '%s\n' "$SOURCE_DESKTOP_PORT" ;;
esac
}
is_tracked_key() {
case "$1" in
APP_URL|PORT|SERVER_URL|AUTH_TRUSTED_ORIGINS|SPA_PORT|MOBILE_SPA_PORT|DESKTOP_PORT) return 0 ;;
*) return 1 ;;
esac
}
parse_env_file() {
local file="$1"
local root="$2"
local label="${file#$root/}"
local line key value
[[ -f "$file" ]] || return 0
if [[ -z "$LOADED_ENV_FILES" ]]; then
LOADED_ENV_FILES="$label"
else
LOADED_ENV_FILES="$LOADED_ENV_FILES, $label"
fi
while IFS= read -r line || [[ -n "$line" ]]; do
line="$(trim "$line")"
[[ -z "$line" || "$line" == \#* ]] && continue
if [[ "$line" == export[[:space:]]* ]]; then
line="$(trim "${line#export}")"
fi
[[ "$line" == *=* ]] || continue
key="$(trim "${line%%=*}")"
value="$(trim "${line#*=}")"
is_tracked_key "$key" || continue
if [[ "$value" == \"*\" && "$value" == *\" && ${#value} -ge 2 ]]; then
value="${value:1:${#value}-2}"
elif [[ "$value" == \'* && "$value" == *\' && ${#value} -ge 2 ]]; then
value="${value:1:${#value}-2}"
fi
set_value "$key" "$value" "$label"
done < "$file"
}
apply_env_files() {
local root="$1"
parse_env_file "$root/.env" "$root"
parse_env_file "$root/.env.$NODE_ENV" "$root"
parse_env_file "$root/.env.local" "$root"
parse_env_file "$root/.env.$NODE_ENV.local" "$root"
}
apply_shell_overrides() {
local key value
while IFS= read -r key; do
if [[ -n "${!key+x}" ]]; then
value="${!key}"
set_value "$key" "$value" "shell"
fi
done < <(keys)
}
resolve_defaults() {
local app_port spa_port mobile_spa_port desktop_port
app_port="$(default_port 3020 3010)"
spa_port="$(default_port 9800 9876)"
mobile_spa_port="$(default_port 3810 3012)"
desktop_port="$(default_port 3030 3015)"
if [[ -z "$VALUE_APP_URL" ]]; then
set_value APP_URL "http://localhost:$app_port" "inferred"
fi
if [[ -z "$VALUE_PORT" ]]; then
if app_port="$(url_port "$VALUE_APP_URL")"; then
set_value PORT "$app_port" "inferred from APP_URL"
else
set_value PORT "$(default_port 3020 3010)" "inferred"
fi
fi
if [[ -z "$VALUE_SERVER_URL" ]]; then
set_value SERVER_URL "$VALUE_APP_URL" "from APP_URL"
fi
if [[ -z "$VALUE_SPA_PORT" ]]; then
set_value SPA_PORT "$spa_port" "inferred"
fi
if [[ -z "$VALUE_MOBILE_SPA_PORT" ]]; then
set_value MOBILE_SPA_PORT "$mobile_spa_port" "inferred"
fi
if [[ -z "$VALUE_DESKTOP_PORT" ]]; then
set_value DESKTOP_PORT "$desktop_port" "inferred"
fi
if [[ -z "$VALUE_AUTH_TRUSTED_ORIGINS" ]]; then
set_value AUTH_TRUSTED_ORIGINS "$(url_origin "$VALUE_APP_URL"),http://localhost:$VALUE_SPA_PORT" "inferred"
fi
}
contains_origin() {
local list="$1"
local expected="$2"
local item
IFS=',' read -r -a items <<< "$list"
for item in "${items[@]}"; do
item="$(trim "$item")"
[[ "$item" == "$expected" ]] && return 0
done
return 1
}
print_exports() {
local key value
while IFS= read -r key; do
value="$(value_for "$key")"
printf 'export %s=%q\n' "$key" "$value"
done < <(keys)
}
print_value() {
local key="$1"
if ! is_tracked_key "$key"; then
echo "unknown key: $key" >&2
exit 2
fi
value_for "$key"
}
print_human() {
local root="$1"
local key value source
echo "agent-testing test env:"
printf ' workspace: %s\n' "$root"
printf ' NODE_ENV: %s\n' "$NODE_ENV"
printf ' env files: %s\n' "${LOADED_ENV_FILES:-none}"
echo
echo "resolved values:"
while IFS= read -r key; do
value="$(value_for "$key")"
source="$(source_for "$key")"
printf ' %-22s %s (%s)\n' "$key=$value" "" "$source"
done < <(keys)
echo
echo "checks:"
local app_origin spa_origin app_port
app_origin="$(url_origin "$VALUE_APP_URL")"
spa_origin="http://localhost:$VALUE_SPA_PORT"
if app_port="$(url_port "$VALUE_APP_URL")" && [[ "$app_port" == "$VALUE_PORT" ]]; then
printf ' OK PORT matches APP_URL (%s)\n' "$VALUE_PORT"
else
printf ' WARN PORT (%s) does not match APP_URL (%s)\n' "$VALUE_PORT" "$VALUE_APP_URL"
fi
if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$app_origin"; then
printf ' OK AUTH_TRUSTED_ORIGINS includes %s\n' "$app_origin"
else
printf ' WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$app_origin"
fi
if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$spa_origin"; then
printf ' OK AUTH_TRUSTED_ORIGINS includes %s\n' "$spa_origin"
else
printf ' WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$spa_origin"
fi
}
usage() {
cat << EOF
Usage:
$0 # print resolved test environment
$0 --exports # print source-able export lines
$0 --value KEY # print one resolved value
Tracked keys:
APP_URL PORT SERVER_URL AUTH_TRUSTED_ORIGINS SPA_PORT MOBILE_SPA_PORT DESKTOP_PORT
EOF
}
ROOT="$(workspace_root)"
apply_env_files "$ROOT"
apply_shell_overrides
resolve_defaults
case "${1:-}" in
"")
print_human "$ROOT"
;;
--exports)
print_exports
;;
--value)
print_value "${2:-}"
;;
-h|--help)
usage
;;
*)
echo "unknown option: $1" >&2
usage >&2
exit 2
;;
esac
+57
View File
@@ -0,0 +1,57 @@
#!/usr/bin/env bash
# Smoke tests for test-env.sh.
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
fail() {
echo "FAIL: $*" >&2
exit 1
}
assert_eq() {
local actual="$1"
local expected="$2"
[[ "$actual" == "$expected" ]] || fail "expected '$expected', got '$actual'"
}
assert_contains() {
local file="$1"
local text="$2"
grep -Fq "$text" "$file" || fail "expected '$text' in $file"
}
tmp_dir="$(mktemp -d)"
trap 'rm -rf "$tmp_dir"' EXIT
mkdir -p "$tmp_dir/lobehub-cloud-1/.agents/skills" "$tmp_dir/lobehub/.agents/skills"
ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing"
ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub/.agents/skills/agent-testing"
cloud_script="$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing/scripts/test-env.sh"
oss_script="$tmp_dir/lobehub/.agents/skills/agent-testing/scripts/test-env.sh"
assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:3021"
assert_eq "$("$cloud_script" --value SPA_PORT)" "9801"
assert_eq "$("$cloud_script" --value MOBILE_SPA_PORT)" "3811"
assert_eq "$("$cloud_script" --value DESKTOP_PORT)" "3031"
assert_eq "$("$oss_script" --value SERVER_URL)" "http://localhost:3010"
cat > "$tmp_dir/lobehub-cloud-1/.env" << 'EOF'
APP_URL=http://localhost:4123
PORT=4123
AUTH_TRUSTED_ORIGINS=http://localhost:4123,http://localhost:9823
SPA_PORT=9823
MOBILE_SPA_PORT=3823
DESKTOP_PORT=3043
EOF
assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:4123"
assert_eq "$("$cloud_script" --value SPA_PORT)" "9823"
"$cloud_script" --exports > "$tmp_dir/exports.out"
assert_contains "$tmp_dir/exports.out" "export APP_URL=http://localhost:4123"
assert_contains "$tmp_dir/exports.out" "export SERVER_URL=http://localhost:4123"
assert_contains "$tmp_dir/exports.out" "export AUTH_TRUSTED_ORIGINS=http://localhost:4123\\,http://localhost:9823"
echo "test-env tests passed"
+13 -4
View File
@@ -10,23 +10,32 @@ backend-only changes prefer [../cli/index.md](../cli/index.md).
## Prerequisites
- Complete [Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) (resolve ports) and [Step -1](../SKILL.md#step--1--plan-approval-for-non-trivial-tests) (plan approval) first.
- Local dev server running — [../references/dev-server.md](../references/dev-server.md)
- Web auth injected into agent-browser — [../references/auth.md](../references/auth.md):
- Web auth verified in agent-browser — prefer `setup-auth.sh web-seed`, see [auth decision flow](../references/auth.md#web--decision-flow).
## Option A — agent-browser with seeded auth (recommended)
```bash
pbpaste | ./.agents/skills/agent-testing/scripts/setup-auth.sh web # after copying the Cookie header
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
```
## Option A — agent-browser with injected auth (recommended)
Then drive the verified session:
```bash
SESSION=lobehub-dev
agent-browser --session $SESSION open "http://localhost:3010/"
agent-browser --session $SESSION open "$SERVER_URL/"
agent-browser --session $SESSION snapshot -i
# interact via refs — full command reference: ../references/agent-browser.md
```
Use this session as the evidence source. Do not use ordinary Chrome screenshots
or Chrome Network records as proof for Web tests; ordinary Chrome is only a
fallback source for copying cookies into agent-browser when the seeded login is
not available.
### Watch the API while driving the UI
```bash
+7
View File
@@ -53,6 +53,12 @@ For Modal specifically, see the dedicated **modal** skill — use the imperative
| Layout | Center, DraggablePanel, Flexbox, Grid, Header, MaskShadow |
| Navigation | Burger, Menu, SideNav, Tabs |
## Loading indicators
**Do NOT use antd `Spin` / `<Spin />`.** Use a project loader
(`NeuralNetworkLoading`, `DotsLoading`, …) — see the **ux** skill ("Loading
visuals") for the component table and when to use each.
## State
When a feature component manages more than 3 pieces of state (`useState`/`useReducer`/derived state), extract the logic into a custom hook (e.g. `useXxx`). Keep the component focused on rendering — the hook holds state and handlers, so logic can be unit-tested without rendering the component.
@@ -112,6 +118,7 @@ errorElement: <ErrorBoundary />;
| ------------------------------------------------------------------ | --------------------------------------------------------------------------- |
| Using `next/link` in SPA | Use `react-router-dom` `Link` |
| Using antd directly | Use `@lobehub/ui/base-ui` first, then `@lobehub/ui` |
| antd `Spin` / `<Spin />` for loading | Use `NeuralNetworkLoading` / project loaders (see the **ux** skill) |
| `import { Select } from '@lobehub/ui'` | `import { Select } from '@lobehub/ui/base-ui'` |
| `import { Modal } from '@lobehub/ui'` + `<Modal open>` declarative | `createModal` / `confirmModal` from `@lobehub/ui/base-ui` (see modal skill) |
| `import { DropdownMenu/Popover/Switch } from '@lobehub/ui'` | Import same name from `@lobehub/ui/base-ui` instead |
+3
View File
@@ -43,6 +43,9 @@ cd packages/database && TEST_SERVER_DB=1 bunx vitest run --silent='passed-only'
2. **Tests must pass type check** - Run `bun run type-check` after writing tests
3. **After 1-2 failed fix attempts, stop and ask for help**
4. **Test behavior, not implementation details**
5. **Regression tests for bug fixes** - After fixing a bug, add a regression test that fails before the fix and passes after, to prevent recurrence
6. **No new component tests** - Only update existing React component tests. Complex logic should be extracted into hooks and tested there instead
7. **All source changes before any test changes** - Complete all source file edits first, then update tests in a separate pass. Interleaving disrupts reasoning about the source changes, especially across many files
## Basic Test Structure
+176
View File
@@ -0,0 +1,176 @@
---
name: ux
description: 'LobeHub product design values / principles / checklists. Load this skill whenever the work touches user-interface features or implementation — designing or building any user-facing flow — to get better UX results.'
user-invocable: false
---
# UX — Design Values & Execution Checklists
How LobeHub products should feel, and concrete rules to get there. Use this when
**building or reviewing** any user-facing flow. For component/styling choices see
**react**, for wording see **microcopy**, for imperative modal wiring see **modal**.
## Design values (设计价值观)
LobeHub follows four product design values — **自然 Natural・意义感 Meaningful・
确定性 Certainty・生长性 Growth**. Read them before designing:
**[references/design-values.md](references/design-values.md)** (definitions +
conflict priority).
> The checklists below are the execution layer. Each item is tagged with the
> value(s) it serves; for what those values mean, see the file above.
## 1. Flow & momentum (操作链路)・自然・意义感
Every action chain must **push the user forward**, never dead-end or block the flow.
- [ ] **Forward momentum** — after any operation, lead the user to the next step,
don't just stop. _(意义感)_
- [ ] **Success state = primary "go to result", secondary "dismiss"** — the strong
button is the forward action (take me to the result); "Done" is the weak/
secondary button. ✅ After moving topics: primary = "Go to «target»", secondary
\= "Done". _(意义感・自然)_
- [ ] **Bulk ⇄ single-item parity** — an action on a multi-select toolbar must also
be reachable on a single item (its context menu), and vice versa. _(确定性)_
- [ ] **Confirm → in-progress → done, in one surface** — bulk/irreversible/async
ops use a modal state machine: a confirm step stating exactly what happens →
an in-progress view with **dismissal locked** → a done (or error) view in the
same modal. Never fire-and-forget with only a toast; never leave a dead
spinner. _(确定性・意义感)_
## 2. States: empty /loading/error (状态设计)・意义感・确定性
Every data surface has **four** states — design all of them, not just "has data".
- [ ] **Empty state is a purpose-built page, not a blank screen.** It explains what
this is, why it's empty, and gives a clear next action (CTA + value props).
✅ Devices: an empty "Connect your first device" page with primary/secondary
connect paths and "what you can do once connected" cards — ❌ not a bare title
over skeleton rows or a blank body. _(意义感)_
- [ ] **Distinguish the empty variants** — "no data yet" (onboarding CTA) vs
"no match for filters" (clear-filters affordance) are different screens. _(确定性)_
- [ ] **Loading state** designed (skeleton / NeuralNetworkLoading), not a flash of
blank or layout shift. _(自然)_
- [ ] **Error state** designed — surface the reason and a retry/back path. _(意义感)_
## 3. Buttons & focus (按钮与焦点)・确定性
- [ ] **One primary button per surface.** The single primary CTA tells the user the
core action; everything else is secondary/tertiary. Never a pile of primary
buttons competing for attention. _(确定性)_
## 4. Lists at scale (列表与规模)・确定性・自然
A list/data page must be designed for its **whole range of sizes**, not just the
demo data.
- [ ] **Walk the scale: 1 / 2 / 5 / 20 / 100 / 1k10k rows.** Pick the right
mechanism per range — plain render → load-more / pagination → virtual scroll;
add batch-select / bulk actions once counts get large. _(确定性)_
- [ ] **Co-design empty / loading / error with the data state** (see §2). A list
isn't done until all four render well. _(自然)_
## 5. Option visibility (选项可见性)・确定性・意义感
- [ ] **Pickers list every valid target.** Watch for options dropped by backend
list queries (pagination, `virtual` flags, scope filters) and add them back.
✅ The default "LobeAI" (inbox) agent is `virtual` and excluded from the
sidebar list, so the move picker re-adds it. An empty picker must mean
"genuinely none", never "we filtered out the only option". _(意义感)_
## 6. Loading visuals (Loading 视觉)・自然
**Never use antd `Spin`** — it doesn't match the product's loading visual. Use a
project loader:
| Need | Component |
| --------------------------- | ----------------------------------------------------------------------------- |
| Default loading (in-flight) | `NeuralNetworkLoading` from `@/components/NeuralNetworkLoading` (`size` prop) |
| Inline dots | `DotsLoading` / `BubblesLoading` from `@/components` |
| Branded full-page | `Loading` from `@/components/Loading/BrandTextLoading` |
| List / card placeholder | a skeleton (e.g. `SkeletonList`) |
When in doubt, reach for `NeuralNetworkLoading` — it's the default in-flight
indicator (e.g. modal "in progress" states).
## 7. Discoverability & growth (可发现性与生长)・生长性
The product should grow with the user — deeper power shows up as needs deepen.
- [ ] **Progressive disclosure** — keep the novice path clean; reveal advanced
capabilities as the user gets there, don't dump everything at once. _(生长性・自然)_
- [ ] **Surface related actions at the moment of need** — make the next capability
discoverable in context (e.g. after the first item exists, offer what to do
with it), not buried in a far-off menu. _(生长性・意义感)_
## 8. Entity lifecycle completeness (实体生命周期完整性)・意义感・确定性
The recurring trap: a feature ships only the **display** of a list, but edit /
delete / management are never built — so the user can add something and then be
stuck with it. For every entity a user can see, design its **full lifecycle**:
create / read / update / delete, plus state transitions (enable/disable,
connect/disconnect, install/uninstall). A read-only list the user can't manage
breaks the flow.
**The allowed operation set depends on the entity's source / ownership** — decide
it explicitly _before_ building. Worked example, the tools/connectors list:
| Entity class | Add | Edit | Remove |
| ----------------------------------- | ------- | --------- | ------------------ |
| Official / built-in (skills, tools) | — | — | ✗ not removable |
| Community (installed MCP) | install | configure | uninstall / remove |
| User-custom (custom connector) | create | edit | delete |
- [ ] **No display-only features.** For every listed entity, enumerate CRUD +
lifecycle ops and build the ones that apply. _(意义感)_
- [ ] **Operation set per source/ownership class** — built-in may be read-only;
anything the user _installed_ must be removable; anything the user _created_
must be editable **and** deletable. _(确定性)_
- [ ] **Each item exposes its allowed ops** (hover action / context menu / detail
page), and there's a clear entry point to add/create where applicable. _(自然)_
- [ ] **An intentionally-absent op is a documented decision, not an oversight**
(e.g. official tools can't be deleted — by design). _(确定性)_
## 9. Capability-gated features・Certainty・Meaningful
A feature can be fully built and still produce a broken result when the selected
model — or its still-loading config — **can't deliver the capability the feature
depends on** (for example, an agentic run on a model without tool calling). This
is usually the user's configuration choice, not a defect; but if the product stays
silent the user reads it as the product being broken. When a feature's success
depends on a capability the current config may lack, the product owes a
**proactive, non-blocking reminder** — a guardrail, not a gate.
- [ ] **Surface the mismatch, don't fail silently.** When a feature needs a model
capability (tool calling, vision, reasoning, long context) the current model
lacks, show a soft inline warning at the point of action — never a hard block
or a modal that stops the user. _(Meaningful)_
- [ ] **Stay reactive.** The reminder clears the moment the user switches to a
capable model — derive it from live state, not a one-shot check. _(Natural)_
- [ ] **Don't warn while config is loading.** A capability that hasn't resolved yet
looks "unsupported"; warning then is a false alarm — exactly the glitch users
mistake for a product bug. Warn only on a _resolved_ unsupported state. _(Certainty)_
- [ ] **Scope to the mode that needs it.** Show only when the capability-dependent
mode is on; one reminder per root cause, never a pile of overlapping notices. _(Natural・Certainty)_
- [ ] **State the problem and the remedy.** The copy says what's wrong _and_ what
the user should do about it. _(Meaningful)_
## Quick review checklist
- [ ] Action leads the user forward; success offers a primary "go to result".
- [ ] Bulk action has a single-item entry (and vice versa).
- [ ] Async/bulk/irreversible action: confirm → in-progress (locked) → done/error.
- [ ] Empty / loading / error states are all designed; empty is a real page with a CTA.
- [ ] Exactly one primary button per surface.
- [ ] List designed across 1 → 10k rows (virtual scroll / pagination / batch as needed).
- [ ] Pickers show all valid targets (default/inbox included); empty = truly none.
- [ ] No antd `Spin`; use `NeuralNetworkLoading` / project loaders.
- [ ] Advanced capability is progressively disclosed / discoverable at the moment of need.
- [ ] Listed entities have their full lifecycle (not display-only); ops match source (built-in / installed / custom).
- [ ] Capability-gated feature warns (soft, reactive, load-gated) when the model can't deliver it; copy gives the remedy.
## Related skills
- **modal** — imperative `createModal` state-machine wiring for confirm/progress/done.
- **microcopy** — wording for confirm / done / empty / error states.
- **react** — component priority, `Button` usage, styling.
@@ -0,0 +1,51 @@
# LobeHub Design Values (设计价值观)
The philosophy behind every LobeHub interface. Read this before designing or
reviewing a flow; the per-aspect execution rules live in the parent
[SKILL.md](../SKILL.md) and each checklist item is tagged with the value(s) it serves.
Adapted from Ant Design's design values
(<https://ant.design/docs/spec/values-cn>, <https://zhuanlan.zhihu.com/p/44809866>).
LobeHub adopts all four.
## 自然 (Natural)
Minimise cognitive load. Digital products keep getting more complex while human
attention stays scarce — so the interface should feel as effortless as the
physical world. The next step should be obvious without thinking; the product
proactively carries the user forward (sensible defaults, AI-assisted decisions,
smooth transitions) rather than making them stop and figure things out.
## 意义感 (Meaningful)
Every screen is rooted in the user's real goal, not an isolated feature. Make the
objective clear, give immediate feedback on the result of each action, and always
point at the next meaningful step. Calibrate difficulty — neither a patronising
over-simplification nor an overwhelming wall — so the user keeps a sense of
progress and accomplishment.
## 确定性 (Certainty)
Low-entropy, predictable interactions. Reuse the same patterns, components, and
wording so behaviour is never surprising. Keep a single clear focus per surface,
and design **every** state (empty / loading / error / success) so nothing is left
undefined. Restraint over cleverness: fewer, consistent rules beat many bespoke
ones.
## 生长性 (Growth)
The product grows together with the user. As needs deepen and roles evolve,
surface advanced capabilities progressively and make related features
discoverable at the moment they become relevant — without crowding the novice
path. Bridge product value to the user's changing scenarios and aim for
humanmachine symbiosis (人机共生): the user and the agent co-evolve, each making
the other more capable over time.
## Priority when values conflict
For moment-to-moment interaction decisions: **意义感 ≳ 自然 > 确定性** — never
sacrifice the user's goal or forward momentum just to keep things uniform.
**生长性 (Growth)** is a longer-horizon lens: weigh it when shaping how a feature
is discovered and how it scales with the user, not when resolving a single-screen
layout trade-off.
@@ -49,4 +49,4 @@ Migration owner: @{pr-author}
The migration owner is responsible for rollout follow-up and incident handling for this schema change.
> **Note for Claude**: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'` or from commit metadata. Do not hardcode a username.
> \[!NOTE]: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'` or from commit metadata. Do not hardcode a username.
@@ -18,4 +18,4 @@
@{pr-author}
> **Note for Claude**: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'`. Do not hardcode a username.
> \[!NOTE]: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'`. Do not hardcode a username.
@@ -86,7 +86,7 @@ New AI model or provider support, typically contributed via community PRs.
- These PR title prefixes (`feat` / `style`) are in the auto-tag trigger list
- No special branch naming or manual release steps required — merging the PR triggers auto patch +1
### When Claude is involved
### When an agent is involved
If asked to add model support, just create a normal feature PR. The title prefix will trigger the release automatically.
+9
View File
@@ -445,6 +445,15 @@ OPENAI_API_KEY=sk-xxxxxxxxx
# MESSAGE_GATEWAY_URL=https://message-gateway.lobehub.com
# MESSAGE_GATEWAY_SERVICE_TOKEN=your_service_token_here
# #######################################
# ######### Agent Gateway Mode ##########
# #######################################
# Enable Gateway Mode for self-hosted deployments. Requires AGENT_GATEWAY_URL.
# ENABLE_AGENT_GATEWAY=1
# AGENT_GATEWAY_URL=https://agent-gateway.example.com
# AGENT_GATEWAY_SERVICE_TOKEN=your_service_token_here
# #######################################
# ########### Messenger Bot #############
# #######################################
+1 -25
View File
@@ -19,12 +19,6 @@ jobs:
steps:
- uses: actions/checkout@v6
- name: Clean issue notice
uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
with:
actions: 'close-issues'
labels: '🚨 Sync Fail'
- name: Sync upstream changes
id: sync
uses: aormsby/Fork-Sync-With-Upstream-action@v3.4
@@ -33,22 +27,4 @@ jobs:
upstream_sync_branch: main
target_sync_branch: main
target_repo_token: ${{ secrets.GITHUB_TOKEN }} # automatically generated, no need to set
test_mode: false
- name: Sync check
if: failure()
uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
with:
actions: 'create-issue'
title: '🚨 同步失败 | Sync Fail'
labels: '🚨 Sync Fail'
body: |
Due to a change in the workflow file of the [LobeChat][lobechat] upstream repository, GitHub has automatically suspended the scheduled automatic update. You need to manually sync your fork. Please refer to the detailed [Tutorial][tutorial-en-US] for instructions.
由于 [LobeChat][lobechat] 上游仓库的 workflow 文件变更,导致 GitHub 自动暂停了本次自动更新,你需要手动 Sync Fork 一次,请查看 [详细教程][tutorial-zh-CN]
![](https://github-production-user-asset-6210df.s3.amazonaws.com/17870709/273954625-df80c890-0822-4ac2-95e6-c990785cbed5.png)
[lobechat]: https://github.com/lobehub/lobe-chat
[tutorial-zh-CN]: https://lobehub.com/zh/docs/self-hosting/advanced/upstream-sync
[tutorial-en-US]: https://lobehub.com/docs/self-hosting/advanced/upstream-sync
test_mode: false
+2 -3
View File
@@ -59,6 +59,7 @@ bun.lockb
# Build outputs
dist/
public/_spa/
public/_spa-auth/
public/spa/
es/
lib/
@@ -92,10 +93,8 @@ public/swe-worker*
# Generated files
src/app/spa/[variants]/[[...path]]/spaHtmlTemplates.ts
src/app/spa-auth/authHtmlTemplate.ts
public/*.js
public/sitemap.xml
public/sitemap-index.xml
sitemap*.xml
robots.txt
# Git hooks
+2
View File
@@ -136,3 +136,5 @@ bun run type-check
### Code Review
Before reviewing a PR / diff / branch change, read the **review-checklist** skill (`.agents/skills/review-checklist/SKILL.md`) — it lists the recurring mistakes specific to this codebase.
When designing or reviewing user-facing flows (empty/loading/error states, confirmations, async feedback, button hierarchy, lists at scale, pickers), follow the **ux** skill (`.agents/skills/ux/SKILL.md`) — LobeHub's design values (自然 / 意义感 / 确定性) plus per-aspect execution checklists.
+1
View File
@@ -29,6 +29,7 @@
},
"devDependencies": {
"@lobechat/agent-gateway-client": "workspace:*",
"@lobechat/device-control": "workspace:*",
"@lobechat/device-gateway-client": "workspace:*",
"@lobechat/device-identity": "workspace:*",
"@lobechat/heterogeneous-agents": "workspace:*",
+1
View File
@@ -1,5 +1,6 @@
packages:
- '../../packages/agent-gateway-client'
- '../../packages/device-control'
- '../../packages/device-gateway-client'
- '../../packages/device-identity'
- '../../packages/heterogeneous-agents'
+41 -5
View File
@@ -2,9 +2,16 @@ import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import {
defaultGetLocalFilePreview,
defaultGetProjectFileIndex,
type DeviceControlDeps,
executeDeviceRpc,
} from '@lobechat/device-control';
import type {
AgentRunRequestMessage,
DeviceSystemInfo,
RpcRequestMessage,
SystemInfoRequestMessage,
ToolCallRequestMessage,
} from '@lobechat/device-gateway-client';
@@ -262,19 +269,23 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
// Handle tool call requests
client.on('tool_call_request', async (request: ToolCallRequestMessage) => {
const { requestId, timeout, toolCall } = request;
const { operationId, requestId, timeout, toolCall } = request;
if (isDaemonChild) {
appendLog(`[TOOL] ${toolCall.apiName} (${requestId})`);
appendLog(
`[TOOL] ${toolCall.apiName}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
);
} else {
log.toolCall(toolCall.apiName, requestId, toolCall.arguments);
log.toolCall(toolCall.apiName, requestId, toolCall.arguments, operationId);
}
const result = await executeToolCall(toolCall.apiName, toolCall.arguments, timeout);
if (isDaemonChild) {
appendLog(`[RESULT] ${result.success ? 'OK' : 'FAIL'} (${requestId})`);
appendLog(
`[RESULT] ${result.success ? 'OK' : 'FAIL'}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
);
} else {
log.toolResult(requestId, result.success, result.content);
log.toolResult(requestId, result.success, result.content, operationId);
}
client.sendToolCallResponse({
@@ -288,6 +299,31 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
});
});
// Handle generic server-internal device RPCs (git / workspace / file ops).
// Shares the `@lobechat/device-control` dispatcher with the desktop app so the
// CLI exposes the same remote-device control surface. File preview / index use
// the package's portable defaults (no preview-protocol approval on the CLI).
const deviceControlDeps: DeviceControlDeps = {
getLocalFilePreview: defaultGetLocalFilePreview,
getProjectFileIndex: defaultGetProjectFileIndex,
};
client.on('rpc_request', async (request: RpcRequestMessage) => {
const { method, params, requestId } = request;
if (isDaemonChild) appendLog(`[RPC] ${method} (${requestId})`);
else info(`Received rpc_request: method=${method} (${requestId})`);
try {
const data = await executeDeviceRpc(method, params, deviceControlDeps);
client.sendRpcResponse({ requestId, result: { data, success: true } });
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
if (isDaemonChild) appendLog(`[RPC ERROR] ${method}: ${message} (${requestId})`);
else error(`rpc_request method=${method} failed: ${message}`);
client.sendRpcResponse({ requestId, result: { error: message, success: false } });
}
});
// Handle gateway-dispatched agent runs (heterogeneous agents, e.g. Claude
// Code). Mirrors the desktop app: spawn `lh hetero exec`, which owns the full
// execution + server-ingest pipeline. Ack with the spawn outcome — `accepted`
+47
View File
@@ -649,6 +649,53 @@ describe('hetero exec command', () => {
]);
});
it('finishes with result "error" when a terminal error event is pushed despite a clean exit', async () => {
// CC relays an API/rate-limit error as an in-stream `error` event but still
// exits 0. The finish result must NOT be derived from the exit code alone,
// otherwise the topic/task is wrongly marked completed.
mockSpawnAgent.mockReturnValue(
createFakeHandle({
events: [
{
data: {
error: 'API Error: Server is temporarily limiting requests · Rate limited',
message: 'API Error: Server is temporarily limiting requests · Rate limited',
},
operationId: 'op-err',
stepIndex: 0,
timestamp: 1,
type: 'error',
},
],
exitCode: 0,
}),
);
await runCmd([
'hetero',
'exec',
'--type',
'claude-code',
'--prompt',
'hi',
'--topic',
'topic-1',
'--operation-id',
'op-err',
'--render',
'none',
]);
expect(mockHeteroFinishMutate).toHaveBeenCalledTimes(1);
expect(mockHeteroFinishMutate.mock.calls[0][0]).toMatchObject({
error: {
message: 'API Error: Server is temporarily limiting requests · Rate limited',
type: 'AgentRuntimeError',
},
result: 'error',
});
});
it('resets the per-message text accumulator at message boundaries (no cross-message duplication)', async () => {
// The `replace` snapshot accumulator must not span
// message boundaries. Two assistant messages separated by a
+35 -7
View File
@@ -467,6 +467,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
* sessionId — CC session id from `system.init` (undefined on resume failure)
* ingestError — true when a batch could not be flushed after retries
* resumeNotFound — true when a resume-not-found error was intercepted
* sawTerminalError — true when a terminal `error` event was pushed to the
* ingester (CC can relay an API/rate-limit error this way
* and still exit 0, so the exit code alone is not enough)
* terminalErrorMessage — the message from that terminal `error` event, used
* as the task-level error detail in the finish payload
* stderrContent — accumulated stderr (only when interceptResumeErrors=true)
*/
const runOneAgent = async (
@@ -477,9 +482,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
code: number | null;
ingestError: boolean;
resumeNotFound: boolean;
sawTerminalError: boolean;
sessionId: string | undefined;
signal: NodeJS.Signals | null;
stderrContent: string;
terminalErrorMessage: string | undefined;
}> => {
// One raw-dump file pair per spawn attempt (the resume retry is a second
// attempt). The stdout tee runs inside `spawnAgent` before the adapter.
@@ -549,6 +556,8 @@ const exec = async (options: ExecOptions): Promise<void> => {
// into the ingester. When intercepting resume errors, a matching
// `error` event is withheld from the ingester and flags a retry instead.
let resumeNotFound = false;
let sawTerminalError = false;
let terminalErrorMessage: string | undefined;
const ingestError = false;
try {
for await (const event of handle.events) {
@@ -563,6 +572,16 @@ const exec = async (options: ExecOptions): Promise<void> => {
continue;
}
}
// A terminal `error` event (e.g. an API/rate-limit error relayed by CC)
// must mark the run as failed even when the child exits 0 — track it so
// the finish result is not derived from the exit code alone. Capture the
// message too, so the finish payload can surface it as the task-level
// error detail (CC relays these on stdout, not stderr).
if (event.type === 'error') {
sawTerminalError = true;
const data = event.data as Record<string, unknown> | undefined;
terminalErrorMessage = String(data?.message ?? data?.error ?? '') || undefined;
}
if (emitJsonl) process.stdout.write(`${JSON.stringify(event)}\n`);
serverIngester?.push(event);
}
@@ -608,9 +627,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
code,
ingestError,
resumeNotFound,
sawTerminalError,
sessionId: handle.sessionId,
signal,
stderrContent,
terminalErrorMessage,
};
};
@@ -675,16 +696,23 @@ const exec = async (options: ExecOptions): Promise<void> => {
result = { ...result, ingestError: true };
}
const exitedClean = !result.ingestError && (code === 0 || signal === 'SIGTERM');
// CC relays API/rate-limit errors as an in-stream terminal `error` event but
// still exits 0, so the exit code alone would report `success`. Treat any
// pushed terminal error as a failed run so the topic/task is marked failed.
const exitedClean =
!result.ingestError && !result.sawTerminalError && (code === 0 || signal === 'SIGTERM');
// When the run failed, pass stderr as the error detail so the server can
// surface a useful message instead of the generic "Agent execution failed"
// fallback. Trim to the last 1 KB — the tail is most informative and
// keeps the tRPC payload small.
// When the run failed, pass an error detail so the server surfaces a useful
// message instead of the generic "Agent execution failed" fallback. Prefer
// the in-stream terminal error (CC relays API/rate-limit errors here while
// exiting 0, so stderr is empty); otherwise fall back to the stderr tail.
// Trim to the last 1 KB — the tail is most informative and keeps the tRPC
// payload small.
const stderrTail = result.stderrContent.trim();
const errorDetail = result.terminalErrorMessage || stderrTail;
const finishError =
!exitedClean && stderrTail
? { message: stderrTail.slice(-1024), type: 'AgentRuntimeError' }
!exitedClean && errorDetail
? { message: errorDetail.slice(-1024), type: 'AgentRuntimeError' }
: undefined;
try {
+6 -5
View File
@@ -1,4 +1,3 @@
/* eslint-disable no-console */
import pc from 'picocolors';
let verbose = false;
@@ -41,18 +40,20 @@ export const log = {
console.log(`${timestamp()} ${pc.bold('[STATUS]')} ${color(status)}`);
},
toolCall: (apiName: string, requestId: string, args?: string) => {
toolCall: (apiName: string, requestId: string, args?: string, operationId?: string) => {
console.log(
`${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)} ${pc.dim(`(${requestId})`)}`,
`${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
);
if (args && verbose) {
console.log(` ${pc.dim(args)}`);
}
},
toolResult: (requestId: string, success: boolean, content?: string) => {
toolResult: (requestId: string, success: boolean, content?: string, operationId?: string) => {
const icon = success ? pc.green('OK') : pc.red('FAIL');
console.log(`${timestamp()} ${pc.magenta('[RESULT]')} ${icon} ${pc.dim(`(${requestId})`)}`);
console.log(
`${timestamp()} ${pc.magenta('[RESULT]')} ${icon}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
);
if (content && verbose) {
const preview = content.length > 200 ? content.slice(0, 200) + '...' : content;
console.log(` ${pc.dim(preview)}`);
+1
View File
@@ -56,6 +56,7 @@
"@electron-toolkit/utils": "^4.0.0",
"@lobechat/chat-adapter-imessage": "workspace:*",
"@lobechat/desktop-bridge": "workspace:*",
"@lobechat/device-control": "workspace:*",
"@lobechat/device-gateway-client": "workspace:*",
"@lobechat/device-identity": "workspace:*",
"@lobechat/electron-client-ipc": "workspace:*",
+1
View File
@@ -8,6 +8,7 @@ packages:
- '../../packages/electron-client-ipc'
- '../../packages/file-loaders'
- '../../packages/desktop-bridge'
- '../../packages/device-control'
- '../../packages/device-gateway-client'
- '../../packages/device-identity'
- '../../packages/local-file-shell'
@@ -3,6 +3,7 @@ import fs from 'node:fs';
import os from 'node:os';
import path from 'node:path';
import { type DeviceControlDeps, executeDeviceRpc as runDeviceRpc } from '@lobechat/device-control';
import type {
AgentRunRequestMessage,
GatewayMcpStdioParams,
@@ -13,11 +14,8 @@ import type {
GetCommandOutputParams,
GlobFilesParams,
GrepContentParams,
InitWorkspaceParams,
KillCommandParams,
ListLocalFileParams,
ListProjectSkillsParams,
LocalFilePreviewUrlParams,
LocalReadFileParams,
LocalReadFilesParams,
LocalSearchFilesParams,
@@ -30,15 +28,16 @@ import { type ILocalSystemService, LocalSystemExecutionRuntime } from '@lobechat
import GatewayConnectionService from '@/services/gatewayConnectionSrv';
import ImessageBridgeService from '@/services/imessageBridgeSrv';
import { createLogger } from '@/utils/logger';
import GitCtr from './GitCtr';
import HeterogeneousAgentCtr from './HeterogeneousAgentCtr';
import { ControllerModule, IpcMethod } from './index';
import LocalFileCtr from './LocalFileCtr';
import McpCtr from './McpCtr';
import RemoteServerConfigCtr from './RemoteServerConfigCtr';
import ShellCommandCtr from './ShellCommandCtr';
import WorkspaceCtr from './WorkspaceCtr';
const logger = createLogger('controllers:GatewayConnectionCtr');
/**
* Inject the lh-notify protocol into the first turn of a new hetero-agent session.
@@ -167,14 +166,6 @@ export default class GatewayConnectionCtr extends ControllerModule {
return this.app.getController(LocalFileCtr);
}
private get workspaceCtr() {
return this.app.getController(WorkspaceCtr);
}
private get gitCtr() {
return this.app.getController(GitCtr);
}
private get shellCommandCtr() {
return this.app.getController(ShellCommandCtr);
}
@@ -353,91 +344,33 @@ export default class GatewayConnectionCtr extends ControllerModule {
return this.localSystemRuntime;
}
/**
* Platform-specific handlers the shared `@lobechat/device-control` dispatcher
* delegates to. Git + workspace-scan methods run inside device-control over
* `@lobechat/local-file-shell`; only file preview / index (and preview
* approval) are desktop-specific and routed back to the controllers here.
*/
private get deviceControlDeps(): DeviceControlDeps {
return {
approveProjectRoot: async (root) => {
try {
await this.app.localFileProtocolManager.approveIndexedProjectRoot(root);
} catch (error) {
logger.error(`Failed to approve project preview root ${root}:`, error);
}
},
getLocalFilePreview: (params) => this.localFileCtr.getLocalFilePreview(params),
getProjectFileIndex: (params) => this.localFileCtr.getProjectFileIndex(params),
};
}
/**
* Dispatch a generic server-internal device RPC (not an agent tool call) by
* method name. Currently only `initWorkspace` (scan the bound project root for
* skills + AGENTS.md); add new server-only device methods here.
* method name. The dispatch logic lives in `@lobechat/device-control` so the
* desktop main process and the CLI daemon share one device RPC surface.
*/
private async executeDeviceRpc(method: string, params: unknown): Promise<unknown> {
switch (method) {
case 'initWorkspace': {
return this.workspaceCtr.initWorkspace(params as InitWorkspaceParams);
}
case 'getGitBranch': {
return this.gitCtr.getGitBranch((params as { path: string }).path);
}
case 'getLinkedPullRequest': {
return this.gitCtr.getLinkedPullRequest(params as { branch: string; path: string });
}
case 'getGitWorkingTreeStatus': {
return this.gitCtr.getGitWorkingTreeStatus((params as { path: string }).path);
}
case 'getGitAheadBehind': {
return this.gitCtr.getGitAheadBehind((params as { path: string }).path);
}
case 'listGitBranches': {
return this.gitCtr.listGitBranches((params as { path: string }).path);
}
case 'checkoutGitBranch': {
return this.gitCtr.checkoutGitBranch(
params as { branch: string; create?: boolean; path: string },
);
}
case 'pullGitBranch': {
return this.gitCtr.pullGitBranch(params as { path: string });
}
case 'pushGitBranch': {
return this.gitCtr.pushGitBranch(params as { path: string });
}
case 'getGitWorkingTreePatches': {
return this.gitCtr.getGitWorkingTreePatches((params as { path: string }).path);
}
case 'getGitWorkingTreeFiles': {
return this.gitCtr.getGitWorkingTreeFiles((params as { path: string }).path);
}
case 'getProjectFileIndex': {
return this.localFileCtr.getProjectFileIndex(params as { scope?: string });
}
case 'getLocalFilePreview': {
return this.localFileCtr.getLocalFilePreview(params as LocalFilePreviewUrlParams);
}
case 'listProjectSkills': {
return this.workspaceCtr.listProjectSkills(params as ListProjectSkillsParams);
}
case 'getGitBranchDiff': {
return this.gitCtr.getGitBranchDiff(params as { baseRef?: string; path: string });
}
case 'listGitRemoteBranches': {
return this.gitCtr.listGitRemoteBranches((params as { path: string }).path);
}
case 'revertGitFile': {
return this.gitCtr.revertGitFile(params as { filePath: string; path: string });
}
case 'statPath': {
return this.workspaceCtr.statPath(params as { path: string });
}
default: {
throw new Error(`Unknown device RPC method: ${method}`);
}
}
return runDeviceRpc(method, params, this.deviceControlDeps);
}
private async executeToolCall(
File diff suppressed because it is too large Load Diff
@@ -23,7 +23,7 @@ import type {
HeteroExecImageRef,
} from '@lobechat/heterogeneous-agents/protocol';
import { buildHeteroExecStdinPayload } from '@lobechat/heterogeneous-agents/protocol';
import type { AgentStreamEvent } from '@lobechat/heterogeneous-agents/spawn';
import type { AgentStreamEvent, UsageData } from '@lobechat/heterogeneous-agents/spawn';
import {
AgentStreamPipeline,
buildAgentInput,
@@ -188,6 +188,21 @@ interface AgentSession {
modelVerificationLastAttemptAt?: number;
modelVerificationLastAttemptSessionId?: string;
process?: ChildProcess;
/**
* Absolute CLI path resolved by spawn preflight detection. Used for spawn()
* when the configured command is bare: detection can find the CLI through
* the login-shell PATH or a well-known install location (e.g. the Codex.app
* bundled CLI) that plain spawn() with the inherited env can't resolve.
*/
resolvedCommandPath?: string;
/**
* PATH the preflight detector used to resolve `resolvedCommandPath`, set only
* when it fell back to the login-shell PATH. Merged into the child PATH at
* spawn so a `#!/usr/bin/env node` shim still finds its interpreter — the
* shim resolving in preflight doesn't guarantee `node` is on the leaner
* inherited PATH (Finder-launched Electron).
*/
resolvedCommandSearchPath?: string;
resumeSessionId?: string;
sessionId: string;
verifiedModel?: string;
@@ -470,11 +485,20 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
session.agentType === 'claude-code' ? 'claude-code' : 'codex',
command,
);
const cliMissingError = this.buildCliMissingError(session);
if (!status || status.available || !cliMissingError) return;
if (!status || status.available) {
// Spawn through the detector-resolved absolute path when the configured
// command is bare — detection may have located the CLI somewhere plain
// spawn() can't (login-shell PATH, Codex.app bundled CLI, …).
const useResolvedPath = Boolean(status?.path) && !command.includes(path.sep);
session.resolvedCommandPath = useResolvedPath ? status!.path : undefined;
// Carry the login-shell PATH the detector resolved through, so a
// `#!/usr/bin/env node` shim spawned by absolute path still finds `node`.
session.resolvedCommandSearchPath = useResolvedPath ? status!.resolvedPathEnv : undefined;
return;
}
return cliMissingError;
return this.buildCliMissingError(session);
}
private get shouldTraceCliOutput(): boolean {
@@ -911,6 +935,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
let spawnPlan;
let traceSession;
let cwd: string;
let initialCumulativeUsage: UsageData | undefined;
let spawnEnv: NodeJS.ProcessEnv;
try {
const driver = getHeterogeneousAgentDriver(session.agentType);
@@ -934,7 +959,12 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
// Forward the user's proxy settings to the CLI. The main-process undici
// dispatcher doesn't reach child processes — they need env vars.
const proxyEnv = buildProxyEnv(this.app.storeManager.get('networkProxy'));
spawnEnv = { ...buildInheritedSpawnEnv(), ...proxyEnv, ...session.env };
const inheritedEnv = buildInheritedSpawnEnv();
// When preflight resolved the CLI via the login-shell PATH, spawn with
// that PATH (a superset of the inherited one) so a `#!/usr/bin/env node`
// shim finds its interpreter. `session.env` still wins if it sets PATH.
if (session.resolvedCommandSearchPath) inheritedEnv.PATH = session.resolvedCommandSearchPath;
spawnEnv = { ...inheritedEnv, ...proxyEnv, ...session.env };
if (session.agentType === 'codex') {
const initialModel = await resolveCodexInitialModel({
@@ -945,6 +975,12 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
session.model = initialModel.model;
session.modelSource = initialModel.source;
}
if (session.agentSessionId) {
initialCumulativeUsage = (
await readCodexSessionModel(session.agentSessionId, { env: spawnEnv })
)?.cumulativeUsage;
}
}
traceSession = await this.createCliTraceSession({
@@ -966,7 +1002,10 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
}
const useStdin = spawnPlan.stdinPayload !== undefined;
const cliArgs = spawnPlan.args;
const resolvedCliSpawnPlan = await resolveCliSpawnPlan(session.command, cliArgs);
const resolvedCliSpawnPlan = await resolveCliSpawnPlan(
session.resolvedCommandPath ?? session.command,
cliArgs,
);
logger.info(
'Spawning agent:',
@@ -1001,6 +1040,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
reject,
resolve,
session,
initialCumulativeUsage,
spawnEnv,
traceSession,
useStdin,
@@ -1070,6 +1110,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
private handleSpawnedAgentProcess({
cwd,
initialCumulativeUsage,
intervention,
params,
proc,
@@ -1088,6 +1129,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
reject: (reason?: unknown) => void;
resolve: () => void;
session: AgentSession;
initialCumulativeUsage?: UsageData | undefined;
spawnEnv: NodeJS.ProcessEnv;
spawnPlan: HeterogeneousAgentBuildPlan;
traceSession: CliTraceSession | undefined;
@@ -1128,6 +1170,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
const pipeline = new AgentStreamPipeline({
agentType: session.agentType,
cwd,
initialCumulativeUsage,
initialModel: session.model,
operationId: params.operationId,
});
@@ -437,11 +437,13 @@ export default class LocalFileCtr extends ControllerModule {
@IpcMethod()
async getLocalFilePreviewUrl({
accept,
path: filePath,
workingDirectory,
}: LocalFilePreviewUrlParams): Promise<LocalFilePreviewUrlResult> {
try {
const url = await this.app.localFileProtocolManager.createPreviewUrl({
accept,
filePath,
workspaceRoot: workingDirectory,
});
@@ -459,11 +461,13 @@ export default class LocalFileCtr extends ControllerModule {
@IpcMethod()
async getLocalFilePreview({
accept,
path: filePath,
workingDirectory,
}: LocalFilePreviewUrlParams): Promise<LocalFilePreviewResult> {
try {
const preview = await this.app.localFileProtocolManager.readPreviewFile({
accept,
filePath,
workspaceRoot: workingDirectory,
});
+16 -207
View File
@@ -1,244 +1,53 @@
import { readdir, readFile, stat } from 'node:fs/promises';
import path from 'node:path';
import {
initWorkspace as runInitWorkspace,
listProjectSkills as runListProjectSkills,
statPath as runStatPath,
type WorkspaceScanDeps,
} from '@lobechat/device-control';
import {
type InitWorkspaceParams,
type InitWorkspaceResult,
type ListProjectSkillsParams,
type ListProjectSkillsResult,
type ProjectSkillItem,
} from '@lobechat/electron-client-ipc';
import { detectRepoType } from '@/utils/git';
import { createLogger } from '@/utils/logger';
import { ControllerModule, IpcMethod } from './index';
const logger = createLogger('controllers:WorkspaceCtr');
const SKILL_FRONTMATTER_RE = /^---\r?\n([\s\S]*?)\r?\n---/;
// Cap recursion to guard against pathological directory trees.
const MAX_SKILL_FILE_COUNT = 1000;
const toPosixRelativePath = (filePath: string) => filePath.split(path.sep).join('/');
const listSkillFilesRecursive = async (dir: string): Promise<string[]> => {
const results: string[] = [];
const stack: string[] = [dir];
while (stack.length > 0 && results.length < MAX_SKILL_FILE_COUNT) {
const current = stack.pop()!;
let entries;
try {
entries = await readdir(current, { withFileTypes: true });
} catch {
continue;
}
for (const entry of entries) {
if (entry.name.startsWith('.')) continue;
const full = path.join(current, entry.name);
if (entry.isDirectory()) {
stack.push(full);
} else if (entry.isFile()) {
results.push(toPosixRelativePath(path.relative(dir, full)));
if (results.length >= MAX_SKILL_FILE_COUNT) break;
}
}
}
return results.sort();
};
// Parse a minimal YAML frontmatter block for SKILL.md files.
// Only handles `key: value` lines; multi-line block scalars fall back to the first line.
const parseSkillFrontmatter = (raw: string): Record<string, string> => {
const match = raw.match(SKILL_FRONTMATTER_RE);
if (!match) return {};
const fields: Record<string, string> = {};
for (const line of match[1].split(/\r?\n/)) {
const colonIdx = line.indexOf(':');
if (colonIdx === -1) continue;
const key = line.slice(0, colonIdx).trim();
if (!key || key.startsWith('#')) continue;
let value = line.slice(colonIdx + 1).trim();
if (value.startsWith('|') || value.startsWith('>')) continue;
if (
(value.startsWith('"') && value.endsWith('"')) ||
(value.startsWith("'") && value.endsWith("'"))
) {
value = value.slice(1, -1);
}
fields[key] = value;
}
return fields;
};
/**
* WorkspaceCtr
*
* Owns "project workspace" scanning: discovering agent skills (`.agents/skills`
* / `.claude/skills`) and project-root instructions (`AGENTS.md` / `CLAUDE.md`)
* under a bound project directory. Split out of LocalFileCtr so the
* workspace/agent-config concern is distinct from generic local file ops.
* Thin IPC layer over `@lobechat/device-control`'s workspace-scan helpers
* (skills discovery under `.agents/skills` / `.claude/skills` + project-root
* instructions). The scan logic is shared with the device-control RPC dispatch
* so the local desktop IPC path, the remote device RPC, and the CLI all run
* identical scans; the desktop-only preview-protocol approval is injected here.
*/
export default class WorkspaceCtr extends ControllerModule {
static override readonly groupName = 'workspace';
/**
* Scan one skill source directory (e.g. `.agents/skills`) under `root` and
* return parsed frontmatter for each `SKILL.md`. Returns `[]` when the source
* directory is absent or unreadable. Unsorted — callers sort/merge.
*/
private async scanSkillsInSource(
root: string,
source: ProjectSkillItem['source'],
): Promise<ProjectSkillItem[]> {
const dir = path.join(root, source);
let entries;
try {
entries = await readdir(dir, { withFileTypes: true });
} catch {
// Directory does not exist or is not readable.
return [];
}
const skills = await Promise.all(
entries
.filter((entry) => entry.isDirectory() || entry.isSymbolicLink())
.map(async (entry) => {
const skillDir = path.join(dir, entry.name);
const skillFile = path.join(skillDir, 'SKILL.md');
try {
const raw = await readFile(skillFile, 'utf8');
const fields = parseSkillFrontmatter(raw);
const files = await listSkillFilesRecursive(skillDir);
return {
description: fields.description || undefined,
fileCount: files.length,
files,
name: fields.name || entry.name,
path: skillFile,
skillDir,
source,
};
} catch {
return null;
}
}),
);
return skills.filter((skill): skill is ProjectSkillItem => skill !== null);
private get scanDeps(): WorkspaceScanDeps {
return { approveProjectRoot: (root) => this.approveProjectRootForPreview(root) };
}
/**
* Scan agent skill directories under the project root and return parsed
* frontmatter for each SKILL.md. Used by the hetero agent's working sidebar
* to surface skills available in the current project. Returns the first
* source directory that yields any skills (`.agents/skills` wins).
*/
@IpcMethod()
async listProjectSkills(params: ListProjectSkillsParams): Promise<ListProjectSkillsResult> {
const root = params.scope;
const sources = ['.agents/skills', '.claude/skills'] as const;
for (const source of sources) {
const skills = (await this.scanSkillsInSource(root, source)).sort((a, b) =>
a.name.localeCompare(b.name),
);
if (skills.length > 0) {
await this.approveProjectRootForPreview(root);
return { root, skills, source };
}
}
return { root, skills: [], source: null };
return runListProjectSkills(params, this.scanDeps);
}
/**
* One-call "workspace init" scan of a bound project directory: merge the
* project skills from BOTH `.agents/skills` and `.claude/skills` (deduped by
* name, `.agents/skills` winning) and read the project-root agent
* instructions file (`AGENTS.md`, else `CLAUDE.md`). Driven server-side at run
* start via the generic device RPC (not an LLM-visible tool) and cached onto
* `devices.workingDirs[].workspace`.
*
* Approves the root for the `lobe-file://` preview protocol (same as
* `listProjectSkills`) so the user can later click through to the scanned
* skills / instructions in the UI.
*/
@IpcMethod()
async initWorkspace(params: InitWorkspaceParams): Promise<InitWorkspaceResult> {
const root = params.scope;
const sources = ['.agents/skills', '.claude/skills'] as const;
const seen = new Set<string>();
const skills: ProjectSkillItem[] = [];
for (const source of sources) {
for (const skill of await this.scanSkillsInSource(root, source)) {
if (seen.has(skill.name)) continue;
seen.add(skill.name);
skills.push(skill);
}
}
skills.sort((a, b) => a.name.localeCompare(b.name));
const instructions = await this.readWorkspaceInstructions(root);
// Approve regardless of what was found — the run is now bound to this root,
// so any later click-through to it should resolve through the preview
// protocol even if the project carries neither skills nor instructions.
await this.approveProjectRootForPreview(root);
return { instructions, root, skills };
return runInitWorkspace(params, this.scanDeps);
}
/**
* Check whether a path exists on this device and is a directory, plus its git
* repo type (`git` / `github` / none). Used to validate a manually-entered
* working directory from a web / remote client (which can't browse this
* device's filesystem) before binding it, and to render the right dir icon.
*/
@IpcMethod()
async statPath(params: {
path: string;
}): Promise<{ exists: boolean; isDirectory: boolean; repoType?: 'git' | 'github' }> {
try {
const stats = await stat(params.path);
if (!stats.isDirectory()) return { exists: true, isDirectory: false };
const repoType = await detectRepoType(params.path);
return { exists: true, isDirectory: true, repoType };
} catch {
return { exists: false, isDirectory: false };
}
}
/**
* Read the project-root agent instructions files. Collects every present
* candidate (`AGENTS.md`, then `CLAUDE.md`) rather than first-match, since both
* can coexist. Each body is capped so a pathologically large file can't bloat
* the cached `workingDirs` payload or the injected system role.
*/
private async readWorkspaceInstructions(
root: string,
): Promise<InitWorkspaceResult['instructions']> {
const MAX_INSTRUCTIONS_BYTES = 64 * 1024;
const candidates = ['AGENTS.md', 'CLAUDE.md'] as const;
const instructions: InitWorkspaceResult['instructions'] = [];
for (const source of candidates) {
try {
const raw = await readFile(path.join(root, source), 'utf8');
const content =
raw.length > MAX_INSTRUCTIONS_BYTES ? raw.slice(0, MAX_INSTRUCTIONS_BYTES) : raw;
instructions.push({ content, source });
} catch {
// File absent or unreadable; skip it.
}
}
return instructions;
return runStatPath(params);
}
private async approveProjectRootForPreview(root: string) {
@@ -480,6 +480,87 @@ describe('HeterogeneousAgentCtr', () => {
expect(spawnCalls).toHaveLength(0);
});
it('spawns through the detector-resolved absolute path when the bare command is off PATH', async () => {
// Codex desktop app case: `codex` is not on PATH, but the preflight
// detector finds the CLI bundled inside Codex.app. Spawning the bare
// command would ENOENT — spawn must use the resolved absolute path.
const resolvedPath = '/Applications/Codex.app/Contents/Resources/codex';
const detect = vi.fn().mockResolvedValue({ available: true, path: resolvedPath });
const { proc } = createFakeProc();
nextFakeProc = proc;
const ctr = new HeterogeneousAgentCtr({
appStoragePath,
storeManager: { get: vi.fn() },
toolDetectorManager: { detect },
} as any);
const { sessionId } = await ctr.startSession({
agentType: 'codex',
command: 'codex',
});
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
expect(spawnCalls[0].command).toBe(resolvedPath);
});
it('carries the detector login-shell PATH into the spawn env for `env node` shims', async () => {
// `codex` resolved via the login-shell PATH (mise/nvm). Spawning the
// absolute shim under the leaner inherited PATH would fail at its
// `#!/usr/bin/env node` shebang — the resolved PATH must reach the child.
const resolvedPath = '/Users/h/.local/share/mise/shims/codex';
const searchPath = '/Users/h/.local/share/mise/shims:/usr/bin:/bin';
const detect = vi
.fn()
.mockResolvedValue({ available: true, path: resolvedPath, resolvedPathEnv: searchPath });
const { proc } = createFakeProc();
nextFakeProc = proc;
const ctr = new HeterogeneousAgentCtr({
appStoragePath,
storeManager: { get: vi.fn() },
toolDetectorManager: { detect },
} as any);
const { sessionId } = await ctr.startSession({ agentType: 'codex', command: 'codex' });
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
expect(spawnCalls[0].command).toBe(resolvedPath);
expect(spawnCalls[0].options.env.PATH).toBe(searchPath);
});
it('keeps an explicit path-like command for spawn instead of the detector result', async () => {
// detectHeterogeneousCliCommand validates the custom path via --version.
execFileMock.mockImplementation(
(
_file: string,
_args: string[],
optionsOrCallback: unknown,
callback?: (error: Error | null, result: { stderr: string; stdout: string }) => void,
) => {
const resolvedCallback =
typeof optionsOrCallback === 'function' ? optionsOrCallback : callback;
(resolvedCallback as any)?.(null, { stderr: '', stdout: 'codex-cli 0.99.0' });
},
);
const detect = vi.fn();
const { proc } = createFakeProc();
nextFakeProc = proc;
const ctr = new HeterogeneousAgentCtr({
appStoragePath,
storeManager: { get: vi.fn() },
toolDetectorManager: { detect },
} as any);
const { sessionId } = await ctr.startSession({
agentType: 'codex',
command: '/custom/bin/codex',
});
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
expect(detect).not.toHaveBeenCalled();
expect(spawnCalls[0].command).toBe('/custom/bin/codex');
});
it('passes prompt via stdin to codex exec instead of argv', async () => {
const prompt = '--run a shell-like prompt safely';
const { cliArgs, command, writes } = await runSendPrompt(prompt);
@@ -225,6 +225,7 @@ describe('LocalFileCtr', () => {
});
expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
accept: undefined,
filePath: '/workspace/app.ts',
workspaceRoot: '/workspace',
});
@@ -247,6 +248,28 @@ describe('LocalFileCtr', () => {
success: false,
});
});
it('should forward image-only preview URL constraints', async () => {
mockLocalFileProtocolManager.createPreviewUrl.mockResolvedValue(
'localfile://file/workspace/image.png?token=abc',
);
const result = await localFileCtr.getLocalFilePreviewUrl({
accept: 'image',
path: '/workspace/image.png',
workingDirectory: '/workspace',
});
expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
accept: 'image',
filePath: '/workspace/image.png',
workspaceRoot: '/workspace',
});
expect(result).toEqual({
success: true,
url: 'localfile://file/workspace/image.png?token=abc',
});
});
});
describe('getLocalFilePreview', () => {
@@ -263,6 +286,7 @@ describe('LocalFileCtr', () => {
});
expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
accept: undefined,
filePath: '/workspace/app.ts',
workspaceRoot: '/workspace',
});
@@ -289,6 +313,34 @@ describe('LocalFileCtr', () => {
success: false,
});
});
it('should forward image-only preview read constraints', async () => {
mockLocalFileProtocolManager.readPreviewFile.mockResolvedValue({
buffer: Buffer.from('image-bytes'),
contentType: 'image/png',
realPath: '/workspace/image.png',
});
const result = await localFileCtr.getLocalFilePreview({
accept: 'image',
path: '/workspace/image.png',
workingDirectory: '/workspace',
});
expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
accept: 'image',
filePath: '/workspace/image.png',
workspaceRoot: '/workspace',
});
expect(result).toEqual({
preview: {
base64: Buffer.from('image-bytes').toString('base64'),
contentType: 'image/png',
type: 'image',
},
success: true,
});
});
});
describe('handleWriteFile', () => {
@@ -54,6 +54,21 @@ export interface PreviewFileReadResult {
realPath: string;
}
type PreviewFileAccept = 'image';
const normalizeContentType = (contentType: string): string =>
contentType.split(';')[0].trim().toLowerCase();
const isAcceptedPreviewContentType = (
contentType: string,
accept?: PreviewFileAccept,
): boolean => {
if (!accept) return true;
const normalizedContentType = normalizeContentType(contentType);
return accept === 'image' && normalizedContentType.startsWith('image/');
};
/**
* Custom `localfile://` protocol for project file previews.
*
@@ -213,16 +228,26 @@ export class LocalFileProtocolManager {
}
async createPreviewUrl({
accept,
filePath,
workspaceRoot,
}: {
accept?: PreviewFileAccept;
filePath: string;
workspaceRoot: string;
}): Promise<string | null> {
const normalizedFilePath = normalizeAbsolutePath(filePath);
if (!normalizedFilePath) return null;
const realFilePath = await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
const realFilePath = accept
? (
await this.readPreviewFile({
accept,
filePath,
workspaceRoot,
})
)?.realPath
: await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
if (!realFilePath) return null;
this.cleanupExpiredTokens();
@@ -237,9 +262,11 @@ export class LocalFileProtocolManager {
}
async readPreviewFile({
accept,
filePath,
workspaceRoot,
}: {
accept?: PreviewFileAccept;
filePath: string;
workspaceRoot: string;
}): Promise<PreviewFileReadResult | null> {
@@ -250,9 +277,12 @@ export class LocalFileProtocolManager {
if (!fileStat.isFile()) return null;
const buffer = await readFile(realFilePath);
const contentType = resolveLocalFileMimeType(realFilePath, buffer);
if (!isAcceptedPreviewContentType(contentType, accept)) return null;
return {
buffer,
contentType: resolveLocalFileMimeType(realFilePath, buffer),
contentType,
realPath: realFilePath,
};
}
@@ -15,6 +15,15 @@ export interface ToolStatus {
error?: string;
lastChecked?: Date;
path?: string;
/**
* PATH value used to resolve/validate the command, surfaced only when it
* differs from the detector process's `process.env.PATH` (e.g. resolution
* fell back to the login-shell PATH). A caller that spawns the resolved
* `path` must carry this into the child's PATH, or a `#!/usr/bin/env node`
* shim that resolved here still fails with `env: node: No such file or
* directory` under the leaner inherited env.
*/
resolvedPathEnv?: string;
version?: string;
}
@@ -119,6 +119,21 @@ describe('LocalFileProtocolManager', () => {
expect(response.headers.get('Content-Type')).toBe('text/plain; charset=utf-8');
});
it('does not mint image-only preview URLs for text files', async () => {
const manager = new LocalFileProtocolManager();
await manager.approveWorkspaceRoot('/Users/alice/project');
mockReadFile.mockResolvedValue(Buffer.from('const value = 1;'));
const url = await manager.createPreviewUrl({
accept: 'image',
filePath: '/Users/alice/project/App.tsx',
workspaceRoot: '/Users/alice/project',
});
expect(url).toBeNull();
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
});
it('decodes percent-encoded characters in the path', async () => {
const manager = new LocalFileProtocolManager();
manager.registerHandler();
@@ -296,6 +311,21 @@ describe('LocalFileProtocolManager', () => {
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
});
it('does not return text payloads for image-only preview reads', async () => {
const manager = new LocalFileProtocolManager();
await manager.approveIndexedProjectRoot('/Users/alice/project');
mockReadFile.mockResolvedValue(Buffer.from('SECRET=value'));
const result = await manager.readPreviewFile({
accept: 'image',
filePath: '/Users/alice/project/.env',
workspaceRoot: '/Users/alice/project',
});
expect(result).toBeNull();
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/.env');
});
it('does not read preview payloads outside the approved workspace root', async () => {
const manager = new LocalFileProtocolManager();
await manager.approveIndexedProjectRoot('/Users/alice/project');
+45 -2
View File
@@ -16,6 +16,12 @@ import type { App } from '../App';
// Create logger
const logger = createLogger('core:Tray');
// Debounce window for distinguishing a single-click from the leading edge of
// a double-click. Electron delivers two `click` events before `double-click`,
// so we defer the single-click action until this window passes — the
// `double-click` handler clears it if it arrives in time.
const CLICK_DEBOUNCE_MS = 250;
export interface TrayOptions {
/**
* Tray icon path (relative to resource directory)
@@ -54,6 +60,12 @@ export class Tray {
*/
private _contextMenu?: ElectronMenu;
/**
* Pending single-click timer. Cleared by the double-click handler so a
* double-click never accidentally fires startSession before showMainWindow.
*/
private _clickTimer?: NodeJS.Timeout;
/**
* Identifier
*/
@@ -118,10 +130,25 @@ export class Tray {
// Set default context menu
this.setContextMenu();
// Left-click: open Quick Composer.
// Left-click: deferred so a follow-up `double-click` can pre-empt it.
this._tray.on('click', () => {
logger.debug(`[${this.identifier}] Tray clicked`);
this.onClick();
if (this._clickTimer) clearTimeout(this._clickTimer);
this._clickTimer = setTimeout(() => {
this._clickTimer = undefined;
this.onClick();
}, CLICK_DEBOUNCE_MS);
});
// Double-click (macOS / Windows): cancel the pending single-click and
// surface the main window instead.
this._tray.on('double-click', () => {
logger.debug(`[${this.identifier}] Tray double-clicked`);
if (this._clickTimer) {
clearTimeout(this._clickTimer);
this._clickTimer = undefined;
}
this.onDoubleClick();
});
// Right-click: pop the stored context menu manually so left-click stays
@@ -189,6 +216,18 @@ export class Tray {
}
}
/**
* Handle tray double-click event — surfaces the main window.
*/
onDoubleClick() {
logger.debug(`[${this.identifier}] Tray double-click → showMainWindow`);
try {
this.app.browserManager.showMainWindow();
} catch (error) {
logger.error(`[${this.identifier}] Failed to show main window:`, error);
}
}
/**
* Replace the tray context menu with a pre-built Electron Menu instance.
* Stored in-house and popped up manually on right-click to preserve
@@ -259,6 +298,10 @@ export class Tray {
*/
destroy() {
logger.debug(`Destroying tray instance: ${this.identifier}`);
if (this._clickTimer) {
clearTimeout(this._clickTimer);
this._clickTimer = undefined;
}
if (this._tray) {
this._tray.destroy();
this._tray = undefined;
@@ -189,7 +189,7 @@ describe('Tray', () => {
expect(mockElectronTray.setContextMenu).not.toHaveBeenCalled();
});
it('should register both click and right-click listeners', () => {
it('should register click, double-click and right-click listeners', () => {
tray = new Tray(
{
iconPath: 'tray.png',
@@ -200,6 +200,7 @@ describe('Tray', () => {
const events = mockElectronTray.on.mock.calls.map((c: any[]) => c[0]);
expect(events).toContain('click');
expect(events).toContain('double-click');
expect(events).toContain('right-click');
});
@@ -346,6 +347,96 @@ describe('Tray', () => {
});
});
describe('onDoubleClick', () => {
beforeEach(() => {
tray = new Tray(
{
iconPath: 'tray.png',
identifier: 'test-tray',
},
mockApp,
);
});
it('should show the main window', () => {
tray.onDoubleClick();
expect(mockApp.browserManager.showMainWindow).toHaveBeenCalled();
});
it('should not start the capture session', () => {
tray.onDoubleClick();
expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
});
it('should not throw when showMainWindow throws', () => {
vi.mocked(mockApp.browserManager.showMainWindow).mockImplementationOnce(() => {
throw new Error('window failed');
});
expect(() => tray.onDoubleClick()).not.toThrow();
});
});
describe('click vs double-click handling', () => {
let clickHandler: (() => void) | undefined;
let doubleClickHandler: (() => void) | undefined;
beforeEach(() => {
vi.useFakeTimers();
tray = new Tray(
{
iconPath: 'tray.png',
identifier: 'test-tray',
},
mockApp,
);
clickHandler = mockElectronTray.on.mock.calls.find((c: any[]) => c[0] === 'click')?.[1];
doubleClickHandler = mockElectronTray.on.mock.calls.find(
(c: any[]) => c[0] === 'double-click',
)?.[1];
});
afterEach(() => {
vi.useRealTimers();
});
it('should debounce single click before calling startSession', () => {
expect(clickHandler).toBeDefined();
clickHandler?.();
expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
vi.advanceTimersByTime(250);
expect(mockApp.screenCaptureManager.startSession).toHaveBeenCalledTimes(1);
});
it('should cancel the pending single click when double-click fires', () => {
expect(clickHandler).toBeDefined();
expect(doubleClickHandler).toBeDefined();
clickHandler?.();
clickHandler?.();
doubleClickHandler?.();
vi.advanceTimersByTime(1000);
expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
expect(mockApp.browserManager.showMainWindow).toHaveBeenCalledTimes(1);
});
it('should only fire startSession once per single-click burst', () => {
clickHandler?.();
clickHandler?.();
vi.advanceTimersByTime(250);
expect(mockApp.screenCaptureManager.startSession).toHaveBeenCalledTimes(1);
});
});
describe('updateIcon', () => {
beforeEach(() => {
tray = new Tray(
@@ -1,5 +1,6 @@
import * as childProcess from 'node:child_process';
import * as os from 'node:os';
import path from 'node:path';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
@@ -180,6 +181,76 @@ describe('cliAgentDetectors', () => {
expect(status.path).toBe('/usr/local/bin/claude');
expect(execMock).not.toHaveBeenCalled();
expect(execFileMock).toHaveBeenCalledTimes(2);
// Resolved on the inherited PATH — nothing extra to carry into spawn.
expect(status.resolvedPathEnv).toBeUndefined();
});
it('falls back to the Codex.app bundled CLI when `codex` is not on any PATH', async () => {
const originalPath = process.env.PATH;
const originalShell = process.env.SHELL;
// Deterministic env: no SHELL → no login-shell lookup, merged PATH
// equals process.env.PATH → no second `which` attempt.
process.env.PATH = '/usr/bin:/bin';
delete process.env.SHELL;
try {
callExecFileError(new Error('not found')); // which codex
callExecFile('codex-cli 0.138.0'); // bundled CLI --version
const { codexDetector } = await import('../cliAgentDetectors');
const status = await codexDetector.detect();
expect(status.available).toBe(true);
expect(status.path).toBe('/Applications/Codex.app/Contents/Resources/codex');
expect(status.version).toBe('codex-cli 0.138.0');
expect(execFileMock).toHaveBeenCalledTimes(2);
expect(execFileMock.mock.calls[0]![0]).toBe('which');
expect(execFileMock.mock.calls[1]![0]).toBe(
'/Applications/Codex.app/Contents/Resources/codex',
);
} finally {
process.env.PATH = originalPath;
if (originalShell === undefined) delete process.env.SHELL;
else process.env.SHELL = originalShell;
}
});
it('stays unavailable when neither PATH nor the well-known locations have codex', async () => {
const originalPath = process.env.PATH;
const originalShell = process.env.SHELL;
process.env.PATH = '/usr/bin:/bin';
delete process.env.SHELL;
try {
callExecFileError(new Error('not found')); // which codex
callExecFileError(new Error('ENOENT')); // /Applications candidate
callExecFileError(new Error('ENOENT')); // ~/Applications candidate
const { codexDetector } = await import('../cliAgentDetectors');
const status = await codexDetector.detect();
expect(status.available).toBe(false);
expect(execFileMock).toHaveBeenCalledTimes(3);
expect(execFileMock.mock.calls[2]![0]).toBe(
path.join(os.homedir(), 'Applications', 'Codex.app', 'Contents', 'Resources', 'codex'),
);
} finally {
process.env.PATH = originalPath;
if (originalShell === undefined) delete process.env.SHELL;
else process.env.SHELL = originalShell;
}
});
it('does not probe well-known locations for an explicit path-like command', async () => {
callExecFileError(new Error('ENOENT')); // /custom/bin/codex --version
const { detectHeterogeneousCliCommand } = await import('../cliAgentDetectors');
const status = await detectHeterogeneousCliCommand('codex', '/custom/bin/codex');
expect(status.available).toBe(false);
// Only the explicit path's --version attempt — no fallback probing.
expect(execFileMock).toHaveBeenCalledTimes(1);
});
it('falls back to the login shell PATH for tools installed by shell setup', async () => {
@@ -200,6 +271,12 @@ describe('cliAgentDetectors', () => {
expect(status.available).toBe(true);
expect(status.path).toBe('/Users/Hanam/.local/share/mise/shims/gemini');
expect(status.version).toBe('gemini 0.2.0');
// The login-shell PATH that resolved the shim must be surfaced so the
// spawn site can carry it into the child env (mise/nvm `node` lives
// there, not on the leaner inherited PATH).
expect(status.resolvedPathEnv).toBe(
'/opt/homebrew/bin:/Users/Hanam/.local/share/mise/shims:/usr/bin:/bin',
);
expect(execFileMock).toHaveBeenCalledTimes(4);
expect(execFileMock.mock.calls[0]![0]).toBe('which');
@@ -1,5 +1,5 @@
import { exec, execFile } from 'node:child_process';
import { platform } from 'node:os';
import { homedir, platform } from 'node:os';
import path from 'node:path';
import { promisify } from 'node:util';
@@ -190,6 +190,11 @@ const detectValidatedCommand = async (
return {
available: true,
path: resolvedPath,
// `env` is set only when resolution fell back to the login-shell PATH.
// Surface that PATH so the spawn site can carry it into the child env —
// otherwise a `#!/usr/bin/env node` shim resolved here can't find `node`
// under the leaner inherited PATH (Finder-launched Electron).
resolvedPathEnv: env?.PATH,
version: output.split(/\r?\n/)[0],
};
} catch {
@@ -209,6 +214,27 @@ const HETEROGENEOUS_CLI_AGENT_OPTIONS = {
Pick<ValidatedDetectorOptions, 'validateKeywords'>
>;
// Well-known absolute install locations probed when a bare command isn't on
// PATH. The Codex desktop app bundles a fully functional CLI inside Codex.app
// (sharing ~/.codex auth/config) but never symlinks it into PATH, so
// `which codex` misses an otherwise working install.
const getWellKnownCommandPaths = (agentType: HeterogeneousCliAgentType): string[] => {
if (platform() !== 'darwin') return [];
switch (agentType) {
case 'codex': {
const bundledCli = path.join('Codex.app', 'Contents', 'Resources', 'codex');
return [
path.join('/Applications', bundledCli),
path.join(homedir(), 'Applications', bundledCli),
];
}
default: {
return [];
}
}
};
export const detectHeterogeneousCliCommand = async (
agentType: HeterogeneousCliAgentType,
command: string,
@@ -216,7 +242,20 @@ export const detectHeterogeneousCliCommand = async (
const validator = HETEROGENEOUS_CLI_AGENT_OPTIONS[agentType];
if (!validator) return { available: false };
return detectValidatedCommand(command, validator);
const status = await detectValidatedCommand(command, validator);
if (status.available) return status;
// A bare command missing from PATH may still live at a well-known install
// location (e.g. the Codex desktop app's bundled CLI). Don't second-guess
// an explicit user-configured path.
if (!command.trim().includes(path.sep)) {
for (const candidate of getWellKnownCommandPaths(agentType)) {
const fallbackStatus = await detectValidatedCommand(candidate, validator);
if (fallbackStatus.available) return fallbackStatus;
}
}
return status;
};
/**
@@ -261,14 +300,17 @@ export const claudeCodeDetector: IToolDetector = createValidatedDetector({
/**
* OpenAI Codex CLI
* @see https://github.com/openai/codex
*
* Goes through `detectHeterogeneousCliCommand` so the Codex.app bundled-CLI
* fallback applies here too, keeping the manager path and the custom-command
* path in sync.
*/
export const codexDetector: IToolDetector = createValidatedDetector({
candidates: ['codex'],
export const codexDetector: IToolDetector = {
description: 'Codex - OpenAI agentic coding CLI',
detect: () => detectHeterogeneousCliCommand('codex', 'codex'),
name: 'codex',
priority: 2,
validateKeywords: ['codex'],
});
};
/**
* Google Gemini CLI
@@ -15,7 +15,15 @@ const mocks = vi.hoisted(() => ({
),
}));
const mockGlobalConfigDependencies = (enableBusinessFeatures: boolean) => {
interface MockGlobalConfigOptions {
agentGatewayUrl?: string;
enableAgentGateway?: boolean;
}
const mockGlobalConfigDependencies = (
enableBusinessFeatures: boolean,
options: MockGlobalConfigOptions = {},
) => {
vi.doMock('@lobechat/business-const', () => ({
ENABLE_BUSINESS_FEATURES: enableBusinessFeatures,
}));
@@ -29,7 +37,12 @@ const mockGlobalConfigDependencies = (enableBusinessFeatures: boolean) => {
}));
vi.doMock('@/envs/app', () => ({
appEnv: {},
appEnv: {
...(options.agentGatewayUrl ? { AGENT_GATEWAY_URL: options.agentGatewayUrl } : {}),
...(options.enableAgentGateway === undefined
? {}
: { ENABLE_AGENT_GATEWAY: options.enableAgentGateway }),
},
getAppConfig: vi.fn(() => ({
DEFAULT_AGENT_CONFIG: '',
})),
@@ -113,6 +126,18 @@ const loadCapturedProviderConfig = async (enableBusinessFeatures: boolean) => {
>;
};
const loadServerConfig = async (
enableBusinessFeatures: boolean,
options?: MockGlobalConfigOptions,
) => {
vi.resetModules();
mocks.genServerAiProvidersConfig.mockClear();
mockGlobalConfigDependencies(enableBusinessFeatures, options);
const { getServerGlobalConfig } = await import('./index');
return getServerGlobalConfig();
};
describe('getServerGlobalConfig', () => {
afterEach(() => {
vi.restoreAllMocks();
@@ -139,4 +164,36 @@ describe('getServerGlobalConfig', () => {
expect(providerConfig[ModelProvider.OpenAI]).toBeUndefined();
expect(providerConfig[ModelProvider.DeepSeek].enabled).toBe(true);
});
it('should enable gateway mode for business builds', async () => {
await expect(loadServerConfig(true)).resolves.toMatchObject({
enableGatewayMode: true,
});
});
it('should enable gateway mode for self-hosted builds only when explicitly enabled with a gateway url', async () => {
await expect(
loadServerConfig(false, {
agentGatewayUrl: 'https://gateway.test.com',
enableAgentGateway: true,
}),
).resolves.toMatchObject({
agentGatewayUrl: 'https://gateway.test.com',
enableGatewayMode: true,
});
await expect(
loadServerConfig(false, {
agentGatewayUrl: 'https://gateway.test.com',
enableAgentGateway: false,
}),
).resolves.toMatchObject({
agentGatewayUrl: 'https://gateway.test.com',
enableGatewayMode: false,
});
await expect(loadServerConfig(false, { enableAgentGateway: true })).resolves.toMatchObject({
enableGatewayMode: false,
});
});
});
+2
View File
@@ -104,6 +104,8 @@ export const getServerGlobalConfig = async () => {
disableEmailPassword: authEnv.AUTH_DISABLE_EMAIL_PASSWORD,
enableBusinessFeatures: ENABLE_BUSINESS_FEATURES,
enableEmailVerification: authEnv.AUTH_EMAIL_VERIFICATION,
enableGatewayMode:
ENABLE_BUSINESS_FEATURES || (!!appEnv.ENABLE_AGENT_GATEWAY && !!appEnv.AGENT_GATEWAY_URL),
enableKlavis: !!klavisEnv.KLAVIS_API_KEY,
enableLobehubSkill: !!(appEnv.MARKET_TRUSTED_CLIENT_SECRET && appEnv.MARKET_TRUSTED_CLIENT_ID),
enableMagicLink: authEnv.AUTH_ENABLE_MAGIC_LINK,
@@ -61,6 +61,7 @@ import { chainCompressContext } from '@lobechat/prompts';
import {
type ChatToolPayload,
type ExecSubAgentParams,
type ExecVirtualSubAgentParams,
type MessageToolCall,
type UIChatMessage,
} from '@lobechat/types';
@@ -323,7 +324,7 @@ const buildPostProcessUrl = (
};
/**
* Build the per-tool-call server sub-agent runner injected into the tool
* Build the per-tool-call server virtual sub-agent runner injected into the tool
* execution context. Closes over the current tool payload + parent message so
* the `callSubAgent` server tool can fork a child op without re-deriving the
* message anchor (which it cannot do correctly from its own context).
@@ -331,17 +332,18 @@ const buildPostProcessUrl = (
* The runner creates the pending placeholder tool message that anchors the
* isolation thread (so the UI shows a loading state and the completion bridge
* has a message to backfill), then kicks off the child op asynchronously and
* returns immediately. Returns `undefined` when sub-agent execution is not
* available (no `execSubAgent` callback, or missing agent/topic context).
* returns immediately. Returns `undefined` when virtual sub-agent execution is
* not available (no `execVirtualSubAgent` callback, or missing agent/topic
* context).
*/
const buildServerSubAgentRunner = (
const buildServerVirtualSubAgentRunner = (
ctx: RuntimeExecutorContext,
state: AgentState,
chatToolPayload: ChatToolPayload,
parentMessageId: string,
): ServerSubAgentRunner | undefined => {
const execSubAgent = ctx.execSubAgent;
if (!execSubAgent) return undefined;
const execVirtualSubAgent = ctx.execVirtualSubAgent;
if (!execVirtualSubAgent) return undefined;
const agentId = state.metadata?.agentId;
const topicId = ctx.topicId ?? state.metadata?.topicId;
@@ -364,16 +366,15 @@ const buildServerSubAgentRunner = (
topicId,
});
// 2. Fork the child op anchored to the placeholder. `resumeParentOnComplete`
// tells execSubAgent to register the completion bridge that
// backfills this tool message and resumes the parent op.
const result = (await execSubAgent({
// 2. Fork the virtual child op anchored to the placeholder. The virtual
// entry marks the child as `isSubAgent` and registers the completion
// bridge that backfills this tool message and resumes the parent op.
const result = (await execVirtualSubAgent({
agentId: targetAgentId ?? agentId,
groupId: state.metadata?.groupId ?? undefined,
instruction,
parentMessageId: placeholder.id,
parentOperationId: ctx.operationId,
resumeParentOnComplete: true,
timeout,
title: description,
topicId,
@@ -387,7 +388,7 @@ const buildServerSubAgentRunner = (
await ctx.messageModel.deleteMessage(placeholder.id);
} catch (error) {
log(
'buildServerSubAgentRunner: failed to clean up placeholder %s: %O',
'buildServerVirtualSubAgentRunner: failed to clean up placeholder %s: %O',
placeholder.id,
error,
);
@@ -522,11 +523,17 @@ export interface RuntimeExecutorContext {
discordContext?: any;
evalContext?: EvalContext;
/**
* Callback to spawn a sub-agent task server-side.
* Callback to run a legacy agent invocation server-side.
* Injected by AiAgentService so exec_sub_agent / exec_sub_agents executors
* can dispatch callAgent-triggered tasks without a circular import.
* can dispatch callAgent-triggered runs without a circular import.
*/
execSubAgent?: (params: ExecSubAgentParams) => Promise<unknown>;
/**
* Callback to fork a `lobe-agent.callSubAgent` virtual child run. Unlike
* execSubAgent, this path installs the async completion bridge and marks the
* child operation as a sub-agent.
*/
execVirtualSubAgent?: (params: ExecVirtualSubAgentParams) => Promise<unknown>;
hookDispatcher?: HookDispatcher;
loadAgentState?: (operationId: string) => Promise<AgentState | null>;
messageModel: MessageModel;
@@ -2476,7 +2483,7 @@ export const createRuntimeExecutors = (
scope: state.metadata?.scope,
serverDB: ctx.serverDB,
skipResultTruncation: true,
subAgent: buildServerSubAgentRunner(
subAgent: buildServerVirtualSubAgentRunner(
ctx,
state,
chatToolPayload,
@@ -2718,14 +2725,15 @@ export const createRuntimeExecutors = (
log('[%s:%d] Tool execution completed', operationId, stepIndex);
// When the tool result carries an execSubAgent / execSubAgents state the
// GeneralChatAgent needs `stop: true` in the payload to detect it and
// emit the matching exec_sub_agent / exec_sub_agents instruction. Without
// this flag the agent falls through to the normal LLM-call path and the
// sub-agent is never spawned.
const execTaskStateType = executionResult.state?.type as string | undefined;
const isExecTaskState =
execTaskStateType === 'execSubAgent' || execTaskStateType === 'execSubAgents';
// When a legacy callAgent task result carries execSubAgent / execSubAgents
// state, the GeneralChatAgent needs `stop: true` in the payload to detect
// it and emit the matching exec_sub_agent / exec_sub_agents instruction.
// Without this flag the agent falls through to the normal LLM-call path
// and the background agent run is never spawned.
const legacyAgentInvocationStateType = executionResult.state?.type as string | undefined;
const isLegacyAgentInvocationState =
legacyAgentInvocationStateType === 'execSubAgent' ||
legacyAgentInvocationStateType === 'execSubAgents';
executeToolSpan.setAttributes(
buildExecuteToolResultAttributes({ attempts: execution.attempts, success: isSuccess }),
@@ -2741,7 +2749,7 @@ export const createRuntimeExecutors = (
isSuccess,
// Pass tool message ID as parentMessageId for the next LLM call
parentMessageId: toolMessageId,
...(isExecTaskState && { stop: true }),
...(isLegacyAgentInvocationState && { stop: true }),
toolCall: chatToolPayload,
toolCallId: chatToolPayload.id,
},
@@ -3048,7 +3056,7 @@ export const createRuntimeExecutors = (
scope: state.metadata?.scope,
serverDB: ctx.serverDB,
skipResultTruncation: true,
subAgent: buildServerSubAgentRunner(
subAgent: buildServerVirtualSubAgentRunner(
ctx,
state,
chatToolPayload,
@@ -132,6 +132,14 @@ describe('formatErrorForState', () => {
expect(result.countAsFailure).toBeUndefined();
expect(result.numericId).toBeUndefined();
});
it('classifies a raw Drizzle "Failed query" Error via its message instead of a bare 500', () => {
const result = formatErrorForState(new Error('Failed query: rollback\nparams: '));
expect(result.type).toBe(AgentRuntimeErrorType.DatabasePersistError);
expect(result.numericId).toBe(7004);
expect(result.attribution).toBe('harness');
});
});
describe('ProviderBizError refinement', () => {
@@ -9,10 +9,16 @@ import { KnowledgeBaseModel } from '@/database/models/knowledgeBase';
import { SessionModel } from '@/database/models/session';
import { UserModel } from '@/database/models/user';
import { AgentService } from '@/server/services/agent';
import { EditLockService } from '@/server/services/editLock';
import { publishResourceEvent } from '@/server/services/resourceEvents';
import { KnowledgeType } from '@/types/knowledgeBase';
import { agentRouter } from '../agent';
vi.mock('@/server/services/resourceEvents', () => ({ publishResourceEvent: vi.fn() }));
const publishResourceEventMock = vi.mocked(publishResourceEvent);
vi.mock('@/database/models/user', () => ({
UserModel: {
findById: vi.fn(),
@@ -329,4 +335,122 @@ describe('agentRouter', () => {
expect(agentModelMock.update).toHaveBeenCalledWith(mockInput.id, { pinned: false });
});
});
describe('edit lock', () => {
const wsCtx = () => ({ ...mockCtx, workspaceId: 'ws-1' });
describe('updateAgentConfig write guard', () => {
it('rejects the update when another member holds the lock', async () => {
agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
const caller = agentRouter.createCaller(wsCtx());
await expect(
caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } }),
).rejects.toMatchObject({ code: 'CONFLICT' });
expect(agentServiceMock.updateAgentConfig).not.toHaveBeenCalled();
});
it('allows the update when no other member holds the lock', async () => {
agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
const caller = agentRouter.createCaller(wsCtx());
await caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } });
expect(agentServiceMock.updateAgentConfig).toHaveBeenCalledWith('agent-1', {
systemRole: 'x',
});
});
it('does not check the lock for personal (non-workspace) agents', async () => {
agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
const caller = agentRouter.createCaller(mockCtx);
await caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } });
expect(guardSpy).not.toHaveBeenCalled();
expect(agentServiceMock.updateAgentConfig).toHaveBeenCalled();
});
});
describe('acquireAgentLock', () => {
it('returns unlocked without touching the lock service for personal agents', async () => {
const acquireSpy = vi.spyOn(EditLockService.prototype, 'acquire');
const caller = agentRouter.createCaller(mockCtx);
const result = await caller.acquireAgentLock({ agentId: 'agent-1' });
expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
expect(acquireSpy).not.toHaveBeenCalled();
});
it('broadcasts lock.changed on a holder edge (first claim)', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(undefined);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
const caller = agentRouter.createCaller(wsCtx());
await caller.acquireAgentLock({ agentId: 'agent-1' });
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'agent-1', type: 'agent' },
expect.objectContaining({ data: { holderId: userId }, type: 'lock.changed' }),
);
});
it('does NOT broadcast on a steady-state heartbeat (same holder)', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(userId);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
const caller = agentRouter.createCaller(wsCtx());
await caller.acquireAgentLock({ agentId: 'agent-1' });
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
});
describe('getAgentLock', () => {
it('reports another member as the holder', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue('other-user');
const caller = agentRouter.createCaller(wsCtx());
const result = await caller.getAgentLock({ agentId: 'agent-1' });
expect(result).toEqual({ expiresAt: null, holderId: 'other-user', lockedByOther: true });
});
});
describe('releaseAgentLock', () => {
it('broadcasts unlocked only when it actually freed the lock', async () => {
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
const caller = agentRouter.createCaller(wsCtx());
await caller.releaseAgentLock({ agentId: 'agent-1' });
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'agent-1', type: 'agent' },
expect.objectContaining({ data: { holderId: null }, type: 'lock.changed' }),
);
});
it('does NOT broadcast when the lease expired / was taken over', async () => {
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(false);
const caller = agentRouter.createCaller(wsCtx());
await caller.releaseAgentLock({ agentId: 'agent-1' });
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
});
});
});
@@ -7,9 +7,15 @@ import * as ChatGroupModelModule from '@/database/models/chatGroup';
import * as UserModelModule from '@/database/models/user';
import * as AgentGroupRepoModule from '@/database/repositories/agentGroup';
import * as ChatGroupServiceModule from '@/server/services/agentGroup';
import { EditLockService } from '@/server/services/editLock';
import { publishResourceEvent } from '@/server/services/resourceEvents';
import { agentGroupRouter } from '../agentGroup';
vi.mock('@/server/services/resourceEvents', () => ({ publishResourceEvent: vi.fn() }));
const publishResourceEventMock = vi.mocked(publishResourceEvent);
describe('agentGroupRouter', () => {
const userId = 'testUserId';
let mockCtx: any;
@@ -439,4 +445,126 @@ describe('agentGroupRouter', () => {
expect(result).toEqual(mockUpdatedGroup);
});
});
describe('edit lock', () => {
const wsCtx = () => ({ serverDB: {}, userId, workspaceId: 'ws-1' });
describe('updateGroup write guard', () => {
it('rejects the update when another member holds the lock', async () => {
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
const caller = agentGroupRouter.createCaller(wsCtx());
await expect(
caller.updateGroup({ id: 'group-1', value: { title: 'New' } }),
).rejects.toMatchObject({ code: 'CONFLICT' });
expect(chatGroupModelMock.update).not.toHaveBeenCalled();
});
it('allows the update when no other member holds the lock', async () => {
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
chatGroupModelMock.update.mockResolvedValue({ id: 'group-1' });
const caller = agentGroupRouter.createCaller(wsCtx());
await caller.updateGroup({ id: 'group-1', value: { title: 'New' } });
expect(chatGroupModelMock.update).toHaveBeenCalled();
});
it('does not check the lock for personal (non-workspace) groups', async () => {
const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
chatGroupModelMock.update.mockResolvedValue({ id: 'group-1' });
const caller = agentGroupRouter.createCaller(mockCtx);
await caller.updateGroup({ id: 'group-1', value: { title: 'New' } });
expect(guardSpy).not.toHaveBeenCalled();
expect(chatGroupModelMock.update).toHaveBeenCalled();
});
});
describe('acquireGroupLock', () => {
it('returns unlocked without touching the lock service for personal groups', async () => {
const acquireSpy = vi.spyOn(EditLockService.prototype, 'acquire');
const caller = agentGroupRouter.createCaller(mockCtx);
const result = await caller.acquireGroupLock({ id: 'group-1' });
expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
expect(acquireSpy).not.toHaveBeenCalled();
});
it('broadcasts lock.changed on a holder edge (first claim)', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(undefined);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
const caller = agentGroupRouter.createCaller(wsCtx());
await caller.acquireGroupLock({ id: 'group-1' });
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'group-1', type: 'chatGroup' },
expect.objectContaining({ data: { holderId: userId }, type: 'lock.changed' }),
);
});
it('does NOT broadcast on a steady-state heartbeat (same holder)', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(userId);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
const caller = agentGroupRouter.createCaller(wsCtx());
await caller.acquireGroupLock({ id: 'group-1' });
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
});
describe('getGroupLock', () => {
it('reports another member as the holder', async () => {
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue('other-user');
const caller = agentGroupRouter.createCaller(wsCtx());
const result = await caller.getGroupLock({ id: 'group-1' });
expect(result).toEqual({ expiresAt: null, holderId: 'other-user', lockedByOther: true });
});
it('returns unlocked for personal groups', async () => {
const caller = agentGroupRouter.createCaller(mockCtx);
const result = await caller.getGroupLock({ id: 'group-1' });
expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
});
});
describe('releaseGroupLock', () => {
it('broadcasts unlocked only when it actually freed the lock', async () => {
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
const caller = agentGroupRouter.createCaller(wsCtx());
await caller.releaseGroupLock({ id: 'group-1' });
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'group-1', type: 'chatGroup' },
expect.objectContaining({ data: { holderId: null }, type: 'lock.changed' }),
);
});
it('does NOT broadcast when the lease expired / was taken over', async () => {
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(false);
const caller = agentGroupRouter.createCaller(wsCtx());
await caller.releaseGroupLock({ id: 'group-1' });
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
});
});
});
@@ -119,7 +119,7 @@ describe('aiChatRouter', () => {
expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledTimes(1);
expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledWith(
expect.any(Object),
expect.objectContaining({ touchTopicUpdatedAt: false }),
expect.not.objectContaining({ touchTopicUpdatedAt: expect.anything() }),
);
expect(mockGet).toHaveBeenCalledWith(
@@ -161,7 +161,7 @@ describe('aiChatRouter', () => {
expect(mockCreateMessage).toHaveBeenCalled();
expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledWith(
expect.any(Object),
expect.objectContaining({ touchTopicUpdatedAt: true }),
expect.not.objectContaining({ touchTopicUpdatedAt: expect.anything() }),
);
expect(mockGet).toHaveBeenCalledWith(
expect.objectContaining({
@@ -6,7 +6,7 @@ import { eq } from 'drizzle-orm';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { topicRouter } from '../../topic';
import { cleanupTestUser, createTestContext, createTestUser } from './setup';
import { cleanupTestUser, createTestAgent, createTestContext, createTestUser } from './setup';
// We need to mock getServerDB to return our test database instance
let testDB: LobeChatDatabase;
@@ -332,31 +332,79 @@ describe('Topic Router Integration Tests', () => {
});
});
// BM25 search requires pg_search extension (ParadeDB), not available in integration test DB
// BM25 search requires pg_search extension (ParadeDB), not available in the
// default integration test DB (PGlite). Run with TEST_SERVER_DB=1 +
// DATABASE_TEST_URL pointing at a ParadeDB instance to exercise these.
describe.skip('searchTopics', () => {
it('should search topics using agentId', async () => {
const caller = topicRouter.createCaller(createTestContext(userId));
// Create test topics
await caller.createTopic({
title: 'TypeScript Discussion',
sessionId: testSessionId,
});
// Topics are agent-native: stored with agentId directly.
await serverDB.insert(topics).values([
{ agentId: testAgentId, title: 'TypeScript Discussion', userId },
{ agentId: testAgentId, title: 'JavaScript Basics', userId },
]);
await caller.createTopic({
title: 'JavaScript Basics',
sessionId: testSessionId,
});
// Search using agentId
const result = await caller.searchTopics({
keywords: 'TypeScript',
agentId: testAgentId,
keywords: 'TypeScript',
});
expect(result.length).toBeGreaterThan(0);
expect(result[0].title).toContain('TypeScript');
});
// Regression for the "No topics match these filters" bug: topics created by
// the new agent system carry `agentId` directly with a NULL `sessionId`.
// The old search resolved agentId -> sessionId and filtered by the
// container only, so these rows were never matched even though the topics
// list (which filters by agentId) showed them.
it('should find agentId-scoped topics that have no sessionId', async () => {
const caller = topicRouter.createCaller(createTestContext(userId));
// Insert a topic the way the agent runtime does: agentId set, sessionId null.
await serverDB.insert(topics).values({
agentId: testAgentId,
sessionId: null,
title: 'rinabrown84@gmail.com',
userId,
});
const result = await caller.searchTopics({
agentId: testAgentId,
keywords: 'rinabrown84@gmail.com',
});
expect(result.length).toBeGreaterThan(0);
expect(result[0].title).toBe('rinabrown84@gmail.com');
});
// The agent scope mirrors the topics list exactly (agentId only). A row that
// shares this agent's resolved session but is owned by a DIFFERENT agent
// must not leak in — the bug the constrained-session-fallback review flagged.
it('should not leak another agent topic that shares the session mapping', async () => {
const caller = topicRouter.createCaller(createTestContext(userId));
const otherAgentId = await createTestAgent(serverDB, userId);
await serverDB.insert(topics).values([
{ agentId: testAgentId, title: 'mine rinabrown84@gmail.com', userId },
// Same session, different agent — used to leak via the session fallback.
{
agentId: otherAgentId,
sessionId: testSessionId,
title: 'theirs rinabrown84@gmail.com',
userId,
},
]);
const result = await caller.searchTopics({
agentId: testAgentId,
keywords: 'rinabrown84@gmail.com',
});
expect(result.map((t) => t.title)).toEqual(['mine rinabrown84@gmail.com']);
});
});
describe('updateTopic', () => {
@@ -719,7 +767,7 @@ describe('Topic Router Integration Tests', () => {
sessionId: testSessionId,
});
const allTopics = await caller.getAllTopics();
const allTopics = await caller.queryTopics();
expect(allTopics).toHaveLength(2);
});
@@ -4,12 +4,15 @@ import { pushTokenRouter } from '@/server/routers/lambda/pushToken';
const mockUpsert = vi.fn();
const mockUnregister = vi.fn();
const mockDeleteByExpoTokenAndDevice = vi.fn();
vi.mock('@/database/models/pushToken', () => ({
PushTokenModel: vi.fn(() => ({
unregister: mockUnregister,
upsert: mockUpsert,
})),
deletePushTokenByExpoTokenAndDevice: (...args: unknown[]) =>
mockDeleteByExpoTokenAndDevice(...args),
}));
const createCaller = (ctxOverrides: Partial<any> = {}) => {
@@ -91,18 +94,90 @@ describe('pushTokenRouter', () => {
});
describe('unregister', () => {
it('should call model.unregister with deviceId', async () => {
it('should delete by (expoToken, deviceId) when expoToken is provided', async () => {
mockDeleteByExpoTokenAndDevice.mockResolvedValueOnce(undefined);
const caller = createCaller();
const result = await caller.unregister({
deviceId: 'device-1',
expoToken: 'ExponentPushToken[abc]',
});
expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalledWith(expect.anything(), {
deviceId: 'device-1',
expoToken: 'ExponentPushToken[abc]',
});
expect(result).toEqual({ success: true });
// Legacy (userId, deviceId) path must not fire when expoToken is present
expect(mockUnregister).not.toHaveBeenCalled();
});
it('should fall back to (userId, deviceId) for legacy clients with a session', async () => {
// Path B — v1.0.7 only sends deviceId; if the request still carries a
// valid session we MUST delete the row, otherwise PushChannel keeps
// notifying a signed-out device (Expo DeviceNotRegistered only fires on
// uninstall, not logout).
mockUnregister.mockResolvedValueOnce(undefined);
const caller = createCaller();
await caller.unregister({ deviceId: 'device-1' });
const result = await caller.unregister({ deviceId: 'device-1' });
expect(mockUnregister).toHaveBeenCalledWith('device-1');
expect(mockDeleteByExpoTokenAndDevice).not.toHaveBeenCalled();
expect(result).toEqual({ success: true });
});
it('should silently succeed without expoToken AND without session', async () => {
// Path C — v1.0.7 + dead session: the only safe move is silent OK.
// Orphan row will be cleaned up by the process-push-receipts worker via
// Expo DeviceNotRegistered receipts. Returning 200 here stops the storm.
const caller = createCaller({ userId: undefined });
const result = await caller.unregister({ deviceId: 'device-1' });
expect(mockDeleteByExpoTokenAndDevice).not.toHaveBeenCalled();
expect(mockUnregister).not.toHaveBeenCalled();
expect(result).toEqual({ success: true });
});
it('should succeed for an unauthenticated caller carrying expoToken', async () => {
// New clients (>=1.0.8) hit Path A regardless of session.
const caller = createCaller({ userId: undefined });
const result = await caller.unregister({
deviceId: 'device-1',
expoToken: 'ExponentPushToken[abc]',
});
expect(result).toEqual({ success: true });
expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalled();
expect(mockUnregister).not.toHaveBeenCalled();
});
it('should prefer expoToken precision over the legacy userId fallback', async () => {
// If both are available, always take Path A — the (expoToken, deviceId)
// pair is more precise and doesn't risk deleting a wrong row.
const caller = createCaller();
await caller.unregister({
deviceId: 'device-1',
expoToken: 'ExponentPushToken[abc]',
});
expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalled();
expect(mockUnregister).not.toHaveBeenCalled();
});
it('should reject empty deviceId', async () => {
const caller = createCaller();
await expect(caller.unregister({ deviceId: '' })).rejects.toThrow();
});
it('should reject empty expoToken when provided', async () => {
const caller = createCaller();
await expect(
caller.unregister({ deviceId: 'device-1', expoToken: '' }),
).rejects.toThrow();
});
});
});
@@ -36,6 +36,7 @@ export const compareDocumentHistoryItemsInputSchema = z.object({
});
export const updateDocumentInputSchema = z.object({
breakAutosaveWindow: z.boolean().optional(),
content: z.string().optional(),
editorData: z.string().optional(),
fileType: z.string().optional(),
@@ -58,6 +59,7 @@ export interface DocumentHistoryListItem {
isCurrent: boolean;
savedAt: string;
saveSource: DocumentHistorySaveSource;
userId: string;
}
export interface ListHistoryOutput {
@@ -123,6 +125,7 @@ export interface CompareHistoryItemsInput {
}
export interface UpdateDocumentInput {
breakAutosaveWindow?: boolean;
content?: string;
editorData?: string;
fileType?: string;
+60
View File
@@ -17,6 +17,8 @@ import { workspaceMembers } from '@/database/schemas';
import { router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
import { AgentService } from '@/server/services/agent';
import { EditLockService } from '@/server/services/editLock';
import { publishResourceEvent } from '@/server/services/resourceEvents';
import { TransferErrorCode } from '@/types/transferError';
const agentProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
@@ -28,6 +30,7 @@ const agentProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) =>
agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
agentService: new AgentService(ctx.serverDB, ctx.userId, wsId),
chatGroupModel: new ChatGroupModel(ctx.serverDB, ctx.userId, wsId),
editLockService: new EditLockService(ctx.userId),
fileModel: new FileModel(ctx.serverDB, ctx.userId, wsId),
knowledgeBaseModel: new KnowledgeBaseModel(ctx.serverDB, ctx.userId, wsId),
sessionModel: new SessionModel(ctx.serverDB, ctx.userId, wsId),
@@ -440,6 +443,19 @@ export const agentRouter = router({
}),
)
.mutation(async ({ input, ctx }) => {
// Collaborative edit lock: reject writes to a workspace agent another
// member is actively editing. Inert until a client acquires the lock.
if (ctx.workspaceId) {
const blockedBy = await ctx.editLockService.getBlockingHolder('agent', input.agentId);
if (blockedBy) {
throw new TRPCError({
cause: { data: { code: 'DocumentLocked' } },
code: 'CONFLICT',
message: 'Agent is being edited by another user',
});
}
}
// Use AgentService to update and return the updated agent data
return ctx.agentService.updateAgentConfig(input.agentId, input.value);
}),
@@ -458,4 +474,48 @@ export const agentRouter = router({
.mutation(async ({ input, ctx }) => {
return ctx.agentModel.update(input.id, { pinned: input.pinned });
}),
acquireAgentLock: agentProcedure
.use(withScopedPermission('agent:update'))
.input(z.object({ agentId: z.string() }))
.mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const prev = await ctx.editLockService.getActiveHolder('agent', input.agentId);
const result = await ctx.editLockService.acquire('agent', input.agentId);
if ((result.holderId ?? null) !== (prev ?? null)) {
void publishResourceEvent(
{ id: input.agentId, type: 'agent' },
{ actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
);
}
return result;
}),
getAgentLock: agentProcedure
.use(withScopedPermission('agent:update'))
.input(z.object({ agentId: z.string() }))
.query(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const holder = await ctx.editLockService.getActiveHolder('agent', input.agentId);
return {
expiresAt: null,
holderId: holder ?? null,
lockedByOther: Boolean(holder) && holder !== ctx.userId,
};
}),
releaseAgentLock: agentProcedure
.use(withScopedPermission('agent:update'))
.input(z.object({ agentId: z.string() }))
.mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return;
// Only broadcast "unlocked" when we actually released our own lock — if the
// lease expired and another member took over, the lock is still held.
const released = await ctx.editLockService.release('agent', input.agentId);
if (!released) return;
void publishResourceEvent(
{ id: input.agentId, type: 'agent' },
{ actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
);
}),
});
@@ -14,6 +14,8 @@ import { type ChatGroupConfig } from '@/database/types/chatGroup';
import { router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
import { AgentGroupService } from '@/server/services/agentGroup';
import { EditLockService } from '@/server/services/editLock';
import { publishResourceEvent } from '@/server/services/resourceEvents';
import { TransferErrorCode } from '@/types/transferError';
/**
@@ -55,6 +57,7 @@ const agentGroupProcedure = wsCompatProcedure.use(serverDatabase).use(async (opt
agentGroupService: new AgentGroupService(ctx.serverDB, ctx.userId, wsId),
agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
chatGroupModel: new ChatGroupModel(ctx.serverDB, ctx.userId, wsId),
editLockService: new EditLockService(ctx.userId),
userModel: new UserModel(ctx.serverDB, ctx.userId),
},
});
@@ -402,6 +405,19 @@ export const agentGroupRouter = router({
}),
)
.mutation(async ({ input, ctx }) => {
// Collaborative edit lock: reject writes to a workspace group another
// member is actively editing. Inert until a client acquires the lock.
if (ctx.workspaceId) {
const blockedBy = await ctx.editLockService.getBlockingHolder('chatGroup', input.id);
if (blockedBy) {
throw new TRPCError({
cause: { data: { code: 'DocumentLocked' } },
code: 'CONFLICT',
message: 'Group is being edited by another user',
});
}
}
return ctx.chatGroupModel.update(input.id, {
...input.value,
config: ctx.agentGroupService.normalizeGroupConfig(
@@ -409,6 +425,47 @@ export const agentGroupRouter = router({
),
});
}),
acquireGroupLock: agentGroupProcedureWrite
.input(z.object({ id: z.string() }))
.mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const prev = await ctx.editLockService.getActiveHolder('chatGroup', input.id);
const result = await ctx.editLockService.acquire('chatGroup', input.id);
if ((result.holderId ?? null) !== (prev ?? null)) {
void publishResourceEvent(
{ id: input.id, type: 'chatGroup' },
{ actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
);
}
return result;
}),
getGroupLock: agentGroupProcedureWrite
.input(z.object({ id: z.string() }))
.query(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const holder = await ctx.editLockService.getActiveHolder('chatGroup', input.id);
return {
expiresAt: null,
holderId: holder ?? null,
lockedByOther: Boolean(holder) && holder !== ctx.userId,
};
}),
releaseGroupLock: agentGroupProcedureWrite
.input(z.object({ id: z.string() }))
.mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return;
// Only broadcast "unlocked" when we actually released our own lock — if the
// lease expired and another member took over, the lock is still held.
const released = await ctx.editLockService.release('chatGroup', input.id);
if (!released) return;
void publishResourceEvent(
{ id: input.id, type: 'chatGroup' },
{ actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
);
}),
});
export type AgentGroupRouter = typeof agentGroupRouter;
@@ -85,6 +85,7 @@ export const agentSignalRouter = router({
return enqueueAgentSignalSourceEvent(sourceEvent, {
agentId: input.agentId,
userId: ctx.userId,
workspaceId: ctx.workspaceId ?? undefined,
});
}),
listReceipts: agentSignalProcedure
-1
View File
@@ -370,7 +370,6 @@ export const aiChatRouter = router({
{ assistantMessage, userMessage },
{
...(modelTiming ? { timing: modelTiming } : {}),
touchTopicUpdatedAt: !isCreateNewTopic,
},
);
},
+6 -1
View File
@@ -268,9 +268,14 @@ export const connectorRouter = router({
await ctx.connectorModel.update(input.id, {
...patch,
// undefined → leave untouched; null → clear; object → encrypt the JSON string.
// When credentials are cleared, also drop the cached expiry timestamp so
// token-refresh logic doesn't act on a stale value for the new server.
...(credentials === undefined
? {}
: { credentials: credentials ? JSON.stringify(credentials) : null }),
: {
credentials: credentials ? JSON.stringify(credentials) : null,
...(credentials === null ? { tokenExpiresAt: null } : {}),
}),
} as any);
}),
+53 -1
View File
@@ -163,6 +163,50 @@ export const deviceRouter = router({
}),
),
/**
* Rename a branch in a directory on a remote device, via the device's
* `renameGitBranch` RPC.
*/
renameGitBranch: deviceProcedure
.input(
z.object({
deviceId: z.string(),
from: z.string(),
path: z.string(),
to: z.string(),
}),
)
.mutation(async ({ ctx, input }) =>
deviceGateway.renameGitBranch({
deviceId: input.deviceId,
from: input.from,
path: input.path,
to: input.to,
userId: ctx.userId,
}),
),
/**
* Delete a branch in a directory on a remote device, via the device's
* `deleteGitBranch` RPC.
*/
deleteGitBranch: deviceProcedure
.input(
z.object({
branch: z.string(),
deviceId: z.string(),
path: z.string(),
}),
)
.mutation(async ({ ctx, input }) =>
deviceGateway.deleteGitBranch({
branch: input.branch,
deviceId: input.deviceId,
path: input.path,
userId: ctx.userId,
}),
),
/**
* Pull (`--ff-only`) the current branch of a directory on a remote device, via
* the device's `pullGitBranch` RPC.
@@ -275,9 +319,17 @@ export const deviceRouter = router({
* receives render data, not a `localfile://` URL; saving remains unsupported.
*/
getLocalFilePreview: deviceProcedure
.input(z.object({ deviceId: z.string(), path: z.string(), workingDirectory: z.string() }))
.input(
z.object({
accept: z.enum(['image']).optional(),
deviceId: z.string(),
path: z.string(),
workingDirectory: z.string(),
}),
)
.query(async ({ ctx, input }) =>
deviceGateway.getLocalFilePreview({
accept: input.accept,
deviceId: input.deviceId,
path: input.path,
userId: ctx.userId,
@@ -253,6 +253,27 @@ export const documentRouter = router({
return ctx.documentService.queryDocuments(input);
}),
acquireDocumentLock: documentProcedure
.use(withScopedPermission('document:update'))
.input(z.object({ id: z.string() }))
.mutation(async ({ ctx, input }) => {
return ctx.documentService.acquireDocumentLock(input.id);
}),
getDocumentLock: documentProcedure
.use(withScopedPermission('document:update'))
.input(z.object({ id: z.string() }))
.query(async ({ ctx, input }) => {
return ctx.documentService.getDocumentLock(input.id);
}),
releaseDocumentLock: documentProcedure
.use(withScopedPermission('document:update'))
.input(z.object({ id: z.string() }))
.mutation(async ({ ctx, input }) => {
await ctx.documentService.releaseDocumentLock(input.id);
}),
updateDocument: documentProcedure
.use(withScopedPermission('document:update'))
.input(updateDocumentInputSchema)
+53 -7
View File
@@ -1,10 +1,13 @@
import { z } from 'zod';
import { PushTokenModel } from '@/database/models/pushToken';
import { authedProcedure, router } from '@/libs/trpc/lambda';
import {
deletePushTokenByExpoTokenAndDevice,
PushTokenModel,
} from '@/database/models/pushToken';
import { authedProcedure, publicProcedure, router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
const pushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
const authedPushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
const { ctx } = opts;
return opts.next({
@@ -13,7 +16,7 @@ const pushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts)
});
export const pushTokenRouter = router({
register: pushTokenProcedure
register: authedPushTokenProcedure
.input(
z.object({
appVersion: z.string().optional(),
@@ -27,10 +30,53 @@ export const pushTokenRouter = router({
return ctx.pushTokenModel.upsert(input);
}),
unregister: pushTokenProcedure
.input(z.object({ deviceId: z.string().min(1) }))
/**
* Public on purpose: clients call this during sign-out, and in the wild many
* of those calls arrive after the session is already gone (expired OIDC
* token / cleared cookie). Authenticating by session here causes a 401
* storm on every such logout.
*
* Authorization model (Path A new clients 1.0.8): the caller presents the
* (deviceId, expoToken) pair it received at registration. Holding both = proof
* of ownership of the row, same trust model as APNs/FCM unregister.
*
* Backwards compat for v1.0.7 (only sends `deviceId`):
* - Path B when the request still carries a valid session, fall back to
* the original (userId, deviceId) delete. This covers the *active*
* sign-out path so PushChannel doesn't keep notifying a signed-out device
* until the user uninstalls (Expo's DeviceNotRegistered receipt only
* fires on uninstall, not on logout).
* - Path C when there's no session either, silently succeed. The orphan
* row will be cleaned up by the existing `process-push-receipts` worker
* via Expo's DeviceNotRegistered receipts. Returning 200 here is what
* actually stops the 401 storm in production.
*/
unregister: publicProcedure
.use(serverDatabase)
.input(
z.object({
deviceId: z.string().min(1),
expoToken: z.string().min(1).optional(),
}),
)
.mutation(async ({ ctx, input }) => {
return ctx.pushTokenModel.unregister(input.deviceId);
const { deviceId, expoToken } = input;
// Path A: new clients — precise delete by (expoToken, deviceId), no session needed
if (expoToken) {
await deletePushTokenByExpoTokenAndDevice(ctx.serverDB, { deviceId, expoToken });
return { success: true };
}
// Path B: legacy v1.0.7 + valid session — fall back to (userId, deviceId)
if (ctx.userId) {
const pushTokenModel = new PushTokenModel(ctx.serverDB, ctx.userId);
await pushTokenModel.unregister(deviceId);
return { success: true };
}
// Path C: legacy v1.0.7 with no session — silent OK, cron worker cleans up
return { success: true };
}),
});
+55
View File
@@ -14,6 +14,8 @@ import { TopicModel } from '@/database/models/topic';
import { workspaceMembers } from '@/database/schemas';
import { router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
import { EditLockService } from '@/server/services/editLock';
import { publishResourceEvent } from '@/server/services/resourceEvents';
import { TaskService } from '@/server/services/task';
import { TaskLifecycleService } from '@/server/services/taskLifecycle';
import { TaskRunnerService } from '@/server/services/taskRunner';
@@ -26,6 +28,7 @@ const taskProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) =>
ctx: {
agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
briefModel: new BriefModel(ctx.serverDB, ctx.userId, wsId),
editLockService: new EditLockService(ctx.userId),
taskLifecycle: new TaskLifecycleService(ctx.serverDB, ctx.userId, wsId),
taskModel: new TaskModel(ctx.serverDB, ctx.userId, wsId),
taskService: new TaskService(ctx.serverDB, ctx.userId, wsId),
@@ -927,6 +930,20 @@ export const taskRouter = router({
const model = ctx.taskModel;
await assertAssigneeAgentBelongsToUser(ctx.agentModel, data.assigneeAgentId);
const resolved = await resolveOrThrow(model, id);
// Collaborative edit lock: reject writes to a workspace task another member
// is actively editing. Inert until a client acquires the lock.
if (ctx.workspaceId) {
const blockedBy = await ctx.editLockService.getBlockingHolder('task', resolved.id);
if (blockedBy) {
throw new TRPCError({
cause: { data: { code: 'DocumentLocked' } },
code: 'CONFLICT',
message: 'Task is being edited by another user',
});
}
}
const resolvedParentTaskId =
parentTaskId === undefined
? undefined
@@ -947,6 +964,44 @@ export const taskRouter = router({
}
}),
acquireTaskLock: taskProcedureWrite.input(idInput).mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const resolved = await resolveOrThrow(ctx.taskModel, input.id);
const prev = await ctx.editLockService.getActiveHolder('task', resolved.id);
const result = await ctx.editLockService.acquire('task', resolved.id);
if ((result.holderId ?? null) !== (prev ?? null)) {
void publishResourceEvent(
{ id: resolved.id, type: 'task' },
{ actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
);
}
return result;
}),
getTaskLock: taskProcedureWrite.input(idInput).query(async ({ ctx, input }) => {
if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const resolved = await resolveOrThrow(ctx.taskModel, input.id);
const holder = await ctx.editLockService.getActiveHolder('task', resolved.id);
return {
expiresAt: null,
holderId: holder ?? null,
lockedByOther: Boolean(holder) && holder !== ctx.userId,
};
}),
releaseTaskLock: taskProcedureWrite.input(idInput).mutation(async ({ ctx, input }) => {
if (!ctx.workspaceId) return;
const resolved = await resolveOrThrow(ctx.taskModel, input.id);
// Only broadcast "unlocked" when we actually released our own lock — if the
// lease expired and another member took over, the lock is still held.
const released = await ctx.editLockService.release('task', resolved.id);
if (!released) return;
void publishResourceEvent(
{ id: resolved.id, type: 'task' },
{ actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
);
}),
updateConfig: taskProcedureWrite
.input(idInput.merge(z.object({ config: z.record(z.unknown()) })))
.mutation(async ({ input, ctx }) => {
+35 -4
View File
@@ -162,6 +162,18 @@ export const topicRouter = router({
return ctx.topicModel.batchDeleteBySessionId(resolved.sessionId);
}),
batchMoveTopics: topicProcedure
.use(withScopedPermission('topic:update'))
.input(
z.object({
targetAgentId: z.string(),
topicIds: z.array(z.string()),
}),
)
.mutation(async ({ input, ctx }) => {
return ctx.topicModel.batchMoveToAgent(input.topicIds, input.targetAgentId);
}),
cloneTopic: topicProcedure
.use(withScopedPermission('topic:create'))
.input(z.object({ id: z.string(), newTitle: z.string().optional() }))
@@ -239,9 +251,18 @@ export const topicRouter = router({
return ctx.topicShareModel.create(input.topicId, input.visibility);
}),
getAllTopics: topicProcedure.query(async ({ ctx }) => {
return ctx.topicModel.queryAll();
}),
queryTopics: topicProcedure
.input(
z
.object({
pageSize: z.number().max(500).optional(),
statuses: z.array(z.string()).optional(),
})
.optional(),
)
.query(async ({ input, ctx }) => {
return ctx.topicModel.queryTopics({ pageSize: input?.pageSize, statuses: input?.statuses });
}),
getShareInfo: topicProcedure
.input(z.object({ topicId: z.string() }))
@@ -570,7 +591,17 @@ export const topicRouter = router({
ctx.workspaceId ?? undefined,
);
return ctx.topicModel.queryByKeyword(input.keywords, resolved.sessionId);
// Scope the search exactly like the topics list (`query`): by agentId
// directly (the new agent system stamps every topic with an agentId).
// Passing only the resolved sessionId used to miss every agentId-scoped
// topic — the cause of "no topics match" in the per-agent Topics search.
// `containerId` is only the fallback for legacy callers that pass no
// agentId/groupId.
return ctx.topicModel.queryByKeyword(input.keywords, {
agentId: input.agentId,
containerId: resolved.sessionId,
groupId: input.groupId,
});
}),
/**
+12 -10
View File
@@ -1,11 +1,12 @@
import { z } from 'zod';
import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
import { AgentOperationModel } from '@/database/models/agentOperation';
import { LlmGenerationTracingModel } from '@/database/models/llmGenerationTracing';
import { VerifyCheckResultModel } from '@/database/models/verifyCheckResult';
import { VerifyCriterionModel } from '@/database/models/verifyCriterion';
import { VerifyRubricModel } from '@/database/models/verifyRubric';
import { authedProcedure, router } from '@/libs/trpc/lambda';
import { router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
import {
VerifyExecutorService,
@@ -35,18 +36,19 @@ const checkItemSchema = z.object({
verifierType: verifierTypeSchema,
});
const verifyProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
const verifyProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
const { ctx } = opts;
const workspaceId = ctx.workspaceId ?? undefined;
return opts.next({
ctx: {
criterionModel: new VerifyCriterionModel(ctx.serverDB, ctx.userId),
executorService: new VerifyExecutorService(ctx.serverDB, ctx.userId),
tracingModel: new LlmGenerationTracingModel(ctx.serverDB, ctx.userId),
feedbackService: new VerifyFeedbackService(ctx.serverDB, ctx.userId),
operationModel: new AgentOperationModel(ctx.serverDB, ctx.userId),
planGenerator: new VerifyPlanGeneratorService(ctx.serverDB, ctx.userId),
resultModel: new VerifyCheckResultModel(ctx.serverDB, ctx.userId),
rubricModel: new VerifyRubricModel(ctx.serverDB, ctx.userId),
criterionModel: new VerifyCriterionModel(ctx.serverDB, ctx.userId, workspaceId),
executorService: new VerifyExecutorService(ctx.serverDB, ctx.userId, workspaceId),
tracingModel: new LlmGenerationTracingModel(ctx.serverDB, ctx.userId, workspaceId),
feedbackService: new VerifyFeedbackService(ctx.serverDB, ctx.userId, workspaceId),
operationModel: new AgentOperationModel(ctx.serverDB, ctx.userId, workspaceId),
planGenerator: new VerifyPlanGeneratorService(ctx.serverDB, ctx.userId, workspaceId),
resultModel: new VerifyCheckResultModel(ctx.serverDB, ctx.userId, workspaceId),
rubricModel: new VerifyRubricModel(ctx.serverDB, ctx.userId, workspaceId),
},
});
});
+3 -14
View File
@@ -3,8 +3,7 @@ import { z } from 'zod';
import { withScopedPermission } from '@/business/server/trpc-middlewares/rbacPermission';
import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
import { TopicModel } from '@/database/models/topic';
import { getServerDB } from '@/database/server';
import { publicProcedure, router } from '@/libs/trpc/lambda';
import { router } from '@/libs/trpc/lambda';
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
import { type BatchTaskResult } from '@/types/service';
@@ -95,12 +94,7 @@ export const topicRouter = router({
return data.id;
}),
getAllTopics: topicProcedure.query(async ({ ctx }) => {
return ctx.topicModel.queryAll();
}),
// TODO: this procedure should be used with authedProcedure
getTopics: publicProcedure
getTopics: topicProcedure
.input(
z.object({
containerId: z.string().nullable().optional(),
@@ -109,12 +103,7 @@ export const topicRouter = router({
}),
)
.query(async ({ input, ctx }) => {
if (!ctx.userId) return [];
const serverDB = await getServerDB();
const topicModel = new TopicModel(serverDB, ctx.userId, ctx.workspaceId ?? undefined);
return topicModel.query(input);
return ctx.topicModel.query(input);
}),
hasTopics: topicProcedure.query(async ({ ctx }) => {
@@ -231,6 +231,57 @@ describe('AgentService', () => {
// Avatar should not be present for non-builtin agents
expect((result as any)?.avatar).toBeUndefined();
});
it('should NOT inherit the member personal default model for a workspace inbox', async () => {
// Workspace inbox is persisted with an empty model/provider.
const mockAgent = {
id: 'agent-1',
slug: 'inbox',
};
const serverDefaultConfig = { model: 'system-default-model', provider: 'system-provider' };
const mockAgentModel = {
getBuiltinAgent: vi.fn().mockResolvedValue(mockAgent),
};
(AgentModel as any).mockImplementation(() => mockAgentModel);
(parseAgentConfig as any).mockReturnValue(serverDefaultConfig);
// The member opening the workspace inbox has a personal default model.
mockUserModel.getUserSettingsDefaultAgentConfig.mockResolvedValueOnce({
config: { model: 'opus-4.6', provider: 'anthropic' },
});
const workspaceService = new AgentService(mockDb, mockUserId, mockWorkspaceId);
const result = await workspaceService.getBuiltinAgent('inbox');
// Should fall back to the system default, NOT the member's personal model.
expect(result?.model).toBe('system-default-model');
expect(result?.provider).toBe('system-provider');
});
it('should still apply the personal default model for a personal inbox', async () => {
const mockAgent = {
id: 'agent-1',
slug: 'inbox',
};
const mockAgentModel = {
getBuiltinAgent: vi.fn().mockResolvedValue(mockAgent),
};
(AgentModel as any).mockImplementation(() => mockAgentModel);
(parseAgentConfig as any).mockReturnValue({});
mockUserModel.getUserSettingsDefaultAgentConfig.mockResolvedValueOnce({
config: { model: 'user-preferred-model', provider: 'user-provider' },
});
// No workspaceId → personal scope keeps the personal default behavior.
const newService = new AgentService(mockDb, mockUserId);
const result = await newService.getBuiltinAgent('inbox');
expect(result?.model).toBe('user-preferred-model');
expect(result?.provider).toBe('user-provider');
});
});
describe('getAgentConfig', () => {
+16 -4
View File
@@ -174,6 +174,13 @@ export class AgentService {
* 2. serverDefaultAgentConfig - from environment variable
* 3. userDefaultAgentConfig - from user settings (defaultAgent.config)
* 4. agent - actual agent config from database
*
* Workspace exception: a workspace is a shared resource, so its agents must
* NOT inherit any individual member's *personal* default model. Otherwise a
* shared agent persisted with an empty model (e.g. the workspace inbox)
* resolves to whoever opens it the creator's personal default leaks in and
* the workspace looks "initialized" with their model. For workspace-scoped
* reads we skip the user layer and fall back to the system default instead.
*/
private mergeDefaultConfig(
agent: any,
@@ -181,12 +188,17 @@ export class AgentService {
): LobeAgentConfig | null {
if (!agent) return null;
const userDefaultAgentConfig =
(defaultAgentConfig as { config?: PartialDeep<LobeAgentConfig> })?.config || {};
// Merge configs in order: DEFAULT -> server -> user -> agent
// Merge configs in order: DEFAULT -> server -> [user] -> agent
const serverDefaultAgentConfig = getServerDefaultAgentConfig();
const baseConfig = merge(DEFAULT_AGENT_CONFIG, serverDefaultAgentConfig);
// Skip the personal default layer for workspace-scoped agents (see above).
if (this.workspaceId) {
return merge(baseConfig, cleanObject(agent));
}
const userDefaultAgentConfig =
(defaultAgentConfig as { config?: PartialDeep<LobeAgentConfig> })?.config || {};
const withUserConfig = merge(baseConfig, userDefaultAgentConfig);
return merge(withUserConfig, cleanObject(agent));
@@ -10,6 +10,7 @@ import type { LobeChatDatabase } from '@/database/type';
import { AgentRuntimeCoordinator } from '@/server/modules/AgentRuntime/AgentRuntimeCoordinator';
import { OperationTraceRecorder } from './OperationTraceRecorder';
import { createDefaultSnapshotStore } from './snapshotStore';
const log = debug('lobe-server:abandon-operation');
@@ -127,25 +128,3 @@ export class AbandonOperationService {
return result;
}
}
function createDefaultSnapshotStore(): ISnapshotStore | null {
if (process.env.ENABLE_AGENT_S3_TRACING === '1') {
try {
const { S3SnapshotStore } = require('@/server/modules/AgentTracing');
return new S3SnapshotStore();
} catch {
/* S3SnapshotStore not available */
}
}
if (process.env.NODE_ENV === 'development') {
try {
const { FileSnapshotStore } = require('@lobechat/agent-tracing');
return new FileSnapshotStore();
} catch {
/* agent-tracing not available */
}
}
return null;
}
@@ -25,7 +25,12 @@ import {
invokeAgentSpanName,
tracer as agentRuntimeTracer,
} from '@lobechat/observability-otel/modules/agent-runtime';
import { type ChatToolPayload, type ExecSubAgentParams, type UIChatMessage } from '@lobechat/types';
import {
type ChatToolPayload,
type ExecSubAgentParams,
type ExecVirtualSubAgentParams,
type UIChatMessage,
} from '@lobechat/types';
import debug from 'debug';
import urlJoin from 'url-join';
@@ -56,6 +61,7 @@ import { CompletionLifecycle } from './CompletionLifecycle';
import { hookDispatcher } from './hooks';
import { HumanInterventionHandler } from './HumanInterventionHandler';
import { OperationTraceRecorder } from './OperationTraceRecorder';
import { createDefaultSnapshotStore } from './snapshotStore';
import { buildStepPresentation, formatTokenCount } from './stepPresentation';
import {
type AgentExecutionParams,
@@ -126,13 +132,17 @@ const toAgentSignalSnapshotEvents = (
*/
export interface AgentRuntimeDelegate {
/**
* Fork a sub-agent through the full high-level pipeline
* Run a legacy agent invocation through the full high-level pipeline
* (AiAgentService.execSubAgent execAgent: agent-config resolution, tool
* engine, context engineering, createOperation). Returns a deferred result;
* the parent op parks (`waiting_for_async_tool`) until the completion bridge
* backfills the placeholder and resumes it.
* engine, context engineering, createOperation).
*/
execSubAgent?: (params: ExecSubAgentParams) => Promise<unknown>;
/**
* Fork a `lobe-agent.callSubAgent` virtual child run. The child is marked as a
* sub-agent and owns the completion bridge that backfills the parent tool
* placeholder before resuming the parked parent operation.
*/
execVirtualSubAgent?: (params: ExecVirtualSubAgentParams) => Promise<unknown>;
}
export interface AgentRuntimeServiceOptions {
@@ -247,7 +257,7 @@ export class AgentRuntimeService {
this.queueService =
options?.queueService === null ? null : (options?.queueService ?? new QueueService());
this.traceRecorder = new OperationTraceRecorder(
options?.snapshotStore ?? this.createDefaultSnapshotStore(),
options?.snapshotStore ?? createDefaultSnapshotStore(),
);
this.agentFactory = options?.agentFactory;
this.delegate = options?.delegate ?? {};
@@ -1864,10 +1874,7 @@ export class AgentRuntimeService {
if (!tool || typeof tool !== 'object') continue;
const toolPayload = tool as { id?: unknown; result_msg_id?: unknown };
if (
typeof toolPayload.id === 'string' &&
typeof toolPayload.result_msg_id === 'string'
) {
if (typeof toolPayload.id === 'string' && typeof toolPayload.result_msg_id === 'string') {
toolResultMessageIds.set(toolPayload.id, toolPayload.result_msg_id);
}
}
@@ -1944,6 +1951,7 @@ export class AgentRuntimeService {
userTimezone: metadata?.userTimezone,
evalContext: metadata?.evalContext,
execSubAgent: this.delegate.execSubAgent,
execVirtualSubAgent: this.delegate.execVirtualSubAgent,
hookDispatcher,
loadAgentState: this.coordinator.loadAgentState.bind(this.coordinator),
messageModel: this.messageModel,
@@ -1967,34 +1975,6 @@ export class AgentRuntimeService {
return { agent, runtime };
}
/**
* Create default snapshot store based on environment.
* - ENABLE_AGENT_S3_TRACING=1 S3SnapshotStore
* - NODE_ENV=development FileSnapshotStore
* - Otherwise null (no tracing)
*/
private createDefaultSnapshotStore(): ISnapshotStore | null {
if (process.env.ENABLE_AGENT_S3_TRACING === '1') {
try {
const { S3SnapshotStore } = require('@/server/modules/AgentTracing');
return new S3SnapshotStore();
} catch {
// S3SnapshotStore not available
}
}
if (process.env.NODE_ENV === 'development') {
try {
const { FileSnapshotStore } = require('@lobechat/agent-tracing');
return new FileSnapshotStore();
} catch {
// agent-tracing not available
}
}
return null;
}
/**
* Compute device context from DB messages at step boundary.
* Uses findInMessages visitor to scan tool messages for device activation.
@@ -344,11 +344,16 @@ export class CompletionLifecycle {
metadata?.assistantMessageId,
metadata?.userId || this.userId,
);
void runVerifyOnCompletion(this.serverDB, metadata?.userId || this.userId, {
deliverable: event.lastAssistantContent ?? '',
goal,
operationId,
});
void runVerifyOnCompletion(
this.serverDB,
metadata?.userId || this.userId,
{
deliverable: event.lastAssistantContent ?? '',
goal,
operationId,
},
this.workspaceId,
);
}
if (reason === 'error') {
@@ -0,0 +1,71 @@
// @vitest-environment node
import { afterEach, describe, expect, it, vi } from 'vitest';
import { createDefaultSnapshotStore, shouldUseAgentS3Tracing } from '../snapshotStore';
const s3SnapshotStoreMock = vi.fn(() => ({ kind: 's3' }));
const fileSnapshotStoreMock = vi.fn(() => ({ kind: 'file' }));
const setEnv = (nodeEnv: string, agentS3Tracing?: string) => {
vi.stubEnv('NODE_ENV', nodeEnv);
vi.stubEnv('ENABLE_AGENT_S3_TRACING', agentS3Tracing);
};
const loadModule = vi.fn((moduleName: string) => {
if (moduleName === '@/server/modules/AgentTracing') {
return { S3SnapshotStore: s3SnapshotStoreMock };
}
if (moduleName === '@lobechat/agent-tracing') {
return { FileSnapshotStore: fileSnapshotStoreMock };
}
throw new Error(`Unexpected module: ${moduleName}`);
});
describe('agent runtime snapshot store defaults', () => {
afterEach(() => {
vi.unstubAllEnvs();
vi.clearAllMocks();
});
it('enables S3 tracing by default in production when env is unset', () => {
setEnv('production');
expect(shouldUseAgentS3Tracing()).toBe(true);
expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 's3' });
expect(loadModule).toHaveBeenCalledWith('@/server/modules/AgentTracing');
expect(s3SnapshotStoreMock).toHaveBeenCalledTimes(1);
expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
});
it('uses the local file snapshot store in development when env is unset', () => {
setEnv('development');
expect(shouldUseAgentS3Tracing()).toBe(false);
expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 'file' });
expect(loadModule).toHaveBeenCalledWith('@lobechat/agent-tracing');
expect(s3SnapshotStoreMock).not.toHaveBeenCalled();
expect(fileSnapshotStoreMock).toHaveBeenCalledTimes(1);
});
it('lets ENABLE_AGENT_S3_TRACING=1 force S3 tracing outside production', () => {
setEnv('development', '1');
expect(shouldUseAgentS3Tracing()).toBe(true);
expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 's3' });
expect(loadModule).toHaveBeenCalledWith('@/server/modules/AgentTracing');
expect(s3SnapshotStoreMock).toHaveBeenCalledTimes(1);
expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
});
it('lets an explicit ENABLE_AGENT_S3_TRACING value disable the production default', () => {
setEnv('production', '0');
expect(shouldUseAgentS3Tracing()).toBe(false);
expect(createDefaultSnapshotStore(loadModule)).toBeNull();
expect(loadModule).not.toHaveBeenCalled();
expect(s3SnapshotStoreMock).not.toHaveBeenCalled();
expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
});
});
@@ -0,0 +1,59 @@
import type { ISnapshotStore } from '@lobechat/agent-tracing';
const ENABLE_AGENT_S3_TRACING_VALUE = '1';
type SnapshotStoreConstructor = new () => ISnapshotStore;
type SnapshotStoreModuleLoader = (moduleName: string) => unknown;
interface FileSnapshotStoreModule {
FileSnapshotStore: SnapshotStoreConstructor;
}
interface S3SnapshotStoreModule {
S3SnapshotStore: SnapshotStoreConstructor;
}
const nodeRequire: SnapshotStoreModuleLoader = (moduleName) => require(moduleName);
export const shouldUseAgentS3Tracing = () => {
const explicitValue = process.env.ENABLE_AGENT_S3_TRACING;
if (explicitValue !== undefined) return explicitValue === ENABLE_AGENT_S3_TRACING_VALUE;
return process.env.NODE_ENV === 'production';
};
/**
* Create default snapshot store based on environment.
* - ENABLE_AGENT_S3_TRACING=1 -> S3SnapshotStore
* - NODE_ENV=production with ENABLE_AGENT_S3_TRACING unset -> S3SnapshotStore
* - NODE_ENV=development -> FileSnapshotStore
* - Otherwise -> null (no tracing)
*/
export const createDefaultSnapshotStore = (
loadModule: SnapshotStoreModuleLoader = nodeRequire,
): ISnapshotStore | null => {
if (shouldUseAgentS3Tracing()) {
try {
const { S3SnapshotStore } = loadModule(
'@/server/modules/AgentTracing',
) as S3SnapshotStoreModule;
return new S3SnapshotStore();
} catch {
// S3SnapshotStore not available
}
}
if (process.env.NODE_ENV === 'development') {
try {
const { FileSnapshotStore } = loadModule(
'@lobechat/agent-tracing',
) as FileSnapshotStoreModule;
return new FileSnapshotStore();
} catch {
// agent-tracing not available
}
}
return null;
};
@@ -21,6 +21,12 @@ vi.mock('@/database/models/thread', () => ({
ThreadModel: vi.fn().mockImplementation(() => mockThreadModel),
}));
vi.mock('@/database/models/agentOperation', () => ({
AgentOperationModel: vi.fn().mockImplementation(() => ({
findById: vi.fn().mockResolvedValue({ trigger: 'cli' }),
})),
}));
// Mock other models
vi.mock('@/database/models/agent', () => ({
AgentModel: vi.fn().mockImplementation(() => ({
@@ -115,7 +121,7 @@ describe('AiAgentService.execSubAgent', () => {
service = new AiAgentService(mockDb, userId);
});
describe('successful task execution', () => {
describe('successful isolated execution', () => {
it('should create Thread with correct parameters', async () => {
// Mock execAgent to return success
vi.spyOn(service, 'execAgent').mockResolvedValue({
@@ -208,6 +214,7 @@ describe('AiAgentService.execSubAgent', () => {
agentId: 'agent-1',
appContext: {
groupId: 'group-1',
isSubAgent: false,
threadId: 'thread-123',
topicId: 'topic-1',
},
@@ -223,6 +230,46 @@ describe('AiAgentService.execSubAgent', () => {
});
});
it('should run deferred lobe-agent children through execVirtualSubAgent', async () => {
const execAgentSpy = vi.spyOn(service, 'execAgent').mockResolvedValue({
agentId: 'agent-1',
assistantMessageId: 'assistant-msg-1',
autoStarted: true,
createdAt: new Date().toISOString(),
message: 'Agent operation created successfully',
messageId: 'queue-msg-1',
operationId: 'op-123',
status: 'created',
success: true,
timestamp: new Date().toISOString(),
topicId: 'topic-1',
userMessageId: 'user-msg-1',
});
await service.execVirtualSubAgent({
agentId: 'agent-1',
instruction: 'Nested research task',
parentMessageId: 'tool-msg-1',
parentOperationId: 'parent-op-1',
topicId: 'topic-1',
});
expect(execAgentSpy).toHaveBeenCalledWith(
expect.objectContaining({
appContext: expect.objectContaining({
isSubAgent: true,
threadId: 'thread-123',
topicId: 'topic-1',
}),
hooks: expect.arrayContaining([
expect.objectContaining({ id: 'sub-agent-bridge', type: 'onComplete' }),
]),
parentOperationId: 'parent-op-1',
trigger: 'cli',
}),
);
});
it('should store operationId and startedAt in Thread metadata', async () => {
vi.spyOn(service, 'execAgent').mockResolvedValue({
agentId: 'agent-1',
@@ -409,7 +456,7 @@ describe('AiAgentService.execSubAgent', () => {
parentMessageId: 'parent-msg-1',
topicId: 'topic-1',
}),
).rejects.toThrow('Failed to create thread for task execution');
).rejects.toThrow('Failed to create thread for agent execution');
});
it('should throw error when Thread creation throws', async () => {
@@ -427,7 +474,7 @@ describe('AiAgentService.execSubAgent', () => {
});
});
describe('task message summary update', () => {
describe('source message summary update', () => {
it('should pass sourceMessageId (parentMessageId) to callbacks for summary update', async () => {
const execAgentSpy = vi.spyOn(service, 'execAgent').mockResolvedValue({
agentId: 'agent-1',
+91 -59
View File
@@ -36,6 +36,7 @@ import type {
ExecGroupAgentResult,
ExecSubAgentParams,
ExecSubAgentResult,
ExecVirtualSubAgentParams,
LobeAgentAgencyConfig,
MessagePluginItem,
UserInterventionConfig,
@@ -318,9 +319,10 @@ export class AiAgentService {
// high-level pipelines mid-step. See AgentRuntimeDelegate. New high-level
// capabilities the runtime calls into go in this `delegate` object.
//
// `execSubAgent` is an auto-bound arrow field, so no `.bind(this)`.
// Arrow fields are auto-bound, so no `.bind(this)`.
delegate: {
execSubAgent: this.execSubAgent,
execVirtualSubAgent: this.execVirtualSubAgent,
},
workspaceId: wsId,
});
@@ -415,9 +417,10 @@ export class AiAgentService {
* Execute a single agent step against this service's runtime.
*
* Delegates to the internal AgentRuntimeService, which is already wired with
* the `execSubAgent` fork callback. The QStash step worker drives stepping
* through here so `lobe-agent.callSubAgent` can fork sub-agents building a
* bare runtime there would lose the callback and fail with SUB_AGENT_UNAVAILABLE.
* the agent-invocation fork callbacks. The QStash step worker drives stepping
* through here so `lobe-agent.callSubAgent` can fork virtual sub-agents
* building a bare runtime there would lose the callback and fail with
* SUB_AGENT_UNAVAILABLE.
*/
executeStep(params: AgentExecutionParams): Promise<AgentExecutionResult> {
return this.agentRuntimeService.executeStep(params);
@@ -2296,7 +2299,7 @@ export class AiAgentService {
: undefined;
// 13. Create user message in database
// Include threadId if provided (for SubAgent task execution in isolated Thread)
// Include threadId if provided (for isolated agent execution)
const userMessageRecord = runFromHistory
? undefined
: await this.messageModel.create({
@@ -2344,7 +2347,7 @@ export class AiAgentService {
}
// 14. Create assistant message placeholder in database
// Include threadId if provided (for SubAgent task execution in isolated Thread)
// Include threadId if provided (for isolated agent execution)
const assistantMessageRecord = await this.messageModel.create({
agentId: persistAgentId,
content: LOADING_FLAT,
@@ -2856,35 +2859,46 @@ export class AiAgentService {
}
/**
* Execute SubAgent task (supports both Group and Single Agent mode)
* Execute an agent in an isolated Thread context.
*
* This method is called by Supervisor (Group mode) or Agent (Single mode)
* to delegate tasks to SubAgents. Each task runs in an isolated Thread context.
*
* - Group mode: pass groupId, Thread will be associated with the Group
* - Single Agent mode: omit groupId, Thread will only be associated with the Agent
*
* Flow:
* 1. Create Thread (type='isolation', status='processing')
* 2. Delegate to execAgent with threadId in appContext
* 3. Store operationId in Thread metadata
* Group/callAgent paths use this entry. It does not mark the child as a
* virtual sub-agent and it does not install the async completion bridge.
*/
// Arrow field (not a method) so it stays bound to this instance when handed to
// AgentRuntimeService as the `execSubAgent` fork callback — no `.bind(this)`.
execSubAgent = async (params: ExecSubAgentParams): Promise<ExecSubAgentResult> => {
const {
groupId,
topicId,
parentMessageId,
agentId,
instruction,
title,
parentOperationId,
resumeParentOnComplete,
} = params;
// Arrow field (not a method) so it stays bound when handed to AgentRuntimeService.
execSubAgent = async (params: ExecSubAgentParams): Promise<ExecSubAgentResult> =>
this.execAgentThreadRun(params, {
isSubAgent: false,
logScope: 'execSubAgent',
});
/**
* Execute a virtual sub-agent created by `lobe-agent.callSubAgent`.
*
* This path is a child operation of the current agent run. It is marked as a
* sub-agent so it cannot recursively spawn more sub-agents, and it registers
* the bridge that backfills the parent's placeholder tool message.
*/
execVirtualSubAgent = async (params: ExecVirtualSubAgentParams): Promise<ExecSubAgentResult> =>
this.execAgentThreadRun(params, {
isSubAgent: true,
logScope: 'execVirtualSubAgent',
resumeParentOnComplete: true,
});
private async execAgentThreadRun(
params: ExecSubAgentParams | ExecVirtualSubAgentParams,
options: {
isSubAgent: boolean;
logScope: 'execSubAgent' | 'execVirtualSubAgent';
resumeParentOnComplete?: boolean;
},
): Promise<ExecSubAgentResult> {
const { groupId, topicId, parentMessageId, agentId, instruction, title, parentOperationId } =
params;
log(
'execSubAgent: agentId=%s, groupId=%s, topicId=%s, instruction=%s',
'%s: agentId=%s, groupId=%s, topicId=%s, instruction=%s',
options.logScope,
agentId,
groupId,
topicId,
@@ -2903,7 +2917,7 @@ export class AiAgentService {
.catch(() => {});
}
// 1. Create Thread for isolated task execution
// 1. Create Thread for isolated agent execution
const thread = await this.threadModel.create({
agentId,
groupId,
@@ -2914,10 +2928,10 @@ export class AiAgentService {
});
if (!thread) {
throw new Error('Failed to create thread for task execution');
throw new Error('Failed to create thread for agent execution');
}
log('execSubAgent: created thread %s', thread.id);
log('%s: created thread %s', options.logScope, thread.id);
// 2. Update Thread status to processing with startedAt timestamp
const startedAt = new Date().toISOString();
@@ -2926,14 +2940,19 @@ export class AiAgentService {
status: ThreadStatus.Processing,
});
// 3. Create hooks for updating Thread metadata and task message
const threadHooks = this.createThreadHooks(thread.id, startedAt, parentMessageId);
// For the deferred-tool path, also register the completion bridge that
// 3. Create hooks for updating Thread metadata and source message
const threadHooks = this.createThreadHooks(
thread.id,
startedAt,
parentMessageId,
options.logScope,
);
// For the virtual sub-agent path, also register the completion bridge that
// backfills the parent's placeholder tool message and resumes the parked
// parent op once the whole batch is done. Registered last so its
// tool-message backfill (content + pluginState) is the final write.
// parent op once the child run is done. Registered last so its tool-message
// backfill (content + pluginState) is the final write.
const hooks =
resumeParentOnComplete && parentOperationId
options.resumeParentOnComplete && parentOperationId
? [
...threadHooks,
this.createSubAgentBridgeHook(parentOperationId, parentMessageId, thread.id),
@@ -2953,16 +2972,23 @@ export class AiAgentService {
).findById(parentOperationId);
inheritedTrigger = parentOp?.trigger ?? undefined;
} catch (error) {
log('execSubAgent: failed to read parent operation trigger: %O', error);
log('%s: failed to read parent operation trigger: %O', options.logScope, error);
}
}
const appContext: NonNullable<InternalExecAgentParams['appContext']> = {
groupId,
isSubAgent: options.isSubAgent,
threadId: thread.id,
topicId,
};
// 4. Delegate to execAgent with threadId in appContext and hooks
// The instruction will be created as user message in the Thread
// Use headless mode to skip human approval in async task execution
// Use headless mode to skip human approval in async agent execution
const result = await this.execAgent({
agentId,
appContext: { groupId, threadId: thread.id, topicId },
appContext,
autoStart: true,
hooks,
parentOperationId,
@@ -2972,7 +2998,8 @@ export class AiAgentService {
});
log(
'execSubAgent: delegated to execAgent, operationId=%s, success=%s',
'%s: delegated to execAgent, operationId=%s, success=%s',
options.logScope,
result.operationId,
result.success,
);
@@ -3028,7 +3055,7 @@ export class AiAgentService {
success: result.success ?? false,
threadId: thread.id,
};
};
}
/**
* Create step lifecycle callbacks for updating Thread metadata
@@ -3036,12 +3063,13 @@ export class AiAgentService {
*
* @param threadId - The Thread ID to update
* @param startedAt - The start time ISO string
* @param sourceMessageId - The task message ID (sourceMessageId from Thread) to update with summary
* @param sourceMessageId - The source message ID from Thread to update with summary
*/
private createThreadMetadataCallbacks(
threadId: string,
startedAt: string,
sourceMessageId: string,
logScope: 'execSubAgent' | 'execVirtualSubAgent' = 'execSubAgent',
): StepLifecycleCallbacks {
// Accumulator for tracking metrics across steps
let accumulatedToolCalls = 0;
@@ -3067,9 +3095,9 @@ export class AiAgentService {
totalToolCalls: accumulatedToolCalls,
},
});
log('execSubAgent: updated thread %s metadata after step %d', threadId, state.stepCount);
log('%s: updated thread %s metadata after step %d', logScope, threadId, state.stepCount);
} catch (error) {
log('execSubAgent: failed to update thread metadata: %O', error);
log('%s: failed to update thread metadata: %O', logScope, error);
}
},
@@ -3101,13 +3129,13 @@ export class AiAgentService {
}
}
// Log error when task fails
// Log error when the isolated run fails
if (reason === 'error' && finalState.error) {
console.error('execSubAgent: task failed for thread %s:', threadId, finalState.error);
console.error('%s: run failed for thread %s:', logScope, threadId, finalState.error);
}
try {
// Extract summary from last assistant message and update task message content
// Extract summary from last assistant message and update source message content
const lastAssistantMessage = finalState.messages
?.slice()
.reverse()
@@ -3117,7 +3145,7 @@ export class AiAgentService {
await this.messageModel.update(sourceMessageId, {
content: lastAssistantMessage.content,
});
log('execSubAgent: updated task message %s with summary', sourceMessageId);
log('%s: updated source message %s with summary', logScope, sourceMessageId);
}
// Format error for proper serialization (Error objects don't serialize with JSON.stringify)
@@ -3140,13 +3168,14 @@ export class AiAgentService {
});
log(
'execSubAgent: thread %s completed with status %s, reason: %s',
'%s: thread %s completed with status %s, reason: %s',
logScope,
threadId,
status,
reason,
);
} catch (error) {
console.error('execSubAgent: failed to update thread on completion: %O', error);
console.error('%s: failed to update thread on completion: %O', logScope, error);
}
},
};
@@ -3160,6 +3189,7 @@ export class AiAgentService {
threadId: string,
startedAt: string,
sourceMessageId: string,
logScope: 'execSubAgent' | 'execVirtualSubAgent',
): AgentHook[] {
let accumulatedToolCalls = 0;
@@ -3186,7 +3216,7 @@ export class AiAgentService {
},
});
} catch (error) {
log('Thread hook afterStep: failed to update metadata: %O', error);
log('%s: thread hook afterStep failed to update metadata: %O', logScope, error);
}
},
id: 'thread-metadata-update',
@@ -3226,14 +3256,15 @@ export class AiAgentService {
if (event.reason === 'error' && finalState.error) {
console.error(
'Thread hook onComplete: task failed for thread %s:',
'%s: thread hook onComplete run failed for thread %s:',
logScope,
threadId,
finalState.error,
);
}
try {
// Update task message with summary
// Update source message with summary
const lastAssistantMessage = finalState.messages
?.slice()
.reverse()
@@ -3263,13 +3294,14 @@ export class AiAgentService {
});
log(
'Thread hook onComplete: thread %s status=%s reason=%s',
'%s: thread hook onComplete thread %s status=%s reason=%s',
logScope,
threadId,
status,
event.reason,
);
} catch (error) {
console.error('Thread hook onComplete: failed to update: %O', error);
console.error('%s: thread hook onComplete failed to update: %O', logScope, error);
}
},
id: 'thread-completion',
@@ -990,6 +990,7 @@ export class BotMessageRouter {
agentId,
db: serverDB,
userId,
workspaceId: workspaceId ?? undefined,
},
{ ignoreError: true },
);
@@ -1175,6 +1176,7 @@ export class BotMessageRouter {
agentId,
db: serverDB,
userId,
workspaceId: workspaceId ?? undefined,
},
{ ignoreError: true },
);
@@ -1392,6 +1394,7 @@ export class BotMessageRouter {
agentId,
db: serverDB,
userId,
workspaceId: workspaceId ?? undefined,
},
{ ignoreError: true },
);
@@ -718,7 +718,37 @@ describe('DeviceGateway', () => {
{ deviceId: 'dev-1', timeout: 30_000, userId: 'user-1' },
{
method: 'getLocalFilePreview',
params: { path: '/proj/App.tsx', workingDirectory: '/proj' },
params: { accept: undefined, path: '/proj/App.tsx', workingDirectory: '/proj' },
},
);
});
it('forwards image-only preview constraints to the device rpc', async () => {
configure();
const data = {
preview: {
base64: 'aW1hZ2U=',
contentType: 'image/png',
type: 'image',
},
success: true,
};
mockClient.invokeRpc.mockResolvedValue({ data, success: true });
const proxy = new DeviceGateway();
await proxy.getLocalFilePreview({
accept: 'image',
deviceId: 'dev-1',
path: '/proj/image.png',
userId: 'user-1',
workingDirectory: '/proj',
});
expect(mockClient.invokeRpc).toHaveBeenCalledWith(
{ deviceId: 'dev-1', timeout: 30_000, userId: 'user-1' },
{
method: 'getLocalFilePreview',
params: { accept: 'image', path: '/proj/image.png', workingDirectory: '/proj' },
},
);
});
@@ -14,9 +14,11 @@ import type {
DeviceGitBranchInfo,
DeviceGitBranchListItem,
DeviceGitCheckoutResult,
DeviceGitDeleteBranchResult,
DeviceGitFileRevertResult,
DeviceGitLinkedPullRequestResult,
DeviceGitRemoteBranchListItem,
DeviceGitRenameBranchResult,
DeviceGitSyncResult,
DeviceGitWorkingTreeFiles,
DeviceGitWorkingTreePatches,
@@ -272,6 +274,73 @@ export class DeviceGateway {
}
}
/**
* Rename a branch in a directory on a remote device via the `renameGitBranch`
* device RPC.
*/
async renameGitBranch(params: {
deviceId: string;
from: string;
path: string;
timeout?: number;
to: string;
userId: string;
}): Promise<DeviceGitRenameBranchResult> {
const { userId, deviceId, from, to, path, timeout = 30_000 } = params;
const client = this.getClient();
if (!client) return { error: 'Device gateway not configured', success: false };
try {
const result = await client.invokeRpc<DeviceGitRenameBranchResult>(
{ deviceId, timeout, userId },
{ method: 'renameGitBranch', params: { from, path, to } },
);
if (!result.success || !result.data) {
log('renameGitBranch: failed for deviceId=%s — %s', deviceId, result.error);
return { error: result.error || 'Rename failed', success: false };
}
return result.data;
} catch (error) {
log('renameGitBranch: error for deviceId=%s — %O', deviceId, error);
return { error: (error as Error)?.message || 'Rename failed', success: false };
}
}
/**
* Delete a branch in a directory on a remote device via the `deleteGitBranch`
* device RPC.
*/
async deleteGitBranch(params: {
branch: string;
deviceId: string;
path: string;
timeout?: number;
userId: string;
}): Promise<DeviceGitDeleteBranchResult> {
const { userId, deviceId, branch, path, timeout = 30_000 } = params;
const client = this.getClient();
if (!client) return { error: 'Device gateway not configured', success: false };
try {
const result = await client.invokeRpc<DeviceGitDeleteBranchResult>(
{ deviceId, timeout, userId },
{ method: 'deleteGitBranch', params: { branch, path } },
);
if (!result.success || !result.data) {
log('deleteGitBranch: failed for deviceId=%s — %s', deviceId, result.error);
return { error: result.error || 'Delete failed', success: false };
}
return result.data;
} catch (error) {
log('deleteGitBranch: error for deviceId=%s — %O', deviceId, error);
return { error: (error as Error)?.message || 'Delete failed', success: false };
}
}
/**
* Pull (`--ff-only`) the current branch of a directory on a remote device via
* the `pullGitBranch` device RPC.
@@ -473,20 +542,24 @@ export class DeviceGateway {
* exposing a `localfile://` URL to web callers.
*/
async getLocalFilePreview(params: {
accept?: 'image';
deviceId: string;
path: string;
timeout?: number;
userId: string;
workingDirectory: string;
}): Promise<DeviceLocalFilePreviewResult> {
const { userId, deviceId, path, workingDirectory, timeout = 30_000 } = params;
const { accept, userId, deviceId, path, workingDirectory, timeout = 30_000 } = params;
const client = this.getClient();
if (!client) return { error: 'Device gateway not configured', success: false };
try {
const result = await client.invokeRpc<DeviceLocalFilePreviewResult>(
{ deviceId, timeout, userId },
{ method: 'getLocalFilePreview', params: { path, workingDirectory } },
{
method: 'getLocalFilePreview',
params: { accept, path, workingDirectory },
},
);
if (!result.success || !result.data) {
@@ -665,7 +738,7 @@ export class DeviceGateway {
}
async executeToolCall(
params: { deviceId: string; userId: string },
params: { deviceId: string; operationId?: string; userId: string },
toolCall: { apiName: string; arguments: string; identifier: string },
timeout = 30_000,
): Promise<DeviceToolCallResult> {
@@ -679,7 +752,8 @@ export class DeviceGateway {
}
log(
'executeToolCall: userId=%s, deviceId=%s, tool=%s/%s',
'executeToolCall: operationId=%s, userId=%s, deviceId=%s, tool=%s/%s',
params.operationId ?? 'N/A',
params.userId,
params.deviceId,
toolCall.identifier,
@@ -688,7 +762,12 @@ export class DeviceGateway {
try {
return await client.executeToolCall(
{ deviceId: params.deviceId, timeout, userId: params.userId },
{
deviceId: params.deviceId,
operationId: params.operationId,
timeout,
userId: params.userId,
},
toolCall,
);
} catch (error) {
@@ -3,7 +3,10 @@ import { documentHistories, documents, files, users } from '@lobechat/database/s
import { and, desc, eq } from 'drizzle-orm';
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
import { DOCUMENT_HISTORY_SOURCE_LIMITS } from '@/const/documentHistory';
import {
DOCUMENT_HISTORY_AUTOSAVE_WINDOW_MS,
DOCUMENT_HISTORY_SOURCE_LIMITS,
} from '@/const/documentHistory';
import { getTestDB } from '@/database/core/getTestDB';
import { DocumentModel } from '@/database/models/document';
import { FileModel } from '@/database/models/file';
@@ -420,7 +423,7 @@ describe('DocumentHistoryService', () => {
documentId: doc.id,
editorData: { v: i },
saveSource: 'autosave',
savedAt: new Date(2026, 3, 1, 0, i, 0),
savedAt: new Date(2026, 3, 1, 0, i * 10, 0),
});
}
@@ -463,6 +466,182 @@ describe('DocumentHistoryService', () => {
});
});
describe('autosave window coalescing', () => {
const base = new Date('2026-04-01T10:00:00Z');
const minutes = (n: number) => new Date(base.getTime() + n * 60 * 1000);
const listRows = (documentId: string) =>
serverDB
.select()
.from(documentHistories)
.where(eq(documentHistories.documentId, documentId))
.orderBy(desc(documentHistories.savedAt), desc(documentHistories.id));
it('should overwrite the latest autosave row within the window', async () => {
const doc = await createTestDocument('Hello');
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 1 },
saveSource: 'autosave',
savedAt: base,
});
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 2 },
saveSource: 'autosave',
savedAt: minutes(5),
});
const rows = await listRows(doc.id);
expect(rows).toHaveLength(1);
expect(rows[0].editorData).toEqual({ v: 2 });
expect(rows[0].savedAt).toEqual(minutes(5));
});
it('should insert a new row once the save falls into the next window bucket', async () => {
const doc = await createTestDocument('Hello');
const windowMinutes = DOCUMENT_HISTORY_AUTOSAVE_WINDOW_MS / 60_000;
for (const [i, at] of [
base,
minutes(5),
minutes(windowMinutes),
minutes(windowMinutes + 5),
].entries()) {
await historyService.createHistory({
documentId: doc.id,
editorData: { v: i + 1 },
saveSource: 'autosave',
savedAt: at,
});
}
const rows = await listRows(doc.id);
expect(rows).toHaveLength(2);
expect(rows[0].editorData).toEqual({ v: 4 });
expect(rows[0].savedAt).toEqual(minutes(windowMinutes + 5));
expect(rows[1].editorData).toEqual({ v: 2 });
expect(rows[1].savedAt).toEqual(minutes(5));
});
it('should start a new window when a non-autosave version is the latest', async () => {
const doc = await createTestDocument('Hello');
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 1 },
saveSource: 'autosave',
savedAt: base,
});
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 2 },
saveSource: 'manual',
savedAt: minutes(1),
});
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 3 },
saveSource: 'autosave',
savedAt: minutes(2),
});
const rows = await listRows(doc.id);
expect(rows).toHaveLength(3);
expect(rows.map((r) => r.saveSource)).toEqual(['autosave', 'manual', 'autosave']);
});
it('should never coalesce manual saves', async () => {
const doc = await createTestDocument('Hello');
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 1 },
saveSource: 'manual',
savedAt: base,
});
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 2 },
saveSource: 'manual',
savedAt: minutes(1),
});
const rows = await listRows(doc.id);
expect(rows).toHaveLength(2);
});
it('should insert a new row within the same window when breakAutosaveWindow is true', async () => {
const doc = await createTestDocument('Hello');
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 1 },
saveSource: 'autosave',
savedAt: base,
});
await historyService.createHistory({
breakAutosaveWindow: true,
documentId: doc.id,
editorData: { v: 2 },
saveSource: 'autosave',
savedAt: minutes(3),
});
const rows = await listRows(doc.id);
expect(rows).toHaveLength(2);
expect(rows[0].editorData).toEqual({ v: 2 });
expect(rows[0].savedAt).toEqual(minutes(3));
expect(rows[1].editorData).toEqual({ v: 1 });
expect(rows[1].savedAt).toEqual(base);
});
it('should coalesce into the break row on the next autosave without the flag', async () => {
const doc = await createTestDocument('Hello');
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 1 },
saveSource: 'autosave',
savedAt: base,
});
await historyService.createHistory({
breakAutosaveWindow: true,
documentId: doc.id,
editorData: { v: 2 },
saveSource: 'autosave',
savedAt: minutes(3),
});
await historyService.createHistory({
documentId: doc.id,
editorData: { v: 3 },
saveSource: 'autosave',
savedAt: minutes(5),
});
const rows = await listRows(doc.id);
expect(rows).toHaveLength(2);
expect(rows[0].editorData).toEqual({ v: 3 });
expect(rows[0].savedAt).toEqual(minutes(5));
expect(rows[1].editorData).toEqual({ v: 1 });
expect(rows[1].savedAt).toEqual(base);
});
});
describe('getDocumentHistoryItem', () => {
it('should resolve head as current document state', async () => {
const editorData = createValidEditorData('Head content');
@@ -5,14 +5,22 @@ import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import { DocumentModel } from '@/database/models/document';
import { FileModel } from '@/database/models/file';
import { EditLockService } from '../../editLock';
import { FileService } from '../../file';
import { publishResourceEvent } from '../../resourceEvents';
import { DocumentHistoryService } from '../history';
import { DocumentService } from '../index';
vi.mock('@/server/modules/AgentRuntime/redis', () => ({ getAgentRuntimeRedisClient: () => null }));
vi.mock('@/database/models/document');
vi.mock('@/database/models/file');
vi.mock('../../file');
vi.mock('../history');
// Spy on the realtime broadcast so we can assert lock.changed is published only
// on a genuine state change (holder edge / actual release).
vi.mock('../../resourceEvents', () => ({ publishResourceEvent: vi.fn() }));
const publishResourceEventMock = vi.mocked(publishResourceEvent);
vi.mock('@lobechat/file-loaders', () => ({
loadFile: vi.fn(),
UnsupportedFileTypeError: class UnsupportedFileTypeError extends Error {
@@ -794,6 +802,168 @@ describe('DocumentService', () => {
'Document not found: missing-doc',
);
});
it('should reject a workspace save when another member holds the edit lock', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.findById.mockResolvedValue(createCurrentDocument({ workspaceId: 'ws-1' }));
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
await expect(wsService.updateDocument('doc-1', { content: 'x' })).rejects.toMatchObject({
code: 'CONFLICT',
});
expect(mockDocumentModel.update).not.toHaveBeenCalled();
});
it('should allow a workspace save when no other member holds the lock', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.update.mockResolvedValue({ id: 'doc-1' });
mockDocumentModel.findById.mockResolvedValue(createCurrentDocument({ workspaceId: 'ws-1' }));
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
await wsService.updateDocument('doc-1', { content: 'x' });
expect(mockDocumentModel.update).toHaveBeenCalled();
});
it('allows a metadata-only save while another member holds the lock (only the body is locked)', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.update.mockResolvedValue({ id: 'doc-1' });
// Current body matches what the autosave re-sends — only title changes.
mockDocumentModel.findById.mockResolvedValue(
createCurrentDocument({ content: 'body', editorData: { blocks: [] }, workspaceId: 'ws-1' }),
);
const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
await wsService.updateDocument('doc-1', {
content: 'body',
editorData: { blocks: [] },
title: 'New Title',
});
// Content unchanged → the lock guard never runs and the meta save lands.
expect(guardSpy).not.toHaveBeenCalled();
expect(mockDocumentModel.update).toHaveBeenCalledWith(
'doc-1',
expect.objectContaining({ title: 'New Title' }),
);
});
it('rejects a body change while locked even when the content string is unchanged', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.findById.mockResolvedValue(
createCurrentDocument({ editorData: { blocks: [] }, workspaceId: 'ws-1' }),
);
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
// editorData changed (historyAppended) → guard runs even with no `content`.
await expect(
wsService.updateDocument('doc-1', { editorData: { blocks: [{ type: 'paragraph' }] } }),
).rejects.toMatchObject({ code: 'CONFLICT' });
expect(mockDocumentModel.update).not.toHaveBeenCalled();
});
});
describe('document edit lock', () => {
it('reports unlocked for personal documents without touching the lock service', async () => {
const acquireSpy = vi.spyOn(EditLockService.prototype, 'acquire');
const result = await service.acquireDocumentLock('doc-1');
expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
expect(acquireSpy).not.toHaveBeenCalled();
});
it('delegates to the edit lock service in workspace mode', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
const expiresAt = new Date(Date.now() + 60_000);
const acquireSpy = vi
.spyOn(EditLockService.prototype, 'acquire')
.mockResolvedValue({ expiresAt, holderId: userId, lockedByOther: false });
const result = await wsService.acquireDocumentLock('doc-1');
expect(acquireSpy).toHaveBeenCalledWith('document', 'doc-1');
expect(result).toEqual({ expiresAt, holderId: userId, lockedByOther: false });
});
it('reports another member as holder when the lock is taken', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
const expiresAt = new Date(Date.now() + 60_000);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt,
holderId: 'other-user',
lockedByOther: true,
});
const result = await wsService.acquireDocumentLock('doc-1');
expect(result).toEqual({ expiresAt, holderId: 'other-user', lockedByOther: true });
});
it('releaseDocumentLock is a no-op for personal documents', async () => {
const releaseSpy = vi.spyOn(EditLockService.prototype, 'release');
await service.releaseDocumentLock('doc-1');
expect(releaseSpy).not.toHaveBeenCalled();
});
it('releaseDocumentLock delegates to the lock service in workspace mode', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
const releaseSpy = vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
await wsService.releaseDocumentLock('doc-1');
expect(releaseSpy).toHaveBeenCalledWith('document', 'doc-1');
});
it('acquireDocumentLock broadcasts lock.changed on a holder edge (first claim)', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(undefined);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
await wsService.acquireDocumentLock('doc-1');
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'doc-1', type: 'document' },
expect.objectContaining({ data: { holderId: userId }, type: 'lock.changed' }),
);
});
it('acquireDocumentLock does NOT broadcast on a steady-state heartbeat (same holder)', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(userId);
vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
expiresAt: new Date(),
holderId: userId,
lockedByOther: false,
});
await wsService.acquireDocumentLock('doc-1');
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
it('releaseDocumentLock broadcasts unlocked only when it actually freed the lock', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
await wsService.releaseDocumentLock('doc-1');
expect(publishResourceEventMock).toHaveBeenCalledWith(
{ id: 'doc-1', type: 'document' },
expect.objectContaining({ data: { holderId: null }, type: 'lock.changed' }),
);
});
it('releaseDocumentLock does NOT broadcast when the lease expired / was taken over', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(false);
await wsService.releaseDocumentLock('doc-1');
expect(publishResourceEventMock).not.toHaveBeenCalled();
});
});
describe('saveDocumentHistory', () => {
@@ -837,6 +1007,37 @@ describe('DocumentService', () => {
);
expect(mockDocumentHistoryService.createHistory).not.toHaveBeenCalled();
});
it('does not check the lock for personal documents', async () => {
mockDocumentModel.findById.mockResolvedValue({ id: 'doc-1', editorData: { blocks: [] } });
const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
await service.saveDocumentHistory('doc-1', { blocks: [] }, 'llm_call');
expect(guardSpy).not.toHaveBeenCalled();
expect(mockDocumentHistoryService.createHistory).toHaveBeenCalled();
});
it('rejects a workspace history snapshot when another member holds the lock', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.findById.mockResolvedValue({ id: 'doc-1', editorData: { blocks: [] } });
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
await expect(
wsService.saveDocumentHistory('doc-1', { blocks: [] }, 'llm_call'),
).rejects.toMatchObject({ code: 'CONFLICT' });
expect(mockDocumentHistoryService.createHistory).not.toHaveBeenCalled();
});
it('allows a workspace history snapshot when no other member holds the lock', async () => {
const wsService = new DocumentService(mockDb, userId, 'ws-1');
mockDocumentModel.findById.mockResolvedValue({ id: 'doc-1', editorData: { blocks: [] } });
vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
await wsService.saveDocumentHistory('doc-1', { blocks: [] }, 'llm_call');
expect(mockDocumentHistoryService.createHistory).toHaveBeenCalled();
});
});
describe('trySaveCurrentDocumentHistory', () => {
@@ -4,6 +4,7 @@ import { documentHistories, documents } from '@lobechat/database/schemas';
import { and, desc, eq, gte, inArray, lt, or } from 'drizzle-orm';
import {
DOCUMENT_HISTORY_AUTOSAVE_WINDOW_MS,
DOCUMENT_HISTORY_QUERY_LIST_LIMIT,
DOCUMENT_HISTORY_SOURCE_LIMITS,
} from '@/const/documentHistory';
@@ -46,6 +47,7 @@ export class DocumentHistoryService {
buildWorkspaceWhere({ userId: this.userId, workspaceId: this.workspaceId }, documentHistories);
createHistory = async (params: {
breakAutosaveWindow?: boolean;
documentId: string;
editorData: Record<string, any>;
saveSource: DocumentHistorySaveSource;
@@ -61,6 +63,32 @@ export class DocumentHistoryService {
throw new Error('Document not found');
}
// Autosave versions coalesce into fixed 10-min windows (Notion-like),
// bucketed on the clock grid so the anchor stays immutable even though the
// overwritten row's savedAt keeps moving — a sliding anchor would collapse
// an entire continuous editing session into a single version.
// Any non-autosave version in between closes the window.
if (params.saveSource === 'autosave' && !params.breakAutosaveWindow) {
const latest = await this.db.query.documentHistories.findFirst({
orderBy: [desc(documentHistories.savedAt), desc(documentHistories.id)],
where: and(eq(documentHistories.documentId, params.documentId), this.historiesOwnership()),
});
const withinWindow =
latest?.saveSource === 'autosave' &&
Math.floor(latest.savedAt.getTime() / DOCUMENT_HISTORY_AUTOSAVE_WINDOW_MS) ===
Math.floor(params.savedAt.getTime() / DOCUMENT_HISTORY_AUTOSAVE_WINDOW_MS);
if (withinWindow) {
await this.db
.update(documentHistories)
.set({ editorData: params.editorData, savedAt: params.savedAt })
.where(and(eq(documentHistories.id, latest.id), this.historiesOwnership()));
return;
}
}
await this.db.insert(documentHistories).values({
documentId: params.documentId,
editorData: params.editorData,
@@ -185,6 +213,7 @@ export class DocumentHistoryService {
isCurrent: true,
saveSource: 'system',
savedAt: headDocument.updatedAt,
userId: headDocument.userId,
});
}
@@ -193,6 +222,7 @@ export class DocumentHistoryService {
isCurrent: false,
saveSource: row.saveSource as DocumentHistorySaveSource,
savedAt: row.savedAt,
userId: row.userId,
}));
// If head consumed a slot and we fetched a full page of history rows,
+116 -1
View File
@@ -15,13 +15,16 @@ import { isValidEditorData } from '@/libs/editor/isValidEditorData';
import { normalizeEditorDataDiffNodes } from '@/libs/editor/normalizeDiffNodes';
import { type LobeDocument } from '@/types/document';
import { EditLockService } from '../editLock';
import { FileService } from '../file';
import { publishResourceEvent } from '../resourceEvents';
import { DocumentHistoryService } from './history';
import type {
CompareDocumentHistoryItemsParams,
CompareDocumentHistoryItemsResult,
DocumentHistoryAccessOptions,
DocumentHistorySaveSource,
DocumentLockResult,
GetDocumentHistoryItemParams,
ListDocumentHistoryParams,
ListDocumentHistoryResult,
@@ -50,6 +53,7 @@ export class DocumentService {
private documentModel: DocumentModel;
private documentHistoryServiceInstance?: DocumentHistoryService;
private fileServiceInstance?: FileService;
private editLockService: EditLockService;
private db: LobeChatDatabase;
private workspaceId?: string;
@@ -60,6 +64,7 @@ export class DocumentService {
this.workspaceId = workspaceId;
this.fileModel = new FileModel(db, userId, workspaceId);
this.documentModel = new DocumentModel(db, userId, workspaceId);
this.editLockService = new EditLockService(userId);
}
private get fileService() {
@@ -202,6 +207,63 @@ export class DocumentService {
return this.documentModel.findById(id);
}
/**
* Acquire (or refresh) the collaborative edit lock for a workspace document.
*
* Doubles as the heartbeat: an active editor calls this on an interval to keep
* the lease alive, and a locked-out member calls it to take the lock over once
* it frees up. Locking only applies in workspace context personal documents
* always report as unlocked.
*/
async acquireDocumentLock(id: string): Promise<DocumentLockResult> {
if (!this.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const prevHolder = await this.editLockService.getActiveHolder('document', id);
const result = await this.editLockService.acquire('document', id);
// Broadcast only on a holder edge (first claim / takeover). This method also
// serves the periodic heartbeat, so a steady-state refresh (same holder)
// must not emit an event.
if ((result.holderId ?? null) !== (prevHolder ?? null)) {
void publishResourceEvent(
{ id, type: 'document' },
{ actorId: this.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
);
}
return result;
}
/**
* Read-only peek of the current edit lock (does not acquire). Lets a client
* render a workspace page read-only on open when another member holds it.
*/
async getDocumentLock(id: string): Promise<DocumentLockResult> {
if (!this.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
const holder = await this.editLockService.getActiveHolder('document', id);
return {
expiresAt: null,
holderId: holder ?? null,
lockedByOther: Boolean(holder) && holder !== this.userId,
};
}
/**
* Release the edit lock if the current user holds it. No-op in personal mode.
*/
async releaseDocumentLock(id: string): Promise<void> {
if (!this.workspaceId) return;
// Only broadcast "unlocked" when we actually released our own lock — if the
// lease had expired and another member took over, the lock is still held and
// a bogus holderId:null would wrongly flip their viewers to editable.
const released = await this.editLockService.release('document', id);
if (!released) return;
void publishResourceEvent(
{ id, type: 'document' },
{ actorId: this.userId, data: { holderId: null }, type: 'lock.changed' },
);
}
async listDocumentHistory(
params: ListDocumentHistoryParams,
options?: DocumentHistoryAccessOptions,
@@ -236,6 +298,21 @@ export class DocumentService {
throw new Error(`Document not found: ${documentId}`);
}
// Same collaborative edit-lock guard as updateDocument: don't record a
// history snapshot for a workspace document another member is editing, so a
// locked-out actor (e.g. a Copilot mutation that will itself be rejected)
// can't pollute the version timeline.
if (this.workspaceId) {
const blockedBy = await this.editLockService.getBlockingHolder('document', documentId);
if (blockedBy) {
throw new TRPCError({
cause: { data: { code: 'DocumentLocked' } },
code: 'CONFLICT',
message: 'Document is being edited by another user',
});
}
}
const normalizedEditorData = normalizeEditorDataDiffNodes(editorData);
const savedAt = new Date();
await this.documentHistoryService.createHistory({
@@ -331,7 +408,8 @@ export class DocumentService {
* Update document
*/
async updateDocument(id: string, params: UpdateDocumentParams): Promise<UpdateDocumentResult> {
return this.db.transaction(async (tx) => {
let changed = false;
const result = await this.db.transaction(async (tx) => {
const transactionDb = tx as unknown as LobeChatDatabase;
const documentModel = new DocumentModel(transactionDb, this.userId, this.workspaceId);
const fileModel = new FileModel(transactionDb, this.userId, this.workspaceId);
@@ -361,6 +439,26 @@ export class DocumentService {
nextEditorDataAccepted !== undefined &&
!isEqual(nextEditorDataAccepted, currentEditorDataAccepted);
// Collaborative edit lock guard: reject writes to a workspace document that
// another member is actively editing, so concurrent edits can't clobber
// each other. Only the rich-text BODY is locked — metadata-only saves
// (title/emoji) pass through, since the autosave always re-sends the
// unchanged body. The lease auto-expires in Redis; when Redis is down this
// returns null (fail-open) so the lock can't block saving.
const contentChanged =
historyAppended ||
(params.content !== undefined && params.content !== currentDocument.content);
if (this.workspaceId && contentChanged) {
const blockedBy = await this.editLockService.getBlockingHolder('document', id);
if (blockedBy) {
throw new TRPCError({
cause: { data: { code: 'DocumentLocked' } },
code: 'CONFLICT',
message: 'Document is being edited by another user',
});
}
}
const updates: Record<string, unknown> = {};
if (params.content !== undefined) {
@@ -390,11 +488,15 @@ export class DocumentService {
updates.parentId = params.parentId;
}
// The lock lease is refreshed by the client heartbeat (acquireDocumentLock),
// so a save does not need to touch it.
let savedAt: Date | undefined;
if (historyAppended) {
savedAt = new Date();
await documentHistoryService.createHistory({
breakAutosaveWindow: params.breakAutosaveWindow,
documentId: id,
editorData: currentEditorDataAccepted,
saveSource: params.saveSource ?? 'autosave',
@@ -413,12 +515,25 @@ export class DocumentService {
await fileModel.update(currentDocument.fileId, fileUpdates);
}
changed = Object.keys(updates).length > 0 || historyAppended;
return {
historyAppended,
id,
savedAt,
};
});
// Notify other workspace members that the document changed so their open
// editor refreshes immediately (best-effort; the heartbeat is the fallback).
if (this.workspaceId && changed) {
void publishResourceEvent(
{ id, type: 'document' },
{ actorId: this.userId, type: 'doc.updated' },
);
}
return result;
}
/**
@@ -22,6 +22,7 @@ export interface DocumentHistoryListItem {
isCurrent: boolean;
savedAt: Date;
saveSource: DocumentHistorySaveSource;
userId: string;
}
export interface DocumentHistoryItemResult {
@@ -54,6 +55,7 @@ export interface ListDocumentHistoryResult {
export type DatabaseLike = LobeChatDatabase | Transaction;
export interface UpdateDocumentParams {
breakAutosaveWindow?: boolean;
content?: string;
editorData?: Record<string, any>;
fileType?: string;
@@ -73,3 +75,12 @@ export interface UpdateDocumentResult {
export interface SaveDocumentHistoryResult {
savedAt: Date;
}
export interface DocumentLockResult {
/** Lease expiry of the active lock, if any. */
expiresAt: Date | null;
/** The user id currently holding the lock, or null when unlocked. */
holderId: string | null;
/** True when another active user holds the lock (caller is locked out). */
lockedByOther: boolean;
}
@@ -0,0 +1,148 @@
import { describe, expect, it, vi } from 'vitest';
import { EditLockService } from '../index';
/**
* Minimal in-memory fake of the ioredis calls EditLockService uses:
* `set(k, v, 'EX', ttl[, 'NX'])`, `get(k)`, and the compare-and-delete `eval`.
*/
const makeFakeRedis = () => {
const store = new Map<string, string>();
return {
eval: vi.fn(async (_script: string, _numKeys: number, key: string, arg: string) => {
if (store.get(key) === arg) {
store.delete(key);
return 1;
}
return 0;
}),
get: vi.fn(async (key: string) => store.get(key) ?? null),
set: vi.fn(async (key: string, value: string, ...args: unknown[]) => {
if (args.includes('NX') && store.has(key)) return null;
store.set(key, value);
return 'OK';
}),
store,
};
};
describe('EditLockService', () => {
it('acquires a free lock and reports the caller as holder', async () => {
const redis = makeFakeRedis();
const svc = new EditLockService('user-1', redis as any);
const result = await svc.acquire('document', 'doc-1');
expect(result.holderId).toBe('user-1');
expect(result.lockedByOther).toBe(false);
expect(result.expiresAt).toBeInstanceOf(Date);
expect(redis.store.get('editlock:document:doc-1')).toBe('user-1');
});
it('reports another member as holder when the lock is already taken', async () => {
const redis = makeFakeRedis();
await new EditLockService('user-1', redis as any).acquire('document', 'doc-1');
const result = await new EditLockService('user-2', redis as any).acquire('document', 'doc-1');
expect(result).toEqual({ expiresAt: null, holderId: 'user-1', lockedByOther: true });
});
it('lets the holder refresh their own lease', async () => {
const redis = makeFakeRedis();
const svc = new EditLockService('user-1', redis as any);
await svc.acquire('document', 'doc-1');
const result = await svc.acquire('document', 'doc-1');
expect(result.holderId).toBe('user-1');
expect(result.lockedByOther).toBe(false);
});
it('getActiveHolder reports the current holder, or undefined when free', async () => {
const redis = makeFakeRedis();
expect(
await new EditLockService('user-1', redis as any).getActiveHolder('document', 'doc-1'),
).toBeUndefined();
await new EditLockService('user-1', redis as any).acquire('document', 'doc-1');
expect(
await new EditLockService('user-2', redis as any).getActiveHolder('document', 'doc-1'),
).toBe('user-1');
});
it('keys locks per resource type, so the same id does not collide across types', async () => {
const redis = makeFakeRedis();
await new EditLockService('user-1', redis as any).acquire('document', 'shared-id');
// A different resource family with the same id is independently lockable.
const result = await new EditLockService('user-2', redis as any).acquire('agent', 'shared-id');
expect(result.holderId).toBe('user-2');
expect(result.lockedByOther).toBe(false);
expect(redis.store.get('editlock:document:shared-id')).toBe('user-1');
expect(redis.store.get('editlock:agent:shared-id')).toBe('user-2');
});
it('getBlockingHolder returns the holder only when it is someone else', async () => {
const redis = makeFakeRedis();
await new EditLockService('user-1', redis as any).acquire('document', 'doc-1');
expect(
await new EditLockService('user-2', redis as any).getBlockingHolder('document', 'doc-1'),
).toBe('user-1');
expect(
await new EditLockService('user-1', redis as any).getBlockingHolder('document', 'doc-1'),
).toBeNull();
});
it('only releases the lock for the current holder', async () => {
const redis = makeFakeRedis();
await new EditLockService('user-1', redis as any).acquire('document', 'doc-1');
// A non-holder release is a no-op and reports it did not release.
expect(await new EditLockService('user-2', redis as any).release('document', 'doc-1')).toBe(
false,
);
expect(redis.store.get('editlock:document:doc-1')).toBe('user-1');
// The holder can release, and reports the lock was actually freed.
expect(await new EditLockService('user-1', redis as any).release('document', 'doc-1')).toBe(
true,
);
expect(redis.store.has('editlock:document:doc-1')).toBe(false);
});
it('degrades to unlocked / no-op when Redis is unavailable', async () => {
const svc = new EditLockService('user-1', null);
expect(await svc.acquire('document', 'doc-1')).toEqual({
expiresAt: null,
holderId: null,
lockedByOther: false,
});
expect(await svc.getBlockingHolder('document', 'doc-1')).toBeNull();
await expect(svc.release('document', 'doc-1')).resolves.toBe(false);
});
it('fails open when Redis is configured but commands reject (unreachable)', async () => {
// ioredis is non-null but every command rejects after retries — the write
// guards must not turn this into a 500; treat the resource as unlocked.
const down = new Error('Connection is closed.');
const redis = {
eval: vi.fn().mockRejectedValue(down),
get: vi.fn().mockRejectedValue(down),
set: vi.fn().mockRejectedValue(down),
};
const svc = new EditLockService('user-1', redis as any);
expect(await svc.acquire('document', 'doc-1')).toEqual({
expiresAt: null,
holderId: null,
lockedByOther: false,
});
expect(await svc.getActiveHolder('document', 'doc-1')).toBeUndefined();
expect(await svc.getBlockingHolder('document', 'doc-1')).toBeNull();
await expect(svc.release('document', 'doc-1')).resolves.toBe(false);
});
});
+153
View File
@@ -0,0 +1,153 @@
import debug from 'debug';
import type { Redis } from 'ioredis';
import { getAgentRuntimeRedisClient } from '@/server/modules/AgentRuntime/redis';
const log = debug('lobe-server:edit-lock');
/** Lease lifetime in seconds; clients heartbeat well within this to keep it alive. */
export const EDIT_LOCK_TTL_SECONDS = 30;
/** Editable resource families that can take a collaborative edit lock. */
export type EditLockResourceType = 'agent' | 'chatGroup' | 'document' | 'task';
export interface EditLockResult {
/** Lease expiry of the active lock, if the caller now holds it. */
expiresAt: Date | null;
/** The user id currently holding the lock, or null when unlocked. */
holderId: string | null;
/** True when another user holds the lock (caller is locked out). */
lockedByOther: boolean;
}
const UNLOCKED: EditLockResult = { expiresAt: null, holderId: null, lockedByOther: false };
const lockKey = (type: EditLockResourceType, id: string) => `editlock:${type}:${id}`;
// Release only if the caller still holds the lock (compare-and-delete), so a
// stale releaser can't drop a lease another member has since taken over.
const RELEASE_SCRIPT = `
if redis.call('get', KEYS[1]) == ARGV[1] then
return redis.call('del', KEYS[1])
end
return 0
`;
/**
* Redis-backed collaborative edit lock, keyed by (resourceType, resourceId).
*
* Intentionally a thin, table-agnostic lease: there is no DB schema, so it
* applies uniformly to any editable resource (documents, briefs, ) and can be
* removed wholesale once real-time co-editing lands the keys simply expire.
*
* The lock is advisory: when Redis is unavailable every method degrades to
* "unlocked" so the lock infrastructure can never block editing or saving.
*/
export class EditLockService {
private userId: string;
private explicitRedis: Redis | null | undefined;
private lazyRedis: Redis | null = null;
private lazyResolved = false;
constructor(userId: string, redis?: Redis | null) {
this.userId = userId;
this.explicitRedis = redis;
}
/**
* The Redis client, resolved lazily on first use. Resolving eagerly in the
* constructor would read server-only env (`getAgentRuntimeRedisClient`) the
* moment any owning service is built which throws in client/test contexts
* that construct the service but never take a lock.
*/
private get redis(): Redis | null {
if (this.explicitRedis !== undefined) return this.explicitRedis;
if (!this.lazyResolved) {
this.lazyRedis = getAgentRuntimeRedisClient();
this.lazyResolved = true;
}
return this.lazyRedis;
}
/**
* Acquire the lock when it is free (or already mine), refreshing the lease;
* otherwise report whoever currently holds it. Doubles as the heartbeat.
*/
async acquire(type: EditLockResourceType, id: string): Promise<EditLockResult> {
const redis = this.redis;
if (!redis) return UNLOCKED;
const key = lockKey(type, id);
try {
// Claim only when the key is absent (NX). The TTL gives automatic expiry, so
// a hard-closed tab frees the lock without any cleanup job.
const claimed = await redis.set(key, this.userId, 'EX', EDIT_LOCK_TTL_SECONDS, 'NX');
if (claimed) return this.held();
const holder = await redis.get(key);
if (holder === this.userId) {
// Already mine — refresh the lease (heartbeat).
await redis.set(key, this.userId, 'EX', EDIT_LOCK_TTL_SECONDS);
return this.held();
}
if (holder) return { expiresAt: null, holderId: holder, lockedByOther: true };
// Freed between the NX and the GET — try once more.
const reclaimed = await redis.set(key, this.userId, 'EX', EDIT_LOCK_TTL_SECONDS, 'NX');
return reclaimed ? this.held() : UNLOCKED;
} catch (error) {
// Fail-open: a Redis outage (configured but unreachable) must never block
// editing — report unlocked rather than surfacing the command rejection.
log('acquire failed for %s:%s %O', type, id, error);
return UNLOCKED;
}
}
/** Current holder of the lock, or undefined when unlocked / Redis is down. */
async getActiveHolder(type: EditLockResourceType, id: string): Promise<string | undefined> {
const redis = this.redis;
if (!redis) return undefined;
try {
const holder = await redis.get(lockKey(type, id));
return holder ?? undefined;
} catch (error) {
// Fail-open: a Redis outage must not turn the write guards into 500s.
log('getActiveHolder failed for %s:%s %O', type, id, error);
return undefined;
}
}
/**
* The holder when someone *other* than the caller holds the lock, else null.
* Used by write guards; returns null when Redis is down (fail-open).
*/
async getBlockingHolder(type: EditLockResourceType, id: string): Promise<string | null> {
const holder = await this.getActiveHolder(type, id);
return holder && holder !== this.userId ? holder : null;
}
/**
* Release the lock, but only if the caller still holds it (compare-and-delete).
* Returns true only when the caller's lock was actually deleted false when
* the lease had already expired or another member has since taken it over, so
* callers can avoid broadcasting a bogus "unlocked" event.
*/
async release(type: EditLockResourceType, id: string): Promise<boolean> {
if (!this.redis) return false;
try {
const deleted = await this.redis.eval(RELEASE_SCRIPT, 1, lockKey(type, id), this.userId);
return deleted === 1;
} catch (error) {
log('release failed for %s:%s %O', type, id, error);
return false;
}
}
private held(): EditLockResult {
return {
expiresAt: new Date(Date.now() + EDIT_LOCK_TTL_SECONDS * 1000),
holderId: this.userId,
lockedByOther: false,
};
}
}
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,261 @@
// @vitest-environment node
import type { AgentStreamEvent } from '@lobechat/agent-gateway-client';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import {
__resetOperationStatesForTesting,
HeterogeneousPersistenceHandler,
} from '../HeterogeneousPersistenceHandler';
/**
* Regression for the MAIN-chain analog of #15808 (which only fixed the SUBAGENT
* coordinator).
*
* The main-agent reducer (`packages/heterogeneous-agents/src/mainAgentCoordinator`)
* cuts a turn purely on the adapter's `stream_start { newStep: true }` signal
* it tracks NO CC `message.id` and `openTurn` mints a fresh random assistant id
* via `ctx.newId('message')`. So unlike the subagent path (which now persists the
* turn's CC message.id on `metadata.subagentMessageId` and dedupes a replayed
* turn), the main chain has NO DB-homed idempotency key for a turn.
*
* The serverless failure mode:
* - `processedKeys` (the per-event dedupe set) lives ONLY in the in-memory
* `operationStates` map. On a cold replica it is empty.
* - The ingest contract (see `ingest()` doc) is: a handler that throws leaves
* its event unmarked, the throw bubbles to the producer, and the producer
* re-sends the WHOLE batch. Already-applied events are skipped "via the
* dedupe map" but that map is in-memory, so on a cold replica retry every
* event (including the `newStep`) is reprocessed.
* - Reprocessing `newStep` re-runs `openTurn`, which mints a SECOND assistant.
* The first one (created before the throw, already carrying the turn's usage
* but no flushed content) is orphaned as an empty shell content empty,
* tools 0, usage present. Exactly the "空壳条" in the reported triad.
*
* This test simulates a mid-batch DB failure on replica A, then a cold replica
* (`__resetOperationStatesForTesting()`) processing the producer's resend.
*/
interface FakeMessage {
agentId: string | null;
content: string;
id: string;
metadata?: any;
model?: string;
parentId?: string | null;
plugin?: any;
reasoning?: any;
role: 'user' | 'assistant' | 'tool' | 'task' | 'system';
threadId?: string | null;
tool_call_id?: string;
tools?: any[];
topicId: string | null;
}
const SEED = 'asst-seed';
const OP = 'op-1';
const TOPIC = 'topic-1';
const createHarness = () => {
let nextMsgIdSeq = 0;
const messages = new Map<string, FakeMessage>();
// Faithful topic-metadata store: the real TopicModel.updateMetadata DEEP-MERGES
// into the JSONB column. The main-chain cold-replica recovery reads
// `heteroCurrentMsgId` from here, so a no-op mock (as in the subagent test)
// would not exercise the path under test.
let topicMetadata: Record<string, any> = {
runningOperation: { assistantMessageId: SEED, operationId: OP },
};
// Trip a single mid-batch DB failure: the Nth `messageModel.update` throws once.
let updateCalls = 0;
let failUpdateAtCall = -1;
// Seed the run's first-turn assistant (already has content, like a real run
// where `newStep` opens the SECOND turn).
messages.set(SEED, {
agentId: null,
content: 'first turn answer',
id: SEED,
role: 'assistant',
topicId: TOPIC,
});
const messageModel = {
create: vi.fn(async (input: Partial<FakeMessage>, id?: string) => {
nextMsgIdSeq += 1;
const msgId = id ?? `msg_${nextMsgIdSeq}`;
const msg = {
agentId: input.agentId ?? null,
content: input.content ?? '',
id: msgId,
metadata: input.metadata,
model: input.model,
parentId: input.parentId ?? null,
plugin: input.plugin,
reasoning: input.reasoning,
role: input.role!,
threadId: input.threadId ?? null,
tool_call_id: input.tool_call_id,
tools: input.tools,
topicId: input.topicId ?? null,
} as FakeMessage;
messages.set(msgId, msg);
return msg;
}),
update: vi.fn(async (id: string, patch: Partial<FakeMessage>) => {
updateCalls += 1;
if (updateCalls === failUpdateAtCall) {
throw new Error('simulated mid-batch DB failure');
}
const existing = messages.get(id);
if (!existing) return { success: false };
const next = { ...existing, ...patch };
if (patch.metadata && existing.metadata) {
next.metadata = { ...existing.metadata, ...patch.metadata };
}
messages.set(id, next);
return { success: true };
}),
updateToolMessage: vi.fn(async (id: string, patch: any) => {
const existing = messages.get(id);
if (!existing) return { success: false };
messages.set(id, { ...existing, content: patch.content ?? existing.content });
return { success: true };
}),
findById: vi.fn(async (id: string) => messages.get(id) ?? null),
query: vi.fn(async (params: { threadId?: string; topicId?: string }) => {
if (params?.threadId)
return [...messages.values()].filter((m) => m.threadId === params.threadId);
return [...messages.values()].filter((m) => !m.threadId && m.topicId === params?.topicId);
}),
getLastChildToolMessageId: vi.fn(async (assistantMessageId: string) => {
const match = [...messages.values()].findLast(
(m) => m.role === 'tool' && m.parentId === assistantMessageId && !m.threadId,
);
return match?.id;
}),
listMessagePluginsByTopic: vi.fn(async () =>
[...messages.values()]
.filter((m) => m.role === 'tool' && m.tool_call_id)
.map((m) => ({ id: m.id, toolCallId: m.tool_call_id! })),
),
};
const topicModel = {
findById: vi.fn(async (id: string) => {
if (id !== TOPIC) return null;
return { agentId: null, id, metadata: topicMetadata };
}),
updateMetadata: vi.fn(async (_id: string, patch: Record<string, any>) => {
// Deep-merge top-level keys, matching the real model.
topicMetadata = { ...topicMetadata, ...patch };
}),
};
const threadModel = {
create: vi.fn(async () => {}),
findById: vi.fn(async () => null),
queryByTopicId: vi.fn(async () => []),
update: vi.fn(async () => {}),
};
const handler = new HeterogeneousPersistenceHandler({
messageModel: messageModel as any,
threadModel: threadModel as any,
topicModel: topicModel as any,
});
return {
handler,
messages,
setFailUpdateAtCall: (n: number) => {
failUpdateAtCall = n;
},
};
};
const buildEvent = (
type: AgentStreamEvent['type'],
stepIndex: number,
data: Record<string, unknown>,
): AgentStreamEvent => ({
data,
operationId: OP,
stepIndex,
timestamp: 1_700_000_000_000 + stepIndex,
type,
});
describe('HeterogeneousPersistenceHandler — main turn survives a cold-replica retry', () => {
beforeEach(() => __resetOperationStatesForTesting());
afterEach(() => __resetOperationStatesForTesting());
it('does NOT fork one main turn into a duplicate + empty shell when a batch is retried on a cold replica', async () => {
const h = createHarness();
// The producer's batch for a turn boundary: open a new turn, record its
// usage, then a tool batch. We trip the DB to fail on the tool-batch
// Phase-1 update, AFTER the turn's usage has already been written to the
// new assistant — so the orphan left behind is a true usage-bearing empty
// shell. `update` call order on replica A: #1 = openTurn flush of the seed's
// first-turn content, #2 = recordUsage on the new assistant, #3 = tools[]
// Phase 1 (← throws).
const batch = [
buildEvent('stream_start', 1, {
messageId: 'cc-msg-2',
newStep: true,
provider: 'claude-code',
}),
buildEvent('step_complete', 1, {
phase: 'turn_metadata',
usage: { totalInputTokens: 64_700, totalTokens: 64_700 },
}),
buildEvent('stream_chunk', 1, {
chunkType: 'tools_calling',
toolsCalling: [
{ apiName: 'Bash', arguments: '{}', id: 'tc-1', identifier: 'bash', type: 'default' },
],
}),
];
// ── Replica A: processes newStep (creates the turn assistant) + usage, then
// THROWS on the tool-batch write. The batch is left un-acked. ──
h.setFailUpdateAtCall(3);
await expect(
h.handler.ingest({
assistantMessageId: SEED,
events: batch,
operationId: OP,
topicId: TOPIC,
}),
).rejects.toThrow('simulated mid-batch DB failure');
// ── Cold replica: warm operation state (incl. processedKeys) is gone; the DB
// persists. The producer re-sends the SAME batch. ──
__resetOperationStatesForTesting();
// ── Replica B: full batch succeeds this time. ──
await h.handler.ingest({
assistantMessageId: SEED,
events: batch,
operationId: OP,
topicId: TOPIC,
});
// One `newStep` must yield exactly ONE new turn assistant (besides the seed).
const turnAssistants = [...h.messages.values()].filter(
(m) => m.role === 'assistant' && m.id !== SEED,
);
// Empty-shell detector: an assistant with usage but no content and no child tools.
const childToolsOf = (asstId: string) =>
[...h.messages.values()].filter((m) => m.role === 'tool' && m.parentId === asstId);
const emptyShells = turnAssistants.filter(
(m) => !m.content && childToolsOf(m.id).length === 0 && !!m.metadata?.usage,
);
expect(emptyShells).toHaveLength(0);
expect(turnAssistants).toHaveLength(1);
});
});
@@ -0,0 +1,445 @@
// @vitest-environment node
import type { AgentStreamEvent } from '@lobechat/agent-gateway-client';
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
import {
__resetOperationStatesForTesting,
HeterogeneousPersistenceHandler,
} from '../HeterogeneousPersistenceHandler';
/**
* Regression for the SERVER-ONLY "大量无意义的 SubAgent" bug.
*
* Root cause: `HeterogeneousPersistenceHandler` keeps per-operation state in a
* module-level `operationStates` map. On Vercel serverless, consecutive ingest
* batches for one operation can land on DIFFERENT (cold) replicas, so that map
* is empty on the next batch. `loadOrCreateState` rehydrates the MAIN-agent
* state from DB (accumulatedContent, toolState, toolMsgIdByCallId,
* currentAssistantMessageId) but initializes `subagentState` with an empty
* `createSubagentRunsState()` and NEVER reconstructs the in-flight subagent
* runs from DB.
*
* Consequence: when a subagent run spans multiple batches, the first subagent
* event seen by each fresh replica hits the `!existing` branch of `ensureRun`
* and creates a BRAND-NEW thread for a `parentToolCallId` that already has one.
* The duplicates get the generic "Subagent" title because spawnMetadata only
* rides the first subagent event per parent (adapter `announcedSpawns`).
*
* The desktop client never hits this it has a single long-lived
* `subagentState` closure for the whole run.
*
* This test simulates a cold replica between batches via
* `__resetOperationStatesForTesting()` (the in-memory map is dropped while the
* mock DB `threads` / `messages` persists, exactly like a fresh Lambda).
*/
interface FakeMessage {
agentId: string | null;
content: string;
id: string;
metadata?: any;
model?: string;
parentId?: string | null;
plugin?: any;
reasoning?: any;
role: 'user' | 'assistant' | 'tool' | 'task' | 'system';
threadId?: string | null;
tool_call_id?: string;
tools?: any[];
topicId: string | null;
}
interface FakeThread {
id: string;
metadata?: any;
sourceMessageId?: string | null;
status: string;
title: string;
topicId: string;
type: string;
}
const createHarness = (params: {
assistantMessageId: string;
operationId: string;
topicId: string;
}) => {
let nextMsgIdSeq = 0;
const messages = new Map<string, FakeMessage>();
const threads = new Map<string, FakeThread>();
messages.set(params.assistantMessageId, {
agentId: null,
content: '',
id: params.assistantMessageId,
role: 'assistant',
topicId: params.topicId,
});
const messageModel = {
create: vi.fn(async (input: Partial<FakeMessage>, id?: string) => {
nextMsgIdSeq += 1;
const msgId = id ?? `msg_${nextMsgIdSeq}`;
const msg: FakeMessage = {
agentId: input.agentId ?? null,
content: input.content ?? '',
id: msgId,
metadata: input.metadata,
model: input.model,
parentId: input.parentId ?? null,
plugin: input.plugin,
provider: undefined,
reasoning: input.reasoning,
role: input.role!,
threadId: input.threadId ?? null,
tool_call_id: input.tool_call_id,
topicId: input.topicId ?? null,
} as FakeMessage;
messages.set(msgId, msg);
return msg;
}),
update: vi.fn(async (id: string, patch: Partial<FakeMessage>) => {
const existing = messages.get(id);
if (!existing) return { success: false };
// Mirror the real MessageModel.update: metadata is DEEP-MERGED, not
// replaced — so e.g. a usage write doesn't clobber subagentMessageId.
const next = { ...existing, ...patch };
if (patch.metadata && existing.metadata) {
next.metadata = { ...existing.metadata, ...patch.metadata };
}
messages.set(id, next);
return { success: true };
}),
updateToolMessage: vi.fn(async (id: string, patch: any) => {
const existing = messages.get(id);
if (!existing) return { success: false };
messages.set(id, { ...existing, content: patch.content ?? existing.content });
return { success: true };
}),
findById: vi.fn(async (id: string) => messages.get(id) ?? null),
query: vi.fn(async (params: { threadId?: string; topicId?: string }) => {
if (params?.threadId) {
return [...messages.values()].filter((m) => m.threadId === params.threadId);
}
return [...messages.values()].filter((m) => !m.threadId && m.topicId === params?.topicId);
}),
getLastChildToolMessageId: vi.fn(async (assistantMessageId: string) => {
const match = [...messages.values()].findLast(
(m) => m.role === 'tool' && m.parentId === assistantMessageId && !m.threadId,
);
return match?.id;
}),
listMessagePluginsByTopic: vi.fn(async (_topicId: string) => {
// Mirror the real query: every persisted tool row's (toolCallId → id).
return [...messages.values()]
.filter((m) => m.role === 'tool' && m.tool_call_id)
.map((m) => ({ id: m.id, toolCallId: m.tool_call_id! }));
}),
};
const threadModel = {
create: vi.fn(async (input: Partial<FakeThread>) => {
const thread: FakeThread = {
id: input.id!,
metadata: input.metadata,
sourceMessageId: input.sourceMessageId,
status: input.status ?? 'active',
title: input.title ?? '',
topicId: input.topicId ?? params.topicId,
type: input.type ?? 'isolation',
};
threads.set(thread.id, thread);
return thread;
}),
findById: vi.fn(async (id: string) => threads.get(id) ?? null),
queryByTopicId: vi.fn(async (topicId: string) =>
[...threads.values()].filter((t) => t.topicId === topicId),
),
update: vi.fn(async (id: string, patch: Partial<FakeThread>) => {
const existing = threads.get(id);
if (!existing) return;
threads.set(id, { ...existing, ...patch });
}),
};
const topicModel = {
findById: vi.fn(async (id: string) => {
if (id !== params.topicId) return null;
return {
agentId: null,
id,
metadata: {
runningOperation: {
assistantMessageId: params.assistantMessageId,
operationId: params.operationId,
},
},
};
}),
updateMetadata: vi.fn(async () => {}),
};
const handler = new HeterogeneousPersistenceHandler({
messageModel: messageModel as any,
threadModel: threadModel as any,
topicModel: topicModel as any,
});
return { handler, messages, threadModel, threads };
};
const buildEvent = (
type: AgentStreamEvent['type'],
stepIndex: number,
data: Record<string, unknown>,
): AgentStreamEvent => ({
data,
operationId: 'op-1',
stepIndex,
timestamp: 1_700_000_000_000 + stepIndex,
type,
});
const innerTool = (id: string) => ({
apiName: 'Bash',
arguments: '{}',
id,
identifier: 'bash',
type: 'default',
});
describe('HeterogeneousPersistenceHandler — subagent run survives a cold replica', () => {
beforeEach(() => __resetOperationStatesForTesting());
afterEach(() => __resetOperationStatesForTesting());
it('does NOT spawn a duplicate thread when a later batch of the SAME subagent run lands on a fresh replica', async () => {
const h = createHarness({
assistantMessageId: 'asst-1',
operationId: 'op-1',
topicId: 'topic-1',
});
const PARENT = 'tc-spawn-1';
// ── Batch 1 (replica A): first subagent turn. Carries spawnMetadata, so the
// thread is created with a real title. ──
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 0, {
chunkType: 'tools_calling',
subagent: {
parentToolCallId: PARENT,
spawnMetadata: {
description: 'Explore session/agent topic data model',
prompt: 'investigate',
subagentType: 'Explore',
},
subagentMessageId: 'sub-msg-1',
},
toolsCalling: [innerTool('inner-1')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
expect(h.threads.size).toBe(1);
// ── Cold replica: the warm in-memory operation state is gone, but the DB
// (threads + messages) persists. ──
__resetOperationStatesForTesting();
// ── Batch 2 (replica B): the SAME subagent run continues with a new turn.
// Mirroring the adapter, this later event carries NO spawnMetadata. ──
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 1, {
chunkType: 'tools_calling',
subagent: {
parentToolCallId: PARENT,
subagentMessageId: 'sub-msg-2',
},
toolsCalling: [innerTool('inner-2')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
// The continuation must attach to the EXISTING thread, not fork a new one.
expect(h.threads.size).toBe(1);
// And we must never produce a generic-titled "Subagent" duplicate.
expect([...h.threads.values()].some((t) => t.title === 'Subagent')).toBe(false);
});
// P1: a tools_calling batch reprocessed on a cold replica (BatchIngester
// retry, or a turn split across a cold boundary so the cumulative array is
// re-seen) must NOT mint a second tool message for an inner tool the run
// already persisted. Rehydration restores `lifetimeToolCallIds`, and the
// reducer de-dupes against it.
it('does NOT re-create an already-persisted inner tool row after a cold replica', async () => {
const h = createHarness({
assistantMessageId: 'asst-1',
operationId: 'op-1',
topicId: 'topic-1',
});
const PARENT = 'tc-spawn-1';
// Batch 1: turn sub-msg-1 persists inner-1.
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 0, {
chunkType: 'tools_calling',
subagent: {
parentToolCallId: PARENT,
spawnMetadata: { prompt: 'go', subagentType: 'Explore' },
subagentMessageId: 'sub-msg-1',
},
toolsCalling: [innerTool('inner-1')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
__resetOperationStatesForTesting(); // cold replica
// Batch 2 (replica B): the SAME turn's cumulative array is re-seen (inner-1
// again) plus a new inner-2.
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 1, {
chunkType: 'tools_calling',
subagent: { parentToolCallId: PARENT, subagentMessageId: 'sub-msg-1' },
toolsCalling: [innerTool('inner-1'), innerTool('inner-2')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
const toolRows = (callId: string) =>
[...h.messages.values()].filter((m) => m.role === 'tool' && m.tool_call_id === callId);
// inner-1 persisted exactly once (no duplicate row), inner-2 once.
expect(toolRows('inner-1')).toHaveLength(1);
expect(toolRows('inner-2')).toHaveLength(1);
expect(h.threads.size).toBe(1);
});
// P2: a stale `Processing` isolation thread left by a PRIOR operation on the
// same topic must not be rehydrated into — or finalized by — the current
// operation. The rehydration is scoped by `metadata.operationId`.
it('ignores a stale Processing thread from a different operation on the same topic', async () => {
const h = createHarness({
assistantMessageId: 'asst-1',
operationId: 'op-2',
topicId: 'topic-1',
});
// Seed a thread (+ its in-thread assistant) left Processing by op-1.
h.threads.set('thd-stale', {
id: 'thd-stale',
metadata: { operationId: 'op-1', sourceToolCallId: 'tc-old' },
sourceMessageId: 'asst-old',
status: 'processing',
title: 'Old Subagent',
topicId: 'topic-1',
type: 'isolation',
});
h.messages.set('stale-asst', {
agentId: null,
content: '',
id: 'stale-asst',
parentId: 'asst-old',
role: 'assistant',
threadId: 'thd-stale',
topicId: 'topic-1',
} as any);
// op-2 runs and terminates. The terminal orphan-drain would finalize every
// run in the reducer state — so if the stale thread were merged in, it would
// be flipped to Active here.
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 0, { chunkType: 'text', content: 'working' }),
buildEvent('agent_runtime_end', 1, {}),
],
operationId: 'op-2',
topicId: 'topic-1',
});
// The unrelated thread is untouched: still Processing, never updated.
expect(h.threads.get('thd-stale')!.status).toBe('processing');
expect(h.threadModel.update).not.toHaveBeenCalledWith('thd-stale', expect.anything());
});
// The in-thread analog of the cold-replica bug: one CC subagent turn continued
// on a fresh replica must NOT fork into a second in-thread assistant. The turn's
// CC message.id is persisted on the assistant's metadata and recovered into
// `currentSubagentMessageId`, so a continuation is recognized as the SAME turn.
it('does NOT fragment one CC subagent turn across a cold replica (no split / empty shell)', async () => {
const h = createHarness({
assistantMessageId: 'asst-1',
operationId: 'op-1',
topicId: 'topic-1',
});
const PARENT = 'tc-spawn-1';
// Batch 1: turn sub-1's first tool → lazy-create thread + user + in-thread
// assistant (stamped subagentMessageId=sub-1) + tool t1.
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 0, {
chunkType: 'tools_calling',
subagent: {
parentToolCallId: PARENT,
spawnMetadata: { prompt: 'go', subagentType: 'Explore' },
subagentMessageId: 'sub-1',
},
toolsCalling: [innerTool('t1')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
const threadId = [...h.threads.keys()][0];
const assistantsOf = () =>
[...h.messages.values()].filter((m) => m.role === 'assistant' && m.threadId === threadId);
expect(assistantsOf()).toHaveLength(1);
// The turn id was persisted so a cold replica can recover it.
expect(assistantsOf()[0].metadata?.subagentMessageId).toBe('sub-1');
__resetOperationStatesForTesting(); // cold replica
// Batch 2 (fresh replica): SAME turn sub-1 continues (cumulative [t1, t2]).
await h.handler.ingest({
assistantMessageId: 'asst-1',
events: [
buildEvent('stream_chunk', 1, {
chunkType: 'tools_calling',
subagent: { parentToolCallId: PARENT, subagentMessageId: 'sub-1' },
toolsCalling: [innerTool('t1'), innerTool('t2')],
}),
],
operationId: 'op-1',
topicId: 'topic-1',
});
// Still exactly ONE in-thread assistant — no fork, no empty shell.
const assistants = assistantsOf();
expect(assistants).toHaveLength(1);
// Both tool rows hang off that same assistant (t1 not duplicated).
const toolRows = [...h.messages.values()].filter(
(m) => m.role === 'tool' && (m.tool_call_id === 't1' || m.tool_call_id === 't2'),
);
expect(toolRows).toHaveLength(2);
expect(new Set(toolRows.map((m) => m.parentId))).toEqual(new Set([assistants[0].id]));
});
});
@@ -486,9 +486,9 @@ describe('HeterogeneousPersistenceHandler', () => {
if (id === 'asst-1') order.push('update-asst');
return origUpdate(id, patch);
});
h.messageModel.create.mockImplementation(async (input: any) => {
h.messageModel.create.mockImplementation(async (input: any, id?: string) => {
order.push(input.role === 'tool' ? 'create-tool' : 'create-other');
return origCreate(input);
return origCreate(input, id);
});
const tool = {
@@ -767,6 +767,91 @@ describe('HeterogeneousPersistenceHandler', () => {
expect(step2Asst!.parentId).toBe('tool-row-only');
});
it('chains off the latest tool row when parallel tools are only partially backfilled', async () => {
// Regression for main-chain breaks with parallel/multi tool calls:
// tool A is visible in assistant.tools[].result_msg_id, while tool B's
// row exists but Phase 3 has not backfilled assistant.tools[] yet. The
// step anchor must be tool B, not the earlier resolved tool A.
const h = createHarness({
assistantMessageId: 'asst-init',
operationId: 'op-1',
topicId: 'topic-1',
});
const metaState: FakeTopicMetadata = {
runningOperation: { assistantMessageId: 'asst-init', operationId: 'op-1' },
};
h.topicModel.findById.mockImplementation(async (id: string) => {
if (id !== 'topic-1') return null;
return { agentId: null, id, metadata: { ...metaState } };
});
h.topicModel.updateMetadata.mockImplementation(async (_id: string, patch: any) => {
Object.assign(metaState, patch);
});
await h.handler.ingest({
events: [buildEvent('stream_start', 1, { newStep: true })],
operationId: 'op-1',
topicId: 'topic-1',
});
const step1Asst = [...h.messages.values()].find(
(m) => m.role === 'assistant' && m.id !== 'asst-init',
)!;
h.messages.set('tool-a-backfilled', {
agentId: null,
content: 'tool A result',
id: 'tool-a-backfilled',
parentId: step1Asst.id,
role: 'tool',
threadId: null,
tool_call_id: 'tc-a',
topicId: 'topic-1',
});
h.messages.set('tool-b-row-only', {
agentId: null,
content: 'tool B result',
id: 'tool-b-row-only',
parentId: step1Asst.id,
role: 'tool',
threadId: null,
tool_call_id: 'tc-b',
topicId: 'topic-1',
});
h.messages.set(step1Asst.id, {
...h.messages.get(step1Asst.id)!,
tools: [
{
apiName: 'Read',
arguments: '{}',
id: 'tc-a',
identifier: 'read',
result_msg_id: 'tool-a-backfilled',
type: 'default',
},
{
apiName: 'Bash',
arguments: '{}',
id: 'tc-b',
identifier: 'bash',
type: 'default',
},
],
});
await h.handler.ingest({
events: [buildEvent('stream_start', 2, { newStep: true })],
operationId: 'op-1',
topicId: 'topic-1',
});
const step2Asst = [...h.messages.values()].find(
(m) => m.role === 'assistant' && m.id !== 'asst-init' && m.id !== step1Asst.id,
);
expect(step2Asst).toBeDefined();
expect(step2Asst!.parentId).toBe('tool-b-row-only');
});
it('ignores subagent tool rows (threadId set) when resolving the step anchor', async () => {
// A subagent tool row lives on its own thread and must never anchor the
// main-agent wire. If the only `role:'tool'` child carries a threadId,
@@ -0,0 +1,25 @@
import { describe, expect, it } from 'vitest';
import { publishResourceEvent, resourceChannelId } from '../index';
describe('resourceEvents', () => {
it('formats a stable channel id per resource', () => {
expect(resourceChannelId({ id: 'doc-1', type: 'document' })).toBe('resource:document:doc-1');
});
it('publish is best-effort and never throws (no Redis → in-memory)', async () => {
await expect(
publishResourceEvent(
{ id: 'doc-1', type: 'document' },
{ actorId: 'u1', type: 'doc.updated' },
),
).resolves.toBeUndefined();
await expect(
publishResourceEvent(
{ id: 'doc-1', type: 'document' },
{ actorId: 'u1', data: { holderId: null }, type: 'lock.changed' },
),
).resolves.toBeUndefined();
});
});
@@ -0,0 +1,82 @@
import debug from 'debug';
// Import the transport pieces from their concrete modules rather than the
// `@/server/modules/AgentRuntime` barrel: the barrel re-exports RuntimeExecutors,
// which eagerly constructs the ModelRuntime ApiKeyManager at module load and
// throws in client/test contexts. These leaf modules pull no ModelRuntime.
import { inMemoryStreamEventManager } from '@/server/modules/AgentRuntime/InMemoryStreamEventManager';
import { getAgentRuntimeRedisClient } from '@/server/modules/AgentRuntime/redis';
import { StreamEventManager } from '@/server/modules/AgentRuntime/StreamEventManager';
import type { IStreamEventManager } from '@/server/modules/AgentRuntime/types';
import type { ReceivedResourceEvent, ResourceEvent, ResourceRef } from './types';
export type { ReceivedResourceEvent, ResourceEvent, ResourceRef, ResourceType } from './types';
const log = debug('lobe-server:resource-events');
/** Redis Stream / in-memory channel key for a resource. */
export const resourceChannelId = (ref: ResourceRef): string => `resource:${ref.type}:${ref.id}`;
/**
* Select the underlying transport. We deliberately bypass
* `createStreamEventManager()` its `GatewayStreamNotifier` wrapper POSTs every
* published event to the agent gateway, which must not see resource events.
* Evaluated per call so it picks up Redis becoming (un)available.
*/
const getManager = (): IStreamEventManager =>
getAgentRuntimeRedisClient() !== null ? new StreamEventManager() : inMemoryStreamEventManager;
/**
* Realtime event fan-out for editable resources, keyed by (resourceType, id).
*
* A thin, table-agnostic wrapper over the existing Redis-Streams transport so
* presence and (eventually) real-time co-editing can reuse the same channel.
* The lease/lock is advisory and this channel is best-effort: publishing never
* throws, and with no Redis the in-memory manager keeps single-instance dev
* working while clients fall back to their polling heartbeat.
*/
export const publishResourceEvent = async (
ref: ResourceRef,
event: ResourceEvent,
): Promise<void> => {
try {
await getManager().publishStreamEvent(resourceChannelId(ref), {
// The agent StreamEvent shape (stepIndex + closed `type` union) is an
// implementation detail of the transport; cast at this single boundary.
data: { actorId: event.actorId, ...event.data },
stepIndex: 0,
type: event.type,
} as unknown as Parameters<IStreamEventManager['publishStreamEvent']>[1]);
} catch (error) {
// Best-effort: a transport hiccup must never break the caller's save/lock op.
log('publishResourceEvent failed for %s:%s %O', ref.type, ref.id, error);
}
};
/**
* Subscribe to a resource's events until `signal` aborts. Only events published
* after subscription are delivered (no history replay).
*/
export const subscribeResourceEvents = async (
ref: ResourceRef,
onEvent: (event: ReceivedResourceEvent) => void,
signal: AbortSignal,
): Promise<void> => {
await getManager().subscribeStreamEvents(
resourceChannelId(ref),
'$',
(events) => {
for (const e of events) {
const { actorId, ...rest } = (e.data ?? {}) as Record<string, unknown>;
onEvent({
actorId: typeof actorId === 'string' ? actorId : '',
data: rest,
timestamp: e.timestamp,
type: e.type as unknown as ResourceEvent['type'],
});
}
},
signal,
);
};
@@ -0,0 +1,21 @@
/** Editable resource families that can broadcast realtime events. */
export type ResourceType = 'agent' | 'chatGroup' | 'document' | 'task';
export interface ResourceRef {
id: string;
type: ResourceType;
}
export type ResourceEventType = 'doc.updated' | 'lock.changed';
export interface ResourceEvent {
/** User id that triggered the event; lets subscribers ignore self-originated events. */
actorId: string;
/** Event-specific payload (e.g. `{ holderId }` for `lock.changed`). */
data?: Record<string, unknown>;
type: ResourceEventType;
}
export interface ReceivedResourceEvent extends ResourceEvent {
timestamp: number;
}
+1
View File
@@ -706,6 +706,7 @@ export class TaskService {
activities: activities.length > 0 ? activities : undefined,
topicCount: topics.length > 0 ? topics.length : undefined,
workspace: workspaceFolders.length > 0 ? workspaceFolders : undefined,
workspaceId: task.workspaceId ?? null,
};
}
@@ -64,6 +64,7 @@ describe('localSystemRuntime', () => {
it('should call deviceGateway.executeToolCall with correct arguments when a proxy function is invoked', async () => {
const context: ToolExecutionContext = {
activeDeviceId: 'device-1',
operationId: 'op-1',
toolManifestMap: {},
userId: 'user-1',
};
@@ -78,7 +79,7 @@ describe('localSystemRuntime', () => {
const result = await proxy[apiName](args);
expect(mockExecuteToolCall).toHaveBeenCalledWith(
{ deviceId: 'device-1', userId: 'user-1' },
{ deviceId: 'device-1', operationId: 'op-1', userId: 'user-1' },
{
apiName,
arguments: JSON.stringify(args),
@@ -43,9 +43,9 @@ export const agentManagementRuntime: ServerRuntimeRegistration = {
): Promise<ToolExecutionResult> => {
const { agentId, instruction, taskTitle, timeout } = params;
// Server runtime always uses the task path because there is no
// client-side `registerAfterCompletion` callback available to execute
// synchronous agent calls.
// Server runtime always uses the legacy async invocation path because
// there is no client-side `registerAfterCompletion` callback available
// to execute synchronous agent calls.
return {
content: `🚀 Triggered async task to call agent "${agentId}"${taskTitle ? `: ${taskTitle}` : ''}`,
state: {
@@ -10,6 +10,7 @@ interface LobeDeliveryCheckerRuntimeContext {
operationId?: string;
serverDB: LobeChatDatabase;
userId: string;
workspaceId?: string;
}
const buildError = (content: string, code: string): BuiltinServerRuntimeOutput => ({
@@ -28,11 +29,13 @@ class LobeDeliveryCheckerExecutionRuntime {
private operationId?: string;
private db: LobeChatDatabase;
private userId: string;
private workspaceId?: string;
constructor(context: LobeDeliveryCheckerRuntimeContext) {
this.operationId = context.operationId;
this.db = context.serverDB;
this.userId = context.userId;
this.workspaceId = context.workspaceId;
}
generateVerifyPlan = async (params: {
@@ -64,7 +67,7 @@ class LobeDeliveryCheckerExecutionRuntime {
// criteria + a rubric, snapshot it onto this operation, and confirm it. The
// tool call is human-reviewed (humanIntervention); this runs post-approval.
const { VerifyPlanGeneratorService } = await import('@/server/services/verify');
const planGenerator = new VerifyPlanGeneratorService(this.db, this.userId);
const planGenerator = new VerifyPlanGeneratorService(this.db, this.userId, this.workspaceId);
const { items, rubricId } = await planGenerator.createPlanFromCriteria({
criteria,
operationId: this.operationId,
@@ -110,6 +113,7 @@ export const lobeDeliveryCheckerRuntime: ServerRuntimeRegistration = {
operationId: context.operationId,
serverDB: context.serverDB,
userId: context.userId,
workspaceId: context.workspaceId,
});
},
identifier: LobeDeliveryCheckerIdentifier,
@@ -18,7 +18,11 @@ export const localSystemRuntime: ServerRuntimeRegistration = {
for (const api of LocalSystemManifest.api) {
proxy[api.name] = async (args: any) => {
return deviceGateway.executeToolCall(
{ deviceId: context.activeDeviceId!, userId: context.userId! },
{
deviceId: context.activeDeviceId!,
operationId: context.operationId,
userId: context.userId!,
},
{
apiName: api.name,
arguments: JSON.stringify(args),
@@ -1,7 +1,7 @@
import { builtinSkills } from '@lobechat/builtin-skills';
import { LocalSystemApiName, LocalSystemIdentifier } from '@lobechat/builtin-tool-local-system';
// Note: only `readFile` is wired through deviceGateway. Directory enumeration is
// left to the model via `local-system.listFiles` so we don't double-fetch.
// left to the model via `local-system.globFiles` so we don't double-fetch.
import {
type CommandResult,
type ExecScriptActivatedSkill,
@@ -15,6 +15,7 @@ interface VerifyResultRuntimeContext {
operationId?: string;
serverDB: LobeChatDatabase;
userId: string;
workspaceId?: string;
}
/**
@@ -27,11 +28,13 @@ class VerifyResultExecutionRuntime {
private operationId?: string;
private db: LobeChatDatabase;
private userId: string;
private workspaceId?: string;
constructor(context: VerifyResultRuntimeContext) {
this.operationId = context.operationId;
this.db = context.serverDB;
this.userId = context.userId;
this.workspaceId = context.workspaceId;
}
submitVerifyResult = async (params: SubmitVerifyResultParams) => {
@@ -47,11 +50,13 @@ class VerifyResultExecutionRuntime {
}
// The verifier runs as a sub-agent; the row to update belongs to the parent run.
const op = await new AgentOperationModel(this.db, this.userId).findById(this.operationId);
const op = await new AgentOperationModel(this.db, this.userId, this.workspaceId).findById(
this.operationId,
);
const targetOperationId = op?.parentOperationId ?? this.operationId;
const status = params.verdict === 'passed' ? 'passed' : 'failed';
await new VerifyCheckResultModel(this.db, this.userId).updateByCheckItem(
await new VerifyCheckResultModel(this.db, this.userId, this.workspaceId).updateByCheckItem(
targetOperationId,
params.checkItemId,
{
@@ -66,10 +71,12 @@ class VerifyResultExecutionRuntime {
verdict: params.verdict,
},
);
await new VerifyStatusService(this.db, this.userId).recompute(targetOperationId);
await new VerifyStatusService(this.db, this.userId, this.workspaceId).recompute(
targetOperationId,
);
// This may be the last check to resolve — kick auto-repair if the run failed
// with auto_repair checks (no-op until everything has a terminal result).
await maybeAutoRepair(this.db, this.userId, targetOperationId);
await maybeAutoRepair(this.db, this.userId, targetOperationId, this.workspaceId);
log(
'submitted verdict %s for check %s (op %s)',
@@ -94,6 +101,7 @@ export const verifyResultRuntime: ServerRuntimeRegistration = {
operationId: context.operationId,
serverDB: context.serverDB,
userId: context.userId,
workspaceId: context.workspaceId,
});
},
identifier: VerifyToolIdentifier,

Some files were not shown because too many files have changed in this diff Show More