#### 💻 Change Type
- [ ] ✨ feat
- [ ] 🐛 fix
- [ ] ♻️ refactor
- [ ] 💄 style
- [x] 👷 build
- [ ] ⚡️ perf
- [ ] ✅ test
- [ ] 📝 docs
- [ ] 🔨 chore
#### 🔗 Related Issue
- None
#### 🔀 Description of Change
- Extract the document history database changes from the feature branch
onto a branch based on main.
- Add the document history migration, schema, relations, model, and
database tests only.
- Exclude UI, router, and service-layer changes so the PR stays focused
on the database layer.
#### 🧪 How to Test
- Run: cd packages/database && bunx vitest run --silent=passed-only
src/models/__tests__/document.test.ts
src/models/__tests__/documentHistory.test.ts
- [x] Tested locally
- [x] Added or updated tests
- [ ] No tests needed
#### 📸 Screenshots / Videos
| Before | After |
| ------ | ----- |
| N/A | N/A |
#### 📝 Additional Information
- This PR intentionally targets main because the database migration
needs to land on the release branch first.
* update
* update
* 🔧 chore: update CLI build command in electron-builder and ensure proper newline in package.json
* Changed the CLI build command from 'npm run build' to 'npm run build:cli' in electron-builder.mjs.
* Added a newline at the end of package.json for consistency.
Signed-off-by: Innei <tukon479@gmail.com>
---------
Signed-off-by: Innei <tukon479@gmail.com>
Co-authored-by: Innei <tukon479@gmail.com>
* feat: add some lost lobe-kb builtin tools
* feat: add the list files and get file detail
* feat: add the list files and get file detail
* fix: update the search limit
* ♻️ refactor: add backgroundColor to TaskParticipant and rename name to title
Add backgroundColor field and rename name→title in TaskParticipant interface
to match agent avatar data. Add LobeAI fallback for inbox agent in
getAgentAvatarsByIds when avatar/title are missing.
Update `pageEditor.editorPlaceholder` from `Start writing your page. Press / to open the command menu` to `Press "/" for AI and commands.` across all supported locales and the default locale source.
* 🐛 fix: default execAgent approval mode to headless
Backend execAgent calls should run headlessly by default since only
frontend scenarios require manual human approval. This prevents cron
jobs and other server-side triggers from unexpectedly waiting for
human intervention.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✅ test: add regression test for headless approval default
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: createAgent uses agentModel.create directly
The createAgent router was still going through sessionModel.create,
which is a legacy path that doesn't pass all agent fields (like
agencyConfig) to the agents table. Switch to agentModel.create
which directly inserts into the agents table with full field support.
- Add CreateAgentSchema in types package for proper input validation
- Remove dependency on insertAgentSchema from database package
- Remove sessionId from CreateAgentResult
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🏷️ chore: mark session-based agent creation as deprecated
Add @deprecated JSDoc tags to the legacy session-based agent creation
path (session router, SessionService, SessionModel.create, session store,
insertAgentSchema). New code should use agent.createAgent / agentModel.create
directly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: honor groupId when creating agents
Pass input.groupId as sessionGroupId to agentModel.create so that
agents created from a sidebar folder are correctly assigned to that group.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: resolve type errors from createAgent refactor
- Remove sessionId fallback in AddAgent.tsx and ForkAndChat.tsx
- Use z.custom<T>() for agencyConfig and tts in CreateAgentSchema
to match agentModel.create parameter types
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: extract agent-stream into @lobechat/agent-gateway-client package
Move the Agent Gateway WebSocket client from src/libs/agent-stream/ into
a standalone workspace package at packages/agent-gateway-client/. This
eliminates the duplicate AgentStreamEvent type in apps/cli and provides
a single source of truth for the Gateway WS protocol types shared by
SPA, server, and CLI consumers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* add agent-gateway-client
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor(chat): remove reject-only button, unify to rejected_continue
Server-side `decision='rejected'` and `decision='rejected_continue'`
share the exact same code path — both surface the rejection to the
LLM as user feedback. Having a separate "reject only" button added UI
complexity without behavioural difference.
- Remove the "仅拒绝" button from InterventionBar popover; the single
"拒绝" button now calls `rejectAndContinueToolCall` directly
- `rejectToolCalling` Gateway branch sends `rejected_continue` instead
of `rejected` so all rejection paths use one decision value
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update ApprovalActions.tsx
* ✨ feat(tool): add executors field to BuiltinToolManifest and dispatch page-agent to client
Add `executors?: ('client' | 'server')[]` to `BuiltinToolManifest` so
each builtin tool declares where it can run. The server-side dispatch
logic in `aiAgent/index.ts` now reads this field instead of hardcoding
per-identifier checks.
- `lobe-local-system`: `executors: ['client', 'server']` — runs on
client via Electron IPC or server via Remote Device proxy
- `lobe-page-agent`: `executors: ['client']` — requires EditorRuntime,
client-only
- Stdio MCP plugins still use the `customParams.mcp.type` heuristic
(not manifest-driven)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix(gateway): route approve/reject via lab flag, not transient server op state
After the coordinator fix for `waiting_for_human` (#13860) the paused
`execServerAgentRuntime` op is marked `completed` client-side as soon
as the server emits `agent_runtime_end`. `startOperation` then runs
`cleanupCompletedOperations(30_000)`, which deletes any op completed
more than 30 seconds ago — so by the time the user sees the
InterventionBar and clicks approve/reject, the running (or recently
completed) server op is gone.
The previous `#hasRunningServerOp` check therefore kept returning
false against a live Gateway backend, flipping approve/reject into
the client-mode `internal_execAgentRuntime` branch and stranding the
server-side paused conversation.
Switch the helper to `#shouldUseGatewayResume`, which checks the same
`isGatewayModeEnabled()` lab flag used to route the initial send. The
signal now mirrors how the conversation was dispatched and survives
the op-cleanup window.
New regression test exercises the post-coordinator-fix state: the
paused `execServerAgentRuntime` op is explicitly `completed` before
the approve call runs, and we still expect the Gateway branch to
fire with `decision='approved'`.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(gateway): clean up paused server op after human approve/reject
In Gateway mode with userInterventionConfig.approvalMode='ask', the
paused execServerAgentRuntime op was never released — the loading
spinner kept spinning after the user approved, rejected, or
reject-and-continued, and reject-only silently did nothing on the
server.
- ToolAction.rejectToolCall now delegates to chatStore.rejectToolCalling
so the Gateway resume op actually fires with decision='rejected';
previously it only mutated local intervention state and the server's
paused op waited forever.
- AgentRuntimeCoordinator treats waiting_for_human as end-of-stream so
the coordinator emits agent_runtime_end when request_human_approve
flips state, letting the client close the paused op via the normal
terminal-event path.
- conversationControl adds #completeRunningServerOps as a fallback
guard in the approve/reject/reject-continue Gateway branches — if
the server-side signal is delayed or missing, the client still clears
the orphan op before starting the resume op.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(gateway): defer paused-op cleanup until resume starts successfully
If `executeGatewayAgent` failed (transient network/auth/server error),
the paused `execServerAgentRuntime` op was already marked completed
locally by the pre-call `#completeRunningServerOps`. Retries would
then see no running server op, miss `#hasRunningServerOp`, and fall
through to the non-Gateway client-mode path — while the backend was
still paused awaiting human input.
Snapshot the paused op IDs before the resume call and retire them
only inside the try block after `executeGatewayAgent` resolves. On
failure the running marker stays intact so a retry still lands on
the Gateway branch and can re-issue the resume.
The helper was renamed from `#completeRunningServerOps(context)` to
`#completeOpsById(ids)` to reflect the new contract: callers must
snapshot beforehand, not re-query at completion time (which would
incorrectly match the new resume op too).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(gateway): avoid double reject dispatch in reject-and-continue
Now that `rejectToolCall` delegates to `chatStore.rejectToolCalling`,
the chained `await get().rejectToolCall(...)` inside
`rejectAndContinueToolCall` fired a full halting reject before the
continue call. In Gateway mode that meant two resume ops on the same
tool_call_id (`decision='rejected'` followed by
`decision='rejected_continue'`) racing server-side; in client mode it
duplicated reject bookkeeping that `chatStore.rejectAndContinueToolCalling`
already handles internally.
Drop the chained call and fire `onToolRejected` inline so hook
semantics are preserved. `chatStore.rejectAndContinueToolCalling` is
now the single entry point for both the rejection persist and the
continue dispatch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix(toolEngineering): drop manifests missing `api` before feeding ToolsEngine
`ToolsEngine.convertManifestsToTools` calls `manifest.api.map(...)`
without a null check, so any manifest that is truthy but lacks a valid
`api` array crashes the entire tools build with "Cannot read properties
of undefined (reading 'map')". This takes down anything that touches
the tools pipeline on that agent — including TokenTag in ChatInput,
which is why users see the crash on the chat page load path.
Manifests are merged from 5 sources (installed plugins, builtin tools,
Klavis, LobeHub skills, caller-supplied extras), only some of which
filter falsy entries, and none validate `api`. Guard defensively at
the merge point and log the offending source + identifier so the
underlying bad data can be traced.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(builtin-tool-gtd): add server runtime for GTD tool
Implement server-side execution runtime so the GTD tool works when
agents run in a pure server context (bot platforms, async task workers,
QStash workflows). Previously only the client executor existed, which
relied on `useNotebookStore` and `notebookService` and would break on
the server.
- `packages/builtin-tool-gtd/src/ExecutionRuntime/index.ts`: pure
`GTDExecutionRuntime` class with an injected service interface,
covering createPlan/updatePlan/createTodos/updateTodos/clearTodos
and execTask/execTasks. Since server runtime has no stepContext,
todo state is read from / written back to the Plan document's
`metadata.todos` field.
- `src/server/services/toolExecution/serverRuntimes/gtd.ts`: factory
wiring `DocumentModel` + `TopicDocumentModel` into the runtime and
registering under `GTDIdentifier`.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor(builtin-tool-gtd): share runtime logic between executor and server
Make the client executor a thin adapter over `GTDExecutionRuntime` so
all processing logic (todo reducer, plan CRUD flow, execTask state
builder, output formatting) lives in one place. Previously the server
runtime was a near-duplicate of the client executor.
- Expand `GTDRuntimeContext` with `currentTodos`, `messageId`, `signal`
so both callers can thread their environment through:
- client supplies `currentTodos` from stepContext / pluginState via
`getTodosFromContext`, and `messageId` for execTask parentMessageId
- server lets the runtime resolve todos from the plan document's
metadata when `currentTodos` is not supplied
- Split service surface into `updatePlan` (user-facing: goal / desc /
context — client routes through `useNotebookStore` to refresh SWR)
vs `updatePlanMetadata` (silent todos sync — client stays on the
raw `notebookService`)
- Runtime methods now return `BuiltinToolResult` (superset of
`BuiltinServerRuntimeOutput`), so `stop: true` on execTask /
execTasks is typed cleanly without `@ts-expect-error`
Net effect: `executor/index.ts` shrinks from 510 → 134 lines; the
server factory just maps models to the service interface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🌐 chore: translate non-English comments to English in lambda router tests
Translated all Chinese/CJK comments to English in 6 test files under
src/server/routers/lambda/__tests__/. Code logic and string literals
are unchanged; only explanatory comments were translated.
Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
💄 style(chat): tighten `execServerAgentRuntime` loading copy
Current text was trying to do too much in one line — status + two
separate user affordances — and read as an explanation, not a status.
Replaces it with a status-first line that mentions where the work is
happening and the single reassurance users actually need.
- EN: "Task is running in the server. You are safe to leave this page."
- zh-CN: "任务正在服务器运行,您可以放心离开此页面。"
Only en-US and zh-CN are edited; CI translates the rest from the
default file.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(conversation): improve workflow display when user intervention is pending
Made-with: Cursor
* 🐛 fix(builtin-tool-activator): add ActivatedToolInfo typing to requestedTools for tsgo compat
requestedTools was inferred as `{ identifier, name }[]` which lacks the
`avatar` property required by `ActivatedToolInfo`, causing tsgo errors.
`messageModel.findById(parentMessageId)` only returns the row from the
`messages` table — the tool-call metadata (identifier / apiName /
arguments / type / toolCallId) lives in the separate `message_plugins`
table. The resumeApproval path was reading `(resumeParentMessage as any).plugin`
and `(resumeParentMessage as any).tool_call_id`, both always undefined,
which meant:
- Approved tool calls were dispatched with `identifier: undefined`,
causing the server-side tool executor to throw
`Builtin tool "undefined" is not implemented`. The follow-up LLM
step could still describe success (it sees the user prompt + picks
plausible output) but the tool message content is permanently the
error string.
- The toolCallId mismatch guard was silently disabled because the
stored value was always null → validation always passed regardless
of what the client sent.
Fix: query `messagePlugins.findFirst` by message id, use the fetched
row for both the toolCallId equality check and the approvedToolCall
payload that the runtime dispatches.
Tests:
- Mock `db.query.messagePlugins.findFirst` with the plugin fields so
existing asserts on `approvedToolCall.identifier`/`apiName` pass
against real values.
- Move `tool_call_id` / identifier / apiName / arguments / type out of
the mock `messages` row fixture into a separate `pendingToolPlugin`
fixture that mirrors the actual DB layout.
- Flip the "toolCallId mismatch" guard test to mutate the plugin mock
(not the message mock) — this is exactly the class of bug the fetch
guards against, so the test would have masked it before.
- New guard test: throw when `messagePlugins.findFirst` returns
undefined (stale message id, wrong user, etc.).
Discovered during E2E verification of LOBE-7152 approve flow — the
approve decision was flipping to the new op correctly but every tool
execution was failing with the "undefined" error.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✨ feat(chat): server-mode human approval via new Gateway op + resumeApproval
When the current agent runtime is Gateway-mode (execServerAgentRuntime),
approve / reject / reject_continue now start a **new** Gateway op carrying
a `resumeApproval` decision instead of resuming the paused op in place
over tRPC — mirroring the "interrupt + new op" pattern from LOBE-7142
(stop/interrupt). This sidesteps the stepIndex / executeStep early-exit
race that was blocking the in-place resume path and matches the Linear
spec for LOBE-7152. Client mode is unchanged.
### Client
- `conversationControl.ts`
- `approveToolCalling` / `rejectToolCalling` / `rejectAndContinueToolCalling`:
server-mode branch calls `executeGatewayAgent({ message: '',
parentMessageId: toolMessageId, resumeApproval: { decision, ... } })`.
The local runtime never spins up; the new op's `agent_runtime_end`
clears loading.
- `#hasRunningServerOp` replaces the old `#getServerOperationId` helper
(we no longer need the paused op's id). Forwards scope/groupId/
subAgentId from `ConversationContext` into the operation lookup so
group/thread conversations correctly resolve their running server op
— `operationsByContext` is keyed on the full `messageMapKey`.
- `gateway.ts` — `executeGatewayAgent` takes an optional `resumeApproval`
and forwards it to `aiAgentService.execAgentTask`.
- `services/aiAgent.ts` — `ExecAgentTaskParams.resumeApproval` with new
`ResumeApprovalParam` shape (decision + parentMessageId + toolCallId
+ optional rejectionReason).
- `gatewayEventHandler.ts` — kept the `toolMessageIds` branch that fetches
pending tool messages on `tools_calling`.
- `services/agentRuntime/{type,index}.ts` — removed the short-lived
`toolMessageId` / `reject_continue` additions; this flow no longer
routes through `processHumanIntervention`.
- `store/chat/slices/operation/selectors.ts` — `getOperationsByContext` /
`hasRunningOperationByContext` now take `MessageMapKeyInput` so scope/
group/subAgent fields are honoured end-to-end.
### Server
- `ExecAgentSchema` / `InternalExecAgentParams.resumeApproval` — optional
`{ decision, parentMessageId, rejectionReason?, toolCallId }`.
- `AiAgentService.execAgent`
- `resumeApproval` implies resume semantics (skip user-message creation,
reuse `parentMessageId` as the target tool message). Folded into a
single `effectiveResume` flag so the existing resume branches apply.
- Validates parent is a `role='tool'` message whose `tool_call_id`
matches the request — guards stale / double-clicks.
- Writes the decision to DB before `historyMessages` is fetched so the
runtime sees the updated tool message on the first step:
* `approved` → `intervention: { status: 'approved' }`
* `rejected` / `rejected_continue` → tool content =
"User reject this tool calling [with reason: X]",
`intervention: { status: 'rejected', rejectedReason }`.
- Branches initial runtime context:
* `approved` → `phase: 'human_approved_tool'` + `approvedToolCall`
payload rebuilt from the tool message plugin → runtime executes
the tool.
* `rejected` / `rejected_continue` → `phase: 'user_input'` with
empty content → LLM re-reads history (now including the rejected
tool) and responds. Both decisions share this path: the client
split is only about optimistic writes and button UX; once the
rejection is persisted there's nothing meaningful to differentiate
server-side.
### Tests
- `conversationControl.test.ts` — rewrote the three server-mode blocks
to spy `executeGatewayAgent` and assert the `resumeApproval` payload
shape. Added a regression test covering group-scope lookup so dropping
scope/groupId from `#hasRunningServerOp` breaks the suite.
- `execAgent.resumeApproval.test.ts` (new) — covers approved and the
unified rejected branches (parameterized), the no-reason fallback, and
the role/tool_call_id validation guards.
Relates to LOBE-7152.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: forward serverUrl in WS auth for apiKey verification
The agent gateway verifies an apiKey by calling
\`\${serverUrl}/api/v1/users/me\` with the token, so \`serverUrl\` has to be
part of the WebSocket auth handshake. The device-gateway-client already
does this; \`lh agent run\` was missing it, producing
"Gateway auth failed: Missing serverUrl for apiKey auth".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🔨 chore: bump cli to 0.0.7
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🧹 chore: remove builtin-agent-onboarding and consolidate web onboarding
- Merge agent system role into builtin-agents; colocate toolSystemPrompt in builtin-tool-web-onboarding
- Drop unused QuestionRenderer client bundle
- Gate onboarding footer switch/skip on AGENT_ONBOARDING_ENABLED for agent route
Made-with: Cursor
* 🧪 test: fix onboarding layout translation mock
* 🧪 test: align onboarding layout test with feature flag
* 🧪 test: type onboarding business const mock
When `call_llm` pushed the assistant turn into `state.messages`, it
dropped the DB id even though the row was already persisted. The
downstream `request_human_approve` executor filters parent lookup on
`m.role === 'assistant' && m.id`, and the DB fallback query is not
reliably finding the just-written row on every topology — so when
human-approve fires on the fresh LLM turn the op errors out with
"No assistant message found as parent for pending tool messages".
Attach `assistantMessageItem.id` to the pushed message so the existing
in-memory lookup hits, and nextContext's `parentMessageId` and
`state.messages` agree on a single source of truth.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Introduced a new `document_histories` table to track changes made to documents, including fields for `editor_data`, `save_source`, and `saved_at`.
- Updated foreign key relationships to link `document_histories` with `documents` and `users`.
- Modified existing models and tests to accommodate the new document history functionality, including changes to pagination and retrieval methods.
- Removed the versioning system from documents in favor of a more flexible history tracking approach.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(agent-runtime): implement server-side human approval flow
Port the client-mode human approval executors (request_human_approve,
call_tool resumption, handleHumanIntervention) to the server agent
runtime so that execServerAgentRuntime can correctly pause on
waiting_for_human and resume on approve / reject / reject_continue.
- request_human_approve now creates one `role='tool'` message per pending
tool call with `pluginIntervention: { status: 'pending' }` and ships
the `{ toolCallId → toolMessageId }` mapping on the `tools_calling`
stream chunk.
- call_tool gains a `skipCreateToolMessage` branch that updates the
pre-existing tool message in-place (prevents duplicate rows / parent_id
FK violations that show up as LOBE-7154 errors).
- AgentRuntimeService.handleHumanIntervention implements all three
paths: approve → `phase: 'human_approved_tool'`; reject → interrupted
with `reason: 'human_rejected'`; reject_continue → `phase: 'user_input'`.
- ProcessHumanIntervention schema carries `toolMessageId` and a new
`reject_continue` action; schema remains permissive (handler no-ops on
missing toolMessageId) to keep legacy callers working.
Fixes LOBE-7151
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(agent-runtime): address LOBE-7151 review (P1 reject_continue, P2 duplicate tool msg)
P1 — reject_continue with remaining pending tools must NOT resume the LLM.
Previously `handleHumanIntervention` kept `status='waiting_for_human'` but
returned `nextContext: { phase: 'user_input' }`, which `executeStep` would
hand to `runtime.step` immediately, breaking batch semantics. Now when
other tools are still pending, the rejection is persisted but no context
is returned; the `user_input` continuation only fires when this is the
last pending tool.
P2 — request_human_approve was pushing an empty placeholder
`{ role: 'tool', tool_call_id, content: '' }` into `newState.messages`
to "reflect" the newly-created pending DB row. On resume, the `call_tool`
skip-create path appends the real tool result, leaving two entries for
the same `tool_call_id` in runtime state. The downstream short-circuit
(`phase=human_approved_tool` → `call_tool`) doesn't consult
state.messages, so the placeholder was unused cost. Removed.
Also fixes a TS 2339 in the skipCreateToolMessage test where
`nextContext.payload` is typed `{}` and needed an explicit cast.
Tests: 99 pass (82 RuntimeExecutors + 17 handleHumanIntervention), type-check clean.
Verified end-to-end via the human-approval eval — it now exercises a
multi-turn retry path (LLM calls the gated tool twice) and both
approvals resolve cleanly through to `completionReason=done`.
Relates to LOBE-7151
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* pin @react-pdf/renderer
* 🐛 fix(deps): pin @react-pdf/image to 3.0.4 to avoid privatized @react-pdf/svg
@react-pdf/image@3.1.0 (auto-resolved via layout@4.6.0 ← renderer@4.4.1)
declares `@react-pdf/svg@^1.1.0` as a dependency, but the svg package was
unpublished/made private on npm (returns 404). CI installs blow up with
ERR_PNPM_FETCH_404.
Upstream issue: https://github.com/diegomura/react-pdf/issues/3377
Pin image to 3.0.4 (the last release before the broken svg dep was
introduced) via pnpm.overrides until react-pdf publishes a fix.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: fail fast when tool/assistant message persist hits a missing parent
When a conversation parent was deleted mid-operation (LOBE-7154), the
runtime was silently swallowing the parent_id FK violation in three tool
persist paths and continuing with a stale parentMessageId. The next LLM
call hit the same FK without context, surfacing as a raw SQL error to
the user after burning several LLM + tool call round trips.
Changes
- packages/types: add AgentRuntimeErrorType.ConversationParentMissing
- new messagePersistErrors.ts helper: FK detection + structured error
constructor + persist-fatal marker (keeps RuntimeExecutors smaller)
- RuntimeExecutors:
- call_tool: publish error event + re-throw on persist failure;
outer catch propagates when persist-fatal
- call_tools_batch: same, mark so the per-tool outer catch doesn't
swallow and fall back to the already-deleted parent
- resolve_aborted_tools: same pattern
- call_llm: preflight parent existence via findById so we fail before
the LLM call instead of after
- tests: replace old swallow-on-fail expectations, add LOBE-7158 cases
for each executor plus focused unit tests for the helper module
Fixes LOBE-7158
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 chore: publish normalized ConversationParentMissing on persist failure
Review feedback on LOBE-7158: the three persist catches were emitting
the raw DB exception as a stream `error` event before normalizing it.
Clients treat `error` events as terminal and surface `event.data.error`
directly, so the raw SQL text leaked to users and ended the stream
before the typed `ConversationParentMissing` throw could propagate.
Move normalization ahead of the publish in call_tool, call_tools_batch,
and resolve_aborted_tools so the stream event always carries the
intended business error. Add a regression assertion on the
call_tool FK test that the error event's `errorType` is
`ConversationParentMissing` and no `Failed query` text leaks through.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Drop the `motion/react` slide + fade transition on NavPanel content
switches (e.g. navigating from `/` to `/agent`). The new content now
renders directly without the 0.28s x-translate animation.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✨ feat: add headless approval and apiKey ws auth to `lh agent run`
Two fixes so `lh agent run` works end-to-end against the WebSocket agent
gateway when the user is authenticated via LOBEHUB_CLI_API_KEY.
- Default to `userInterventionConfig: { approvalMode: 'headless' }` when
running the agent from the CLI. Without this flag the runtime waits
for human tool-call approval and local-device commands hang forever.
Users who want interactive approval can pass `--no-headless`.
- Pass `tokenType` (`jwt` | `apiKey`) in the WebSocket auth handshake so
the gateway knows how to verify the token. Previously the CLI sent
only the raw token value and the gateway assumed JWT, rejecting valid
API keys.
Fixes LOBE-6939
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(agent-runtime): harden classifyLLMError so it never masks the original provider error
Production traces across multiple providers (openrouter, openai, google)
surface a single opaque error — `e.trim is not a function` with
`errorType: 'unknown'` — hiding whatever the upstream actually returned.
Root cause: `normalizeCode` / `normalizeErrorType` assumed their input is
always `string | undefined` (matching the TypeScript signature), but real
provider error objects frequently carry a numeric `code` (HTTP status) or
a structured object in `errorType`. `value?.trim()` short-circuits only
on null/undefined, so a truthy non-string turns into a TypeError that
the outer catch records as the "final" error, erasing the upstream one.
Fixes:
- Guard `normalizeCode` / `normalizeErrorType` on `typeof value ===
'string'`, widen parameter type to `unknown`.
- Wrap the whole `classifyLLMError` in a try/catch that falls back to a
conservative `stop` decision and preserves the best-effort message of
the ORIGINAL error. A classifier that throws is worse than a
classifier that's wrong — it must never shadow the real failure.
- `bestEffortMessage` swallows property-access errors (hostile Proxy
etc.) to guarantee the fallback itself can't throw.
Regression tests cover: numeric `code`, structured `errorType`, nested
OpenAI-SDK-shaped `error.error.code`, and a hostile Proxy that throws on
every property access.
This is a forcing function for root-cause diagnosis: after this lands,
the real upstream errors behind the 'e.trim' mask will finally surface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Remove fallback warning in classifyLLMError
Removed console warning for classification failure.
* 🐛 fix(agent-runtime): treat numeric provider code as status fallback
Bare HTTP proxies sometimes surface the HTTP status ONLY as a numeric `code`
on the error object (no `status`/`statusCode`, no digits in the message).
After widening `normalizeCode` to require `typeof === 'string'`, those numeric
codes were dropped entirely and auth/permission failures fell through to
retry — wasting the full retry budget on permanent errors.
Forward numeric `raw.code` / `nested?.code` / `nestedError?.code` into the
status chain (after the real status/statusCode lookups, before the
message-digit extractor) so classifyKind still maps 401/403 → stop and
429/5xx → retry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: detect truncated tool_call arguments in builtin tools
When an LLM hits max_tokens mid tool_call, the arguments JSON is
truncated. The previous flow passed `{}` to the tool, which returned a
generic "required field missing" error; the model re-tried with the same
payload and the truncation repeated — one observed trace burned 17 min
and $2.46 on 5 blind retries.
Detect structural truncation (unclosed braces/brackets/strings) in
BuiltinToolsExecutor before schema validation, and return a dedicated
TRUNCATED_ARGUMENTS error telling the model to reduce payload size or
raise max_tokens instead of retrying.
Fixes LOBE-7148
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 chore: echo raw arguments string and reject all unparseable JSON
Two improvements based on review:
- Append the received arguments string to the error content so the model
can verify the payload is exactly what it produced (stops it from
blaming upstream or guessing what went wrong).
- Treat ANY unparseable non-empty argsStr as an error (new code
INVALID_JSON_ARGUMENTS), not just truncation. The previous fallback
of passing `{}` to the tool produced generic "missing field" errors
that hid the real cause. Empty argsStr still falls through to `{}`
for tools that take no parameters.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: wire Gateway-mode stop button to WS interrupt
Frontend half of [LOBE-7142](https://linear.app/lobehub/issue/LOBE-7142)
— the stop button previously silently failed in Gateway mode because:
1. `stopGenerateMessage` only filtered `execAgentRuntime`, so
`execServerAgentRuntime` ops (Gateway) were skipped.
2. Even if the local op got cancelled, nothing bridged the cancel to
the server-side agent loop running behind the Agent Gateway WS.
## Changes
**`conversationControl.ts::stopGenerateMessage`** — extend the type
filter to include both op types so both client-side and Gateway-mode
runs are cancelled from the same entry point.
**`gateway.ts::executeGatewayAgent` + `reconnectToGatewayOperation`** —
register an `onOperationCancel` handler on the local `gatewayOpId` that
forwards the server-side operation id to `interruptGatewayAgent(...)`,
which sends `{ type: 'interrupt' }` over the Agent Gateway WS. The
closure cleanly resolves the "local op id vs server op id" mapping —
no metadata lookup needed.
**`operation/actions.ts::cancelOperation`** — `isAborting` flag was
gated on `execAgentRuntime`. Extend to `execServerAgentRuntime` too so
the UI loading state transitions out immediately on Gateway-mode stop,
without waiting for the round-trip `session_complete` from the server.
## What this doesn't do (follow-ups)
- **Backend**: new `POST /api/agent/interrupt` route + Redis LPUSH
(LOBE-7145). Without it, the WS interrupt reaches Agent Gateway but
never gets forwarded to cloud.
- **Agent loop**: `AgentRuntimeService.executeStep` LPOP polling of the
interrupt key (LOBE-7146). Without it, the state never flips to
`interrupted` server-side.
- **Agent Gateway DO** (external repo): `_forwardInterrupt` HTTP POST
from the WS interrupt handler (LOBE-7147).
With only this PR merged, clicking stop will clear the local UI state
and send the WS frame correctly — the server-side loop keeps running
until those three are merged too.
## Tests
- `conversationControl.test.ts`: +1 — stopGenerateMessage cancels
`execServerAgentRuntime`, invokes the onCancel handler, sets
`isAborting: true`.
- `gateway.test.ts`: +1 — `executeGatewayAgent` registers a handler
against the local opId, handler invokes `interruptGatewayAgent`
with the server opId.
All 123 touched-slice tests pass; type-check clean.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🔨 chore: switch Gateway stop to direct tRPC instead of WS roundtrip
Rewiring only — no new behaviour on top of the previous commit. See
the discussion in PR #13815 for the full reasoning.
TL;DR the WS-based path (client → Agent Gateway WS → DO forwards
HTTP → cloud route → Redis LPUSH → loop LPOP) has the same end-effect
as the tRPC-direct path (client → tRPC → AgentRuntimeService
.interruptOperation → DB state flip), except:
- the tRPC path is one hop instead of three
- the tRPC path reuses infrastructure that's *already on canary* —
`aiAgentService.interruptTask` → `AiAgentService.interruptTask` →
`AgentRuntimeService.interruptOperation` → `coordinator.saveAgentState`
with status='interrupted' — and the existing step-boundary polling
in `executeStep` (AgentRuntimeService.ts:474, 565) already picks it up
- zero new server code required; zero Agent Gateway (external repo)
coordination required
The only reason the WS path was in the original spec (LOBE-7142) was
symmetry with the Phase 6.4 tool_execute/tool_result path, but
`interrupt` is a one-shot control signal, not stream data — there's
no actual benefit to routing it through the same channel. Mid-step
abort would require threading an AbortSignal into `runtime.step(...)`,
which WS doesn't help with either.
Closes out the need for LOBE-7145 / LOBE-7146 / LOBE-7147.
Changes:
- `gateway.ts`: both `executeGatewayAgent` and
`reconnectToGatewayOperation` register the cancel handler against
the local op id, but the handler body now calls
`aiAgentService.interruptTask({ operationId: serverOpId })` via
tRPC instead of `this.interruptGatewayAgent(serverOpId)` (which sent
the WS interrupt frame).
- `gateway.test.ts`: adjust the one new test case to verify the
tRPC call rather than the WS-path spy; add `interruptTask` to the
`aiAgentService` mock.
`AgentStreamClient.sendInterrupt()` and `interruptGatewayAgent()` are
kept as-is — public API, might be useful elsewhere. Just not called
from the cancel handler anymore.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: gateway sync
* fix: skip error connection
* feat: add disconnect all & MESSAGE_GATEWAY_ENABLED env vairable
* chore: add gateway test case
* chore: clean lobehub connnections when switch to message gateway
* chore: optimize disconnect all
* chore: disconnect gateway connnections when using lobehub gateway
* chore: clean up exsiting gateway connections after reconnect and avoid gateway callback when not enabled
* ✨ feat: receive and execute executor=client tools on desktop Electron
Frontend half of LOBE-7076 (Phase 6.4). Pairs with server PR #13790,
which adds the `clientRuntime` signal + `hasClientExecutor` gate so
`local-system` and stdio MCP can enter the manifest for desktop callers.
Data flow, client side:
Agent Gateway WS
└─ tool_execute event ──► AgentStreamClient
└─ 'agent_event' ──► gatewayEventHandler (case 'tool_execute')
└─ internal_executeClientTool (fire-and-forget)
├─ parse args → params
├─ mark pendingClientToolExecutions[toolCallId]
├─ dispatch: builtin → invokeExecutor,
│ else → mcpService.invokeMcpToolCall
├─ clear pending
└─ AgentStreamClient.sendToolResult(...)
└─ WS → /api/agent/tool-result → LPUSH
→ server BLPOP unblocks → loop continues
Key guarantees:
- `internal_executeClientTool` never throws; ALL error paths (parse
failure, no executor match, thrown executor, missing connection, MCP
error) still call `sendToolResult({ success: false, error })`. The
server's BLPOP must never hang on a silent client.
- `case 'tool_execute'` uses `void`, not `await`. A long-running tool
must not block subsequent `stream_chunk` / `tool_end` events on the
same WebSocket.
- UI loading state is kept separate from `toolCallingStreamIds` (the
LLM-streaming animation) via a dedicated
`pendingClientToolExecutions: Record<toolCallId, true>` map, so a
renderer can show a distinct "running on device" indicator without
entangling existing selectors.
Client → server signal:
`executeGatewayAgent` now passes `clientRuntime: isDesktop ? 'desktop' : 'web'`
so the server knows this Electron caller can receive `tool_execute`.
Tests: 39 new cases across AgentStreamClient / internal_executeClientTool
/ gatewayEventHandler covering success, error, MCP fallback, pending
state lifecycle, and fire-and-forget semantics. 148 total in affected
suites.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: pass server operationId to tool_result dispatch (operationId mismatch)
The gateway event handler received `tool_execute` events but the resulting
`internal_executeClientTool` call looked up `gatewayConnections` by the
*local* operation id (e.g. `op_8chrnd`) instead of the *server-side*
operation id (e.g. `op_1776171452938_...`) the WS connection is actually
keyed on. `conn` was therefore always `undefined`, the early-return in
`send(...)` swallowed the response, and the server's BLPOP waiter timed
out after 60 s.
This was reproducible on canary E2E: server logs showed
`dispatching client tool lobe-local-system/readLocalFile` followed by
`client tool ... timed out after 60027ms`, with no outbound `tool_result`
frame ever reaching the Agent Gateway.
Fix: thread a distinct `gatewayOperationId` through
`createGatewayEventHandler` and use it for the `case 'tool_execute'`
dispatch. The existing `operationId` (used for `dispatchContext` →
`internal_dispatchMessage` keying) is untouched. Both `executeGatewayAgent`
and `reconnectToGatewayOperation` now pass the server id explicitly; when
a caller omits it, it falls back to the local `operationId` for backwards
compatibility.
Verified live on canary: WS now shows
`[in] tool_execute` → `[out] tool_result success=true content=...` and
the agent returns the real local-file contents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: slove the execAgents tools exec types not correct
* fix: should inject source:discovery when tools type is lost
* fix: delete the source inject test
* fix: slack not respond to text commands
* feat: add slack slash commands instructions
* chore: add slack validate in test connections
* chore: update slack docs
* chore: remove text commands for slack
* fix: execAgent should get all tools manifests
* fix: should add the tools source into payload source
* fix: add the discoverable tools into tools enginer
* fix: update the test, should include the discoverable tools
* ✨ fix: implement stable navigation hook and refactor navigation handling
- Introduced `useStableNavigate` hook to provide a stable `navigate` function that can be used across the application.
- Refactored components to utilize the new stable navigation approach, replacing direct access to the navigation function from the global store.
- Updated `NavigatorRegistrar` to sync the `navigate` function into a ref for consistent access.
- Removed deprecated navigation handling from various components and actions, ensuring a cleaner and more maintainable codebase.
Signed-off-by: Innei <tukon479@gmail.com>
* 🐛 fix: refactor navigation handling to prevent state mutation
- Updated navigation reference handling in the global store to use a dedicated function for creating navigation refs, ensuring that the initial state is not mutated by nested writes.
- Adjusted tests and components to utilize the new navigation ref creation method, enhancing stability and maintainability of navigation logic.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ test: mock Electron's net.fetch in unit tests
- Added a mock for Electron's net.fetch in the AuthCtr and BackendProxyProtocolManager tests to ensure proper handling of remote server requests.
- This change allows tests to simulate network interactions without relying on the actual fetch implementation, improving test reliability.
Signed-off-by: Innei <tukon479@gmail.com>
---------
Signed-off-by: Innei <tukon479@gmail.com>
messageModel.query() calls inside RuntimeExecutors were missing a
postProcessUrl callback, so imageList/videoList/fileList entries retained
raw S3 keys (e.g. `files/user_xxx/icon.png`). After the first tool batch,
the refreshed state fed those raw keys straight into the next LLM call,
and providers like Anthropic reject anything that isn't an absolute URL or
data URI ("Invalid image URL"). Wire a lazy FileService-backed
postProcessUrl into all three query sites (topic reference resolution,
compression, and post-batch refresh) so imageLists stay resolved across
multi-step operations.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix: dispatch executor=client tools to desktop caller even with DEVICE_GATEWAY configured
Two fixes to make Phase 6.4 (LOBE-7076) actually reach a desktop caller on
canary, where DEVICE_GATEWAY is configured and a separate remote device
may be registered.
### 1. AgentToolsEngine: suppress RemoteDevice for desktop callers
The `lobe-remote-device` tool is meant for the legacy "tunnel commands to
a separately registered desktop" flow. When the caller itself is a
desktop Electron client, that's redundant — and worse, the LLM was
picking `listOnlineDevices` + `activateDevice` *first*, then routing the
subsequent `readLocalFile` to a different registered host (a remote
Linux VM in our E2E trace, returning ENOENT for a path that only exists
on the caller).
Adds `&& !hasClientExecutor` to the RemoteDevice enable rule. Desktop
callers now see only `local-system` in their manifest.
### 2. aiAgent.execAgent: mark executor='client' for desktop callers
The existing gate was `if (!gatewayConfigured) { executorMap[...] = 'client' }`.
On canary, `gatewayConfigured === true` (DEVICE_GATEWAY set), so
`local-system` / stdio MCP stayed server-executed and were dispatched to
the Remote Device proxy instead of back to the caller's Agent Gateway WS.
Extends the gate to:
`if (clientRuntime === 'desktop' || !gatewayConfigured)`
So a caller that explicitly signals it can receive `tool_execute` bypasses
the DEVICE_GATEWAY heuristic. Legacy behaviour unchanged for web callers
and for callers that don't send `clientRuntime`.
### Tests
- AgentToolsEngine: +1 case verifying RemoteDevice is suppressed when
`clientRuntime === 'desktop'` even with `gatewayConfigured: true`
- execAgent.deviceToolPipeline: +3 cases
- local-system gets executor='client' for desktop + DEVICE_GATEWAY
- stdio MCP gets executor='client' for desktop + DEVICE_GATEWAY
- web caller preserves legacy routing (executor unset)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: enable executor=client tools for desktop Electron callers
Adds a `clientRuntime` signal to execAgent so the server knows the caller
itself can execute `executor: 'client'` tools (local-system, stdio MCP) over
its Agent Gateway WebSocket. This is the missing server piece for Phase 6.4
(LOBE-7076): previously `local-system` only entered the manifest when a
*separately registered* remote device was online & auto-activated, so a
desktop Electron caller sitting on the other end of the Gateway WS could
never actually be dispatched to via `tool_execute`.
The new signal is orthogonal to the legacy device-proxy `deviceContext` —
it describes the caller itself, not a third-party device. The enable rule
for LocalSystemManifest simply gets one extra OR branch:
local && gatewayConfigured && (hasClientExecutor || legacy-device-online-activated)
`toolExecutorMap[LocalSystemManifest.identifier] = 'client'` (LOBE-7067)
then kicks in as soon as the manifest entry is present, so
`RuntimeExecutors.call_tool` (LOBE-7068) will push `tool_execute` over the
Agent Gateway WS to this caller.
Plumbing:
- packages/types: `ExecAgentParams.clientRuntime?: 'desktop' | 'web'`
- lambda router: accepts + forwards `clientRuntime`
- aiAgent service: forwards to `createServerAgentToolsEngine`
- AgentToolsEngine: +1 field, +1 OR branch in LocalSystem enable rule.
Zero changes to `runtimeMode` / `platform` / `RemoteDeviceManifest` /
`deviceContext` semantics.
Tests: 3 new cases in AgentToolsEngine covering desktop / web / gateway-off
branches; 3 new cases in execAgent.deviceToolPipeline verifying the
`clientRuntime` param is forwarded verbatim.
Follow-up (separate PR): frontend receives `tool_execute`, runs the tool
via Electron IPC, and sends `tool_result` back over the same WS.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: untangle runtime / platform / device-proxy flags in AgentToolsEngine
Renames and separates two orthogonal concerns that used to share the
misleading `isDesktopClient` name:
- `hasClientExecutor` — caller itself can receive `tool_execute` over
the Agent Gateway WS (Phase 6.4). Property of the caller.
- `hasDeviceProxy` — server has a device-proxy configured that tunnels
to a separately registered device (legacy Remote Device). Property of
the server.
`platform` is now derived from the caller (`clientRuntime`) first,
falling back to the device-proxy signal for backwards compat — it was
previously derived purely from the server's proxy config, which
conflated "server can reach a desktop" with "caller is a desktop".
LocalSystem enable rule restructured to read in natural order:
runtimeMode === 'local' // user opted in
&& hasDeviceProxy // server has a Gateway path
&& (hasClientExecutor || ...) // an execution target exists
Behavior is identical to the previous commit; this is a pure rename /
regrouping refactor. 38 existing tests still pass without changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: decouple hasClientExecutor from hasDeviceProxy in local-system gate
The previous rule required `hasDeviceProxy` as a shared prerequisite for
BOTH enable paths, which is wrong: `hasDeviceProxy` reflects the legacy
device-proxy (`deviceProxy.isConfigured`), while Phase 6.4's
`tool_execute` rides the Agent Gateway WebSocket that this request is
already on. The two systems are orthogonal — a desktop caller on the
Gateway WS can receive `tool_execute` without any device-proxy being
configured server-side.
Correct enable rule:
runtimeMode === 'local'
&& (hasClientExecutor // Phase 6.4, self
|| (hasDeviceProxy && deviceOnline && autoActivated)) // legacy
Updated the `still requires gateway to be configured` test, which was
asserting the incorrect coupling, to instead verify that agent-level
`runtimeMode.desktop === 'none'` opt-out is respected for desktop
callers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add image-to-video options to CLI generate video command
Why: CLI only supported text-to-video. Backend already accepts imageUrl/endImageUrl
for image-to-video, but the CLI had no way to pass them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* update cli version
* update cli version
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* use Electron's net.fetch() so system trusted certs are honored
* 🐛 fix(tests): mock netFetch in unit tests broken by net.fetch migration
Both LocalFileCtr and RemoteServerConfigCtr tests were patching
global.fetch / stubGlobal, which no longer intercepts calls now that
the controllers route through Electron's net.fetch via @/utils/net-fetch.
Hoist the fetch mock and point vi.mock('@/utils/net-fetch') at it directly.
Tools flagged as `executor: 'client'` are dispatched via `dispatchClientTool`
through the Agent Gateway WS path. In cloud deployments where the gateway is
configured but no desktop device is connected, this path 404s on
`/api/operations/tool-execute` and the tool fails with `dispatch_failed`.
Only mark local-system and stdio MCP plugins as `'client'` when the gateway
is NOT configured (standalone Electron). When deviceContext is available,
tool routing goes through the RemoteDevice proxy instead.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix(desktop): use low urgency for Linux notifications to prevent GNOME Shell freeze
On Linux/GNOME Shell, desktop notifications with urgency 'normal' appear
as banner pop-ups. Clicking the dismiss (X) button on these banners can
cause the system to freeze for 30-45 seconds due to heavy gnome-shell
CPU and memory usage.
Setting urgency to 'low' on Linux routes notifications to the message
tray instead of displaying them as banners, which avoids the problematic
X button interaction. The urgency option is ignored on macOS and Windows.
Fixes#13538
Co-authored-by: octo-patch <octo-patch@github.com>
* ✨ feat(task): add participants array to task.list response
Return a participants array per task (id / type / avatar / name) so
clients can show avatar groups on task cards. For now participants
only contains the assignee agent; future iterations can aggregate
comment authors and topic executors.
Also extract TaskItem into @lobechat/types as an explicit type
definition so it no longer relies on drizzle schema inference.
* ♻️ refactor(task): extract NewTask to @lobechat/types
Remove the drizzle $inferInsert NewTask from schemas and define it
explicitly in @lobechat/types alongside TaskItem.
* ✅ test(task): cover participants in task.list response
✨ feat(agent-runtime): dispatch client-executor tools via Agent Gateway WS
Wire the block-await dispatch path for tools marked as `executor: 'client'`:
- `aiAgent/index.ts` (6.3a) — derive `toolExecutorMap` from manifests:
* `local-system` builtin → `'client'` (requires Electron IPC)
* MCP plugins with `customParams.mcp.type === 'stdio'` → `'client'`
(subprocess runs on the user's machine)
Purely manifest-driven; no new context / capability fields needed.
- `dispatchClientTool` (6.3b) — helper that:
* Pushes a `tool_execute` event via `streamManager.sendToolExecute`
* Block-awaits on Redis BLPOP via `ToolResultWaiter`
* Returns a `ToolExecutionResultResponse`-shaped object (drop-in with
the existing server path)
* Never throws — timeouts / gateway errors / missing infra all
produce a failed-but-structured result so the agent loop continues
- `RuntimeExecutors.call_tool` / `call_tools_batch` — route to
`dispatchClientTool` when `payload.executor === 'client'` AND the
stream manager exposes `sendToolExecute`. Otherwise fall through to
the existing server path unchanged. Response API (`source: 'client'`)
interrupt branch is untouched.
Capped at 270s per tool to match Vercel's streaming function window;
longer tools will be handled by the resumable path in Phase 6.3c.
Covered by:
- 5 unit tests on `dispatchClientTool` (gateway missing, redis missing,
happy path, timeout, dispatch error)
- 286 existing tests still pass in adjacent suites
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace 6 per-path Next.js `route.ts` handlers (using `@upstash/workflow/nextjs` serve) with a single Hono app mounted at `[[...route]]`. Workflow logic moves to `src/server/workflows-hono/memory-user-memory/`; all public URLs remain unchanged so existing `MemoryExtractionWorkflowService.triggerXxx` callers need no update.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(agent-runtime): add ToolResultWaiter for Redis BLPOP-based tool result await
Introduce ToolResultWaiter — a Promise-based wrapper around Redis BLPOP
that server-side agent loops will use to block-await client-side tool
execution results delivered via the callback API (LPUSH on another
connection).
Design highlights:
- Takes two ioredis clients: a dedicated blocking connection for BLPOP
(must not be shared with business traffic) and a normal producing
connection for side effects (cancel sentinel).
- `waitForResult(id, timeoutMs)` returns the parsed payload or null on
timeout / cancel, never throws for timeout (caller decides fallback).
- `waitForResults(ids[], timeoutMs)` fans out via Promise.all, aligning
results with input order.
- `cancel(id)` LPUSHes a poison-pill sentinel to wake a pending waiter,
used when the agent loop is terminated mid-tool.
Covered by unit tests (6 cases: push-before / push-after / timeout /
batch / cancel / malformed payload).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(agent-runtime): use multi-key BLPOP in waitForResults to avoid N×timeout latency
Promise.all-ing waitForResult over a shared blocking Redis connection
actually serializes: BLPOP holds the socket, so calls run back-to-back
rather than concurrently. A batch of N where some results never arrive
would take up to N × timeoutMs to resolve, stalling tool-call loops
and delaying cancellation.
Rewrite waitForResults to use Redis's multi-key BLPOP in a loop with a
shared deadline: each iteration blocks on all remaining keys with the
remaining budget, wakes when any one arrives, drops that key, and
re-enters with the rest. Total latency is bounded by one timeoutMs
regardless of N. Single-key waitForResult now delegates to this path.
Covered by a new regression test asserting that an N=3 batch of
never-arriving keys completes in ~1 timeout window, not N×.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✨ feat(api): add POST /api/agent/tool-result callback endpoint
Agent Gateway forwards client tool execution results to this endpoint;
the handler LPUSHes into a per-toolCallId Redis list with a 120s TTL so
the server-side agent loop's BLPOP can wake and continue.
- Auth via AGENT_GATEWAY_SERVICE_TOKEN bearer header
- Zod-validated body: { toolCallId, content, success, error? }
- Key: tool_result:{toolCallId}
- Idempotency not required; duplicates sit under TTL until expired
No runtime caller yet — wiring lands with the BLPOP waiter in LOBE-7068.
Covered by unit tests (6 cases: missing/wrong token, missing token env,
invalid body, Redis unavailable, happy path, Redis write error).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✨ feat(agent-runtime): add GatewayStreamNotifier.sendToolExecute
Expose a request-response-style push for tool_execute on top of the
existing Gateway HTTP pipe. Callers use this to delegate tool execution
to the client; failures surface back to the caller so the agent loop
can decide whether to fall back to the interrupt-resume path.
- `IStreamEventManager.sendToolExecute?` — optional interface method,
only the Gateway-backed notifier implements it (InMemory/Redis-only
managers intentionally leave it undefined)
- `GatewayStreamNotifier.sendToolExecute(operationId, ToolExecuteData)`
POSTs to Gateway `/api/operations/tool-execute`
- New private `httpPostAwait` helper preserves the 5s timeout but,
unlike the fire-and-forget `httpPost`, rejects on non-ok / network
failure so callers can react
No runtime caller yet; the dispatch branch lands with LOBE-7068.
Covered by unit tests (3 new cases: happy path payload, non-ok
response, network error).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(agent-stream): add tool_execute / tool_result protocol types
Introduce the type-level scaffold for the Gateway-mediated client tool
execution flow:
- `tool_execute` server→client event with `ToolExecuteData` payload
(toolCallId, identifier, apiName, arguments, executionTimeoutMs)
- `tool_result` client→server message with success/error and content,
added to the `ClientMessage` union
No runtime wiring yet; this PR is pure type scaffolding so subsequent
server (Redis BLPOP waiter, Gateway notifier, RuntimeExecutors branch)
and client (gateway handler) work can land independently.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update types.ts
* 💄 style(agent-stream): reorder ToolResultMessage fields for perfectionist
Move `error?` before `state?` to satisfy `perfectionist/sort-interfaces`
after the `state?: any` field was added to align with ChatToolResult.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(agent): support multimodal input for server-side agent execution
Wires already-uploaded file IDs through the Gateway-mode execAgent path so
SPA-attached images / documents / videos reach the LLM when the agent runs
server-side. Resolves attachments via FileModel.findByIds, classifies by
MIME, parses documents idempotently, and persists the messages_files link
for history replay.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(agent): dedupe repeated fileIds before writing messages_files
messages_files has a composite PK on (file_id, message_id); a fileIds array
containing the same id twice would fail the insert and abort execAgent. Dedupe
the input while preserving caller-provided order so rendering stays stable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ToolExecutor ('client' | 'server') as a new orthogonal dimension
alongside ToolSource to describe where a tool invocation is dispatched.
Thread executorMap through OperationToolSet / ResolvedToolSet / AgentState
and attach executor to the ChatToolPayload emitted in onToolsCalling.
Defaults remain empty (all server-side), so behavior is unchanged. This
is pure scaffolding to unblock subsequent work on client-side dispatch.
Also remove the unused 'plugin' value from ToolSource (no downstream
consumers branched on it; installed plugins now labeled 'mcp').
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix: guard non-string content in context-engine to prevent `e.trim is not a function`
Two unguarded `.trim()` / string-concatenation paths in the context-engine
could throw or produce garbage text when a message's `content` is not a
plain string (multimodal parts array, null tool turns). Both are reached
in normal chat and trigger `e.trim is not a function` in production.
- `resolveTopicReferences`: filter out non-string content in the fallback
`lookupMessages` path before calling `.trim()`. Without this guard, the
outer try/catch swallows the TypeError and drops the whole fallback.
- `MessageContent` processor: normalize `message.content` (string or
parts array) before concatenating file context, instead of relying on
implicit `toString()` coercion which emitted `[object Object]` into
the LLM prompt.
Adds regression tests for both paths.
🐛 fix(local-system): restore loc param when calling readLocalFile IPC
The `denormalizeParams` method in `LocalSystemExecutionRuntime` was
missing a case for `readLocalFile`. It fell through to `default`, which
passed `{startLine, endLine, path}` as-is to the IPC layer. However,
the IPC handler (`LocalFileCtr.readFile`) expects `LocalReadFileParams`
with `loc?: [number, number]`, not `startLine`/`endLine`. As a result,
`loc` was always `undefined` on the IPC side, causing `readLocalFile`
to default to `[0, 200]` and always return content from line 0.
Fix: add an explicit `readLocalFile` case that reconstructs the `loc`
tuple from `startLine` and `endLine` before forwarding to the IPC layer.
Fixes#13735
Co-authored-by: octo-patch <octo-patch@github.com>
* 🐛 fix: refine ProviderBizError classification for insufficient balance and quota limit errors
Extract inline "Insufficient Balance" check into a dedicated `isInsufficientQuotaError` utility with case-insensitive matching and broader patterns. Add "too many tokens" pattern to `isQuotaLimitError` for Moonshot rate-limit messages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* update
* 🐛 fix: remove "account has been deactivated" from InsufficientQuota patterns
Account deactivation can be triggered by policy, security, or account review — not just billing. Classifying it as InsufficientQuota misleads users into topping up balance when the fix is usually permission or support escalation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add AccountDeactivated error type for deactivated/suspended accounts
Separate account deactivation from InsufficientQuota so users get actionable guidance (contact support) instead of misleading billing advice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: preserve error message in ChatCompletionErrorPayload for ProviderBizError
Add `message` field to `ChatCompletionErrorPayload` and extract SDK error messages in `handleOpenAIError` and `handleAnthropicError`, so downstream consumers (agent tracing, error state) receive human-readable error details instead of generic "ProviderBizError".
Closes LOBE-7019
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: guard nullish error in handleAnthropicError
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: resolve author info (avatar + name) for task activity list
Add `author` field to `TaskDetailActivity` with `{id, type, name, avatar}`.
Backend resolves agent/user info via batch queries in `getTaskDetail`:
- Topics: author is the task's assignee agent
- Briefs: author is the brief's agentId
- Comments: author is authorAgentId or authorUserId
Fixes LOBE-7013
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: move author resolution queries to model layer
Replace direct db.select() calls in TaskService with:
- AgentModel.getAgentAvatarsByIds() for agent info
- UserModel.findByIds() for user info
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix: show loading state for assistant message during sendMessage phase
During optimistic update, the assistant message content is "..." but the
loading indicator was not shown because isGenerating only checks
AI_RUNTIME_OPERATION_TYPES (execAgentRuntime), not sendMessage. Include
isCreating state so the loading dots appear immediately when message is sent.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add delete action to agent profile dropdown menu
Add a "Delete" option to the three-dot menu in Agent Profile header,
with confirmation modal. Uses existing `removeAgent` from homeStore.
Fixes LOBE-6582
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: navigate to home after deleting agent from profile
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: complete operation and show error on gateway error event
- Error event handler writes inline error immediately via
internal_dispatchMessage, then fetches from DB for richer detail.
This ensures the UI always shows an error even when the server
hasn't persisted the error into the message table.
- disconnected listener only fires onSessionComplete after a terminal
agent event (agent_runtime_end / error), not on auth failures or
explicit disconnect calls.
- Track terminal events via agent_event listener with dedup guard to
prevent double-firing onSessionComplete.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: persist error into assistant message on agent runtime failure
When an agent runtime step fails, the error was written to error_logs
and Redis state but not to the assistant message in the DB. This caused
the frontend to show an empty message after fetchAndReplaceMessages,
since the message had no error field set.
Now dispatchCompletionHooks writes the error to the assistant message
via messageModel.update when reason is 'error', matching the pattern
used by updateAbortedAssistantMessage.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add `cancelIfRunning` to TaskTopicModel: atomically cancel only if topic
is still running, preventing overwrite of concurrent completed/timeout transitions
- Skip topic cancellation when `interruptTask` fails, keeping DB state
consistent with the still-running remote operation
- Add test for interrupt failure scenario
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat(subscription): add cross-platform subscription i18n and mobile subscription router
- Add crossPlatform.title/desc/manageOnMobile translations for 18 languages
- Register mobileSubscriptionRouter in mobile tRPC router
- Add mobileSubscription business router placeholder
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ Restructure sidebar layout: extract Lobe AI entry, move New Agent button
- Extract Lobe AI (InboxItem) from agent list to standalone top entry in sidebar body
- Move "New Agent" button from header to below Lobe AI entry
- Add "Create" to bottom menu items alongside Community and Resources
- Filter hidden items in BottomMenu component
Fixes LOBE-6938
https://claude.ai/code/session_01RtfXck3GUngoLAgP2yHArz
* ✨ Add unified Recents section to home page
- New TRPC router `recent.getAll` aggregating topics, documents, files, and tasks
- New client service and SWR-based store integration for recents data
- Unified Recents component on home page with type-based icons
- Items sorted by updatedAt, limited to 10, mixed across all types
Fixes LOBE-6938
https://claude.ai/code/session_01RtfXck3GUngoLAgP2yHArz
* ⚡ Prefetch agent config on hover for faster page loads
- Add usePrefetchAgent hook using SWR mutate to warm cache
- Trigger prefetch on mouseEnter for sidebar agent items
- Reduces or eliminates loading screen when navigating to agent pages
Fixes LOBE-6938
https://claude.ai/code/session_01RtfXck3GUngoLAgP2yHArz
* ✨ Redesign agent homepage with info, recent topics, and tasks
- New AgentHome feature replacing the old AgentWelcome component
- Agent info section: avatar, name, description, opening questions
- Recent Topics: horizontal scrollable cards for agent-specific topics
- Tasks section: list with status labels for agent-assigned tasks
- Preserve ToolAuthAlert for tool authorization flows
Fixes LOBE-6938
https://claude.ai/code/session_01RtfXck3GUngoLAgP2yHArz
* fix: common misstakes in layout
* chore: add fetch Recents cache
* chore: add back createagents
* chore: add back lobe ai
* feat: add display count
* feat: add create agent button
* feat: add sidebar section order
* chore: move divider
* ✨ feat: show current page size in display items submenu
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add sidebar display management with customize sidebar modal
- Add "Hide section" and "Customize sidebar" to Recents/Agents dropdown menus
- Create CustomizeSidebarModal with eye toggle for section visibility
- BottomMenu (Community/Resources) also manageable via modal
- Show customize sidebar button in footer when all sections hidden
- Add hiddenSidebarSections to store with localStorage persistence
- Rename "Display Items" to "Show" in dropdown menus
- Add 12px margin between accordion sections and bottom menu
- Add i18n keys for en-US and zh-CN
Fixes LOBE-6938
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 style: use SlidersHorizontal icon for customize sidebar
Replace Settings2/PanelLeft icon with SlidersHorizontal to avoid
confusion with the settings gear icon.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 style: refine sidebar customization UX
- Move Settings entry from Footer to BottomMenu alongside Community/Resources
- Add Settings to Customize sidebar modal with eye toggle
- Allow hiding all sections (remove disabled constraint)
- Move Customize sidebar button next to help button in Footer
- Merge Agent dropdown: group Create items with Category items
- Use SlidersHorizontal icon for Customize sidebar
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add recents item actions and "more" drawer
- Add inline rename (same as Agent Topic) and delete to Recents items
- Topic/document/file support rename + delete, task supports delete only
- Add "more" button when items exceed pageSize, opens AllRecentsDrawer
- AllRecentsDrawer shows all cached recents from store (up to 50)
- Fetch max(pageSize, 50) items to support drawer without extra request
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add create agent/group modal with ChatInput and examples
- Add CreateAgentModal using base-ui Modal with ChatInputProvider
- Show suggestion examples (agent/group mode) in 2-column grid
- Submit triggers sendAsAgent/sendAsGroup to auto-generate via Agent Builder
- "Create Blank" button for skipping the prompt
- Integrate modal into AgentModalProvider for shared state across sidebar
- Wire up AddButton, NewAgentButton, and dropdown menus to open modal
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: optimitic update rename
* chore: prefetch agent detail
* feat: add recent topic meta data
* feat: add recents search
* ⚡ perf: optimize recents API with single UNION query and prefetch
- Replace 3 separate DB queries with single UNION ALL query (RecentModel)
- Add optimistic updates for rename and delete actions
- Add hover prefetch for resources (usePrefetchResource)
- Add hover prefetch for agent config on topic/task items
- Change default pageSize to 5 for both Agents and Recents
- Unify delete confirmation messages per item type
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* chore: adjust settings page
* chore: optimize side bar
* feat: recents support right click
* chore: add pin icon to Agents
* chore: add custom side bar modal
* chore: reserve rencent drawer status
* feat: add prefetch route
* feat: add LobeAI prefetch
* fix: document and task rename and delete operation lost
* fix: group route id
* fix: lint error
---------
Co-authored-by: Claude <noreply@anthropic.com>
* chore: bump lucide-react from ^0.577.0 to ^1.8.0
Breaking change: Github icon was removed from lucide-react v1.x (brand icons removed).
Replaced with Github from @lobehub/icons in 5 affected files.
* fix: use GithubIcon from @lobehub/ui/icons instead of @lobehub/icons
When a task's status changes from `running` to another state (backlog/paused/completed/canceled),
automatically cancel all associated running topics and interrupt their operations.
This prevents 409 CONFLICT errors when users try to re-run a task after manually changing its status.
Fixes LOBE-6719
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## 📦 Weekly Release 20260410
This release includes **67 commits**. Key user-facing updates below.
### New Features and Enhancements
- Introduced **Prompt Rewrite & Translate** feature for assisted input
editing.
- Added **Skill Panel** with dedicated skills tab in the skill store and
fixed skill icon rendering.
- Introduced `lh notify` CLI command for external agent callbacks.
- Added `migrate openclaw` CLI command.
- Added **GraphAgent** and `agentFactory` for graph-driven agent
execution (experimental).
- New topic auto-creation every 4 hours for long-running sessions.
### Models and Provider Expansion
- Added a new provider: **StreamLake (快手万擎)**.
- Added **GLM-5.1** model support with Kimi CodingPlan fixes.
- Added **Seedance 2.0** & **Seedance 2.0 Fast** video generation models
(pricing adjusted with 20% service fee).
- Expanded AIGC parameter support for image and video generation.
- Improved model type normalization for better provider compatibility.
- Multi-media and multiple connection mode support for ComfyUI
integration.
### Desktop Improvements
- **Embedded CLI** in the desktop app with PATH installation support.
- Added Electron version display in system tools settings.
- Fixed RuntimeConfig instant-apply working directory with recent list.
- Fixed desktop locale restore — now uses stored URL parameter instead
of system locale.
- Improved remote re-auth for batched tRPC and clean OIDC on gateway
disconnect.
### Stability, Security, and UX Fixes
- **Security**: prevented path traversal in
`TempFileManager.writeTempFile`; patched IDOR in
`addFilesToKnowledgeBase`; upgraded `better-auth` with hardened
`humanIntervention` requirement in builtin-tool-activator.
- **Context engine**: added `typeof` guard before `.trim()` calls to
prevent runtime crashes.
- **Agent runtime**: preserved reasoning state across OpenAI providers;
fixed service error serialization producing `[object Object]`; surfaced
error `reasonDetail` in `agent_runtime_end` events.
- **Knowledge Base**: cleaned up vector storage when deleting knowledge
bases.
- **Templates**: allow templates to specify `policyLoad` so default docs
are fully injected.
- **Skills**: inject current agents information when `lobehub_skill` is
activated; filter current agent out of available agents list; fix
`agents_documents` overriding `systemRole`.
- **Google Tools**: use `parametersJsonSchema` for Google tool schemas.
- **Web Crawler**: prevent happy-dom CSS parsing crash in
`htmlToMarkdown`.
- **Mobile/UI**: fixed video page icon collision, missing locale keys,
model query param; hidden LocalFile actions on topic share page; allow
manual close of hidden builtin tools.
- **Auth**: `ENABLE_MOCK_DEV_USER` now supported in `checkAuth` and
openapi auth middleware.
- **Sandbox**: stopped using `sanitizeHTMLContent` to block scripts &
sandbox styles.
### Refactors
- Library/resource tree store for hierarchy and move sync.
- Removed legacy `messageLoadingIds` from chat store.
- Removed promptfoo configs and dependencies.
- `OnboardingContextInjector` wired into context engine.
### Credits
Huge thanks to these contributors (alphabetical):
@arvinxx @canisminor1990 @cy948 @hardy-one @hezhijie0327 @Innei
@MarcellGu @ONLY-yours @rdmclin2 @rivertwilight @sxjeru @tjx666
Add `typeof !== 'string'` checks before `.trim()` calls in BaseSystemRoleProvider,
SystemRoleInjector, and BaseProcessor to prevent TypeError when a non-string truthy
value (e.g. object, array, number) is passed at runtime.
* fix(builtin-tool-activator): add humanIntervention required field to activateTools manifest
- Add humanIntervention: "required" to the activateTools API manifest
- Update better-auth dependency from 1.4.6 to 1.4.9 (GHSA-xg6x-h9c9-2m83, 分数: 7.4)
* Downgrade better-auth version to 1.4.6
Thanks for your correction.
* ✨ feat: add gateway mode branch to regenerateUserMessage
When gateway mode is enabled, regenerateUserMessage now calls
executeGatewayAgent with parentMessageId instead of running
internal_execAgentRuntime locally. The server handles branching
and agent execution.
Fixes LOBE-6934
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: switch branch before gateway regeneration and keep operation open
- Move switchMessageBranch before the gateway/client branch so
activeBranchIndex is advanced and the UI shows the new response
immediately (fixes regression from client path)
- Add onComplete callback to executeGatewayAgent so callers can
run cleanup when the gateway session finishes
- Keep regenerate operation running until onComplete fires,
preventing duplicate concurrent regenerations via isMessageRegenerating
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: fix Kimi K2.5 model icon display by using deploymentName
- Change model id from 'k2p5' to 'kimi-k2.5' to match Moonshot icon keywords
- Add deploymentName 'k2p5' for API calls to use original model name
- Add KimiCodingPlan to providersWithDeploymentName list
This allows the model icon to display correctly while maintaining
backward compatibility with the API using the original 'k2p5' name.
* 🐛 fix: normalize messages for KimiCodingPlan thinking models
Add message normalization for Kimi K2.5 and K2 Thinking models to ensure
every assistant message has a thinking block when thinking is enabled.
This fixes the issue where regenerating with KimiCodingPlan after using
other providers would fail with "thinking is enabled but reasoning_content
is missing" error, because historical messages from other providers don't
have reasoning fields.
The normalization adds a placeholder thinking block when:
1. Thinking is enabled for Kimi K2.5/K2 Thinking models
2. Assistant message doesn't have reasoning content
* ✨ feat(siliconcloud): add GLM-5.1 model support
Add GLM-5.1 (Pro) model configuration with:
- 198K context window
- Function call and reasoning capabilities
- Tiered pricing (0-32k / 32k+)
- reasoningBudgetToken32k extension parameter
* 🐛 fix: use hardcoded maxOutput mapping for KimiCodingPlan models
Replace getModelPropertyWithFallback with a simple hardcoded mapping to fix
the issue where max_tokens lookup fails when using deploymentName (k2p5).
The model id is converted to deploymentName in ChatService layer before
reaching the provider, causing getModelPropertyWithFallback('k2p5', ...) to
fail since the model card uses id 'kimi-k2.5'.
By using a hardcoded mapping that supports both model id and deploymentName,
we avoid the lookup issue while keeping the code simple (KimiCodingPlan only
has a few models).
* ✅ test(kimiCodingPlan): add tests for thinking and max_tokens handling
Add comprehensive tests for KimiCodingPlan provider covering:
- Hardcoded maxOutput mapping for k2p5, kimi-k2.5, kimi-k2-thinking
- Thinking parameter handling for kimi-k2.5 and kimi-k2-thinking models
- Message normalization with forceThinking for assistant messages
- Tool calls with reasoning content to prevent API error
* ✅ test(kimiCodingPlan): add tests for thinking and max_tokens handling
Add comprehensive tests for KimiCodingPlan provider covering:
- Hardcoded maxOutput mapping for k2p5, kimi-k2.5, kimi-k2-thinking
- Thinking parameter handling for kimi-k2.5 and kimi-k2-thinking models
- Message normalization with forceThinking for assistant messages
- Tool calls with reasoning content to prevent API error
* refactor(workflow): rewrite WorkflowSummary with status dot and minimal flat style
* refactor(workflow): rewrite WorkflowCollapse with unified borderless container
* ✨ feat(workflow): add WorkflowExpandedList component and fix type errors
* ♻️ refactor(workflow): add missing Workflow components with Minimal Flat design
- WorkflowReasoningLine: cssVar tokens, aligned padding
- WorkflowToolDetail: new expandable result panel with motion animation
- WorkflowToolLine: expand chevron, getToolColor, detail panel integration
- WorkflowExpandedList: flat rendering with reasoning + tool lines
* Add tool call collapse support
Made-with: Cursor
* 💄 style(workflow): align WorkflowCollapse UI with @lobehub/ui design system
- Align border-radius, gap, padding tokens across all Workflow components
- Replace chevron expand/collapse with status icons (CheckCircle2, CircleX, Loader2)
- Use @lobehub/ui Highlighter for tool detail panel with JSON auto-formatting
- Use @lobehub/ui Flexbox for WorkflowExpandedList with proper gap and padding
- Fix delete action to use removeToolFromMessage instead of deleteAssistantMessage
- Wire debug button to existing Tool/Debug panel with full tabs
- Fix auto-collapse to only trigger on incomplete→complete transition
- Single ChevronDown with rotation for WorkflowSummary (match @lobehub/ui pattern)
* 💄 style(workflow): use AccordionItem and inspectorTextStyles for WorkflowCollapse
- Replace custom WorkflowSummary with @lobehub/ui AccordionItem
- Use StatusIndicator pattern (Block outlined 24x24) for status icon
- Apply inspectorTextStyles.root for title text (colorTextSecondary)
- Remove WorkflowSummary.tsx (dead code)
- Match Tool component AccordionItem usage (paddingBlock/Inline=4, borderless)
* 💄 style(workflow): remove divider and gap from WorkflowExpandedList
* 💄 style(workflow): align WorkflowCollapse title bar with Thinking component
* 💄 style(workflow): unify inner item spacing, font size, and colors
* ✨ feat(workflow): add streaming scroll behavior with max-height and auto-scroll
* 💄 refactor(assistant-group): refine workflow collapse UI and duration
- Use Accordion for collapse; align tool/reasoning lines with generation state
- Show workflow header duration from summed block performance, not reasoning only
Made-with: Cursor
* ✨ feat(inspector): enhance ActivateToolsInspector to display not found tools count
- Added localization for not found tools message in English, Chinese, and default locales.
- Updated ActivateToolsInspector to show a tooltip with the count of tools not found.
- Modified StatusIndicator to support a warning state for scenarios where no tools are activated but some are not found.
Signed-off-by: Innei <tukon479@gmail.com>
* 💄 style(workflow): simplify padding in WorkflowExpandedList component
- Removed unnecessary paddingInline from Flexbox elements in WorkflowExpandedList for cleaner layout.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(assistant-group): introduce constants and utility functions for workflow management
- Added constants for workflow timing, limits, and tool display names to enhance the assistant group's functionality.
- Implemented utility functions for processing and scoring post-tool answers, improving the workflow's response handling.
- Created new components for rendering content blocks and managing scroll behavior in the assistant group.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(assistant-group): enhance ContentBlock and Group components with content handling logic
- Added logic to conditionally render message content based on content availability and tool presence in ContentBlock.
- Introduced utility functions to determine substantive content and reasoning in Group, improving block partitioning for workflow management.
- Updated partitioning logic to handle trailing reasoning candidates and streamline answer and working block separation.
Signed-off-by: Innei <tukon479@gmail.com>
* 🙈 chore(gitignore): clarify superpowers local paths
Document that `.superpowers/` and `docs/superpowers/` are plugin/local outputs
and must not be committed.
Made-with: Cursor
* 👷 chore(ci): restore auto-tag-release workflow from canary
Revert unintended workflow edits so release tagging stays on main with
sync-main-to-canary dispatch.
Made-with: Cursor
---------
Signed-off-by: Innei <tukon479@gmail.com>
* 🐛 feat(db): add findExclusiveFileIds, deleteWithFiles, deleteAllWithFiles to KnowledgeBaseModel
Add methods to safely clean up vector storage when deleting knowledge bases:
- findExclusiveFileIds: identifies files belonging only to a specific KB
- deleteWithFiles: deletes KB and its exclusive files with chunks/embeddings
- deleteAllWithFiles: bulk version for deleting all user KBs
* 🐛 fix(kb): wire vector cleanup in TRPC router, OpenAPI service, and client
- TRPC removeKnowledgeBase: use deleteWithFiles when removeFiles=true + S3 cleanup
- TRPC removeAllKnowledgeBases: use deleteAllWithFiles + S3 cleanup
- OpenAPI deleteKnowledgeBase: use deleteWithFiles + S3 cleanup
- Client service: default removeFiles=true when deleting knowledge base
* 🐛 fix(knowledgeBase): change default behavior of deleteKnowledgeBase to not remove files and update related tests
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(knowledgeBase): add optional query parameter to deleteKnowledgeBase for file removal
- Introduced `removeFiles` query parameter to control the deletion of exclusive files and derived data when deleting a knowledge base.
- Updated `KnowledgeBaseController`, `KnowledgeBaseService`, and related schemas to support this new functionality.
This change enhances the flexibility of the delete operation, allowing users to choose whether to remove associated files.
Signed-off-by: Innei <tukon479@gmail.com>
* 🐛 fix: cascade knowledge base deletion and add orphan cleanup runbook
* ✨ feat(knowledgeRepo): implement cascading deletion for file-backed documents
- Enhanced the `KnowledgeRepo` to ensure that when a document with an associated file is deleted, all related data (files, chunks, embeddings) are also removed.
- Introduced a new method `deleteDocumentWithRelations` to handle the cascading deletion logic.
- Updated tests to verify that all related entities are deleted when a file-backed document is removed.
This change improves data integrity by ensuring that no orphaned records remain after deletions.
Signed-off-by: Innei <tukon479@gmail.com>
* Defer DocumentService file initialization
* Fix flaky database tests and knowledge repo fixtures
* Add deletion regression tests for folders and external files
* ⏪ chore: remove kb orphan cleanup files from pr
---------
Signed-off-by: Innei <tukon479@gmail.com>
* 🌐 chore: update execServerAgentRuntime i18n copy
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: extend execAgent with parentMessageId for regeneration/continue via Gateway
Add parentMessageId support to the execAgent API, enabling regeneration and continue-generation flows through the Gateway WebSocket path. When parentMessageId is provided, user message creation is skipped (resume mode) and the new assistant message branches from the specified parent.
Fixes LOBE-6933
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: propagate parentMessageId through execAgents batch and fix test types
- Forward parentMessageId in execAgents executeTask to maintain batch parity with execAgent
- Fix ExecAgentResult mock types in gateway tests
- Fix messages table insert type cast in server router test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(modelParse): enhance model type normalization and add tests for invalid types
* feat(modelParse): optimize imports and improve model type handling
* 🐛 fix: buffer and deduplicate events during resume to prevent out-of-order display
When reconnecting with empty lastEventId (page reload), live broadcast
events can arrive before resume replay completes, causing content to
appear out of order. Now AgentStreamClient enters resume mode: buffers
all events, waits for a 500ms gap (resume replay is dense, live events
are sparse), then deduplicates by event ID and emits in order.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: clear runningOperation on agent finish + resume timeout for completed sessions
- RuntimeExecutors.finish clears topic metadata.runningOperation when
agent reaches terminal state, so stale entries don't trigger reconnect
- AgentStreamClient resume mode: add 3s timeout for empty buffer —
if no events arrive after resume request, session has already completed,
emit session_complete and disconnect
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: eagerly fetch messages after topic switch to avoid skeleton flash
After switchTopic in Gateway mode, immediately fetch messages from DB
and replace in store, so the UI renders content right away instead of
showing a skeleton loading state while SWR re-fetches.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: eliminate skeleton flash on gateway topic switch
Match the client-mode pattern: fetch messages from DB and replaceMessages
BEFORE calling switchTopic with skipRefreshMessage: true. This ensures
messages are already in the store when the topic switches, preventing
a skeleton loading flash.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: flush resume buffer on session_complete before disconnect
session_complete is a top-level ServerMessage (not an agent_event), so
it bypassed the resume buffer. When it arrived during resume mode,
disconnect() cleared the buffer and all replayed events were lost.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: limit resume buffering to explicit reconnect scenarios only
Resume mode was triggered for ALL new connections (lastEventId always
empty on first connect), delaying live streaming for normal operations.
Now resume buffering requires explicit opt-in via resumeOnConnect option,
which is only set by reconnectToGatewayOperation (page-reload reconnect).
Normal executeGatewayAgent connections stream events immediately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: should inject current agnets information when actived the lobehub skill
* fix: not inject the agent systemRole in lobehub skill inject
* fix: should use the isLobeHubSkillActive hook to judge
* fix: change the tools inject to vars replace function
* fix: add the lost topic id & agent title
* fix: later the PlaceholderVariablesProcessor
* fix: update the description
* ✨ feat: add StreamLake (快手) support
* style: add thinking support
style: add thinking support
style: add thinking support
style: add thinking support
style: add thinking support
🐛 fix(server): prevent path traversal in TempFileManager.writeTempFile
Use path.basename() to strip directory components from user-supplied
filenames before writing temp files, preventing arbitrary file write
via crafted filenames like "../../app/startServer.js".
Fixes LOBE-6904
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: persist runningOperation to topic metadata for gateway reconnect
- Add runningOperation field to ChatTopicMetadata type
- execAgent writes { operationId, assistantMessageId } to topic metadata
after creating the operation
- onSessionComplete clears runningOperation from metadata (best-effort)
- Extend updateTopicMetadata tRPC schema + service to support the field
Fixes LOBE-6905
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add refreshGatewayToken tRPC endpoint
Signs a fresh JWT for Gateway WebSocket reconnection after page reload.
The token is scoped to the authenticated user via signUserJWT.
Fixes LOBE-6906
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: auto-reconnect to running Gateway operation on topic load
- Add reconnectToGatewayOperation to GatewayActionImpl — refreshes JWT,
creates local operation, and connects WebSocket with event replay
- Add useGatewayReconnect hook — checks topic metadata.runningOperation
when entering a topic and triggers reconnection
- Wire hook into ConversationArea
Fixes LOBE-6907
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: preserve thread scope in reconnect context and subscribe to topic metadata
- Store scope + threadId in topic metadata.runningOperation
- reconnectToGatewayOperation uses stored scope/threadId instead of
hardcoded main/null
- useGatewayReconnect subscribes to runningOperation via useChatStore
selector so it triggers when topic data arrives from SWR (not just
on mount when data may be empty)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: update device tests to allow runningOperation metadata writes
The tests asserted updateMetadata was never called, but now execAgent
persists runningOperation. Changed to assert no device-binding metadata
was written (boundDeviceId), which is the actual intent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: use SWR for gateway reconnect lifecycle
Replace useEffect + ref with useSWR keyed by operationId. SWR
naturally deduplicates (same key = no re-fetch), handles the async
reconnect, and doesn't fire when key is null (no runningOperation).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: validate topic has running operation before issuing gateway token
refreshGatewayToken now requires topicId, verifies the topic belongs to
the user and has a runningOperation in metadata before signing a JWT.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 style: break signin title into two lines
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Fix signin.title formatting in auth.json
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: allow templates to specify policyLoad so default docs are fully injected
All documents were hardcoded to PolicyLoad.PROGRESSIVE on creation,
causing CLAW template docs (IDENTITY, SOUL, BOOTSTRAP, AGENTS) to be
progressively disclosed instead of fully injected into context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: forward policyLoad through upsertDocument and persist on update
- Add policyLoad to UpsertDocumentParams and pass it through to model
- Add policyLoad param to update() so upsert's existing-document path
writes the value instead of silently discarding it
- Ensures re-running template init migrates pre-existing docs to ALWAYS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: change update() to use named params object instead of positional args
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: change create() and upsert() to use named params object
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✅ test: improve agentDocuments test coverage to 99%
Add tests for uncovered branches:
- normalizeLoadRule default branch (unknown rule)
- explicit 'always' rule match
- by-time-range with NaN dates
- resolveDocumentLoadPosition fallback paths
- composeToolPolicyUpdate with existing context values
- upsert create path for new filenames
- getAgentContext empty docs path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: preserve policyLoad when copying documents
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✅ fix: align test assertion with refactored create() params object signature
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
🐛 fix(database): add ownership check in addFilesToKnowledgeBase to prevent IDOR
Verify that the target knowledge base belongs to the authenticated user
before inserting files, preventing unauthorized file injection into
other users' knowledge bases.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: reuse existing messages in execAgent when existingMessageIds provided
When existingMessageIds contains [userMsgId, assistantMsgId], skip
creating new messages and reuse the existing ones. This fixes duplicate
messages in Gateway mode where sendMessageInServer already created
the messages before execAgentTask is called.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: allow clicking NavItem while loading
Loading state should only show a visual indicator, not block onClick.
This fixes topic sidebar items being unclickable during agent execution.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Revert "🐛 fix: reuse existing messages in execAgent when existingMessageIds provided"
This reverts commit 43b808024d5c4a0074b692a85083a72046ab47e0.
* 🐛 fix: skip sendMessageInServer in Gateway mode to avoid duplicate messages
Gateway mode now calls execAgentTask directly instead of going through
sendMessageInServer first. The backend creates user + assistant messages
and topic in one call. executeGatewayAgent handles topic switching
internally after receiving the server response.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🌐 chore: add i18n for execServerAgentRuntime operation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: move temp message cleanup after executeGatewayAgent succeeds
Keep temp messages visible during the gateway call so the UI isn't
blank. On failure, mark the operation as failed instead of silently
returning — temp messages remain so the user sees something went wrong.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: remove manual temp message cleanup in gateway mode
switchTopic handles new topic navigation, and fetchAndReplaceMessages
replaces the message list from DB — no need to manually delete temp
messages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: clear _new key temp messages when gateway creates new topic
Pass clearNewKey: true to switchTopic so temp messages from the
optimistic create don't persist in the _new key after switching
to the server-created topic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: import ExecAgentResult from @lobechat/types
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(desktop): embed CLI in app and PATH install
Made-with: Cursor
* ✨ feat(desktop): add CLI command execution feature and UI integration
- Implemented `runCliCommand` method in `ElectronSystemService` to execute CLI commands.
- Added `CliTestSection` component for testing CLI commands within the app.
- Updated `SystemCtr` to include CLI command execution functionality.
- Enhanced `generateCliWrapper` to create short aliases for CLI commands.
- Integrated CLI testing UI in the system tools settings page.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat: enhance working directory handling for desktop
- Updated working directory logic to prioritize topic-level settings over agent-level.
- Introduced local storage management for agent working directories.
- Modified tests to reflect changes in working directory behavior.
- Added checks to ensure working directory retrieval is only performed on desktop environments.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(desktop): implement CLI command routing and cleanup
- Introduced `CliCtr` for executing CLI commands, enhancing the desktop application with CLI capabilities.
- Updated `ShellCommandCtr` to route specific commands to `CliCtr`, improving command handling.
- Removed legacy CLI path installation methods from `SystemCtr` and related services.
- Cleaned up localization files by removing obsolete entries related to CLI path installation.
Signed-off-by: Innei <tukon479@gmail.com>
* 🚸 settings(system-tools): show CLI embedded test only in dev mode
Made-with: Cursor
---------
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat: integrate Gateway connection management into chat store
Add GatewayActionImpl to aiChat slice for managing Agent Gateway
WebSocket connections per operationId. Includes connect, disconnect,
interrupt, and status tracking. Also type the execAgentTask return value.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add Gateway mode branch in sendMessage for server-side agent execution
When agentGatewayUrl is set in server config (enableQueueAgentRuntime),
sendMessage now triggers server-side agent execution via execAgentTask
and receives events through the Agent Gateway WebSocket, instead of
running the agent loop client-side.
Includes:
- Expose agentGatewayUrl in GlobalServerConfig when queue mode is enabled
- Gateway event handler mapping stream events to UI message updates
- Fallback to client-side agent loop when Gateway is not configured
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: emit disconnected event on intentional disconnect
disconnect() was only calling setStatus('disconnected') but not emitting
the 'disconnected' event. This caused the store's cleanup listener to
never fire after terminal events (agent_runtime_end), leaving stale
connections in gatewayConnections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: enhance Gateway event handler for multi-step agent streaming
Support multi-step agent execution display (LLM → tool calls → next LLM)
using hybrid approach: real-time streaming for current step, DB refresh at
step transitions.
Fixes LOBE-6874
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: wire up Gateway JWT token from execAgent to connectToGateway
Pass the RS256 JWT token returned by execAgentTask to connectToGateway
for WebSocket authentication. Also use ExecAgentResult from @lobechat/types
instead of local duplicate definition.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: handle wss:// protocol in AgentStreamClient buildWsUrl
When gatewayUrl already uses ws:// or wss:// protocol, use it directly
instead of stripping and re-adding the protocol prefix. Previously,
wss://host would become ws://wss://host (double protocol).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: queue gateway events to ensure stream_chunk waits for refreshMessages
Use a sequential Promise chain to process gateway events, so that
stream_chunk dispatches only run after stream_start's refreshMessages
resolves. Previously, chunks arrived before the new assistant message
existed in dbMessagesMap, causing updates to be silently dropped.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: pass operationId context to internal_dispatchMessage in gateway handler
Without operationId, internal_dispatchMessage falls back to global state
to compute the messageMapKey, which may differ from the key where
refreshMessages stored the server-created messages. Passing operationId
ensures the correct conversation context is resolved.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: resolve gateway streaming display issues
- Use fetchAndReplaceMessages (direct DB fetch + replaceMessages) instead
of refreshMessages which mutates an orphaned SWR key
- Create dedicated execServerAgentRuntime operation with correct topicId
context for internal_dispatchMessage to resolve the right messageMapKey
- Complete operation on agent_runtime_end instead of relying on
onSessionComplete callback
- Keep loading state active between steps (only clear on agent_runtime_end)
so users don't think the session ended during tool execution gaps
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: maintain loading state across gateway step transitions
- Create dedicated execServerAgentRuntime operation with correct topicId
- Use fetchAndReplaceMessages instead of orphaned refreshMessages SWR key
- Re-apply loading after tool_end refresh so UI stays active between steps
- Complete operation on agent_runtime_end
- Add record-app-screen.sh for automated screen recording
- Output recordings to .records/ (gitignored)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: show loading on assistant message immediately in stream_start
Set loading on the current assistant message BEFORE awaiting
fetchAndReplaceMessages, so the UI shows a loading indicator while
waiting for the DB response instead of appearing frozen.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: drive gateway loading state via operation system instead of messageLoadingIds
Associate the assistant message with the gateway operation via
associateMessageWithOperation so the Conversation store's operation-based
loading detection (isGenerating) works correctly. This shows the proper
loading skeleton on the assistant message while waiting for gateway events.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: remove unused internal_toggleMessageLoading from gateway handler
Loading state is now fully driven by the operation system via
associateMessageWithOperation + completeOperation. The old
messageLoadingIds-based approach is no longer needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: rewrite record-app-screen.sh to use CDP screenshot assembly
Replace broken ffmpeg avfoundation live recording (corrupts on kill) with
agent-browser CDP screenshot capture + ffmpeg assembly on stop. This works
reliably on any screen including external monitors.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add Gateway Mode lab toggle and fix CI type error
- Add enableGatewayMode to UserLabSchema as experimental feature
- Add lab selector and settings UI toggle in Advanced > Labs
- Gateway mode now requires both server config (agentGatewayUrl) AND
user opt-in via Labs toggle
- Fix TS2322: result.token (string | undefined) → fallback to ''
- Add i18n keys for gateway mode feature
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: hide Gateway Mode toggle when agentGatewayUrl is not configured
Only show the lab toggle when the server has AGENT_GATEWAY_URL set,
so users without gateway infrastructure don't see the option.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 style: move Gateway Mode toggle below Input Markdown in labs section
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: remove default AGENT_GATEWAY_URL value and make schema optional
Without an explicit env var, the gateway URL should be undefined so the
lab toggle and gateway mode are not available.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 📝 docs: update SKILL.md to reference record-app-screen.sh
Replace outdated record-gateway-demo.sh references with the renamed
record-app-screen.sh and its start/stop lifecycle documentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 📝 docs: add record-app-screen reference doc and slim down SKILL.md
Move detailed recording documentation to references/record-app-screen.md
and keep SKILL.md concise with a link to the full reference.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: guard GatewayStreamNotifier with AGENT_GATEWAY_URL check
AGENT_GATEWAY_URL is now optional, so check both URL and service token
before wrapping with GatewayStreamNotifier to avoid TS2345.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: extract gateway execution logic to GatewayActionImpl
Move server-side gateway execution logic from conversationLifecycle.ts
into GatewayActionImpl.startGatewayExecution(). The sendMessage flow
now does a simple early return when gateway mode is active, keeping
the existing client-mode code path untouched.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ♻️ refactor: split gateway into isGatewayModeEnabled check + executeGatewayAgent
Replace fire-and-forget startGatewayExecution with explicit check/execute
pattern. Caller does: if (check) { await execute(); return; } — giving
proper error handling and clearer control flow.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(ResourceManager): integrate tree store for folder management and enhance file operations
- Added `useTreeStore` to manage folder structure and state, replacing previous file store dependencies.
- Updated `EmptyPlaceholder` to utilize `currentFolderId` for file uploads.
- Refactored `MoveToFolderModal` to use tree store for moving items, improving folder navigation.
- Enhanced drag-and-drop functionality in `DndContextWrapper` to support moving items between folders.
- Removed obsolete `LibraryHierarchy` state management, streamlining folder operations.
- Improved file renaming and deletion processes to ensure tree state consistency.
This update enhances the overall file management experience by leveraging a dedicated tree store for better performance and maintainability.
Signed-off-by: Innei <tukon479@gmail.com>
* ✨ feat(TreeAction): enhance resource movement and update handling
- Updated mutation logic for moving resources to differentiate between items visible in the Explorer and those not visible, improving performance and user experience.
- Added refresh functionality for the file list after resource updates (move, update, delete) to ensure the Explorer reflects the latest state.
- Refactored mutation methods to use async/await for better readability and error handling.
This update streamlines resource management within the tree structure, ensuring a more responsive and consistent user interface.
Signed-off-by: Innei <tukon479@gmail.com>
* Fix file updates and tree move fallback regressions
---------
Signed-off-by: Innei <tukon479@gmail.com>
🐛 fix: hide LocalFile actions (Open/Show in Folder) in share page
In topic share pages, the LocalFile component was showing 'Open' and
'Show in Folder' action buttons on hover, which are desktop-only
operations not available to share page viewers.
- Add 'readonly' prop to LocalFile component to disable interactive actions
- Detect share page context via topicShareId in LocalFile Render plugin
- Skip Popover rendering when readonly is true
* ♻️ refactor: remove legacy messageLoadingIds from chat store
The messageLoadingIds state and internal_toggleMessageLoading action in the
chat store have been fully superseded by the operation system. The state was
being written to but never read by any consumer — all UI components and
selectors already use operation-based selectors (isMessageGenerating,
isMessageProcessing, etc.).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 📝 chore: update skill docs to remove messageLoadingIds references
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: replace messageLoadingIds with operationSelectors in generation action
The Conversation store's regenerateUserMessage was reading messageLoadingIds
from the chat store to check if a message is already being processed. Replace
with operationSelectors.isMessageProcessing which is the correct way to check
operation state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: add operationsByMessage to test mocks for operation selector
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat(cli): add `lh notify` command for external agent callbacks
Add a new `lh notify` CLI command and server-side TRPC endpoint that allows
external agents (e.g. Claude Code) to send callback messages to a topic and
trigger the agent loop to process them.
Fixes LOBE-6888
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🔧 chore(cli): replace sessionId with agentId and threadId in notify command
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
♻️ refactor: remove promptfoo configs and dependencies from packages
Migrate all prompt evaluation tests to the cloud repo's agent-evals framework.
Remove promptfoo directories, configs, dependencies, and generator scripts
from @lobechat/prompts, @lobechat/memory-user-memory, and @lobechat/builtin-tool-memory.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: use parametersJsonSchema for Google tool schemas to support full JSON Schema
Replace Google's restrictive Schema subset with parametersJsonSchema, which accepts
standard JSON Schema directly. This eliminates the need for resolveRefs and
sanitizeSchemaForGoogle, fixing nullable enum (LOBE-6607) and $ref (LOBE-6680) issues.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: update remaining tests to use parametersJsonSchema
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 💄 fix(RuntimeConfig): instant-apply working directory with recent list
Remove Save/Cancel buttons from working directory selector.
Directories now apply immediately on click. Show recent directories
list with checkmark for active selection and "Choose a different folder"
entry at bottom.
* ✨ feat(SystemCtr): enhance folder selection to return repository type
Updated the `selectFolder` method to return an object containing the selected folder path and its repository type (either 'git' or 'github'). Added a new private method `detectRepoType` to determine the repository type based on the presence of a `.git/config` file. Introduced a new utility for managing recent directories, allowing the application to display appropriate icons based on the repository type in the UI.
Signed-off-by: Innei <tukon479@gmail.com>
---------
Signed-off-by: Innei <tukon479@gmail.com>
* ♻️ refactor: remove redundant update-status call from GatewayStreamNotifier
Gateway now handles session completion directly in pushEvent when it
receives agent_runtime_end, so the separate update-status HTTP call
is no longer needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✅ test: update GatewayStreamNotifier tests for removed update-status call
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
✨ feat: generate JWT token for Gateway WebSocket auth in execAgent
Sign a short-lived RS256 JWT via signUserJWT(userId) when creating an agent
operation, and return it in ExecAgentResult.token so the client can
authenticate with the Agent Gateway WebSocket.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Disable CSS file loading and JS evaluation in happy-dom Window (root cause)
- Add try-catch around Readability.parse() for defense in depth
- Add regression tests for invalid CSS selectors and external stylesheet links
Closes LOBE-6869
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: support nested subtask tree in task.detail
Replace flat subtask list with recursive nested tree structure.
Backend builds the complete subtask tree in one response,
eliminating the need for separate getTaskTree API calls.
Fixes LOBE-6814
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix: return empty array for root subtasks instead of undefined
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 📝 docs: add cli-backend-testing skill
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Each instruction interface now extends AgentInstructionBase directly instead of intersection
- Group instructions by category: LLM, Tool, Task, Human Interaction, Control
- Extract AgentHookType and AgentHookEvent into agent-runtime package
- Keep AgentHook, AgentHookWebhook, SerializedHook in server layer (webhook is server-specific)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add GraphAgent and agentFactory for graph-driven agent execution
- Add GraphAgent: a decorator around GeneralChatAgent that drives execution via declarative ReasoningGraph
- Agent nodes: delegate to GeneralChatAgent for tool-calling loops, then extract structured output
- LLM nodes: single structured LLM call
- Programmatic transition evaluation (not LLM-driven)
- Backtracking with configurable limits
- Add AgentInstruction.stepLabel: allows any Agent to label steps for display in stream events and hooks
- Add agentFactory to AgentRuntimeServiceOptions: external injection of custom Agent implementations
- Add stepLabel propagation: stream_start/stream_end events and afterStep hooks carry the label
- Fix: sanitize null bytes in MessageModel.create content (consistent with existing plugin argument sanitization)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* 🐛 fix(agent-runtime): validate graph node existence and preserve transitions at backtrack limit
- Add node existence check in startNode to prevent runtime crash on invalid entry/transition targets
- Evaluate all transitions even when backtrack limit is reached; only suppress actual backtrack targets
* 🐛(device-gateway-client): prevent uncaught error when closing connecting WebSocket
Detach ws event listeners safely, temporarily handle close-phase errors, and guard ws.close() so logout/token clear does not surface a main-process uncaught exception.
Made-with: Cursor
* 🧹 refactor(tests): remove unused mockProps from ComfyUIForm test
Cleaned up the ComfyUIForm test by removing the unused mockProps object, streamlining the test setup for better clarity and maintainability.
Signed-off-by: Innei <tukon479@gmail.com>
* Hide onboarding finish tool call and preserve close error listener
---------
Signed-off-by: Innei <tukon479@gmail.com>
🐛 fix(desktop): use stored locale from URL parameter instead of system language
When the desktop app restarts, the UI language was reverting to the system
language instead of respecting the user's saved language preference.
Root cause: The inline script in index.html was setting document.documentElement.lang
from navigator.language (system language) before i18n initialization could read
the stored locale from Electron store.
Fix: Check the URL's `lng` query parameter first (which is set by Electron main
process from stored settings in Browser.ts:buildUrlWithLocale()), then fall back
to navigator.language.
Fixes#13616https://claude.ai/code/session_0128LZAbJL1a5vkGboH4U5FP
Co-authored-by: Claude <noreply@anthropic.com>
* 🐛 fix(desktop): remote re-auth for batched tRPC and clean OIDC on disconnect
- Notify authorization required when X-Auth-Required is set, not only on HTTP 401 (207 batch)
- Show AuthRequiredModal after remote config init; do not gate on dataSyncConfig.active
- Desktop: market 401 only silent refresh; avoid community sign-in UI (AuthRequiredModal handles cloud)
- Disconnect: clearRemoteServerConfig to wipe encrypted OIDC tokens
Made-with: Cursor
* 🐛 Reset user-data Zustand stores on remote disconnect and sync refresh
- Add ResetableStoreAction helper and batched reset via userDataStores
- Wire reset into Electron remote disconnect and refreshUserData
- Handle refreshUserData failures in data sync SWR onSuccess
Made-with: Cursor
* 🐛 fix(useUserAvatar): refactor desktop environment checks to use mockConstEnv
- Replace direct manipulation of mockIsDesktop with mockConstEnv.isDesktop for better encapsulation.
- Update all relevant test cases to utilize the new mock structure, ensuring consistent behavior across tests.
This change improves the clarity and maintainability of the test code.
Signed-off-by: Innei <tukon479@gmail.com>
* 🐛 test: update mocks for ShikiLobeTheme and refactor session/agent mocks
- Added ShikiLobeTheme mock to ComfyUIForm and AddFilesToKnowledgeBase tests for consistent theming.
- Refactored session and agent mocks to use async imports, improving test isolation and performance.
This enhances the clarity and maintainability of the test suite.
Signed-off-by: Innei <tukon479@gmail.com>
---------
Signed-off-by: Innei <tukon479@gmail.com>
* 🤖 chore(skills): add electron-dev.sh script and update local-testing skill
Add reusable electron-dev.sh script with start/stop/status/restart commands
that reliably manages all Electron processes (main + helpers + vite).
Update SKILL.md to reference the script instead of inline bash commands.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* ✨ feat: add AgentStreamClient for Agent Gateway WebSocket communication
Browser-compatible WebSocket client for receiving agent execution events
from the Agent Gateway. Supports auto-reconnect with exponential backoff,
heartbeat keep-alive, and event replay via lastEventId resume.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix: add the availableAgents into the prompt inject
* fix: should auto inject the avaiable agents into context when use the auto model
* fix: update the prompt
* fix: test fixed
* ♻️ refactor(onboarding): add OnboardingContextInjector and wire context engine
Made-with: Cursor
* 🔧 refactor(onboarding): update tool call references to use `lobe-user-interaction________builtin`
Modified onboarding documentation and utility functions to standardize the use of the `lobe-user-interaction________builtin` tool call for structured input collection, enhancing clarity and consistency across the codebase.
Signed-off-by: Innei <tukon479@gmail.com>
* 🔧 refactor(onboarding): standardize tool call references to `lobe-user-interaction____askUserQuestion____builtin`
Updated documentation and utility functions to replace instances of the `lobe-user-interaction________builtin` tool call with `lobe-user-interaction____askUserQuestion____builtin`, ensuring consistency in structured input collection across the onboarding process.
Signed-off-by: Innei <tukon479@gmail.com>
* ♻️ refactor(onboarding): move onboarding context before first user
* ♻️ refactor(context-engine): add virtual last user provider
* update v3
* 🐛 fix(onboarding): add early exit escape hatch for boundary cases
The `<next_actions>` directive only prompted finishOnboarding in the
summary phase, but phase transition required all fields + 5 discovery
exchanges — a condition extreme cases rarely meet. This left the model
stuck in discovery, never calling finishOnboarding.
- Add EARLY EXIT hint in discovery phase next_actions
- Add universal completion-signal REMINDER across all phases
- Add minimum-viable discovery fallback in systemRole
- Add explicit completion signal list in Early Exit section
- Add off-topic redirect limit in Boundaries
- Add CRITICAL persistence rule in toolSystemRole
* ✅ test(context-engine): fix OnboardingContextInjector tests to match BaseFirstUserContentProvider
Remove brittle MessagesEngine onboarding test that hardcoded XML content.
---------
Signed-off-by: Innei <tukon479@gmail.com>
🐛 fix(prompts): enforce user perspective in input completion prompt
The autocomplete prompt was generating completions from the AI assistant's
perspective (e.g., "How can I help you?") instead of the user's perspective.
Added explicit perspective constraints with good/bad examples.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:31:14 +08:00
1367 changed files with 88306 additions and 18254 deletions
description: 'Bot platform architecture (Discord, Slack, Telegram, Feishu/Lark, QQ, WeChat). Use when working on inbound webhooks, Chat SDK message routing, agent execution from chat platforms, queue-mode callbacks, gateway lifecycle (websocket/polling), bot provider CRUD/credentials, or platform-specific clients/adapters/schemas. Triggers on bot, channel, webhook, mention, Chat SDK, agent bot provider, gateway, bot-callback, qstash bot.'
---
# Bot System
> **Last updated: 2026-04-08.** Implementation evolves quickly — this doc is a map, not the source of truth. Always read the key files below to verify behavior, especially per-platform quirks. Update this doc when the architecture changes.
LobeChat agents can answer inside external chat platforms. Inbound messages flow through the Chat SDK (`chat` npm package), get routed to the right agent by `(platform, applicationId)`, executed via `AiAgentService`, and replied back through a per-platform `PlatformClient`. There are **two execution modes** (in-memory vs queue/QStash) and **three connection modes** (`webhook`, `websocket`, `polling`).
`supportsMarkdown=false` ⇒ outbound markdown is stripped to plain text via `stripMarkdown` and the AI is told not to use markdown. `supportsMessageEdit=false` ⇒ no progress edits — only the final reply is sent.
**Multi-mode connection** — Slack/Feishu/Lark/QQ shipped as websocket but support `webhook` per-provider via `settings.connectionMode`. Legacy rows without that field stay on `webhook` (see `LEGACY_WEBHOOK_PLATFORMS` in `platforms/utils.ts`) — **never add new platforms to that list**.
→ returns immediately, callbacks land at /api/agent/webhooks/bot-callback
```
The router caches loaded bots in memory. Cache is **invalidated** by `BotMessageRouter.invalidateBot(platform, appId)` whenever the TRPC `update`/`delete` mutations run, so new credentials/settings take effect on the next webhook.
## Execution Modes
### In-memory (default)
`AgentBridgeService.executeWithInMemoryCallbacks` wraps `execAgent` with `stepCallbacks`. Lives in one process — Promise-based wait, 30-min timeout, edits the same `progressMessage` after every step. Topic title is summarized inline via `SystemAgentService`.
### Queue (`isQueueAgentRuntimeEnabled`)
`AgentBridgeService.executeWithWebhooks`:
1. Posts the `renderStart` placeholder, captures `progressMessageId`.
2. Calls `execAgent` with `stepWebhook` and `completionWebhook` pointing at `${INTERNAL_APP_URL ?? APP_URL}/api/agent/webhooks/bot-callback`, plus `webhookDelivery: 'qstash'`.
3. Returns immediately; the bridge `finally` block keeps the active-thread marker held until the `completion` callback fires.
`/api/agent/webhooks/bot-callback/route.ts` verifies the QStash signature and hands off to `BotCallbackService.handleCallback`:
-`type: 'step'` → `handleStep` re-renders `renderStepProgress`, edits `progressMessageId` (skipped if `displayToolCalls=false` or platform `supportsMessageEdit=false`).
-`type: 'completion'` → `handleCompletion` writes the final reply (or error/interrupted message), removes the 👀 reaction, clears active-thread tracker, fires async `summarizeTopicTitle`.
`BotCallbackService.createMessenger` reloads provider + credentials from DB and rebuilds a `PlatformClient` per call (no in-memory state).
## Commands
Defined in `BotMessageRouter.buildCommands` and registered via two paths:
- **Text-based fallback** (Telegram/Feishu/QQ/Lark/WeChat): `bot.onNewMessage(/^\/(new|stop)(\s|$|@)/, ...)` plus a per-mention `tryDispatch` so commands work even before subscribe.
Built-in commands:
-`/new` — clears `topicId` in thread state, next message starts a fresh topic.
-`/stop` — interrupts the active execution (calls `AiAgentService.interruptTask` if `operationId` is known; otherwise queues a deferred stop via `requestStop`/`pendingStopThreads`, also aborts the startup phase via `startupControllers`).
To add a command, append to `buildCommands` — it auto-registers everywhere; on Telegram it also surfaces in the `/` menu via `client.registerBotCommands` → `setMyCommands`.
## Active-thread State (statics on `AgentBridgeService`)
-`activeThreads: Set<threadId>` — prevents duplicate runs per thread (must guard before stale-topic check, otherwise concurrent messages can drop).
-`activeOperations: Map<threadId, operationId>` — needed by `/stop` once `execAgent` returns.
-`startupControllers: Map<threadId, AbortController>` — cancels pre-`operationId` work (topic/tool prep).
-`pendingStopThreads: Set<threadId>` — `/stop` arrived before `operationId` existed; consumed once available.
In **queue mode**, the bridge `finally` skips cleanup so the marker persists until `BotCallbackService.handleCompletion` calls `clearActiveThread`.
## Topic Lifecycle in Threads
-`handleMention` always treats the message as the start of a new conversation.
-`handleSubscribedMessage` reads `topicId` from `thread.state`. If the topic is stale (`> 4 hours` since `updatedAt`), state is cleared and it retries as a fresh mention.
- If `execAgent` fails with a Postgres FK violation on `topic_id` (cached topic was deleted), the bridge clears state and retries as a mention.
-`subscribe()` is gated by `client.shouldSubscribe(threadId)` — Discord top-level channels return `false` so we don't follow up there.
## Attachments
`AgentBridgeService.extractFiles` resolves attachments in priority order:
1.`att.buffer` — already downloaded by the adapter (WeChat/Feishu inbound).
2.`att.fetchData()` — adapter-provided lazy download with auth (Telegram, Slack, Feishu history). **Required** when URLs are token-protected — naive `fetch(url)` later in `ingestAttachment.ts` has no credentials.
3.`att.url` — public CDN fallback (Discord, public QQ).
`inferMimeType` / `inferName` patch Telegram-style `photo` payloads (no `mimeType`/`name` from Bot API → defaults to `image/jpeg`) so vision models actually see them. Quoted-message attachments are also pulled from `raw.referenced_message.attachments` (Discord).
## Concurrency
`settings.concurrency` is `'queue'` or `'debounce'`:
-`debounce` → Chat SDK debounces inbound messages by `debounceMs`; `mergeSkippedMessages` joins skipped texts/attachments into the current message before handing to the agent.
-`queue` → Chat SDK serializes per-thread; the bridge's own `activeThreads` set is still required because in queue mode the SDK lock releases before the agent finishes.
## Gateway (persistent platforms)
Webhook platforms run fine in serverless functions. Persistent platforms (`websocket`, `polling`) need a long-running listener — that's the **gateway**.
- Iterates registered platforms and starts every enabled persistent provider with `durationMs = 10min`, then in `after(...)` polls `BotConnectQueue` every 30s for new connect requests, until the window expires.
-`getEffectiveConnectionMode(platform, settings)` is the only place that resolves per-provider mode — respect it everywhere.
**`POST /api/agent/gateway/start/route.ts`** is the non-Vercel `ensureRunning` entry point (`Bearer ${KEY_VAULTS_SECRET}`).
**Runtime status** is stored in Redis at `bot:runtime-status:platform:appId` with TTL ≈ `durationMs + 60s`. States: `starting | connected | disconnected | failed | queued`. Updated by each `PlatformClient.start/stop` and by the gateway service.
## Platform Definitions
Each platform exposes a `PlatformDefinition` registered in `platforms/index.ts`:
`schema` drives both server validation (`mergeWithDefaults`, `extractDefaults`) **and** the auto-generated UI form. Top-level keys `applicationId` / `credentials` / `settings` map to DB columns. Common settings fields live in `platforms/const.ts` (`displayToolCallsField`, `serverIdField`, `userIdField`).
Each platform implements `PlatformClient` (see `platforms/types.ts`):
`ClientFactory.validateCredentials` is called from the TRPC `testConnection` mutation — implement it to hit the platform API and return useful per-field errors.
- User-scoped: `create / update / delete / query / findById / findByAgentId / findEnabledByApplicationId`. Credentials are encrypted/decrypted via the injected `KeyVaultsGateKeeper`.
- Static (system-wide): `findByPlatformAndAppId`, `findEnabledByPlatform` — used by webhook routing & gateway sync, since they don't have a user context yet.
Client service: `src/services/agentBotProvider.ts`. Store actions: `src/store/agent/slices/bot/action.ts`. UI: `src/routes/(main)/agent/channel/{list,detail}` — settings form is auto-generated from each platform's `schema`.
## Reply Templates
`src/server/services/bot/replyTemplate.ts` exports `renderStart`, `renderStepProgress`, `renderFinalReply`, `renderError`, `renderStopped`, `splitMessage`. Step progress carries elapsed time, last LLM content, last tools, totals; final reply uses `client.formatMarkdown` then `client.formatReply` (which optionally appends `formatUsageStats`). `splitMessage(text, charLimit)` chunks at paragraph → line → hard cut.
-`const.ts` — `DEFAULT_X_CONNECTION_MODE`, history limits, etc.
-`protocol-spec.md` — protocol notes (every existing platform has one)
2. Pick the right `connectionMode` — webhook is much simpler if the platform supports it.
3. If the platform can't render markdown, set `supportsMarkdown: false` and implement `formatMarkdown` via `stripMarkdown`.
4. If it can't edit messages, set `supportsMessageEdit: false` — `BotCallbackService` will skip step edits and only send the final reply.
5. Implement `validateCredentials` so the UI's "Test connection" button gives useful errors.
6. Add the platform icon in `src/routes/(main)/agent/channel/const.ts` and register the platform in `src/server/services/bot/platforms/index.ts`.
7. Add i18n keys under `channel.*` in `src/locales/default/setting.ts` (or wherever the channel namespace lives) — the schema's `label`/`description`/`placeholder`/`enumLabels` are i18n keys.
- **If reachable** (returns any HTTP status): server is running. Skip to Step 2.
- **If unreachable**: start the server:
```bash
# From cloud repo root
pnpm run dev:next
```
To **restart** (pick up server-side code changes):
```bash
lsof -ti:3011 | xargs kill
pnpm run dev:next
```
**Important:** Server-side code changes in the submodule (`lobehub/src/server/`, `lobehub/packages/`) require a server restart. Next.js hot-reload may not pick up changes in submodule packages.
- **If file exists and contains `"serverUrl": "http://localhost:3011"`**: already authenticated. Skip to Step 3.
- **If file missing or points to wrong server**: login is needed. Ask the user to run:
```bash
! cd lobehub/apps/cli &&LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3011
```
> Login requires interactive browser authorization (OIDC Device Code Flow), so the user must run it themselves via `!` prefix. After login, credentials are saved to `lobehub/apps/cli/.lobehub-dev/` and persist across sessions.
### Step 3: Test with CLI Commands
CLI runs from source (`bun src/index.ts`), so CLI-side code changes take effect immediately without rebuilding.
```bash
cd lobehub/apps/cli
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts <command>
```
### Step 4: Clean Up Test Data
Delete any test data created during verification:
```bash
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts task delete < id > -y
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts agent delete < id > -y
2.**Read images**: If the issue description contains images, MUST use `mcp__linear-server__extract_images` to read image content for full context
3.**Check for sub-issues**: Use`mcp__linear-server__list_issues` with `parentId` filter
4.**Mark as In Progress**: When starting to plan or implement an issue, immediately update status to **"In Progress"** via`mcp__linear-server__update_issue`
5.**Update issue status** when completing: `mcp__linear-server__update_issue`
agent-browser --cdp 9222 snapshot # Explicit CDP port
```
## iOS Simulator (Mobile Safari)
@@ -247,7 +247,7 @@ agent-browser -p ios close
```bash
agent-browser dashboard install
agent-browser dashboard start # Background server on port 4848
agent-browser dashboard start # Background server on port 4848
agent-browser dashboard stop
```
@@ -258,37 +258,43 @@ Use `-p <provider>` to run against cloud browsers: `agentcore`, `browserbase`, `
## Browser Engine Selection
```bash
agent-browser --engine lightpanda open example.com # 10x faster, 10x less memory
agent-browser --engine lightpanda open example.com # 10x faster, 10x less memory
```
## Electron (LobeHub Desktop)
### Setup
### Setup / Teardown
Use the `electron-dev.sh` script to manage the Electron dev environment. It handles process lifecycle, waits for SPA readiness, and reliably kills all child processes (main + helpers + vite).
# Part 2: osascript (Native macOS App Bot Testing)
Use AppleScript via `osascript` to control native macOS desktop apps for bot testing. This works with any app that supports macOS Accessibility, without needing CDP or Chromium.
Use AppleScript via `osascript` to control native macOS desktop apps for bot testing. Works with any app that supports macOS Accessibility, no CDP or Chromium needed.
## Core osascript Patterns
The pattern is the same for every platform:
### Activate an App
1.**Activate** the app (`tell application "X" to activate`)
2.**Navigate** to a channel/chat (Quick Switcher `Cmd+K` or Search `Cmd+F`)
3.**Send** a message (clipboard paste `Cmd+V` + Enter)
4.**Wait** for the bot response
5.**Screenshot** for verification (`screencapture` + `Read` tool)
```bash
osascript -e 'tell application "Discord" to activate'
```
## Per-Platform References
### Type Text
Pick the file for your target platform — each contains activation, navigation, send-message, and verification snippets specific to that app:
```bash
# Type character by character (reliable, but slow for long text)
osascript -e 'tell application "System Events" to keystroke "Hello world"'
**App name:**`微信` or `WeChat` | **Process name:**`WeChat`
### Activate & Navigate
```bash
# Activate WeChat
osascript -e 'tell application "微信" to activate'
sleep 1
# Search for a contact/bot (Cmd+F)
osascript -e '
tell application "System Events"
keystroke "f" using command down
delay 0.5
keystroke "TestBot"
delay 1
key code 36 -- Enter to select
end tell
'
sleep 2
```
### Send Message
```bash
# After navigating to a chat, the input is focused
osascript -e '
tell application "System Events"
keystroke "Hello bot!"
delay 0.3
key code 36
end tell
'
```
### Send Long Message (clipboard)
```bash
osascript -e '
tell application "微信" to activate
delay 0.5
set the clipboard to "Please help me with this task..."
tell application "System Events"
keystroke "v" using command down
delay 0.3
key code 36
end tell
'
```
### Verify Response
```bash
sleep 10
screencapture /tmp/wechat-bot-response.png
```
### WeChat-Specific Notes
- WeChat macOS app name can be `微信` or `WeChat` depending on system language. Try both:
```bash
osascript -e 'tell application "微信" to activate' 2> /dev/null \
|| osascript -e 'tell application "WeChat" to activate'
```
- WeChat uses **Enter** to send (not Cmd+Enter by default, but configurable)
- For multi-line messages without sending, use **Shift+Enter**:
```bash
osascript -e 'tell application "System Events" to key code 36 using shift down'
```
---
## Client: Lark / 飞书
**App name:** `Lark` or `飞书` | **Process name:** `Lark` or `飞书`
### Activate & Navigate
```bash
# Activate Lark (auto-detects Lark or 飞书)
osascript -e 'tell application "Lark" to activate' 2> /dev/null \
|| osascript -e 'tell application "飞书" to activate'
sleep 1
# Quick Switcher / Search (Cmd+K)
osascript -e 'tell application "System Events" to keystroke "k" using command down'
sleep 0.5
osascript -e '
set the clipboard to "bot-testing"
tell application "System Events"
keystroke "v" using command down
delay 1.5
key code 36 -- Enter
end tell
'
sleep 2
```
### Send Message to Bot
```bash
osascript -e '
set the clipboard to "@MyBot help me with this task"
tell application "System Events"
keystroke "v" using command down
delay 0.3
key code 36 -- Enter
end tell
'
```
### Verify Response
```bash
sleep 10
screencapture /tmp/lark-bot-response.png
```
### Lark-Specific Notes
- App name varies: `Lark` (international) vs `飞书` (China mainland) — the script auto-detects
- Uses `Cmd+K` for quick search (same as Discord/Slack)
- Enter sends message by default
---
## Client: QQ
**App name:** `QQ` | **Process name:** `QQ`
### Activate & Navigate
```bash
osascript -e 'tell application "QQ" to activate'
sleep 1
# Search for contact/group (Cmd+F)
osascript -e '
tell application "System Events"
keystroke "f" using command down
delay 0.8
end tell
'
osascript -e '
set the clipboard to "bot-testing"
tell application "System Events"
keystroke "v" using command down
delay 1.5
key code 36 -- Enter
end tell
'
sleep 2
```
### Send Message to Bot
```bash
osascript -e '
set the clipboard to "Hello bot!"
tell application "System Events"
keystroke "v" using command down
delay 0.3
key code 36 -- Enter
end tell
'
```
### Verify Response
```bash
sleep 10
screencapture /tmp/qq-bot-response.png
```
### QQ-Specific Notes
- Enter sends message by default; Shift+Enter for newlines
- Uses `Cmd+F` for search
- Always use clipboard paste for CJK characters
---
## Common Bot Testing Workflow (osascript)
Regardless of platform, the pattern is:
```bash
APP_NAME="Discord" # or "Slack", "Telegram", "微信"
CHANNEL="bot-testing"
MESSAGE="Hello bot!"
WAIT_SECONDS=10
# 1. Activate
osascript -e "tell application \"$APP_NAME\" to activate"
sleep 1
# 2. Navigate to channel/chat (via Quick Switcher or Search)
osascript -e 'tell application "System Events" to keystroke "k" using command down'
sleep 0.5
osascript -e "tell application \"System Events\" to keystroke \"$CHANNEL\""
sleep 1
osascript -e 'tell application "System Events" to key code 36'
sleep 2
# 3. Send message
osascript -e "set the clipboard to \"$MESSAGE\""
osascript -e '
tell application "System Events"
keystroke "v" using command down
delay 0.3
key code 36
end tell
'
# 4. Wait for bot response
sleep "$WAIT_SECONDS"
# 5. Screenshot for verification
screencapture /tmp/"${APP_NAME,,}"-bot-test.png
echo "Result saved to /tmp/${APP_NAME,,}-bot-test.png"
```
### Tips
- **Use clipboard paste** (`Cmd+V`) for messages containing special characters or long text — `keystroke` can mangle non-ASCII
- **Add `delay`** between actions — apps need time to process UI events
- **Screenshot for verification** — use `screencapture` + `Read` tool for visual checks
- **Use a dedicated test channel/chat** — avoid polluting real conversations
- **Check app name** — some apps have different names in different locales (e.g., `微信` vs `WeChat`)
- **Accessibility permissions required** — System Events automation requires granting Accessibility access in System Preferences > Privacy & Security > Accessibility
For **shared osascript patterns** (activate, type, paste, screenshot, read accessibility, common workflow template, gotchas), see [references/osascript-common.md](./references/osascript-common.md). Read this first if you're new to osascript automation.
---
@@ -995,16 +410,18 @@ echo "Result saved to /tmp/${APP_NAME,,}-bot-test.png"
Ready-to-use scripts in `.agents/skills/local-testing/scripts/`:
| `test-discord-bot.sh` | Send message to Discord bot via osascript |
| `test-slack-bot.sh` | Send message to Slack bot via osascript |
| `test-telegram-bot.sh` | Send message to Telegram bot via osascript |
| `test-wechat-bot.sh` | Send message to WeChat bot via osascript |
| `test-lark-bot.sh` | Send message to Lark / 飞书 bot via osascript |
| `test-qq-bot.sh` | Send message to QQ bot via osascript |
### Window Screenshot Utility
@@ -1061,25 +478,16 @@ Each script: activates the app, navigates to the channel/contact, pastes the mes
# Screen Recording
Record automated demos by combining `ffmpeg` screen capture with `agent-browser` automation. The script `.agents/skills/local-testing/scripts/record-electron-demo.sh` handles the full lifecycle for Electron.
### Usage
Record automated demos using `record-app-screen.sh` (start/stop lifecycle, CDP screenshots + ffmpeg assembly). See [references/record-app-screen.md](references/record-app-screen.md) for full documentation.
1. Starts Electron with CDP and waits for SPA to load
2. Detects window position, screen, and Retina scale via Swift/CGWindowList
3. Records only the Electron window region using `ffmpeg -f avfoundation` with crop
4. Runs the demo (built-in or custom script receiving CDP port as `$1`)
5. Stops recording and cleans up
Outputs to `.records/` directory (gitignored): `<name>.mp4` (video) + `<name>/` (screenshots every 3s).
---
@@ -1098,20 +506,11 @@ The script automatically:
### Electron-specific
- **`npx electron-vite dev` must run from `apps/desktop/`** — running from project root fails silently
- **Always use `electron-dev.sh stop` to clean up** — `pkill -f "Electron"` only kills the main process; helper processes (GPU, renderer, network) survive. The script finds and kills all of them via PID matching against the project's electron binary path.
- **`npx electron-vite dev` must run from `apps/desktop/`** — running from project root fails silently. The `electron-dev.sh` script handles this automatically.
- **Don't resize the Electron window after load** — resizing triggers full SPA reload
- **Store is at `window.__LOBE_STORES`** not `window.__ZUSTAND_STORES__`
### osascript
- **Accessibility permission required** — first run will prompt for access; grant it in System Preferences > Privacy & Security > Accessibility for Terminal / iTerm / Claude Code
- **`keystroke` is slow for long text** — always use clipboard paste (`Cmd+V`) for messages over \~20 characters
- **`keystroke` can mangle non-ASCII** — use clipboard paste for Chinese, emoji, or special characters
- **`key code 36` is Enter** — this is the hardware key code, works regardless of keyboard layout
- **`entire contents` is extremely slow** — avoid for complex UIs; use screenshots instead
- **App name varies by locale** — `微信` vs `WeChat`, `企业微信` vs `WeCom`; handle both
- **WeChat Enter sends immediately** — use `Shift+Enter` for newlines within a message
- **Rate limiting** — don't send messages too fast; platforms may throttle or flag automated input
- **Lark / 飞书 app name varies** — `Lark` (international) vs `飞书` (China mainland); scripts auto-detect
- **QQ uses `Cmd+F` for search** — not `Cmd+K` like Discord/Slack/Lark
- **Bot response times vary** — AI-powered bots may take 10-60s; use generous sleep values
See [references/osascript-common.md](./references/osascript-common.md#gotchas) for the full osascript gotchas list (accessibility permissions, `keystroke` non-ASCII issues, locale-specific app names, rate limiting, etc.).
Shared AppleScript / `osascript` patterns used by all platform bot tests. Read this first, then refer to the per-platform file for app-specific quirks.
## Core Patterns
### Activate an App
```bash
osascript -e 'tell application "Discord" to activate'
```
### Type Text
```bash
# Type character by character (reliable, but slow for long text)
osascript -e 'tell application "System Events" to keystroke "Hello world"'
# Press Enter
osascript -e 'tell application "System Events" to key code 36'
# Press Tab
osascript -e 'tell application "System Events" to key code 48'
# Press Escape
osascript -e 'tell application "System Events" to key code 53'
```
### Paste from Clipboard (fast, for long text)
```bash
# Set clipboard and paste — much faster than keystroke for long messages
osascript -e 'set the clipboard to "Your long message here"'
osascript -e 'tell application "System Events" to keystroke "v" using command down'
```
Or in one shot:
```bash
osascript -e '
set the clipboard to "Your long message here"
tell application "System Events" to keystroke "v" using command down
'
```
### Keyboard Shortcuts
```bash
# Cmd+K (quick switcher in Discord/Slack)
osascript -e 'tell application "System Events" to keystroke "k" using command down'
# Cmd+F (search)
osascript -e 'tell application "System Events" to keystroke "f" using command down'
# Cmd+N (new message/chat)
osascript -e 'tell application "System Events" to keystroke "n" using command down'
# Cmd+Shift+K (example: multi-modifier)
osascript -e 'tell application "System Events" to keystroke "k" using {command down, shift down}'
```
### Click at Position
```bash
# Click at absolute screen coordinates
osascript -e '
tell application "System Events"
click at {500, 300}
end tell
'
```
### Get Window Info
```bash
# Get window position and size
osascript -e '
tell application "System Events"
tell process "Discord"
get {position, size} of window 1
end tell
end tell
'
```
### Screenshot
```bash
# Full screen
screencapture /tmp/screenshot.png
# Interactive region select
screencapture -i /tmp/screenshot.png
# Specific window (by window ID from CGWindowList)
# Get all UI elements of the frontmost window (can be slow/large)
osascript -e '
tell application "System Events"
tell process "Discord"
entire contents of window 1
end tell
end tell
'
# Get a specific element's value
osascript -e '
tell application "System Events"
tell process "Discord"
get value of text field 1 of window 1
end tell
end tell
'
```
> **Warning:** `entire contents` can be extremely slow on complex UIs. Prefer screenshots + `Read` tool for visual verification.
### Read Screen Text via Clipboard
For reading the latest message or response from an app:
```bash
# Select all text in the focused area and copy
osascript -e '
tell application "System Events"
keystroke "a" using command down
keystroke "c" using command down
end tell
'
sleep 0.5
# Read clipboard
pbpaste
```
---
## Common Bot Testing Workflow
Regardless of platform, the pattern is:
```bash
APP_NAME="Discord"# or "Slack", "Telegram", "微信"
CHANNEL="bot-testing"
MESSAGE="Hello bot!"
WAIT_SECONDS=10
# 1. Activate
osascript -e "tell application \"$APP_NAME\" to activate"
sleep 1
# 2. Navigate to channel/chat (via Quick Switcher or Search)
osascript -e 'tell application "System Events" to keystroke "k" using command down'
sleep 0.5
osascript -e "tell application \"System Events\" to keystroke \"$CHANNEL\""
sleep 1
osascript -e 'tell application "System Events" to key code 36'
sleep 2
# 3. Send message
osascript -e "set the clipboard to \"$MESSAGE\""
osascript -e '
tell application "System Events"
keystroke "v" using command down
delay 0.3
key code 36
end tell
'
# 4. Wait for bot response
sleep "$WAIT_SECONDS"
# 5. Screenshot for verification
screencapture /tmp/"${APP_NAME,,}"-bot-test.png
echo"Result saved to /tmp/${APP_NAME,,}-bot-test.png"
```
### Tips
- **Use clipboard paste** (`Cmd+V`) for messages containing special characters or long text — `keystroke` can mangle non-ASCII
- **Add `delay`** between actions — apps need time to process UI events
- **Screenshot for verification** — use `screencapture` + `Read` tool for visual checks
- **Use a dedicated test channel/chat** — avoid polluting real conversations
- **Check app name** — some apps have different names in different locales (e.g., `微信` vs `WeChat`)
- **Accessibility permissions required** — System Events automation requires granting Accessibility access in System Preferences > Privacy & Security > Accessibility
---
## Gotchas
- **Accessibility permission required** — first run will prompt for access; grant it in System Preferences > Privacy & Security > Accessibility for Terminal / iTerm / Claude Code
- **`keystroke` is slow for long text** — always use clipboard paste (`Cmd+V`) for messages over \~20 characters
- **`keystroke` can mangle non-ASCII** — use clipboard paste for Chinese, emoji, or special characters
- **`key code 36` is Enter** — this is the hardware key code, works regardless of keyboard layout
- **`entire contents` is extremely slow** — avoid for complex UIs; use screenshots instead
- **App name varies by locale** — `微信` vs `WeChat`, `企业微信` vs `WeCom`; handle both
- **WeChat Enter sends immediately** — use `Shift+Enter` for newlines within a message
- **Rate limiting** — don't send messages too fast; platforms may throttle or flag automated input
- **Lark / 飞书 app name varies** — `Lark` (international) vs `飞书` (China mainland); scripts auto-detect
- **QQ uses `Cmd+F` for search** — not `Cmd+K` like Discord/Slack/Lark
- **Bot response times vary** — AI-powered bots may take 10-60s; use generous sleep values
General-purpose screen recording tool for the Electron app. Captures CDP screenshots as video frames and gallery snapshots, then assembles into an MP4 on stop.
## Why CDP Screenshots Instead of ffmpeg Screen Capture
- **Works on any screen** — CDP screenshots capture the browser viewport directly, so external monitors, Retina scaling, and window positioning are all handled automatically
- **No signal handling issues** — ffmpeg-static (npm) produces corrupt MP4 files when killed (missing moov atom). CDP screenshots avoid this entirely
- **Consistent output** — Screenshots are resolution-independent and don't require crop coordinate calculations
## Commands
```bash
# Start recording (Electron must be running with CDP)
@@ -6,6 +6,9 @@ description: React component development guide. Use when working with React comp
# React Component Writing Guide
- Use antd-style for complex styles; for simple cases, use inline `style` attribute
- **Prefer `createStaticStyles` with `cssVar.*`** (zero-runtime) — module-level, no hook call required
- Only fall back to `createStyles` + `token` when styles genuinely need runtime computation (dynamic props, JS color fns like `readableColor`/`chroma`)
- See `.cursor/docs/createStaticStyles_migration_guide.md` for full pattern
- Use `Flexbox` and `Center` from `@lobehub/ui` for layouts (see `references/layout-kit.md`)
description: Guide for using Recent Data (topics, resources, pages). Use when working with recently accessed items, implementing recent lists, or accessing session store recent data. Triggers on recent data usage or implementation tasks.
user-invocable: false
---
# Recent Data Usage Guide
Recent data (recentTopics, recentResources, recentPages) is stored in session store.
@@ -5,6 +5,14 @@ description: "Version release workflow. Use when the user mentions 'release', 'h
# Version Release Workflow
## Mandatory Companion Skill
For every `/version-release` execution, you MUST load and apply:
-`../microcopy/SKILL.md`
Changelog style guidance is now fully embedded in this skill. Keep release facts unchanged, and only improve structure, readability, and tone.
## Overview
The primary development branch is **canary**. All day-to-day development happens on canary. When releasing, canary is merged into main. After merge, `auto-tag-release.yml` automatically handles tagging, version bumping, creating a GitHub Release, and syncing back to the canary branch.
@@ -150,6 +158,166 @@ All release PR bodies (both Minor and Patch) must include a user-facing changelo
- Weekly Release: See `reference/changelog-example/weekly-release.md`
- DB Migration: See `reference/changelog-example/db-migration.md`
@@ -4,16 +4,27 @@ A changelog reference for database migration release PR bodies.
---
This release includes a **database schema migration**involving **5 new tables** for the Agent Evaluation Benchmark system.
This release includes a **database schema migration**for Agent Evaluation Benchmark. We are adding **5 new tables** so benchmark setup, runs, and run-topic records can be stored in a complete and queryable structure.
- Added 5 new tables: `agent_eval_benchmarks`, `agent_eval_datasets`, `agent_eval_records`, `agent_eval_runs`, `agent_eval_run_topics`
Previously, benchmark-related data lacked a full lifecycle model, whichmade it harder to track evaluation flow from dataset to run results. This migration introduces the missing relational layer so benchmark configuration, execution, and analysis records stay connected.
### Notes for Self-hosted Users
In practical terms, this reduces ambiguity for downstream features and gives operators a cleaner foundation for troubleshooting and reporting.
- The migration runs automatically on application startup
- No manual intervention required
Added tables:
-`agent_eval_benchmarks`
-`agent_eval_datasets`
-`agent_eval_records`
-`agent_eval_runs`
-`agent_eval_run_topics`
## Notes for self-hosted users
- Migration runs automatically during app startup.
- No manual SQL action is required in standard deployments.
- As with any schema release, we still recommend database backup and rollout during a low-traffic window.
The migration owner: @{pr-author} — responsible for this database schema change, reach out for any migration-related issues.
@@ -4,42 +4,47 @@ A real-world changelog reference for weekly patch release PR bodies.
---
This release includes **82 commits** , Key updates are below.
This weekly release includes **82 commits**. The throughline is simple: less friction when moving from idea to execution. Across agent workflows, model coverage, and desktop polish, this release removes several small blockers that used to interrupt momentum.
### New Features and Enhancements
The result is not one headline feature, but a noticeably smoother week-to-week experience. Teams can evaluate agents with clearer structure, ship richer media flows, and spend less time debugging provider and platform edge cases.
- Added **Agent Benchmark** support for more systematic agent performance evaluation.
- Introduced the **video generation** feature end-to-end, including entry points, sidebar "new" badge support, and skeleton loading for topic switching.
- Expanded memory capabilities: support for memory effort/tool permission configuration and improved timeout calculation for memory analysis tasks.
- Added desktop editor support for image upload via file picker.
## Agent workflows and media generation
### Models and Provider Expansion
Previously, some agent evaluation and media generation flows still felt fragmented: setup was manual, discoverability was uneven, and switching between topics could interrupt context. This release adds **Agent Benchmark** support and lands the **video generation** path end-to-end, from entry point to generation feedback.
- Added a new provider: **Straico**.
- Added/updated support for:
- Claude Sonnet 4.6
- Gemini 3.1 Pro Preview
- Qwen3.5 series
- Grok Imagine (`grok-imagine-image`)
- MiniMax 2.5
- Added related i18n copy and model parameter adaptations.
In practice, this means users can discover and run these workflows with fewer detours. Sidebar "new" indicators improve visibility, skeleton loading makes topic switches feel less abrupt, and memory-related controls now behave more predictably under real workload pressure.
### Desktop Improvements
We also expanded memory controls with effort and tool-permission configuration, and improved timeout calculation for memory analysis tasks so longer runs fail less often in production-like usage.
- Improved DMG background assets and desktop release workflow.
## Models and provider coverage
### Stability, Security, and UX Fixes
Provider diversity matters most when teams can adopt new models without rewriting glue code every sprint. This release adds **Straico** and updates support for Claude Sonnet 4.6, Gemini 3.1 Pro Preview, Qwen3.5, Grok Imagine (`grok-imagine-image`), and MiniMax 2.5.
- Fixed multiple video generation pipeline issues: precharge refund handling, webhook token verification, pricing parameter usage, asset cleanup, and type safety.
- Fixed `sanitizeFileName` path traversal risks and added unit tests.
-Fixed MCP media URL generation with duplicated `APP_URL` prefix.
Use these updates to:
-route requests to newly available providers
- test newer model families without custom patching
- keep model parameters and related i18n copy aligned across providers
This keeps model exploration practical: faster evaluation loops, fewer adaptation surprises, and cleaner cross-provider behavior.
## Desktop and platform polish
Desktop receives a set of quality-of-life upgrades that reduce "death by a thousand cuts" moments. We integrated `electron-liquid-glass` for macOS Tahoe and improved DMG background assets and packaging flow for more consistent release output.
The desktop editor now supports image upload from the file picker, which shortens everyday authoring steps and removes one more reason to switch tools mid-task.
## Improvements and fixes
- Fixed multiple video pipeline issues across precharge refund handling, webhook token verification, pricing parameter usage, asset cleanup, and type safety.
- Fixed path traversal risk in `sanitizeFileName` and added corresponding unit tests.
- Fixed MCP media URL generation when `APP_URL` was duplicated in output paths.
- Fixed Qwen3 embedding failures caused by batch-size limits.
- Fixed multiple UI/interaction issues, including mobile header agent selector/topic count, ChatInput scrolling behavior, and tooltip stacking context.
- Fixed several UIinteraction issues, including mobile header agent selector/topic count, ChatInput scrolling behavior, and tooltip stacking context.
- Fixed missing `@napi-rs/canvas` native bindings in Docker standalone builds.
- Improved GitHub Copilot authentication retry behavior and response error handling in edge cases.
if grep -iq "^${ISSUE_AUTHOR}$" .github/maintainers.txt; then
echo "is_team=true" >> "$GITHUB_OUTPUT"
else
echo "is_team=false" >> "$GITHUB_OUTPUT"
fi
- name:Copy triage prompts
run:|
mkdir -p /tmp/claude-prompts
@@ -62,7 +72,7 @@ jobs:
**IMPORTANT**:
- Follow ALL steps in the issue-triage.md guide
- Apply labels according to the guide's rules
- Post a mention comment to the appropriate team member(s) based on team-assignment.md
- ${{ steps.check-team.outputs.is_team == 'true' && 'The issue author is a team member. Do NOT post any @mention comment.' || 'Post a mention comment to the appropriate team member(s) based on team-assignment.md' }}
'You need to maintain the component format of the mdx file; the output text does not need to be wrapped in any code block syntax on the outermost layer.\n'+
@@ -6,7 +6,7 @@ Guidelines for using Claude Code in this LobeHub repository.
- Next.js 16 + React 19 + TypeScript
- SPA inside Next.js with `react-router-dom`
-`@lobehub/ui`, antd for components; antd-style for CSS-in-JS
-`@lobehub/ui`, antd for components; antd-style for CSS-in-JS — **prefer `createStaticStyles` with `cssVar.*`** (zero-runtime); only fall back to `createStyles` + `token` when styles genuinely need runtime computation. See `.cursor/docs/createStaticStyles_migration_guide.md`.
- react-i18next for i18n; zustand for state management
- SWR for data fetching; TRPC for type-safe backend
title: LobeHub Plugin Ecosystem - Functionality Extensions and Development Resources
description: >-
Discover how the LobeHub plugin ecosystem enhances the utility and flexibility
of the LobeHub assistant, along with the development resources and plugin
development guidelines provided.
title: 'Plugin System: Extend Your Agents with Community Skills'
description: LobeHub now supports a plugin ecosystem that lets Agents access real-time information, interact with external services, and handle specialized tasks without leaving the conversation.
tags:
- LobeHub
- Plugins
@@ -13,12 +10,29 @@ tags:
# Supported Plugin System
The LobeHub plugin ecosystem is a significant extension of its core functionalities, greatly enhancing the utility and flexibility of the LobeHub assistant.
LobeHub now supports plugins that extend what your Agents can do. Instead of being limited to built-in capabilities, Agents can now pull live data, interact with external platforms, and handle specialized workflows through community-built extensions.
By leveraging plugins, the LobeHub assistants are capable of accessing and processing real-time information, such as searching online for data and providing users with timely and relevant insights.
## Access real-time information
Moreover, these plugins are not solely limited to news aggregation; they can also extend to other practical functionalities, such as quickly retrieving documents, generating images, obtaining data from various platforms such as Bilibili and Steam, and interacting with an array of third-party services.
Previously, conversations were limited to the knowledge cutoff of the underlying model. Now, with plugins like web search, your Agents can fetch current information—news, documentation, stock prices, or weather—right when you need it.
To learn more, please refer to the [Plugin Usage](/en/docs/usage/plugins/basic). Additionally, quality voice options (OpenAI Audio, Microsoft Edge Speech) are available to cater to users from different regions and cultural backgrounds. Users can select suitable voices based on personal preferences or specific situations, providing a personalized communication experience.
Use plugins to:
- Run web searches and get up-to-date answers
- Query documentation sites and technical references
- Retrieve platform data from services like Bilibili or Steam
- Generate images on demand during a conversation
## Community-powered flexibility
The plugin system is designed to grow with community contributions. Developers can build and share custom plugins that add new capabilities to any Agent. Users simply enable the plugins they need for their specific workflow.
This means your Agents become more specialized over time. A coding assistant might enable documentation search and code execution plugins. A creative assistant might use image generation and content research tools. The same underlying Agent adapts to different contexts through its enabled plugins.
## Voice options for natural interaction
Alongside plugin capabilities, LobeHub now offers quality voice synthesis options including OpenAI Audio and Microsoft Edge Speech. Choose a voice that matches your preference or scenario for more personalized interactions.
Learn more about plugin usage in our [documentation](/en/docs/usage/plugins/basic).
LobeHub supports various large language models with visual recognition
capabilities, allowing users to upload or drag and drop images. The assistant
will recognize the content and engage in intelligent dialogue, creating a more
intelligent and diverse chat environment.
title: 'Visual Recognition: Chat With Images, Not Just Text'
description: LobeHub now supports multimodal models including GPT-4 Vision, Google Gemini Pro Vision, and GLM-4 Vision. Upload or drag images into conversations and your Agent will understand and respond to visual content.
tags:
- Visual Recognition
- LobeHub
@@ -17,6 +11,25 @@ tags:
# Supported Models for Visual Recognition
LobeHub now supports several large language models with visual recognition capabilities, including OpenAI's [`gpt-4-vision`](https://platform.openai.com/docs/guides/vision), Google Gemini Pro vision, and Zhiyuan GLM-4 Vision. This empowers LobeHub with multimodal interaction capabilities. Users can effortlessly upload images or drag and drop them into the chat window, where the assistant can recognize the image content and engage in intelligent dialogue, building a smarter and more diverse chat experience.
Conversations in LobeHub are no longer limited to text. We now support several large language models with visual recognition capabilities, including OpenAI's [`gpt-4-vision`](https://platform.openai.com/docs/guides/vision), Google Gemini Pro Vision, and Zhiyuan GLM-4 Vision.
This feature opens up new avenues for interaction, allowing communication that extends beyond text to include rich visual elements. Whether sharing images during everyday use or interpreting graphics in specific industries, the assistant delivers an exceptional conversational experience. Additionally, we have carefully selected a range of high-quality voice options (OpenAI Audio, Microsoft Edge Speech) to cater to users from different regions and cultural backgrounds. Users can choose a suitable voice based on personal preferences or specific contexts, thus receiving a more personalized communication experience.
## Share images naturally
Upload an image or drag it directly into the chat window, and your Agent can understand the visual content and continue the discussion in context. This works for screenshots, photos, diagrams, or any visual reference you need to share.
This brings a more natural multimodal experience to both everyday and professional scenarios:
- Share photos from your day and discuss them
- Upload UI screenshots for design feedback
- Share diagrams and get explanations
- Reference visual content without describing it in words
## Context-aware visual understanding
The assistant doesn't just see the image—it understands it within the ongoing conversation. Ask follow-up questions about specific details, compare multiple images, or use visuals as reference material for complex discussions.
For specialized fields, this means clearer context and more practical responses. Medical imaging discussions, architectural reviews, or technical diagram analysis all become more natural when both parties can see the same visual reference.
## Voice options for personalized interaction
To better serve users across regions and preferences, we've also added quality voice options from OpenAI Audio and Microsoft Edge Speech. Choose a voice that fits your style or scenario for more personalized interactions.
LobeHub supports Text-to-Speech (TTS) and Speech-to-Text (STT) technologies,
offering high-quality voice options for a personalized communication
experience. Learn more about Lobe TTS Toolkit.
title: 'Voice Conversations: Talk Naturally With Your Agents'
description: LobeHub now supports Text-to-Speech (TTS) and Speech-to-Text (STT), enabling natural voice interactions. Speak with your Agents and hear responses in clear, personalized voices.
tags:
- TTS
- STT
@@ -14,6 +11,24 @@ tags:
# Supporting TTS & STT Voice Conversations
LobeHub supports Text-to-Speech (TTS) and Speech-to-Text (STT) technologies, allowing our application to transform textual information into clear voice output. Users can interact with our conversational agents as if they were talking to a real person. There are various voice options for users to choose from, providing the right audio source for their assistant. Additionally, for those who prefer auditory learning or seek to gain information while on the go, TTS offers an excellent solution.
LobeHub now supports Text-to-Speech (TTS) and Speech-to-Text (STT), turning typed conversations into natural voice interactions. You can speak with your Agents and hear their responses, making the experience closer to talking with a real person.
In LobeHub, we have carefully curated a selection of high-quality voice options (OpenAI Audio, Microsoft Edge Speech) to cater to users from different regions and cultural backgrounds. Users can select suitable voices based on personal preferences or specific scenarios, thus achieving a personalized communication experience.
## Natural voice interaction
With TTS, your Agents can read responses aloud in clear, natural-sounding voices. With STT, you can dictate messages instead of typing. Together, they enable hands-free interaction—useful when you're multitasking, on the move, or simply prefer speaking to typing.
This is especially helpful for:
- Auditory learners who process information better by hearing
- Users who want to stay productive while commuting or away from a keyboard
- Anyone who finds voice more accessible or convenient than text
## Personalized voice selection
Different Agents can have different voices. Choose a voice that matches each Agent's personality or purpose. A professional assistant might use a calm, measured tone. A creative collaborator might sound more expressive.
We've curated high-quality voices from OpenAI Audio and Microsoft Edge Speech to serve users across regions and preferences. Select the voice that fits your usage style or scenario.
## A complete communication loop
Voice support closes the gap between human and AI interaction styles. Speak naturally, hear responses aloud, and maintain context just like you would in a spoken conversation. The rest of LobeHub's features—plugins, multimodal support, context management—work seamlessly alongside voice mode.
LobeHub now supports the latest text-to-image generation technology, allowing
users to directly invoke the text-to-image tool during conversations with the
assistant for creative purposes. By utilizing AI tools such as DALL-E 3,
MidJourney, and Pollinations, assistants can turn your ideas into images,
making the creative process more intimate and immersive.
title: 'Text-to-Image: Create Visuals Directly in Chat'
description: LobeHub now supports text-to-image generation. Invoke DALL-E 3, MidJourney, or Pollinations directly during conversations to turn your ideas into images without leaving the chat.
tags:
- Text-to-Image
- LobeHub
@@ -16,4 +11,18 @@ tags:
# Support for Text-to-Image Generation
The latest text-to-image generation technology is now supported, enabling LobeHub users to directly use the text-to-image tool during conversations with their assistant. By harnessing the capabilities of AI tools like [`DALL-E 3`](https://openai.com/dall-e-3), [`MidJourney`](https://www.midjourney.com/), and [`Pollinations`](https://pollinations.ai/), assistants can now transform your ideas into images. Thisallows for a more intimate and immersive creative process.
LobeHub now supports text-to-image generation, so you can create images directly while chatting with your Agents. With tools like [`DALL-E 3`](https://openai.com/dall-e-3), [`MidJourney`](https://www.midjourney.com/), and [`Pollinations`](https://pollinations.ai/), your Agents can turn your descriptions into visuals within the same conversation flow.
## Creative workflow without switching tools
Previously, generating AI images meant leaving your conversation, opening a separate tool, writing a prompt, waiting for results, then copying the image back. Now you simply describe what you want, and your Agent produces the image right there in the chat.
This keeps creative momentum flowing. Iterate on ideas quickly—request adjustments, explore variations, or refine descriptions—all without context switching. The conversation history maintains your creative direction, so you can reference previous ideas and build on them.
## Private and immersive creation
Image generation happens within your existing conversation, keeping your creative process contained and private. No need to manage separate accounts or jump between platforms. Your prompts, iterations, and final images stay in one place, organized alongside the rest of your discussion.
## Multiple generation options
Different tools excel at different styles. DALL-E 3 offers detailed, precise renders. MidJourney produces artistic, atmospheric results. Pollinations provides fast, accessible generation. Your Agent can help you choose and use the right tool for each creative task.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.