🐛 fix(memory): keep list after editing memory

🐛 fix(chat): preserve subAgentId/documentId in message bucket key context (#15865 )
* 🐛 fix(chat): preserve subAgentId/documentId in message bucket key context `replaceMessages` and `internal_getConversationContext` rebuilt the conversation context with a hand-picked field whitelist, silently dropping `subAgentId` (and others). Since `messageMapKey` uses `subAgentId` as the group_agent scope subTopicId, group-agent writes collapsed into the wrong bucket. Spread the whole context instead and only special-case the fields that need a fallback/assertion (agentId, topicId), so every bucket-key field carries through. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * ✅ test(database): deterministic ordering in topic.duplicate test Both seed messages were inserted in one transaction with no explicit createdAt, so they shared the same `now()` default. `duplicate`'s `orderBy(createdAt)` then returned the tied rows in arbitrary order, making the positional assertions flaky. Give them distinct createdAt (user before assistant) so the order is well-defined. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:36:07 +00:00 · 2026-06-15 14:08:40 +08:00 · 2026-06-15 13:57:09 +08:00 · 2026-06-15 13:56:52 +08:00 · 2026-06-15 13:44:55 +08:00 · 2026-06-15 13:11:21 +08:00
1121 changed files with 36562 additions and 15362 deletions
@@ -19,9 +19,23 @@ also run as full cloud automation. Every test session follows the same
 four-step contract:

 ```
-Step 0: Env + Auth  →  Step 1: Pick surface  →  Step 2: Run  →  Step 3: Structured report
+Step -1: Plan approval  →  Step 0: Env + Auth  →  Step 1: Pick surface  →  Step 2: Run  →  Step 3: Structured report
 ```

+## Step -1 — Plan approval for non-trivial tests
+
+Skip directly to Step 0 if: the test is a single re-run after a fix, the plan
+was already agreed on, or the user gave exact commands.
+
+Otherwise, propose a test plan (surface, cases, expected evidence, assumptions)
+and use the runtime structured question tool (`request_user_input` /
+ask-user-question equivalent) with two fixed choices:
+
+1. `开始执行 (Recommended)` — 测试方案没问题，开始执行
+2. `先讨论下` — 方案有问题，先讨论下
+
+Wait for the user's choice before proceeding.
+
 ## Step 0 — Environment setup + auth check (mandatory)

 Step 0 is about getting the environment ready: **dependencies are healthy**
@@ -29,6 +43,36 @@ and **auth is green**. A test run that dies halfway on a missing dependency or
 a login wall wastes the whole session — clear both gates BEFORE writing a
 single test step.

+### 0.0 Resolve the current test environment
+
+Before starting a dev server, checking auth, opening agent-browser, or writing
+test steps, print and confirm the current local test environment:
+
+```bash
+./.agents/skills/agent-testing/scripts/test-env.sh
+```
+
+This command is the source of truth for local test ports. It reads the current
+shell plus `.env` files using the same precedence as `scripts/runWithEnv.mts`,
+then prints:
+
+- `APP_URL`
+- `PORT`
+- `SERVER_URL`
+- `AUTH_TRUSTED_ORIGINS`
+- `SPA_PORT`
+- `MOBILE_SPA_PORT`
+- `DESKTOP_PORT`
+
+For commands that need these values, export them from the same resolver:
+
+```bash
+eval "$(./.agents/skills/agent-testing/scripts/test-env.sh --exports)"
+```
+
+Do not rely on hard-coded port tables. If the printed values do not match the
+running dev server, fix/export the env first, then continue.
+
 ### 0.1 Dependencies are installed — root AND standalone apps

 The root pnpm workspace does **NOT** cover every app: `pnpm-workspace.yaml`
@@ -38,9 +82,9 @@ lists `packages/**`, `e2e`, `apps/server`, and only `apps/desktop/src/main` —
 refresh them, so install in every app the test will touch:

 ```bash
-pnpm install                      # root workspace
-cd apps/desktop && pnpm install   # Electron surface
-cd apps/cli && pnpm install       # CLI surface
+pnpm install                    # root workspace
+cd apps/desktop && pnpm install # Electron surface
+cd apps/cli && pnpm install     # CLI surface
 ```

 Symptom of a stale standalone install: the build/launch fails to resolve a
@@ -55,27 +99,129 @@ directory — a script launched while `cwd` is `apps/desktop` fails with
 `No such file or directory`. Verify `pwd` is the repo root before launching
 long-running scripts.

-### 0.3 Auth is green
+### 0.3 Init local dev env without `.env`

-**Auth is the gate for all automated testing.**
+For Web smoke against local code, start a **normal local dev environment**.
+First check the repo root for `.env`:
+
+- If `.env` exists, use the existing local configuration and start the dev
+  server normally.
+- If `.env` does not exist, use the agent-testing env bootstrap.
+
+Do not start the standalone e2e server as the product under test.
+
+Use `scripts/init-dev-env.sh`. It follows the e2e setup pattern — Postgres,
+migrations, auth/key-vault/S3 test env, seed user — but it is owned by this
+skill and starts the repo's dev server (`pnpm run dev:next` / `bun run dev`),
+not `e2e/scripts/setup.ts --start`. The script hard-blocks when root `.env`
+exists, so it cannot accidentally override a user's local config. When `.env`
+exists, do not call any `init-dev-env.sh` subcommand.
+
+Decision flow:

 ```bash
-./.agents/skills/agent-testing/scripts/setup-auth.sh status
+if [[ -f .env ]]; then
+  bun run dev
+else
+  ./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
+  ./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+  ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
+fi
 ```

-| Surface  | Mechanism                                         | One-key path                   | Standard check                 |
-| -------- | ------------------------------------------------- | ------------------------------ | ------------------------------ |
-| CLI      | OIDC Device Code Flow (`apps/cli/.lobehub-dev`)   | `setup-auth.sh cli`            | `setup-auth.sh status`         |
-| Web      | better-auth cookie injection into `agent-browser` | `pbpaste \| setup-auth.sh web` | `setup-auth.sh web-verify`     |
-| Electron | App's own persistent login state                  | Log in once in the app         | `app-probe.sh auth`            |
-| Bot      | Native apps already logged in                     | —                              | per-platform screenshot        |
+Bootstrap flow when no `.env` exists:
+
+```bash
+# From repo root. Managed DB flow requires Docker Desktop.
+./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
+```
+
+If using an existing Postgres instead of the managed Docker DB, set
+`DATABASE_URL` and skip `setup-db`:
+
+```bash
+DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate
+DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
+```
+
+For backend-only checks, `dev-next` is available, but Web smoke needs the
+full-stack `dev` command so Next can proxy the SPA HTML from Vite:
+
+```bash
+./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
+```
+
+Useful subcommands:
+
+```bash
+./.agents/skills/agent-testing/scripts/init-dev-env.sh env       # print exports
+./.agents/skills/agent-testing/scripts/init-dev-env.sh write     # write .records/env/agent-testing-dev.env
+./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate   # migrations only
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user # seed user + CLI API key
+./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash    # local QStash for workflow paths
+./.agents/skills/agent-testing/scripts/init-dev-env.sh clean-db  # remove managed DB container
+```
+
+Default script env:
+
+- `APP_URL=http://localhost:3010`
+- `DATABASE_URL=postgresql://postgres:postgres@localhost:5433/postgres`
+- `DATABASE_DRIVER=node`
+- `FEATURE_FLAGS=-agent_self_iteration` so local smoke does not require QStash
+- Local QStash defaults (`QSTASH_URL`, `QSTASH_TOKEN`, signing keys) are exported;
+  run `init-dev-env.sh qstash` in a separate terminal when the path under test
+  triggers QStash/Workflow.
+- `KEY_VAULTS_SECRET`, `AUTH_SECRET`, auth verification off
+- S3 mock vars
+- Managed DB container: `lobehub-agent-testing-postgres`
+
+`seed-user` creates `agent-testing@lobehub.com` / `TestPassword123!` with
+onboarding already completed, plus a local API key in
+`.records/env/agent-testing-cli.env` for CLI automation. When running Cucumber
+against this dev server, pass the same script env into the test process too;
+Cucumber has its own `BeforeAll` seed path and it must see `DATABASE_URL`
+instead of silently skipping setup:
+
+```bash
+cd e2e
+# Only in the no-.env branch.
+eval "$(../.agents/skills/agent-testing/scripts/init-dev-env.sh env)"
+BASE_URL=http://localhost:3010 HEADLESS=true bun run test:smoke
+```
+
+### 0.4 Auth is green for the selected surface
+
+**Auth is the gate for automated testing, but the gate is surface-scoped.**
+Pick the intended surface first when it is already clear from the task, then
+check only that surface. Do not block a Web test on CLI device-code auth or an
+Electron login state unless the test spans those surfaces.
+
+```bash
+./.agents/skills/agent-testing/scripts/setup-auth.sh status --surface web
+```
+
+Use `status` with no `--surface` only for cross-surface test plans.
+
+| Surface  | Mechanism                                     | One-key path             | Standard check                            |
+| -------- | --------------------------------------------- | ------------------------ | ----------------------------------------- |
+| CLI      | Seeded API key, device-code fallback          | `setup-auth.sh cli-seed` | `setup-auth.sh status --surface cli`      |
+| Web      | Seeded better-auth login into `agent-browser` | `setup-auth.sh web-seed` | `setup-auth.sh status --surface web`      |
+| Electron | App's own persistent login state              | Log in once in the app   | `setup-auth.sh status --surface electron` |
+| Bot      | Native apps already logged in                 | —                        | per-platform screenshot                   |

 Login-state checks are standardized — do NOT hand-roll `window.__LOBE_STORES`
 eval snippets; use `scripts/app-probe.sh auth` (returns `{ isSignedIn, userId }`,
 works for Electron CDP and web sessions via `AB_TARGET`).

-If `status` is not all green, fix auth first (the steps that need a human must be
-requested from the user explicitly). Full background and failure modes:
+For Web tests, the test surface is always `agent-browser --session lobehub-dev`.
+Use `setup-auth.sh web-seed` first in the seeded local env. The user's normal
+Chrome is only a source for copying the Cookie header when seed auth is not
+available or `status --surface web` still fails. If Chrome is already logged in,
+do not open a login page; verify agent-browser first, then request the Network
+`Cookie:` header only if that verification fails. Full background and failure modes:
 [references/auth.md](./references/auth.md).

 ## Step 1 — Pick the surface by change scope
@@ -148,17 +294,19 @@ Surface guides above carry the detailed workflows. Shared infrastructure:

 All under `.agents/skills/agent-testing/scripts/`:

-| Script                    | Usage                                                                          |
-| ------------------------- | ------------------------------------------------------------------------------ |
-| `setup-auth.sh`           | One-stop auth setup & status check (`status` / `cli` / `web`)                  |
-| `app-probe.sh`            | LobeHub app probes: `auth` / `route` / `ops` / `goto <path>` / `errors`        |
-| `record-gif.sh`           | Frame-sequence → GIF for time-based behavior (streaming, timers, animations)   |
-| `report-init.sh`          | Scaffold a structured test report (Step 3)                                     |
-| `electron-dev.sh`         | Manage Electron dev env (start/stop/status/restart, CDP 9222)                  |
-| `capture-app-window.sh`   | Screenshot a specific app window (general; used by bot tests)                  |
-| `record-app-screen.sh`    | Record app screen (video + periodic screenshots)                               |
-| `record-electron-demo.sh` | Record Electron app demo with ffmpeg                                           |
-| `agent-gateway/`          | Gateway probe / dump / analyze tools                                           |
+| Script                    | Usage                                                                        |
+| ------------------------- | ---------------------------------------------------------------------------- |
+| `test-env.sh`             | Print/export the resolved local test env and ports                           |
+| `setup-auth.sh`           | One-stop auth setup & status check (`status` / `cli` / `web`)                |
+| `init-dev-env.sh`         | Self-contained local dev env (`setup-db` / `seed-user` / `dev-next` / `dev`) |
+| `app-probe.sh`            | LobeHub app probes: `auth` / `route` / `ops` / `goto <path>` / `errors`      |
+| `record-gif.sh`           | Frame-sequence → GIF for time-based behavior (streaming, timers, animations) |
+| `report-init.sh`          | Scaffold a structured test report (Step 3)                                   |
+| `electron-dev.sh`         | Manage Electron dev env (start/stop/status/restart, CDP 9222)                |
+| `capture-app-window.sh`   | Screenshot a specific app window (general; used by bot tests)                |
+| `record-app-screen.sh`    | Record app screen (video + periodic screenshots)                             |
+| `record-electron-demo.sh` | Record Electron app demo with ffmpeg                                         |
+| `agent-gateway/`          | Gateway probe / dump / analyze tools                                         |

 `app-probe.sh` is the LobeHub-specific fast path into app state — auth check,
 current route, running operations, and `goto <path>` quick navigation
@@ -174,12 +322,13 @@ not a chat-only summary. Scaffold it up front and fill it as you test:
 ```bash
 DIR=$(./.agents/skills/agent-testing/scripts/report-init.sh my-feature "Verify my feature")
 # ... test, saving screenshots / CLI transcripts into $DIR/assets/ ...
-# fill $DIR/report.md (case table, embedded evidence, verdict) and $DIR/result.json
+# fill $DIR/report.md (scope, case table with inline evidence, verdict, score) and $DIR/result.json
 ```

 Reports live in `.records/reports/<timestamp>-<slug>/` (gitignored): `report.md`
-(human-readable, with embedded screenshots), `result.json` (machine-readable
-pass/fail + score), `assets/` (evidence). Format spec and evidence rules:
+(human-readable, with screenshots/GIFs embedded directly in the case table),
+`result.json` (machine-readable pass/fail + score), `assets/` (evidence).
+Format spec and evidence rules:
 [references/report.md](./references/report.md).

 Two hard rules worth front-loading:
@@ -187,6 +336,21 @@ Two hard rules worth front-loading:
 - **Report language = the user's conversation language.** Write the ENTIRE
  `report.md` (headings included) in the language the user is conversing in —
  no mixed English. `result.json` keys/status values stay English.
+- **The case table is the main reading surface.** Prefer the compact
+  `# | case | result | key observation | evidence` shape and embed the
+  screenshot/GIF in the evidence cell. Use separate evidence sections only for
+  long CLI transcripts, HAR summaries, or supplemental detail.
+- **Visual evidence must render inline.** Screenshots and GIFs in `report.md`
+  must use Markdown image syntax like `![case 1](assets/case1.png)`. Do not
+  use bare file paths, Markdown links, or local file links as the primary
+  visual evidence; those make the report unreadable without opening each asset.
+- **Final replies must include visual evidence links.** When a run includes UI
+  screenshots or GIFs, include the report directory and the most important
+  visual artifacts in the final chat response. Each item must include a stable
+  label, an evidence caption describing the observed UI outcome, and a
+  repo-relative path, for example:
+  `[Image #1 - error toast shows provider auth failure](<report-dir>/assets/foo.png)`.
+  Use repo-relative paths, not absolute paths.
 - **Time-based behavior needs a GIF, not a screenshot.** If a case asserts
  change over time (streaming output, a ticking timer, loading states,
  animations), record it with `scripts/record-gif.sh` and embed the GIF —
@@ -13,17 +13,18 @@ flakiness.

 ## Prerequisites

-| Requirement  | Details                                                                           |
-| ------------ | --------------------------------------------------------------------------------- |
-| Dev server   | `localhost:3010` — see [../references/dev-server.md](../references/dev-server.md) |
+| Requirement  | Details                                                                                                                                        |
+| ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
+| Dev server   | `localhost:3010` — see [../references/dev-server.md](../references/dev-server.md)                                                              |
 | CLI source   | `apps/cli/` — runs from source, no rebuild; standalone `node_modules` — run `pnpm install` inside `apps/cli/` (root install does not cover it) |
-| CLI dev mode | `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated credentials                          |
-| Auth         | Device Code Flow login — see [../references/auth.md](../references/auth.md)       |
+| CLI dev mode | `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated settings                                                                                          |
+| Auth         | Seeded API key first; Device Code Flow only as fallback — see [../references/auth.md](../references/auth.md)                                   |

 All CLI dev commands run from `apps/cli/`. Subsequent examples use `$CLI`:

 ```bash
-CLI="LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts"
+source ../../.records/env/agent-testing-cli.env
+CLI="bun src/index.ts"
 ```

 ## Workflow
@@ -39,14 +40,23 @@ check, start, and restart commands. Server-side code changes require a restart.
 ./.agents/skills/agent-testing/scripts/setup-auth.sh status
 ```

-If the CLI is not logged in, **the user must run the login themselves**
-(interactive browser authorization):
+If the CLI is not ready in the seeded local environment:
+
+```bash
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+source .records/env/agent-testing-cli.env
+./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
+```
+
+If the target environment is not seeded, use the interactive fallback:

 ```bash
 cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
 ```

-Credentials persist in `apps/cli/.lobehub-dev/`. Details:
+Seeded API-key auth does not store credentials. It writes local settings under
+`$HOME/.lobehub-dev` and requires the generated env file to be sourced before
+CLI commands. Details:
 [../references/auth.md](../references/auth.md).

 ### Step 3 — Test with CLI commands
@@ -133,10 +143,10 @@ $CLI provider test <provider-id>

 ## Troubleshooting

-| Issue                       | Solution                                        |
-| --------------------------- | ----------------------------------------------- |
-| `No authentication found`   | Run `login --server http://localhost:3010`      |
-| `UNAUTHORIZED` on API calls | Token expired; re-run login                     |
-| `ECONNREFUSED`              | Dev server not running — see dev-server.md      |
-| CLI shows old data/behavior | Server needs restart to pick up code changes    |
-| Login opens wrong server    | Must use `--server` flag (env var doesn't work) |
+| Issue                       | Solution                                                                                               |
+| --------------------------- | ------------------------------------------------------------------------------------------------------ |
+| `No authentication found`   | Source `.records/env/agent-testing-cli.env`, or run device-code `login --server http://localhost:3010` |
+| `UNAUTHORIZED` on API calls | Re-run `init-dev-env.sh seed-user` and re-source the env file; for device-code fallback, re-run login  |
+| `ECONNREFUSED`              | Dev server not running — see dev-server.md                                                             |
+| CLI shows old data/behavior | Server needs restart to pick up code changes                                                           |
+| Login opens wrong server    | Must use `--server` flag (env var doesn't work)                                                        |
@@ -1,37 +1,72 @@
 # Auth Setup for Local Agent Testing

-**Auth is the gate for all automated testing.** Prepare and verify it before
-writing any test step. The one-stop entry point is:
+**Auth is the gate for all automated testing.** Complete
+[Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) first so
+`SERVER_URL` and ports are resolved, then verify auth before writing any test
+step.
+
+Initialize helpers first:

 ```bash
-SCRIPT=".agents/skills/agent-testing/scripts/setup-auth.sh"
-
-$SCRIPT status        # check server + CLI + web auth readiness
-$SCRIPT cli           # interactive CLI device-code login (must be run by the user)
-pbpaste | $SCRIPT web # inject a copied Cookie header into the agent-browser session
-$SCRIPT web-verify    # live-check that the agent-browser session is authenticated
+SCRIPT="./.agents/skills/agent-testing/scripts/setup-auth.sh"
+TEST_ENV="./.agents/skills/agent-testing/scripts/test-env.sh"
+eval "$($TEST_ENV --exports)"
 ```

-`SERVER_URL` defaults to `http://localhost:3010` (this repo's `dev:next` port).
-Override it when testing against another server (e.g. `SERVER_URL=http://localhost:3011`
-in the cloud repo).
+Quick reference after initialization:
+
+| Command                        | Purpose                                            |
+| ------------------------------ | -------------------------------------------------- |
+| `$SCRIPT status`               | Check all surfaces (server + CLI + web + Electron) |
+| `$SCRIPT status --surface web` | Check only the Web surface gate                    |
+| `$SCRIPT cli-seed`             | Configure CLI API-key auth from the seeded key     |
+| `$SCRIPT cli`                  | Interactive CLI device-code login (user must run)  |
+| `$SCRIPT open-chrome`          | Open Chrome at `SERVER_URL` with DevTools          |
+| `$SCRIPT web-seed`             | Sign in the seeded user and inject cookies         |
+| `pbpaste \| $SCRIPT web`       | Inject a copied Cookie header into agent-browser   |
+| `$SCRIPT web-verify`           | Live-check agent-browser session auth              |
+
+Use `localhost` for Web auth; better-auth cookies are stored for `localhost`,
+not `127.0.0.1`.

 ## Per-surface overview

-| Surface  | Mechanism                                | Persistence                                                       | Human interaction                               |
-| -------- | ---------------------------------------- | ----------------------------------------------------------------- | ----------------------------------------------- |
-| CLI      | OIDC Device Code Flow                    | `apps/cli/.lobehub-dev/settings.json`                             | Yes — browser authorization, every token expiry |
-| Web      | better-auth cookie injection             | `~/.lobehub-agent-testing/web-state.json` + agent-browser session | Copy the Cookie header once per token rotation  |
-| Electron | App's own login state                    | Electron user-data dir                                            | Log in once manually in the app                 |
-| Bot      | Native apps (Discord/WeChat/…) logged in | Each app's own session                                            | Once per app                                    |
+| Surface  | Mechanism                                | Persistence                                                       | Human interaction                              |
+| -------- | ---------------------------------------- | ----------------------------------------------------------------- | ---------------------------------------------- |
+| CLI      | Seeded API key or OIDC Device Code Flow  | `.records/env/agent-testing-cli.env` + `$HOME/.lobehub-dev`       | No for seed path; yes for device-code fallback |
+| Web      | Seeded better-auth login or cookie copy  | `~/.lobehub-agent-testing/web-state.json` + agent-browser session | No for seed path; copy cookie only as fallback |
+| Electron | App's own login state                    | Electron user-data dir                                            | Log in once manually in the app                |
+| Bot      | Native apps (Discord/WeChat/…) logged in | Each app's own session                                            | Once per app                                   |

-## CLI — Device Code Flow
+## CLI — Seeded API key

+For the self-contained no-root-`.env` dev environment, seed the baseline user
+and API key once:
+
+```bash
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+source .records/env/agent-testing-cli.env
+./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
+```
+
+The seed step writes `LOBE_API_KEY` for humans and maps it to the CLI's current
+auth variable, `LOBEHUB_CLI_API_KEY`. It also sets `LOBEHUB_SERVER` so CLI
+commands hit the local server without needing a stored device-code token.
+
+Use this for automated CLI verification:
+
+```bash
+cd apps/cli
+source ../../.records/env/agent-testing-cli.env
+bun src/index.ts <command>
+```
+
+## CLI — Device Code Flow fallback
+
+Use device-code login only when testing against a non-seeded environment.
 Credentials are isolated from the user's real CLI config via
-`LOBEHUB_CLI_HOME=.lobehub-dev` (kept inside `apps/cli/`, gitignored).
-
-Login requires interactive browser authorization, so **the user must run it
-themselves** (e.g. via the `!` prefix in Claude Code):
+`LOBEHUB_CLI_HOME=.lobehub-dev`, which the current CLI stores under
+`$HOME/.lobehub-dev`.

 ```bash
 cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
@@ -40,10 +75,30 @@ cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server htt
 - The `--server` flag is required — an env var does NOT work and login will hit
  the wrong server without it.
 - Check state without logging in: `setup-auth.sh status` (verifies
-  `settings.json` exists and `serverUrl` matches).
+  `LOBEHUB_CLI_API_KEY` when present, otherwise checks the stored server URL).
 - `UNAUTHORIZED` on API calls means the token expired — re-run login.

-## Web — better-auth cookie injection (agent-browser)
+## Web — seeded better-auth login
+
+The Web test surface is `agent-browser --session lobehub-dev`. The user's
+ordinary Chrome is only a cookie source; Chrome screenshots, Chrome Network
+records, and Chrome logged-in state do not prove the agent-browser test session
+is authenticated.
+
+For the seeded local dev environment, use the automatic path:
+
+```bash
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
+```
+
+`web-seed` posts the seeded email/password to
+`/api/auth/sign-in/email`, stores the returned cookie jar under
+`~/.lobehub-agent-testing/`, converts it to Playwright `storageState`, loads it
+into the `agent-browser` session, and verifies the session does not land on
+`/signin`.
+
+## Web — manual cookie injection fallback

 `agent-browser --headed` on macOS often creates the Chromium window off-screen —
 the user can't see or interact with it, so manual login inside the agent-browser
@@ -53,31 +108,19 @@ user's own logged-in Chrome and inject it as a Playwright-style state file.
 Do **not** use this on production URLs — only local dev. Treat the cookie as a
 secret: don't paste it into shared logs, PRs, or commit it anywhere.

-### One-key path
+### Web — decision flow

-1. Ask the user to copy the Cookie header **from a Network request, NOT
-   `document.cookie`** (`document.cookie` cannot see HttpOnly cookies, which is
-   exactly where better-auth puts its session):
-   - Open the logged-in tab (`http://localhost:<port>/…`) in Chrome.
-   - `Cmd+Option+I` → **Network** tab → refresh → click any same-origin request.
-   - Under **Request Headers**, right-click the `Cookie:` line → **Copy value**.
-2. Inject and verify in one shot:
-
-```bash
-pbpaste | ./.agents/skills/agent-testing/scripts/setup-auth.sh web
-```
-
-The script filters the header down to the better-auth cookies
-(`better-auth.session_token`, `better-auth.state`), builds the Playwright
-`storageState` JSON, loads it into the `agent-browser` session (default name
-`lobehub-dev`), opens `SERVER_URL`, and asserts the URL is not `/signin`.
+1. `$SCRIPT status --surface web` — green? Start testing. Do not ask for a Cookie header.
+2. Not green and using the seeded local env → `$SCRIPT web-seed`.
+3. Still not green or not using the seed env → `$SCRIPT open-chrome` opens Chrome at `SERVER_URL` with DevTools.
+4. User copies the `Cookie:` header from Network tab → any same-origin request → Request Headers → right-click `Cookie:` → **Copy value**. Must be from Network, NOT `document.cookie` (HttpOnly cookies are invisible to `document.cookie`).
+5. `pbpaste | $SCRIPT web` — filters to better-auth cookies (`session_token`, `session_data`, `state`), builds Playwright `storageState`, loads it into the `agent-browser` session (`lobehub-dev`), opens `SERVER_URL`, and asserts the URL is not `/signin`.

 ### Using the authenticated session

 ```bash
-agent-browser --session lobehub-dev open "http://localhost:3010/"
+agent-browser --session lobehub-dev open "$SERVER_URL/"
 agent-browser --session lobehub-dev snapshot -i | head -20
-# Look for the user's avatar/name in the sidebar, or absence of the signin form.
 ```

 ### Notes
@@ -90,12 +133,12 @@ agent-browser --session lobehub-dev snapshot -i | head -20

 ### Common failure modes

-| Symptom                                       | Cause                                                                     | Fix                                               |
-| --------------------------------------------- | ------------------------------------------------------------------------- | ------------------------------------------------- |
-| Still redirects to `/signin` after injection  | User pasted from `document.cookie` → missed HttpOnly session              | Re-pull from Network request Headers, not console |
-| Script reports `no better-auth cookies found` | Separator wrong, or user pasted URL-decoded value                         | Keep the raw `Cookie:` header as-is               |
-| Login works briefly then expires              | `better-auth.session_token` rotated (user logged out / signed in again)   | Re-copy and re-inject                             |
-| Domain mismatch                               | Cookie domain must be `localhost` literally, no leading dot for local dev | —                                                 |
+| Symptom                                       | Cause                                                                     | Fix                                                                                            |
+| --------------------------------------------- | ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
+| Still redirects to `/signin` after injection  | User pasted from `document.cookie` → missed HttpOnly session              | Re-pull from Network request Headers, not console                                              |
+| Script reports `no better-auth cookies found` | User pasted the wrong value, or the cookie parser regressed               | Keep the raw `Cookie:` header as-is; run `scripts/setup-auth.test.sh` if the input looks valid |
+| Login works briefly then expires              | `better-auth.session_token` rotated (user logged out / signed in again)   | Re-copy and re-inject                                                                          |
+| Domain mismatch                               | Cookie domain must be `localhost` literally, no leading dot for local dev | —                                                                                              |

 ## Electron

@@ -3,33 +3,71 @@
 Single source of truth for starting / restarting the backend that all test
 surfaces (CLI, Electron, Web) hit.

+## Resolve ports first
+
+Run `test-env.sh` as described in
+[SKILL.md Step 0.0](../SKILL.md#00-resolve-the-current-test-environment)
+before starting or probing any local test surface.
+
 ## Ports & modes

-| Command             | What it runs                                              | Port                              |
-| ------------------- | --------------------------------------------------------- | --------------------------------- |
-| `pnpm run dev:next` | Next.js backend (API + auth)                              | `3010`                            |
-| `bun run dev`       | Full-stack (Next.js + Vite SPA, via `devStartupSequence`) | `3010` (API) + SPA                |
-| `bun run dev:spa`   | Vite SPA only, proxies API to `3010`                      | `9876` (prints a Debug Proxy URL) |
+| Command             | What it runs                                              | Port source         |
+| ------------------- | --------------------------------------------------------- | ------------------- |
+| `pnpm run dev:next` | Next.js backend (API + auth)                              | `PORT`              |
+| `bun run dev`       | Full-stack (Next.js + Vite SPA, via `devStartupSequence`) | `PORT` + `SPA_PORT` |
+| `bun run dev:spa`   | Vite SPA only, proxies API to `PORT`                      | `SPA_PORT`          |

-In the **cloud repo** (where this repo is the `lobehub/` submodule) the dev
-server conventionally runs on `3011` — set `SERVER_URL=http://localhost:3011`
-for the scripts in this skill when testing there.
+In the **cloud repo** (where this repo is the `lobehub/` submodule), local
+worktree names map to fallback defaults only when `.env` and shell env do not
+provide values:
+
+| Workspace directory | Default `SERVER_URL`             |
+| ------------------- | -------------------------------- |
+| `lobehub`           | `http://localhost:3010`          |
+| `lobehub-cloud`     | `http://localhost:3020`          |
+| `lobehub-cloud-1`   | `http://localhost:3021`          |
+| `lobehub-cloud-N`   | `http://localhost:$((3020 + N))` |
+
+`test-env.sh` and `setup-auth.sh` both use the resolved env first and these
+worktree defaults only as fallback. Treat the dev-server terminal output as the
+final source of truth when testing a non-standard port, then export it for every
+agent-testing command:
+
+```bash
+export SERVER_URL=http://localhost:<port-from-dev-output>
+```

 ## Health check

 ```bash
-curl -s -o /dev/null -w '%{http_code}' http://localhost:3010/
+curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/"
 ```

 ## Start / restart

 ```bash
-# Start (from repo root)
+# Start backend only.
+# With root .env: use the existing local config.
 pnpm run dev:next

+# Without root .env: use the self-contained agent-testing env.
+./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
+
+# Full-stack SPA + backend. Required for Web smoke.
+# With root .env:
+bun run dev
+
+# Without root .env:
+./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
+
+# Local QStash. Run in a separate terminal only when testing workflow paths.
+./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash
+
 # Restart — required to pick up server-side code changes
-lsof -ti:3010 | xargs kill
+lsof -ti:"$PORT" | xargs kill
 pnpm run dev:next
+# or, when no root .env exists:
+# ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
 ```

 ## When a server restart is needed
@@ -48,8 +86,13 @@ in doubt.

 ## Troubleshooting

-| Issue                     | Solution                                                |
-| ------------------------- | ------------------------------------------------------- |
-| `ECONNREFUSED`            | Server not running — start it                           |
-| `EADDRINUSE` on the port  | Already running — `lsof -ti:<port> \| xargs kill` first |
-| Stale data / old behavior | Server needs a restart to pick up code changes          |
+| Issue                     | Solution                                                                                      |
+| ------------------------- | --------------------------------------------------------------------------------------------- |
+| `ECONNREFUSED`            | Server not running — start it                                                                 |
+| `EADDRINUSE` on the port  | Already running — `lsof -ti:<port> \| xargs kill` first                                       |
+| Stale data / old behavior | Server needs a restart to pick up code changes                                                |
+| QStash workflow failures  | Start `init-dev-env.sh qstash` and make sure dev server inherited the script's `QSTASH_*` env |
+
+Marketplace/community endpoints are not part of the local agent-testing auth
+gate. Do not block local product-chain verification on marketplace API auth
+unless the change explicitly targets marketplace behavior.
@@ -11,7 +11,7 @@ output):

 ```
 .records/reports/<YYYYMMDD-HHMMSS>-<slug>/
-├── report.md      # human-readable report (embedded screenshots, case table, verdict)
+├── report.md      # human-readable report (case table with inline screenshots, verdict)
 ├── result.json    # machine-readable results (pass/fail counts, score)
 └── assets/        # evidence: screenshots, HAR files, CLI transcripts
 ```
@@ -25,13 +25,16 @@ output):
   ```

   The script creates the directory, pre-fills branch / commit / date in both
-   files, and prints the directory path.
+   files, and prints the directory path. The scaffold uses the compact report
+   shape below; translate its headings and table labels to the user's language
+   before delivery if needed.

 2. **Collect evidence as you test** — every asserted behavior gets one evidence
   item in `$DIR/assets/`:
   - UI (static state): `agent-browser screenshot` or `capture-app-window.sh`,
     then **verify the screenshot with the Read tool before citing it** —
     never cite an image you haven't looked at.
+
   - UI (time-based behavior): **screenshot vs GIF is a judgment you must
     make per case.** If the assertion is about change over time — streaming
     output, a ticking timer, loading/progress states, animations,
@@ -48,33 +51,91 @@ output):

     Embed it like an image: `![case 2](assets/case2-streaming.gif)`. Verify
     at least the first/last frames visually (Read the GIF) before citing.
+
   - CLI: exact command + trimmed output (`$CLI task list | tee "$DIR/assets/task-list.txt"`).
+
   - Network: `agent-browser network requests` dumps or HAR files.

 3. **Fill `report.md` as you go** — don't reconstruct from memory at the end.
+   The primary evidence belongs in the case table itself: each row should pair
+   the assertion with the screenshot/GIF or non-visual artifact that proves it,
+   so readers can scan the result without jumping between sections. UI evidence
+   must render inline with Markdown image syntax; a plain link or file path is
+   not acceptable as primary visual evidence.

 4. **Set the verdict** in both `report.md` and `result.json`, then link the
-   report directory in your final answer to the user.
+   report directory in your final answer to the user. If UI evidence exists,
+   list the key screenshot/GIF links in the final chat response. Use Markdown
+   link text as the evidence caption, for example:
+   `[Image #1 - observed outcome](<report-dir>/assets/case1.png)`.

 ## Report language (hard rule)

 **`report.md` MUST be written in the language the user is conversing in** —
 the whole file, headings included. If the conversation is in Chinese, the
-report is in Chinese; do not mix English prose into it. The scaffold's English
-headings are placeholders — translate them when filling. Exceptions that stay
-as-is: code/commands, identifiers, log excerpts, and `result.json` (its keys
-and status values are machine-read and stay English; the `title` and case
-`name` fields follow the user's language).
+report is in Chinese; do not mix English prose into it. The scaffold headings
+are placeholders — translate them when filling if the user is not conversing in
+the scaffold language. Exceptions that stay as-is: code/commands, identifiers,
+log excerpts, and `result.json` (its keys and status values are machine-read
+and stay English; the `title` and case `name` fields follow the user's
+language).

 ## report.md sections

-| Section         | Content                                                                            |
-| --------------- | ---------------------------------------------------------------------------------- |
-| **Scope**       | What changed / what is being verified; branch + commit                             |
-| **Environment** | Server URL, surfaces used (cli / electron / web / bot), relevant versions          |
-| **Cases**       | Table: `# \| case \| surface \| steps \| expected \| actual \| status \| evidence` |
-| **Evidence**    | Embedded screenshots/GIFs (`![case 1](assets/case1.png)`), fenced CLI transcripts  |
-| **Verdict**     | Pass/fail/blocked counts, optional 0–100 score, open issues / follow-ups           |
+Default report shape:
+
+| Section          | Content                                                                                      |
+| ---------------- | -------------------------------------------------------------------------------------------- |
+| **Scope**        | What changed / what is being verified; branch, commit, date, surface, entry URL/page, focus  |
+| **Cases**        | Compact table: `# \| Case \| Result \| Key observation \| Evidence`                          |
+| **Verdict**      | Overall verdict first (`pass` / `partial` / `fail`), then the concise reasons and follow-ups |
+| **Verification** | Commands or automated checks run in this session, with trimmed results                       |
+| **Score**        | Pass/fail/blocked counts, optional 0–100 score                                               |
+
+The case table is the main reading surface. Prefer one clear row per user
+scenario or regression assertion, and put the screenshot/GIF directly in the
+`Evidence` cell:
+
+```markdown
+| #   | Case                     | Result | Key observation                                                   | Evidence                                         |
+| --- | ------------------------ | ------ | ----------------------------------------------------------------- | ------------------------------------------------ |
+| 1   | Create a new page        | pass   | Title and body persisted after refresh                            | ![created page](assets/new-page-created.png)     |
+| 2   | Respect requested length | fail   | Requested about 600 Chinese characters; final body was about 1286 | ![final article](assets/write-article-final.png) |
+```
+
+## Inline visual evidence
+
+Screenshots and GIFs must be embedded so the report shows the image inline:
+
+```markdown
+![case 1 result](assets/case1-result.png)
+![streaming response](assets/case2-streaming.gif)
+```
+
+Do **not** use these as the primary evidence for UI cases:
+
+```markdown
+[case 1 result](assets/case1-result.png)
+assets/case1-result.png
+file:///tmp/case1-result.png
+```
+
+Links are acceptable for non-visual artifacts such as CLI transcripts, HAR
+files, or long logs. For videos, embed a representative screenshot/GIF inline in
+the case row and link the full video as supplemental evidence.
+
+Avoid the old wide table with separate `steps`, `expected`, and `actual`
+columns unless the test is purely non-visual and truly needs that breakdown.
+For UI reports, those columns make screenshot-backed reading harder. Put
+procedural detail in the row's key observation only when it changes the
+interpretation of the result.
+
+Use an extra evidence/detail section only when the inline table cannot carry
+the material cleanly, such as long CLI transcripts, HAR summaries, or multiple
+screenshots for one case. In that situation, keep the table evidence cell as an
+inline visual proof for UI cases or a concise link for non-visual artifacts,
+then put the longer material under `Verification` or a brief
+`Additional Evidence` section.

 Status values: `pass` / `fail` / `blocked` (couldn't run — e.g. auth or env
 missing; a blocked case is not a pass).
@@ -115,7 +176,8 @@ word the user reads first: `pass`, `fail`, or `partial`.
 ## Rules

 - **No evidence, no claim** — every `pass`/`fail` in the case table must link
-  at least one asset.
+  at least one asset. UI cases must inline-embed their primary screenshot/GIF;
+  non-visual CLI/network cases may link transcripts, HAR files, or logs.
 - **Screenshots must be visually verified** with the Read tool before being
  cited.
 - **Report failures faithfully** — a failing case with clear evidence is a good
@@ -0,0 +1,407 @@
+#!/usr/bin/env bash
+# init-dev-env.sh — self-contained local dev env for agent testing.
+#
+# This script initializes the env needed to run LobeHub's normal local dev
+# server without depending on a root .env file. It follows the same shape as
+# the e2e bootstrap (Postgres + migrations + auth/key-vault/S3 test env), but
+# starts the repo's dev server, not the standalone e2e server.
+#
+# Guardrail: if repo-root .env exists, every non-help command exits immediately.
+# Existing local config always wins.
+#
+# Usage:
+#   init-dev-env.sh env              # print shell exports
+#   init-dev-env.sh write [file]     # write a source-able env file
+#   init-dev-env.sh setup-db         # start local Postgres and run migrations
+#   init-dev-env.sh migrate          # run DB migrations against the configured DB
+#   init-dev-env.sh seed-user        # seed the baseline test user + CLI API key
+#   init-dev-env.sh qstash           # run local Upstash QStash dev server
+#   init-dev-env.sh dev-next         # exec `pnpm run dev:next` with this env
+#   init-dev-env.sh dev              # exec `bun run dev` with this env
+#   init-dev-env.sh clean-db         # remove the managed Postgres container
+#
+# Overrides:
+#   SERVER_PORT=3010 DB_PORT=5433 DB_CONTAINER=lobehub-agent-testing-postgres QSTASH_DEV_PORT=8080
+
+set -euo pipefail
+
+REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
+ROOT_ENV_FILE="$REPO_ROOT/.env"
+
+SERVER_PORT="${SERVER_PORT:-3010}"
+DB_PORT="${DB_PORT:-5433}"
+DB_CONTAINER="${DB_CONTAINER:-lobehub-agent-testing-postgres}"
+DATABASE_URL="${DATABASE_URL:-postgresql://postgres:postgres@localhost:${DB_PORT}/postgres}"
+ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-dev.env"
+CLI_ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-cli.env"
+AGENT_TESTING_API_KEY="${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}"
+QSTASH_DEV_PORT="${QSTASH_DEV_PORT:-8080}"
+QSTASH_LOCAL_TOKEN="${QSTASH_LOCAL_TOKEN:-eyJVc2VySUQiOiJkZWZhdWx0VXNlciIsIlBhc3N3b3JkIjoiZGVmYXVsdFBhc3N3b3JkIn0=}"
+QSTASH_LOCAL_CURRENT_SIGNING_KEY="${QSTASH_LOCAL_CURRENT_SIGNING_KEY:-sig_7kYjw48mhY7kAjqNGcy6cr29RJ6r}"
+QSTASH_LOCAL_NEXT_SIGNING_KEY="${QSTASH_LOCAL_NEXT_SIGNING_KEY:-sig_5ZB6DVzB1wjE8S6rZ7eenA8Pdnhs}"
+
+ok() { printf '  \033[32m✔\033[0m %s\n' "$1"; }
+bad() { printf '  \033[31m✘\033[0m %s\n' "$1"; }
+note() { printf '      %s\n' "$1"; }
+
+guard_no_root_env() {
+  if [[ -f "$ROOT_ENV_FILE" ]]; then
+    bad "root .env exists: $ROOT_ENV_FILE"
+    note "Use the existing local configuration instead of init-dev-env.sh."
+    note "Start normally from repo root, e.g. pnpm run dev:next or bun run dev."
+    exit 1
+  fi
+}
+
+apply_env() {
+  export APP_URL="${APP_URL:-http://localhost:${SERVER_PORT}}"
+  export AUTH_EMAIL_VERIFICATION="${AUTH_EMAIL_VERIFICATION:-0}"
+  export AUTH_SECRET="${AUTH_SECRET:-agent-testing-local-auth-secret-32chars}"
+  export DATABASE_DRIVER="${DATABASE_DRIVER:-node}"
+  export DATABASE_URL
+  export FEATURE_FLAGS="${FEATURE_FLAGS:--agent_self_iteration}"
+  export KEY_VAULTS_SECRET="${KEY_VAULTS_SECRET:-r2gbBPKyJ8ZRKCLKt+I3DImfcL+wGxaQyRC56xtm9Uk=}"
+  export NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION="${NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION:-0}"
+  export NODE_OPTIONS="${NODE_OPTIONS:---max-old-space-size=6144}"
+  export PORT="${PORT:-$SERVER_PORT}"
+  export QSTASH_CURRENT_SIGNING_KEY="${QSTASH_CURRENT_SIGNING_KEY:-$QSTASH_LOCAL_CURRENT_SIGNING_KEY}"
+  export QSTASH_DEV_PORT
+  export QSTASH_NEXT_SIGNING_KEY="${QSTASH_NEXT_SIGNING_KEY:-$QSTASH_LOCAL_NEXT_SIGNING_KEY}"
+  export QSTASH_TOKEN="${QSTASH_TOKEN:-$QSTASH_LOCAL_TOKEN}"
+  export QSTASH_URL="${QSTASH_URL:-http://127.0.0.1:${QSTASH_DEV_PORT}}"
+  export S3_ACCESS_KEY_ID="${S3_ACCESS_KEY_ID:-agent-testing-access-key}"
+  export S3_BUCKET="${S3_BUCKET:-agent-testing-bucket}"
+  export S3_ENDPOINT="${S3_ENDPOINT:-https://agent-testing-s3.localhost}"
+  export S3_SECRET_ACCESS_KEY="${S3_SECRET_ACCESS_KEY:-agent-testing-secret-key}"
+}
+
+env_keys() {
+  printf '%s\n' \
+    APP_URL \
+    AUTH_EMAIL_VERIFICATION \
+    AUTH_SECRET \
+    DATABASE_DRIVER \
+    DATABASE_URL \
+    FEATURE_FLAGS \
+    KEY_VAULTS_SECRET \
+    NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION \
+    NODE_OPTIONS \
+    PORT \
+    QSTASH_CURRENT_SIGNING_KEY \
+    QSTASH_DEV_PORT \
+    QSTASH_NEXT_SIGNING_KEY \
+    QSTASH_TOKEN \
+    QSTASH_URL \
+    S3_ACCESS_KEY_ID \
+    S3_BUCKET \
+    S3_ENDPOINT \
+    S3_SECRET_ACCESS_KEY
+}
+
+print_env() {
+  apply_env
+  while IFS= read -r key; do
+    printf 'export %s=%q\n' "$key" "${!key}"
+  done < <(env_keys)
+}
+
+write_env() {
+  local file="${1:-$ENV_FILE_DEFAULT}"
+  apply_env
+  mkdir -p "$(dirname "$file")"
+  {
+    printf '# Source this file before starting LobeHub local dev server.\n'
+    printf '# Generated by %s\n' "$0"
+    while IFS= read -r key; do
+      printf 'export %s=%q\n' "$key" "${!key}"
+    done < <(env_keys)
+  } > "$file"
+  ok "wrote env file: $file"
+  note "source it with: source $file"
+}
+
+require_docker() {
+  if ! command -v docker > /dev/null 2>&1; then
+    bad "docker CLI is not available"
+    note "Install/start Docker Desktop, or provide DATABASE_URL for an existing Postgres."
+    return 1
+  fi
+}
+
+wait_for_db() {
+  printf '      waiting for Postgres'
+  until docker exec "$DB_CONTAINER" pg_isready -U postgres > /dev/null 2>&1; do
+    printf '.'
+    sleep 2
+  done
+  printf '\n'
+}
+
+start_db() {
+  require_docker
+
+  if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
+    ok "Postgres container already running: $DB_CONTAINER"
+  elif docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
+    docker start "$DB_CONTAINER" > /dev/null
+    ok "started existing Postgres container: $DB_CONTAINER"
+  else
+    docker run -d \
+      --name "$DB_CONTAINER" \
+      -e POSTGRES_PASSWORD=postgres \
+      -p "${DB_PORT}:5432" \
+      paradedb/paradedb:latest > /dev/null
+    ok "created Postgres container: $DB_CONTAINER"
+  fi
+
+  wait_for_db
+}
+
+migrate_db() {
+  apply_env
+  cd "$REPO_ROOT"
+  bun run db:migrate
+}
+
+seed_user() {
+  apply_env
+  export AGENT_TESTING_API_KEY
+  export AGENT_TESTING_CLI_ENV_FILE="${AGENT_TESTING_CLI_ENV_FILE:-$CLI_ENV_FILE_DEFAULT}"
+  cd "$REPO_ROOT"
+  node <<'NODE'
+const bcrypt = require('bcryptjs');
+const crypto = require('node:crypto');
+const fs = require('node:fs');
+const path = require('node:path');
+const pg = require('pg');
+
+const databaseUrl = process.env.DATABASE_URL;
+if (!databaseUrl) {
+  throw new Error('DATABASE_URL is required to seed the baseline test user.');
+}
+
+const TEST_USER = {
+  email: 'agent-testing@lobehub.com',
+  fullName: 'Agent Testing User',
+  id: 'user_agent_testing_001',
+  password: 'TestPassword123!',
+  username: 'agent_testing_user',
+};
+
+const TEST_API_KEY = {
+  id: 'api_key_agent_testing_001',
+  key: process.env.AGENT_TESTING_API_KEY || 'sk-lh-agenttesting0001',
+  name: 'Agent Testing CLI API Key',
+};
+
+const validateApiKeyFormat = (apiKey) => /^sk-lh-[\da-z]{16}$/.test(apiKey);
+
+const hashApiKey = (apiKey) => {
+  const secret = process.env.KEY_VAULTS_SECRET;
+  if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
+
+  return crypto.createHmac('sha256', secret).update(apiKey).digest('hex');
+};
+
+const encryptWithKeyVaultsSecret = (plaintext) => {
+  const secret = process.env.KEY_VAULTS_SECRET;
+  if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
+
+  const rawKey = Buffer.from(secret, 'base64');
+  if (![16, 24, 32].includes(rawKey.length)) {
+    throw new Error(
+      `KEY_VAULTS_SECRET must decode to 16, 24, or 32 bytes, got ${rawKey.length} bytes.`,
+    );
+  }
+
+  const iv = crypto.randomBytes(12);
+  const cipher = crypto.createCipheriv(`aes-${rawKey.length * 8}-gcm`, rawKey, iv);
+  const encrypted = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
+  const authTag = cipher.getAuthTag();
+
+  return `${iv.toString('hex')}:${authTag.toString('hex')}:${encrypted.toString('hex')}`;
+};
+
+const writeCliEnvFile = () => {
+  const file = process.env.AGENT_TESTING_CLI_ENV_FILE || '.records/env/agent-testing-cli.env';
+  fs.mkdirSync(path.dirname(file), { recursive: true });
+  fs.writeFileSync(
+    file,
+    [
+      '# Source this file before running LobeHub CLI agent tests.',
+      '# Generated by init-dev-env.sh seed-user',
+      `export LOBE_API_KEY=${TEST_API_KEY.key}`,
+      `export LOBEHUB_CLI_API_KEY="${'${LOBE_API_KEY}'}"`,
+      `export LOBEHUB_SERVER=${process.env.APP_URL}`,
+      'export LOBEHUB_CLI_HOME=.lobehub-dev',
+      '',
+    ].join('\n'),
+  );
+
+  return file;
+};
+
+const client = new pg.Client({ connectionString: databaseUrl });
+
+(async () => {
+  if (!validateApiKeyFormat(TEST_API_KEY.key)) {
+    throw new Error(`Invalid AGENT_TESTING_API_KEY format: ${TEST_API_KEY.key}`);
+  }
+
+  await client.connect();
+  const now = new Date().toISOString();
+  const onboarding = JSON.stringify({ finishedAt: now, version: 1 });
+  const passwordHash = await bcrypt.hash(TEST_USER.password, 10);
+  const encryptedApiKey = encryptWithKeyVaultsSecret(TEST_API_KEY.key);
+  const apiKeyHash = hashApiKey(TEST_API_KEY.key);
+
+  await client.query(
+    `INSERT INTO users (id, email, normalized_email, username, full_name, email_verified, onboarding, created_at, updated_at, last_active_at)
+     VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $8, $8)
+     ON CONFLICT (id) DO UPDATE SET onboarding = $7, updated_at = $8`,
+    [
+      TEST_USER.id,
+      TEST_USER.email,
+      TEST_USER.email.toLowerCase(),
+      TEST_USER.username,
+      TEST_USER.fullName,
+      true,
+      onboarding,
+      now,
+    ],
+  );
+
+  await client.query(
+    `INSERT INTO accounts (id, user_id, account_id, provider_id, password, created_at, updated_at)
+     VALUES ($1, $2, $3, $4, $5, $6, $6)
+     ON CONFLICT DO NOTHING`,
+    [
+      'agent_testing_account_001',
+      TEST_USER.id,
+      TEST_USER.email,
+      'credential',
+      passwordHash,
+      now,
+    ],
+  );
+
+  await client.query(
+    `INSERT INTO api_keys (id, name, key, key_hash, enabled, expires_at, user_id, workspace_id, created_at, updated_at)
+     VALUES ($1, $2, $3, $4, $5, NULL, $6, NULL, $7, $7)
+     ON CONFLICT (id) DO UPDATE
+     SET name = EXCLUDED.name,
+         key = EXCLUDED.key,
+         key_hash = EXCLUDED.key_hash,
+         enabled = EXCLUDED.enabled,
+         expires_at = NULL,
+         updated_at = EXCLUDED.updated_at`,
+    [
+      TEST_API_KEY.id,
+      TEST_API_KEY.name,
+      encryptedApiKey,
+      apiKeyHash,
+      true,
+      TEST_USER.id,
+      now,
+    ],
+  );
+
+  const cliEnvFile = writeCliEnvFile();
+
+  console.log('seeded baseline user:');
+  console.log(`  email: ${TEST_USER.email}`);
+  console.log(`  password: ${TEST_USER.password}`);
+  console.log('seeded baseline API key:');
+  console.log(`  LOBE_API_KEY: ${TEST_API_KEY.key}`);
+  console.log(`  CLI env: ${cliEnvFile}`);
+})()
+  .finally(() => client.end())
+  .catch((error) => {
+    console.error(error);
+    process.exit(1);
+  });
+NODE
+}
+
+cmd_status() {
+  apply_env
+  echo "agent-testing local dev env:"
+  note "APP_URL=$APP_URL"
+  note "DATABASE_URL=$DATABASE_URL"
+  note "PORT=$PORT"
+  note "QSTASH_URL=$QSTASH_URL"
+  if command -v docker > /dev/null 2>&1; then
+    ok "docker CLI available"
+    if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
+      ok "managed Postgres running: $DB_CONTAINER"
+    else
+      note "managed Postgres is not running: $DB_CONTAINER"
+    fi
+  else
+    bad "docker CLI is not available"
+  fi
+}
+
+cmd_qstash() {
+  apply_env
+  cd "$REPO_ROOT"
+  note "starting local QStash dev server at $QSTASH_URL"
+  note "keep this process running while testing workflow paths"
+  exec pnpm run qstash -- -port "$QSTASH_DEV_PORT"
+}
+
+cmd_dev_next() {
+  apply_env
+  cd "$REPO_ROOT"
+  exec pnpm run dev:next
+}
+
+cmd_dev() {
+  apply_env
+  cd "$REPO_ROOT"
+  exec bun run dev
+}
+
+cmd_clean_db() {
+  require_docker
+  if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
+    docker stop "$DB_CONTAINER" > /dev/null
+  fi
+  if docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
+    docker rm "$DB_CONTAINER" > /dev/null
+    ok "removed Postgres container: $DB_CONTAINER"
+  else
+    note "Postgres container not found: $DB_CONTAINER"
+  fi
+}
+
+usage() {
+  sed -n '3,24p' "$0" >&2
+}
+
+COMMAND="${1:-status}"
+
+case "$COMMAND" in
+  help|-h|--help) usage; exit 0 ;;
+  *) guard_no_root_env ;;
+esac
+
+case "$COMMAND" in
+  env) print_env ;;
+  write) shift; write_env "${1:-}" ;;
+  setup-db)
+    start_db
+    migrate_db
+    ;;
+  migrate) migrate_db ;;
+  seed-user) seed_user ;;
+  qstash) cmd_qstash ;;
+  dev-next) cmd_dev_next ;;
+  dev) cmd_dev ;;
+  clean-db) cmd_clean_db ;;
+  status) cmd_status ;;
+  *)
+    usage
+    exit 2
+    ;;
+esac
@@ -24,39 +24,53 @@ DATE_HUMAN=$(date '+%Y-%m-%d %H:%M')
 DATE_ISO=$(date '+%Y-%m-%dT%H:%M:%S%z')

 cat > "$DIR/report.md" << EOF
-# Test Report: $TITLE
+# 测试报告：$TITLE

-## Scope
+## 范围

-<!-- What changed / what is being verified -->
+<!-- 测试目标 / 变更范围 / 重点风险 -->

- Branch: \`$BRANCH\`
- Commit: \`$COMMIT\`
- Date: $DATE_HUMAN
+- 分支：\`$BRANCH\`
+- 当前提交：\`$COMMIT\`
+- 日期：$DATE_HUMAN
+- 表面：<!-- CLI / Electron + CDP / Web / Bot:<platform> -->
+- 测试页 / 入口：<!-- e.g. /settings or http://localhost:3010 -->
+- 重点：<!-- 本轮最关心的体验、功能或回归点 -->

-## Environment
+## 用例

- Server: <!-- e.g. http://localhost:3010 -->
- Surfaces: <!-- cli / electron / web / bot:<platform> -->
+| # | 用例 | 结果 | 关键现象 | 证据 |
+| - | ---- | ---- | -------- | ---- |
+| 1 |      | 待测 |          | ![用例 1](assets/case1.png) |

-## Cases
+## 结论

-| # | Case | Surface | Steps | Expected | Actual | Status | Evidence |
-| - | ---- | ------- | ----- | -------- | ------ | ------ | -------- |
-| 1 |      |         |       |          |        |        |          |
+整体结论：\`pending\`。

-## Evidence
+<!-- 用 1-2 段概括用户最需要知道的结果；失败和阻塞必须明确说明影响。 -->

-<!-- Embed screenshots: ![case 1](assets/case1.png) -->
-<!-- CLI transcripts in fenced blocks, with the exact command -->
+仍需处理 / 跟进：

-## Verdict
+- <!-- TODO -->

- Passed: 0 / 0
- Failed: 0
- Blocked: 0
- Score (optional): —
- Open issues / follow-ups:
+## 本轮验证
+
+<!-- 如有自动化或命令行验证，保留精简命令与结果；没有则写“未运行额外自动化验证”。 -->
+
+\`\`\`bash
+# command
+\`\`\`
+
+结果：
+
+- <!-- TODO -->
+
+## 评分
+
+- 通过：0
+- 失败：0
+- 阻塞：0
+- 评分：— / 100
 EOF

 cat > "$DIR/result.json" << EOF
@@ -5,29 +5,114 @@
 # test step. Background and failure modes: ../references/auth.md
 #
 # Usage:
-#   setup-auth.sh status        # check server + CLI + web auth readiness
+#   setup-auth.sh status        # check server + CLI + web + Electron readiness
+#   setup-auth.sh status --surface web  # check only the Web surface gate
+#   setup-auth.sh cli-seed      # configure CLI API-key auth from seeded local env
 #   setup-auth.sh cli           # interactive CLI device-code login (run by a human)
+#   setup-auth.sh open-chrome   # open SERVER_URL in Chrome and show DevTools
+#   setup-auth.sh web-seed      # sign in seeded user and inject cookies automatically
 #   setup-auth.sh web           # stdin = Cookie header -> inject into agent-browser session
 #   setup-auth.sh web-verify    # live-check the agent-browser session is authenticated
 #
 # Env:
-#   SERVER_URL  (default http://localhost:3010)   dev server under test
+#   SERVER_URL  (default from test-env.sh)        dev server under test
 #   SESSION     (default lobehub-dev)             agent-browser session name
 #   AUTH_DIR    (default ~/.lobehub-agent-testing) where web state is persisted
+#   SEED_EMAIL / SEED_PASSWORD                    seeded better-auth login

 set -euo pipefail

-SERVER_URL="${SERVER_URL:-http://localhost:3010}"
+REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
+
+workspace_root_for_port() {
+  local root="$REPO_ROOT"
+  local name
+  name="$(basename "$root")"
+
+  if [[ "$name" == "lobehub" ]]; then
+    local parent
+    parent="$(cd "$root/.." && pwd)"
+    local parent_name
+    parent_name="$(basename "$parent")"
+    if [[ "$parent_name" == lobehub-cloud* ]]; then
+      root="$parent"
+    fi
+  fi
+
+  printf '%s\n' "$root"
+}
+
+default_server_url() {
+  local env_resolver resolved
+  env_resolver="$(dirname "${BASH_SOURCE[0]}")/test-env.sh"
+  if [[ -x "$env_resolver" ]]; then
+    resolved="$("$env_resolver" --value SERVER_URL 2> /dev/null || true)"
+    if [[ -n "$resolved" ]]; then
+      printf '%s\n' "$resolved"
+      return 0
+    fi
+  fi
+
+  local root name suffix port
+  root="$(workspace_root_for_port)"
+  name="$(basename "$root")"
+
+  case "$name" in
+    lobehub-cloud)
+      port=3020
+      ;;
+    lobehub-cloud-*)
+      suffix="${name#lobehub-cloud-}"
+      if [[ "$suffix" =~ ^[0-9]+$ ]]; then
+        port=$((3020 + 10#$suffix))
+      else
+        port=3010
+      fi
+      ;;
+    *)
+      port=3010
+      ;;
+  esac
+
+  printf 'http://localhost:%s\n' "$port"
+}
+
+SERVER_URL="${SERVER_URL:-$(default_server_url)}"
 SESSION="${SESSION:-lobehub-dev}"
 AUTH_DIR="${AUTH_DIR:-$HOME/.lobehub-agent-testing}"
 STATE_FILE="$AUTH_DIR/web-state.json"
-REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
-CLI_HOME="$REPO_ROOT/apps/cli/.lobehub-dev"
+CLI_HOME_NAME="${LOBEHUB_CLI_HOME:-.lobehub-dev}"
+CLI_HOME="$HOME/${CLI_HOME_NAME#/}"
+CLI_CREDENTIALS_FILE="$CLI_HOME/credentials.json"
+SEED_EMAIL="${SEED_EMAIL:-agent-testing@lobehub.com}"
+SEED_PASSWORD="${SEED_PASSWORD:-TestPassword123!}"
+SEED_API_KEY="${SEED_API_KEY:-${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}}"
+CLI_ENV_FILE="${CLI_ENV_FILE:-$REPO_ROOT/.records/env/agent-testing-cli.env}"

 ok()   { printf '  \033[32m✔\033[0m %s\n' "$1"; }
 bad()  { printf '  \033[31m✘\033[0m %s\n' "$1"; }
 note() { printf '      %s\n' "$1"; }

+usage() {
+  cat << EOF
+Usage:
+  $0 status [--surface all|cli|web|electron]
+  $0 cli-seed
+  $0 cli
+  $0 open-chrome [--dry-run]
+  $0 web-seed
+  $0 web
+  $0 web-verify
+
+Env:
+  SERVER_URL=$SERVER_URL
+  SESSION=$SESSION
+  AUTH_DIR=$AUTH_DIR
+  SEED_EMAIL=$SEED_EMAIL
+  CLI_HOME=$CLI_HOME
+EOF
+}
+
 check_server() {
  local code
  code=$(curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/" 2> /dev/null || true)
@@ -41,11 +126,35 @@ check_server() {
 }

 check_cli() {
-  if [[ -f "$CLI_HOME/settings.json" ]] && grep -q "$SERVER_URL" "$CLI_HOME/settings.json"; then
-    ok "CLI logged in to $SERVER_URL (creds: apps/cli/.lobehub-dev)"
+  local api_key="${LOBEHUB_CLI_API_KEY:-${LOBE_API_KEY:-}}"
+  if [[ -n "$api_key" ]]; then
+    local body_file code
+    body_file="$(mktemp)"
+    code=$(curl -sS -o "$body_file" -w '%{http_code}' \
+      -H "Authorization: Bearer $api_key" \
+      "$SERVER_URL/api/v1/users/me?includeCount=0" 2> /dev/null || true)
+
+    if [[ "$code" =~ ^[23] ]]; then
+      rm -f "$body_file"
+      ok "CLI API-key auth valid for $SERVER_URL"
+      return 0
+    fi
+
+    bad "CLI API-key auth failed for $SERVER_URL (http_code='$code')"
+    note "seed the local API key first:"
+    note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
+    note "source $CLI_ENV_FILE"
+    rm -f "$body_file"
+    return 1
+  fi
+
+  if [[ -f "$CLI_HOME/settings.json" ]] && grep -q "$SERVER_URL" "$CLI_HOME/settings.json" && [[ -f "$CLI_CREDENTIALS_FILE" ]]; then
+    ok "CLI device-code credentials configured for $SERVER_URL (creds: $CLI_HOME)"
  else
    bad "CLI not logged in to $SERVER_URL"
-    note "ask the user to run:"
+    note "automated path:"
+    note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user && source $CLI_ENV_FILE && $0 cli-seed"
+    note "interactive fallback:"
    note "cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server $SERVER_URL"
    return 1
  fi
@@ -54,13 +163,24 @@ check_cli() {
 check_web() {
  if [[ -f "$STATE_FILE" ]]; then
    ok "web auth state saved ($STATE_FILE)"
-    note "live-verify: $0 web-verify"
  else
    bad "no web auth state for agent-browser"
-    note "copy the Cookie header from Chrome DevTools (Network tab), then:"
+    note "for the seeded local user, run: $0 web-seed"
+    note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
    note "pbpaste | $0 web   (see references/auth.md)"
    return 1
  fi
+  cmd_web_verify --skip-server-check
+}
+
+check_agent_browser() {
+  if command -v agent-browser > /dev/null 2>&1; then
+    ok "agent-browser available"
+  else
+    bad "agent-browser command not found"
+    note "install or expose agent-browser before Web/Electron UI testing"
+    return 1
+  fi
 }

 check_electron() {
@@ -84,16 +204,75 @@ check_electron() {
 }

 cmd_status() {
-  echo "agent-testing auth status (SERVER_URL=$SERVER_URL):"
+  local surface="all"
+  while [[ $# -gt 0 ]]; do
+    case "$1" in
+      --surface)
+        if [[ $# -lt 2 ]]; then
+          echo "--surface requires one of: all, cli, web, electron" >&2
+          return 2
+        fi
+        surface="${2:-}"
+        shift 2
+        ;;
+      --surface=*)
+        surface="${1#*=}"
+        shift
+        ;;
+      all|cli|web|electron)
+        surface="$1"
+        shift
+        ;;
+      -h|--help)
+        usage
+        return 0
+        ;;
+      *)
+        echo "unknown status option: $1" >&2
+        usage >&2
+        return 2
+        ;;
+    esac
+  done
+
+  case "$surface" in
+    all|cli|web|electron) ;;
+    "")
+      echo "--surface requires one of: all, cli, web, electron" >&2
+      return 2
+      ;;
+    *)
+      echo "unknown surface: $surface" >&2
+      usage >&2
+      return 2
+      ;;
+  esac
+
+  echo "agent-testing auth status (surface=$surface, SERVER_URL=$SERVER_URL):"
  local rc=0
-  check_server || rc=1
-  check_cli || rc=1
-  check_web || rc=1
-  check_electron || rc=1
+  case "$surface" in
+    all)
+      check_server || rc=1
+      check_cli || rc=1
+      check_web || rc=1
+      check_electron || rc=1
+      ;;
+    cli)
+      check_server || rc=1
+      check_cli || rc=1
+      ;;
+    web)
+      check_server || rc=1
+      check_web || rc=1
+      ;;
+    electron)
+      check_electron || rc=1
+      ;;
+  esac
  if [[ $rc -eq 0 ]]; then
-    echo "all green — safe to start automated testing."
+    echo "$surface auth green — safe to start automated testing on this surface."
  else
-    echo "auth NOT ready — fix the ✘ items before writing any test step."
+    echo "$surface auth NOT ready — fix the ✘ items before writing any test step."
  fi
  return $rc
 }
@@ -105,23 +284,148 @@ cmd_cli() {
  LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server "$SERVER_URL"
 }

+write_cli_seed_env() {
+  mkdir -p "$(dirname "$CLI_ENV_FILE")"
+  cat > "$CLI_ENV_FILE" << EOF
+# Source this file before running LobeHub CLI agent tests.
+# Generated by setup-auth.sh cli-seed
+export LOBE_API_KEY=$SEED_API_KEY
+export LOBEHUB_CLI_API_KEY="\${LOBE_API_KEY}"
+export LOBEHUB_SERVER=$SERVER_URL
+export LOBEHUB_CLI_HOME=.lobehub-dev
+EOF
+}
+
+write_cli_settings() {
+  mkdir -p "$CLI_HOME"
+  python3 - "$CLI_HOME/settings.json" "$SERVER_URL" << 'PY'
+import json
+import os
+import sys
+
+path, server_url = sys.argv[1], sys.argv[2]
+os.makedirs(os.path.dirname(path), exist_ok=True)
+with open(path, "w") as f:
+    json.dump({"serverUrl": server_url}, f, indent=2)
+    f.write("\n")
+os.chmod(path, 0o600)
+PY
+}
+
+cmd_cli_seed() {
+  check_server || return 1
+  write_cli_seed_env
+  write_cli_settings
+  ok "wrote CLI seed env: $CLI_ENV_FILE"
+  note "source it before CLI commands: source $CLI_ENV_FILE"
+  note "settings saved at: $CLI_HOME/settings.json"
+  LOBE_API_KEY="$SEED_API_KEY" LOBEHUB_CLI_API_KEY="$SEED_API_KEY" check_cli
+}
+
+cmd_open_chrome() {
+  local mode="${1:-}"
+  if [[ "$mode" != "" && "$mode" != "--dry-run" ]]; then
+    echo "unknown open-chrome option: $mode" >&2
+    usage >&2
+    return 2
+  fi
+
+  if [[ "$mode" == "--dry-run" ]]; then
+    echo "would open Google Chrome at $SERVER_URL/"
+    echo "would press Cmd+Option+I to open DevTools"
+    echo "would open DevTools command menu and run 'Show Network'"
+    return 0
+  fi
+
+  if [[ "$(uname -s)" != "Darwin" ]]; then
+    bad "open-chrome is macOS-only"
+    note "open $SERVER_URL/ in your browser and open DevTools manually"
+    return 1
+  fi
+
+  if ! command -v osascript > /dev/null 2>&1; then
+    bad "osascript not found"
+    note "open $SERVER_URL/ in Chrome and press Cmd+Option+I manually"
+    return 1
+  fi
+
+  SERVER_URL="$SERVER_URL" osascript << 'OSA'
+set targetUrl to (system attribute "SERVER_URL") & "/"
+
+tell application "Google Chrome"
+  activate
+  if (count of windows) = 0 then
+    make new window
+  end if
+  tell front window to make new tab with properties {URL:targetUrl}
+end tell
+
+delay 1
+
+tell application "System Events"
+  tell process "Google Chrome"
+    set frontmost to true
+    keystroke "i" using {command down, option down}
+    delay 1
+    keystroke "p" using {command down, shift down}
+    delay 0.2
+    keystroke "Show Network"
+    key code 36
+  end tell
+end tell
+OSA
+  ok "opened Chrome at $SERVER_URL/ and requested DevTools Network panel"
+}
+
+cookie_header_from_jar() {
+  local jar="$1"
+  awk '
+    BEGIN { first = 1 }
+    /^$/ { next }
+    /^#/ {
+      if ($0 !~ /^#HttpOnly_/) next
+      sub(/^#HttpOnly_/, "")
+    }
+    NF >= 7 {
+      if (!first) printf "; "
+      printf "%s=%s", $6, $7
+      first = 0
+    }
+    END {
+      if (!first) printf "\n"
+    }
+  ' "$jar"
+}
+
 # Build a Playwright storageState file from a raw Cookie header on stdin,
 # keeping only the better-auth cookies. See references/auth.md for why the
 # header must come from a Network request (HttpOnly) and why httpOnly=false.
 cmd_web() {
  mkdir -p "$AUTH_DIR"
-  python3 - "$STATE_FILE" << 'PY'
-import json, sys, time
+  local raw
+  raw="$(cat)"
+  COOKIE_INPUT="$raw" python3 - "$STATE_FILE" << 'PY'
+import json, os, sys, time

-raw = sys.stdin.read().strip()
-if raw.lower().startswith("cookie:"):
-    raw = raw.split(":", 1)[1].strip()
+raw = os.environ.get("COOKIE_INPUT", "").strip()
+cookie_lines = []
+for line in raw.splitlines():
+    stripped = line.strip()
+    if not stripped:
+        continue
+    if stripped.lower().startswith("cookie:"):
+        cookie_lines.append(stripped.split(":", 1)[1].strip())
+    else:
+        cookie_lines.append(stripped)

-WANTED = {"better-auth.session_token", "better-auth.state"}
+raw = "; ".join(cookie_lines)
+
+WANTED = {"better-auth.session_token", "better-auth.session_data", "better-auth.state"}
 exp = int(time.time()) + 30 * 24 * 3600  # 30 days

 cookies = []
-for pair in raw.split("; "):
+for pair in raw.split(";"):
+    pair = pair.strip()
    if "=" not in pair:
        continue
    name, _, value = pair.partition("=")
@@ -146,14 +450,79 @@ with open(sys.argv[1], "w") as f:
    json.dump({"cookies": cookies, "origins": []}, f, indent=2)
 print(f"wrote {len(cookies)} cookie(s) to {sys.argv[1]}")
 PY
-  agent-browser --session "$SESSION" state load "$STATE_FILE"
  cmd_web_verify
 }

+cmd_web_seed() {
+  check_server || return 1
+  mkdir -p "$AUTH_DIR"
+
+  local cookie_jar="$AUTH_DIR/web-seed-cookie.jar"
+  local response_body="$AUTH_DIR/web-seed-response.json"
+  local payload code
+  payload="$(
+    SEED_EMAIL="$SEED_EMAIL" SEED_PASSWORD="$SEED_PASSWORD" python3 - << 'PY'
+import json
+import os
+
+print(json.dumps({
+    "callbackURL": "/",
+    "email": os.environ["SEED_EMAIL"],
+    "password": os.environ["SEED_PASSWORD"],
+}))
+PY
+  )"
+
+  code=$(curl -sS -o "$response_body" -w '%{http_code}' \
+    -c "$cookie_jar" \
+    -H 'Content-Type: application/json' \
+    -X POST "$SERVER_URL/api/auth/sign-in/email" \
+    --data "$payload" 2> /dev/null || true)
+
+  if [[ ! "$code" =~ ^[23] ]]; then
+    bad "seed user sign-in failed at $SERVER_URL/api/auth/sign-in/email (http_code='$code')"
+    note "make sure the seed user exists:"
+    note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
+    return 1
+  fi
+
+  local cookie_header
+  cookie_header="$(cookie_header_from_jar "$cookie_jar")"
+  if [[ -z "$cookie_header" ]]; then
+    bad "seed sign-in succeeded but no cookies were written to $cookie_jar"
+    return 1
+  fi
+
+  printf '%s\n' "$cookie_header" | cmd_web
+}
+
 cmd_web_verify() {
-  agent-browser --session "$SESSION" open "$SERVER_URL/" > /dev/null
+  local skip_server_check="${1:-}"
+  if [[ "$skip_server_check" != "--skip-server-check" ]]; then
+    check_server || return 1
+  fi
+  if [[ ! -f "$STATE_FILE" ]]; then
+    bad "no web auth state for agent-browser"
+    note "for the seeded local user, run: $0 web-seed"
+    note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
+    note "pbpaste | $0 web"
+    return 1
+  fi
+  check_agent_browser || return 1
+  if ! agent-browser --session "$SESSION" state load "$STATE_FILE" > /dev/null; then
+    bad "failed to load web auth state into agent-browser session '$SESSION'"
+    return 1
+  fi
+  if ! agent-browser --session "$SESSION" open "$SERVER_URL/" > /dev/null; then
+    bad "failed to open $SERVER_URL in agent-browser session '$SESSION'"
+    return 1
+  fi
  local url
-  url=$(agent-browser --session "$SESSION" get url)
+  url=$(agent-browser --session "$SESSION" get url 2> /dev/null || true)
+  if [[ -z "$url" ]]; then
+    bad "agent-browser session '$SESSION' did not report a current URL"
+    return 1
+  fi
  if [[ "$url" == *"/signin"* || "$url" == *"/login"* ]]; then
    bad "agent-browser session '$SESSION' NOT authenticated (landed on $url)"
    note "re-copy the Cookie header and re-run: pbpaste | $0 web"
@@ -163,12 +532,22 @@ cmd_web_verify() {
 }

 case "${1:-status}" in
-  status) cmd_status ;;
+  status)
+    shift || true
+    cmd_status "$@"
+    ;;
+  cli-seed) cmd_cli_seed ;;
  cli) cmd_cli ;;
+  open-chrome)
+    shift || true
+    cmd_open_chrome "$@"
+    ;;
+  web-seed) cmd_web_seed ;;
  web) cmd_web ;;
  web-verify) cmd_web_verify ;;
+  -h|--help) usage ;;
  *)
-    echo "Usage: $0 {status|cli|web|web-verify}" >&2
+    echo "Usage: $0 {status|cli-seed|cli|open-chrome|web-seed|web|web-verify}" >&2
    exit 2
    ;;
 esac
@@ -0,0 +1,197 @@
+#!/usr/bin/env bash
+# Smoke tests for setup-auth.sh. Uses a temporary agent-browser stub and local
+# HTTP server, so it does not need real browser auth.
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SCRIPT="$SCRIPT_DIR/setup-auth.sh"
+
+fail() {
+  echo "FAIL: $*" >&2
+  exit 1
+}
+
+assert_contains() {
+  local file="$1"
+  local text="$2"
+  grep -Fq "$text" "$file" || fail "expected '$text' in $file"
+}
+
+tmp_dir="$(mktemp -d)"
+server_pid=""
+
+cleanup() {
+  if [[ -n "$server_pid" ]]; then
+    kill "$server_pid" > /dev/null 2>&1 || true
+    wait "$server_pid" > /dev/null 2>&1 || true
+  fi
+  rm -rf "$tmp_dir"
+}
+trap cleanup EXIT
+export HOME="$tmp_dir/home"
+
+port="$(python3 - << 'PY'
+import socket
+
+sock = socket.socket()
+sock.bind(("127.0.0.1", 0))
+print(sock.getsockname()[1])
+sock.close()
+PY
+)"
+
+python3 - "$port" << 'PY' > "$tmp_dir/http.log" 2>&1 &
+from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
+import sys
+
+
+class Handler(BaseHTTPRequestHandler):
+    def do_GET(self):
+        if self.path.startswith("/api/v1/users/me"):
+            if self.headers.get("authorization") != "Bearer sk-lh-agenttesting0001":
+                self.send_response(401)
+                self.end_headers()
+                self.wfile.write(b'{"success":false}')
+                return
+
+            self.send_response(200)
+            self.send_header("Content-Type", "application/json")
+            self.end_headers()
+            self.wfile.write(b'{"success":true,"data":{"id":"user_agent_testing_001"}}')
+            return
+
+        self.send_response(200)
+        self.end_headers()
+        self.wfile.write(b"ok")
+
+    def do_POST(self):
+        length = int(self.headers.get("content-length") or "0")
+        if length:
+            self.rfile.read(length)
+
+        if self.path != "/api/auth/sign-in/email":
+            self.send_response(404)
+            self.end_headers()
+            return
+
+        self.send_response(200)
+        self.send_header(
+            "Set-Cookie",
+            "better-auth.session_token=seed.token; Path=/; HttpOnly; SameSite=Lax",
+        )
+        self.send_header(
+            "Set-Cookie",
+            "better-auth.session_data=seed.data; Path=/; HttpOnly; SameSite=Lax",
+        )
+        self.send_header("Content-Type", "application/json")
+        self.end_headers()
+        self.wfile.write(b'{"ok":true}')
+
+    def log_message(self, format, *args):
+        return
+
+
+ThreadingHTTPServer(("localhost", int(sys.argv[1])), Handler).serve_forever()
+PY
+server_pid="$!"
+
+server_url="http://localhost:$port"
+for _ in {1..50}; do
+  if curl -s -o /dev/null "$server_url/"; then
+    break
+  fi
+  sleep 0.1
+done
+curl -s -o /dev/null "$server_url/" || fail "test HTTP server did not start"
+
+mkdir -p "$tmp_dir/bin" "$tmp_dir/auth"
+cat > "$tmp_dir/bin/agent-browser" << 'SH'
+#!/usr/bin/env bash
+set -euo pipefail
+
+if [[ "${1:-}" == "--session" ]]; then
+  shift 2
+fi
+
+case "${1:-}" in
+  state)
+    [[ "${2:-}" == "load" ]] || exit 2
+    [[ -f "${3:-}" ]] || exit 1
+    ;;
+  open)
+    printf '%s\n' "${2:-}" > "${AGENT_BROWSER_URL_FILE:?}"
+    ;;
+  get)
+    [[ "${2:-}" == "url" ]] || exit 2
+    cat "${AGENT_BROWSER_URL_FILE:?}"
+    ;;
+  *)
+    echo "unexpected agent-browser command: $*" >&2
+    exit 2
+    ;;
+esac
+SH
+chmod +x "$tmp_dir/bin/agent-browser"
+
+export PATH="$tmp_dir/bin:$PATH"
+export AUTH_DIR="$tmp_dir/auth"
+export SESSION="setup-auth-test"
+export SERVER_URL="$server_url"
+export AGENT_BROWSER_URL_FILE="$tmp_dir/current-url"
+
+cookie_header="Cookie: foo=bar; better-auth.session_token=test.token; better-auth.session_data=encoded%3D; theme=dark"
+printf '%s\n' "$cookie_header" | "$SCRIPT" web > "$tmp_dir/web.out"
+
+python3 - "$AUTH_DIR/web-state.json" << 'PY'
+import json, sys
+
+with open(sys.argv[1]) as f:
+    state = json.load(f)
+
+names = {cookie["name"] for cookie in state["cookies"]}
+expected = {"better-auth.session_token", "better-auth.session_data"}
+if names != expected:
+    raise SystemExit(f"unexpected cookies: {sorted(names)}")
+PY
+
+"$SCRIPT" web-seed > "$tmp_dir/web-seed.out"
+
+python3 - "$AUTH_DIR/web-state.json" << 'PY'
+import json, sys
+
+with open(sys.argv[1]) as f:
+    state = json.load(f)
+
+values = {cookie["name"]: cookie["value"] for cookie in state["cookies"]}
+expected = {
+    "better-auth.session_token": "seed.token",
+    "better-auth.session_data": "seed.data",
+}
+if values != expected:
+    raise SystemExit(f"unexpected seeded cookies: {values}")
+PY
+
+"$SCRIPT" status --surface web > "$tmp_dir/status.out"
+assert_contains "$tmp_dir/status.out" "surface=web"
+assert_contains "$tmp_dir/status.out" "web auth green"
+
+"$SCRIPT" cli-seed > "$tmp_dir/cli-seed.out"
+assert_contains "$tmp_dir/cli-seed.out" "CLI API-key auth valid"
+assert_contains "$tmp_dir/cli-seed.out" "settings saved at: $HOME/.lobehub-dev/settings.json"
+
+if "$SCRIPT" status --surface cli > "$tmp_dir/cli-no-env.out"; then
+  fail "cli status without API key unexpectedly passed"
+fi
+assert_contains "$tmp_dir/cli-no-env.out" "CLI not logged in"
+
+LOBEHUB_CLI_API_KEY=sk-lh-agenttesting0001 "$SCRIPT" status --surface cli > "$tmp_dir/cli-status.out"
+assert_contains "$tmp_dir/cli-status.out" "CLI API-key auth valid"
+assert_contains "$tmp_dir/cli-status.out" "cli auth green"
+
+if printf 'foo=bar\n' | "$SCRIPT" web > "$tmp_dir/invalid.out" 2> "$tmp_dir/invalid.err"; then
+  fail "invalid cookie unexpectedly passed"
+fi
+assert_contains "$tmp_dir/invalid.err" "no better-auth cookies found"
+
+echo "setup-auth tests passed"
@@ -0,0 +1,377 @@
+#!/usr/bin/env bash
+# Print the resolved local test environment for agent-testing.
+#
+# This is intentionally read-only. It mirrors scripts/runWithEnv.mts precedence:
+# .env -> .env.$NODE_ENV -> .env.local -> .env.$NODE_ENV.local, then shell env.
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+REPO_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)"
+NODE_ENV="${NODE_ENV:-development}"
+
+VALUE_APP_URL=""
+VALUE_PORT=""
+VALUE_SERVER_URL=""
+VALUE_AUTH_TRUSTED_ORIGINS=""
+VALUE_SPA_PORT=""
+VALUE_MOBILE_SPA_PORT=""
+VALUE_DESKTOP_PORT=""
+
+SOURCE_APP_URL=""
+SOURCE_PORT=""
+SOURCE_SERVER_URL=""
+SOURCE_AUTH_TRUSTED_ORIGINS=""
+SOURCE_SPA_PORT=""
+SOURCE_MOBILE_SPA_PORT=""
+SOURCE_DESKTOP_PORT=""
+
+LOADED_ENV_FILES=""
+
+keys() {
+  printf '%s\n' \
+    APP_URL \
+    PORT \
+    SERVER_URL \
+    AUTH_TRUSTED_ORIGINS \
+    SPA_PORT \
+    MOBILE_SPA_PORT \
+    DESKTOP_PORT
+}
+
+trim() {
+  local value="$1"
+  value="${value#"${value%%[![:space:]]*}"}"
+  value="${value%"${value##*[![:space:]]}"}"
+  printf '%s' "$value"
+}
+
+workspace_root() {
+  local root="$REPO_ROOT"
+  local name
+  name="$(basename "$root")"
+
+  if [[ "$name" == "lobehub" ]]; then
+    local parent parent_name
+    parent="$(cd "$root/.." && pwd)"
+    parent_name="$(basename "$parent")"
+    if [[ "$parent_name" == lobehub-cloud* ]]; then
+      root="$parent"
+    fi
+  fi
+
+  printf '%s\n' "$root"
+}
+
+workspace_offset() {
+  local name="$1"
+
+  case "$name" in
+    lobehub-cloud)
+      printf '0\n'
+      ;;
+    lobehub-cloud-*)
+      local suffix="${name#lobehub-cloud-}"
+      if [[ "$suffix" =~ ^[0-9]+$ ]]; then
+        printf '%s\n' "$((10#$suffix))"
+      else
+        printf '\n'
+      fi
+      ;;
+    *)
+      printf '\n'
+      ;;
+  esac
+}
+
+default_port() {
+  local base="$1"
+  local fallback="$2"
+  local root name offset
+  root="$(workspace_root)"
+  name="$(basename "$root")"
+  offset="$(workspace_offset "$name")"
+
+  if [[ -n "$offset" ]]; then
+    printf '%s\n' "$((base + offset))"
+  else
+    printf '%s\n' "$fallback"
+  fi
+}
+
+url_port() {
+  local url="$1"
+  local hostport
+  hostport="${url#*://}"
+  hostport="${hostport%%/*}"
+
+  if [[ "$hostport" == *:* ]]; then
+    local port="${hostport##*:}"
+    if [[ "$port" =~ ^[0-9]+$ ]]; then
+      printf '%s\n' "$port"
+      return 0
+    fi
+  fi
+
+  return 1
+}
+
+url_origin() {
+  local url="$1"
+  local scheme rest hostport
+  if [[ "$url" == *"://"* ]]; then
+    scheme="${url%%://*}"
+    rest="${url#*://}"
+    hostport="${rest%%/*}"
+    printf '%s://%s\n' "$scheme" "$hostport"
+  else
+    printf '%s\n' "$url"
+  fi
+}
+
+set_value() {
+  local key="$1"
+  local value="$2"
+  local source="$3"
+
+  case "$key" in
+    APP_URL) VALUE_APP_URL="$value"; SOURCE_APP_URL="$source" ;;
+    PORT) VALUE_PORT="$value"; SOURCE_PORT="$source" ;;
+    SERVER_URL) VALUE_SERVER_URL="$value"; SOURCE_SERVER_URL="$source" ;;
+    AUTH_TRUSTED_ORIGINS) VALUE_AUTH_TRUSTED_ORIGINS="$value"; SOURCE_AUTH_TRUSTED_ORIGINS="$source" ;;
+    SPA_PORT) VALUE_SPA_PORT="$value"; SOURCE_SPA_PORT="$source" ;;
+    MOBILE_SPA_PORT) VALUE_MOBILE_SPA_PORT="$value"; SOURCE_MOBILE_SPA_PORT="$source" ;;
+    DESKTOP_PORT) VALUE_DESKTOP_PORT="$value"; SOURCE_DESKTOP_PORT="$source" ;;
+  esac
+}
+
+value_for() {
+  case "$1" in
+    APP_URL) printf '%s\n' "$VALUE_APP_URL" ;;
+    PORT) printf '%s\n' "$VALUE_PORT" ;;
+    SERVER_URL) printf '%s\n' "$VALUE_SERVER_URL" ;;
+    AUTH_TRUSTED_ORIGINS) printf '%s\n' "$VALUE_AUTH_TRUSTED_ORIGINS" ;;
+    SPA_PORT) printf '%s\n' "$VALUE_SPA_PORT" ;;
+    MOBILE_SPA_PORT) printf '%s\n' "$VALUE_MOBILE_SPA_PORT" ;;
+    DESKTOP_PORT) printf '%s\n' "$VALUE_DESKTOP_PORT" ;;
+  esac
+}
+
+source_for() {
+  case "$1" in
+    APP_URL) printf '%s\n' "$SOURCE_APP_URL" ;;
+    PORT) printf '%s\n' "$SOURCE_PORT" ;;
+    SERVER_URL) printf '%s\n' "$SOURCE_SERVER_URL" ;;
+    AUTH_TRUSTED_ORIGINS) printf '%s\n' "$SOURCE_AUTH_TRUSTED_ORIGINS" ;;
+    SPA_PORT) printf '%s\n' "$SOURCE_SPA_PORT" ;;
+    MOBILE_SPA_PORT) printf '%s\n' "$SOURCE_MOBILE_SPA_PORT" ;;
+    DESKTOP_PORT) printf '%s\n' "$SOURCE_DESKTOP_PORT" ;;
+  esac
+}
+
+is_tracked_key() {
+  case "$1" in
+    APP_URL|PORT|SERVER_URL|AUTH_TRUSTED_ORIGINS|SPA_PORT|MOBILE_SPA_PORT|DESKTOP_PORT) return 0 ;;
+    *) return 1 ;;
+  esac
+}
+
+parse_env_file() {
+  local file="$1"
+  local root="$2"
+  local label="${file#$root/}"
+  local line key value
+
+  [[ -f "$file" ]] || return 0
+  if [[ -z "$LOADED_ENV_FILES" ]]; then
+    LOADED_ENV_FILES="$label"
+  else
+    LOADED_ENV_FILES="$LOADED_ENV_FILES, $label"
+  fi
+
+  while IFS= read -r line || [[ -n "$line" ]]; do
+    line="$(trim "$line")"
+    [[ -z "$line" || "$line" == \#* ]] && continue
+
+    if [[ "$line" == export[[:space:]]* ]]; then
+      line="$(trim "${line#export}")"
+    fi
+
+    [[ "$line" == *=* ]] || continue
+    key="$(trim "${line%%=*}")"
+    value="$(trim "${line#*=}")"
+    is_tracked_key "$key" || continue
+
+    if [[ "$value" == \"*\" && "$value" == *\" && ${#value} -ge 2 ]]; then
+      value="${value:1:${#value}-2}"
+    elif [[ "$value" == \'* && "$value" == *\' && ${#value} -ge 2 ]]; then
+      value="${value:1:${#value}-2}"
+    fi
+
+    set_value "$key" "$value" "$label"
+  done < "$file"
+}
+
+apply_env_files() {
+  local root="$1"
+  parse_env_file "$root/.env" "$root"
+  parse_env_file "$root/.env.$NODE_ENV" "$root"
+  parse_env_file "$root/.env.local" "$root"
+  parse_env_file "$root/.env.$NODE_ENV.local" "$root"
+}
+
+apply_shell_overrides() {
+  local key value
+  while IFS= read -r key; do
+    if [[ -n "${!key+x}" ]]; then
+      value="${!key}"
+      set_value "$key" "$value" "shell"
+    fi
+  done < <(keys)
+}
+
+resolve_defaults() {
+  local app_port spa_port mobile_spa_port desktop_port
+  app_port="$(default_port 3020 3010)"
+  spa_port="$(default_port 9800 9876)"
+  mobile_spa_port="$(default_port 3810 3012)"
+  desktop_port="$(default_port 3030 3015)"
+
+  if [[ -z "$VALUE_APP_URL" ]]; then
+    set_value APP_URL "http://localhost:$app_port" "inferred"
+  fi
+
+  if [[ -z "$VALUE_PORT" ]]; then
+    if app_port="$(url_port "$VALUE_APP_URL")"; then
+      set_value PORT "$app_port" "inferred from APP_URL"
+    else
+      set_value PORT "$(default_port 3020 3010)" "inferred"
+    fi
+  fi
+
+  if [[ -z "$VALUE_SERVER_URL" ]]; then
+    set_value SERVER_URL "$VALUE_APP_URL" "from APP_URL"
+  fi
+
+  if [[ -z "$VALUE_SPA_PORT" ]]; then
+    set_value SPA_PORT "$spa_port" "inferred"
+  fi
+
+  if [[ -z "$VALUE_MOBILE_SPA_PORT" ]]; then
+    set_value MOBILE_SPA_PORT "$mobile_spa_port" "inferred"
+  fi
+
+  if [[ -z "$VALUE_DESKTOP_PORT" ]]; then
+    set_value DESKTOP_PORT "$desktop_port" "inferred"
+  fi
+
+  if [[ -z "$VALUE_AUTH_TRUSTED_ORIGINS" ]]; then
+    set_value AUTH_TRUSTED_ORIGINS "$(url_origin "$VALUE_APP_URL"),http://localhost:$VALUE_SPA_PORT" "inferred"
+  fi
+}
+
+contains_origin() {
+  local list="$1"
+  local expected="$2"
+  local item
+  IFS=',' read -r -a items <<< "$list"
+  for item in "${items[@]}"; do
+    item="$(trim "$item")"
+    [[ "$item" == "$expected" ]] && return 0
+  done
+  return 1
+}
+
+print_exports() {
+  local key value
+  while IFS= read -r key; do
+    value="$(value_for "$key")"
+    printf 'export %s=%q\n' "$key" "$value"
+  done < <(keys)
+}
+
+print_value() {
+  local key="$1"
+  if ! is_tracked_key "$key"; then
+    echo "unknown key: $key" >&2
+    exit 2
+  fi
+  value_for "$key"
+}
+
+print_human() {
+  local root="$1"
+  local key value source
+
+  echo "agent-testing test env:"
+  printf '  workspace: %s\n' "$root"
+  printf '  NODE_ENV: %s\n' "$NODE_ENV"
+  printf '  env files: %s\n' "${LOADED_ENV_FILES:-none}"
+  echo
+  echo "resolved values:"
+  while IFS= read -r key; do
+    value="$(value_for "$key")"
+    source="$(source_for "$key")"
+    printf '  %-22s %s  (%s)\n' "$key=$value" "" "$source"
+  done < <(keys)
+  echo
+  echo "checks:"
+
+  local app_origin spa_origin app_port
+  app_origin="$(url_origin "$VALUE_APP_URL")"
+  spa_origin="http://localhost:$VALUE_SPA_PORT"
+  if app_port="$(url_port "$VALUE_APP_URL")" && [[ "$app_port" == "$VALUE_PORT" ]]; then
+    printf '  OK   PORT matches APP_URL (%s)\n' "$VALUE_PORT"
+  else
+    printf '  WARN PORT (%s) does not match APP_URL (%s)\n' "$VALUE_PORT" "$VALUE_APP_URL"
+  fi
+
+  if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$app_origin"; then
+    printf '  OK   AUTH_TRUSTED_ORIGINS includes %s\n' "$app_origin"
+  else
+    printf '  WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$app_origin"
+  fi
+
+  if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$spa_origin"; then
+    printf '  OK   AUTH_TRUSTED_ORIGINS includes %s\n' "$spa_origin"
+  else
+    printf '  WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$spa_origin"
+  fi
+}
+
+usage() {
+  cat << EOF
+Usage:
+  $0                 # print resolved test environment
+  $0 --exports       # print source-able export lines
+  $0 --value KEY     # print one resolved value
+
+Tracked keys:
+  APP_URL PORT SERVER_URL AUTH_TRUSTED_ORIGINS SPA_PORT MOBILE_SPA_PORT DESKTOP_PORT
+EOF
+}
+
+ROOT="$(workspace_root)"
+apply_env_files "$ROOT"
+apply_shell_overrides
+resolve_defaults
+
+case "${1:-}" in
+  "")
+    print_human "$ROOT"
+    ;;
+  --exports)
+    print_exports
+    ;;
+  --value)
+    print_value "${2:-}"
+    ;;
+  -h|--help)
+    usage
+    ;;
+  *)
+    echo "unknown option: $1" >&2
+    usage >&2
+    exit 2
+    ;;
+esac
@@ -0,0 +1,57 @@
+#!/usr/bin/env bash
+# Smoke tests for test-env.sh.
+
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+
+fail() {
+  echo "FAIL: $*" >&2
+  exit 1
+}
+
+assert_eq() {
+  local actual="$1"
+  local expected="$2"
+  [[ "$actual" == "$expected" ]] || fail "expected '$expected', got '$actual'"
+}
+
+assert_contains() {
+  local file="$1"
+  local text="$2"
+  grep -Fq "$text" "$file" || fail "expected '$text' in $file"
+}
+
+tmp_dir="$(mktemp -d)"
+trap 'rm -rf "$tmp_dir"' EXIT
+
+mkdir -p "$tmp_dir/lobehub-cloud-1/.agents/skills" "$tmp_dir/lobehub/.agents/skills"
+ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing"
+ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub/.agents/skills/agent-testing"
+
+cloud_script="$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing/scripts/test-env.sh"
+oss_script="$tmp_dir/lobehub/.agents/skills/agent-testing/scripts/test-env.sh"
+
+assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:3021"
+assert_eq "$("$cloud_script" --value SPA_PORT)" "9801"
+assert_eq "$("$cloud_script" --value MOBILE_SPA_PORT)" "3811"
+assert_eq "$("$cloud_script" --value DESKTOP_PORT)" "3031"
+assert_eq "$("$oss_script" --value SERVER_URL)" "http://localhost:3010"
+
+cat > "$tmp_dir/lobehub-cloud-1/.env" << 'EOF'
+APP_URL=http://localhost:4123
+PORT=4123
+AUTH_TRUSTED_ORIGINS=http://localhost:4123,http://localhost:9823
+SPA_PORT=9823
+MOBILE_SPA_PORT=3823
+DESKTOP_PORT=3043
+EOF
+
+assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:4123"
+assert_eq "$("$cloud_script" --value SPA_PORT)" "9823"
+"$cloud_script" --exports > "$tmp_dir/exports.out"
+assert_contains "$tmp_dir/exports.out" "export APP_URL=http://localhost:4123"
+assert_contains "$tmp_dir/exports.out" "export SERVER_URL=http://localhost:4123"
+assert_contains "$tmp_dir/exports.out" "export AUTH_TRUSTED_ORIGINS=http://localhost:4123\\,http://localhost:9823"
+
+echo "test-env tests passed"
@@ -10,23 +10,32 @@ backend-only changes prefer [../cli/index.md](../cli/index.md).

 ## Prerequisites

+- Complete [Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) (resolve ports) and [Step -1](../SKILL.md#step--1--plan-approval-for-non-trivial-tests) (plan approval) first.
 - Local dev server running — [../references/dev-server.md](../references/dev-server.md)
- Web auth injected into agent-browser — [../references/auth.md](../references/auth.md):
+- Web auth verified in agent-browser — prefer `setup-auth.sh web-seed`, see [auth decision flow](../references/auth.md#web--decision-flow).
+
+## Option A — agent-browser with seeded auth (recommended)

 ```bash
-pbpaste | ./.agents/skills/agent-testing/scripts/setup-auth.sh web # after copying the Cookie header
+./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
+./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
 ```

-## Option A — agent-browser with injected auth (recommended)
+Then drive the verified session:

 ```bash
 SESSION=lobehub-dev

-agent-browser --session $SESSION open "http://localhost:3010/"
+agent-browser --session $SESSION open "$SERVER_URL/"
 agent-browser --session $SESSION snapshot -i
 # interact via refs — full command reference: ../references/agent-browser.md
 ```

+Use this session as the evidence source. Do not use ordinary Chrome screenshots
+or Chrome Network records as proof for Web tests; ordinary Chrome is only a
+fallback source for copying cookies into agent-browser when the seeded login is
+not available.
+
 ### Watch the API while driving the UI

 ```bash
@@ -53,6 +53,12 @@ For Modal specifically, see the dedicated **modal** skill — use the imperative
 | Layout       | Center, DraggablePanel, Flexbox, Grid, Header, MaskShadow                             |
 | Navigation   | Burger, Menu, SideNav, Tabs                                                           |

+## Loading indicators
+
+**Do NOT use antd `Spin` / `<Spin />`.** Use a project loader
+(`NeuralNetworkLoading`, `DotsLoading`, …) — see the **ux** skill ("Loading
+visuals") for the component table and when to use each.
+
 ## State

 When a feature component manages more than 3 pieces of state (`useState`/`useReducer`/derived state), extract the logic into a custom hook (e.g. `useXxx`). Keep the component focused on rendering — the hook holds state and handlers, so logic can be unit-tested without rendering the component.
@@ -112,6 +118,7 @@ errorElement: <ErrorBoundary />;
 | ------------------------------------------------------------------ | --------------------------------------------------------------------------- |
 | Using `next/link` in SPA                                           | Use `react-router-dom` `Link`                                               |
 | Using antd directly                                                | Use `@lobehub/ui/base-ui` first, then `@lobehub/ui`                         |
+| antd `Spin` / `<Spin />` for loading                               | Use `NeuralNetworkLoading` / project loaders (see the **ux** skill)         |
 | `import { Select } from '@lobehub/ui'`                             | `import { Select } from '@lobehub/ui/base-ui'`                              |
 | `import { Modal } from '@lobehub/ui'` + `<Modal open>` declarative | `createModal` / `confirmModal` from `@lobehub/ui/base-ui` (see modal skill) |
 | `import { DropdownMenu/Popover/Switch } from '@lobehub/ui'`        | Import same name from `@lobehub/ui/base-ui` instead                         |
@@ -43,6 +43,9 @@ cd packages/database && TEST_SERVER_DB=1 bunx vitest run --silent='passed-only'
 2. **Tests must pass type check** - Run `bun run type-check` after writing tests
 3. **After 1-2 failed fix attempts, stop and ask for help**
 4. **Test behavior, not implementation details**
+5. **Regression tests for bug fixes** - After fixing a bug, add a regression test that fails before the fix and passes after, to prevent recurrence
+6. **No new component tests** - Only update existing React component tests. Complex logic should be extracted into hooks and tested there instead
+7. **All source changes before any test changes** - Complete all source file edits first, then update tests in a separate pass. Interleaving disrupts reasoning about the source changes, especially across many files

 ## Basic Test Structure

@@ -0,0 +1,176 @@
+---
+name: ux
+description: 'LobeHub product design values / principles / checklists. Load this skill whenever the work touches user-interface features or implementation — designing or building any user-facing flow — to get better UX results.'
+user-invocable: false
+---
+
+# UX — Design Values & Execution Checklists
+
+How LobeHub products should feel, and concrete rules to get there. Use this when
+**building or reviewing** any user-facing flow. For component/styling choices see
+**react**, for wording see **microcopy**, for imperative modal wiring see **modal**.
+
+## Design values (设计价值观)
+
+LobeHub follows four product design values — **自然 Natural・意义感 Meaningful・
+确定性 Certainty・生长性 Growth**. Read them before designing:
+**[references/design-values.md](references/design-values.md)** (definitions +
+conflict priority).
+
+> The checklists below are the execution layer. Each item is tagged with the
+> value(s) it serves; for what those values mean, see the file above.
+
+## 1. Flow & momentum (操作链路)・自然・意义感
+
+Every action chain must **push the user forward**, never dead-end or block the flow.
+
+- [ ] **Forward momentum** — after any operation, lead the user to the next step,
+      don't just stop. _(意义感)_
+- [ ] **Success state = primary "go to result", secondary "dismiss"** — the strong
+      button is the forward action (take me to the result); "Done" is the weak/
+      secondary button. ✅ After moving topics: primary = "Go to «target»", secondary
+      \= "Done". _(意义感・自然)_
+- [ ] **Bulk ⇄ single-item parity** — an action on a multi-select toolbar must also
+      be reachable on a single item (its context menu), and vice versa. _(确定性)_
+- [ ] **Confirm → in-progress → done, in one surface** — bulk/irreversible/async
+      ops use a modal state machine: a confirm step stating exactly what happens →
+      an in-progress view with **dismissal locked** → a done (or error) view in the
+      same modal. Never fire-and-forget with only a toast; never leave a dead
+      spinner. _(确定性・意义感)_
+
+## 2. States: empty /loading/error (状态设计)・意义感・确定性
+
+Every data surface has **four** states — design all of them, not just "has data".
+
+- [ ] **Empty state is a purpose-built page, not a blank screen.** It explains what
+      this is, why it's empty, and gives a clear next action (CTA + value props).
+      ✅ Devices: an empty "Connect your first device" page with primary/secondary
+      connect paths and "what you can do once connected" cards — ❌ not a bare title
+      over skeleton rows or a blank body. _(意义感)_
+- [ ] **Distinguish the empty variants** — "no data yet" (onboarding CTA) vs
+      "no match for filters" (clear-filters affordance) are different screens. _(确定性)_
+- [ ] **Loading state** designed (skeleton / NeuralNetworkLoading), not a flash of
+      blank or layout shift. _(自然)_
+- [ ] **Error state** designed — surface the reason and a retry/back path. _(意义感)_
+
+## 3. Buttons & focus (按钮与焦点)・确定性
+
+- [ ] **One primary button per surface.** The single primary CTA tells the user the
+      core action; everything else is secondary/tertiary. Never a pile of primary
+      buttons competing for attention. _(确定性)_
+
+## 4. Lists at scale (列表与规模)・确定性・自然
+
+A list/data page must be designed for its **whole range of sizes**, not just the
+demo data.
+
+- [ ] **Walk the scale: 1 / 2 / 5 / 20 / 100 / 1k–10k rows.** Pick the right
+      mechanism per range — plain render → load-more / pagination → virtual scroll;
+      add batch-select / bulk actions once counts get large. _(确定性)_
+- [ ] **Co-design empty / loading / error with the data state** (see §2). A list
+      isn't done until all four render well. _(自然)_
+
+## 5. Option visibility (选项可见性)・确定性・意义感
+
+- [ ] **Pickers list every valid target.** Watch for options dropped by backend
+      list queries (pagination, `virtual` flags, scope filters) and add them back.
+      ✅ The default "LobeAI" (inbox) agent is `virtual` and excluded from the
+      sidebar list, so the move picker re-adds it. An empty picker must mean
+      "genuinely none", never "we filtered out the only option". _(意义感)_
+
+## 6. Loading visuals (Loading 视觉)・自然
+
+**Never use antd `Spin`** — it doesn't match the product's loading visual. Use a
+project loader:
+
+| Need                        | Component                                                                     |
+| --------------------------- | ----------------------------------------------------------------------------- |
+| Default loading (in-flight) | `NeuralNetworkLoading` from `@/components/NeuralNetworkLoading` (`size` prop) |
+| Inline dots                 | `DotsLoading` / `BubblesLoading` from `@/components`                          |
+| Branded full-page           | `Loading` from `@/components/Loading/BrandTextLoading`                        |
+| List / card placeholder     | a skeleton (e.g. `SkeletonList`)                                              |
+
+When in doubt, reach for `NeuralNetworkLoading` — it's the default in-flight
+indicator (e.g. modal "in progress" states).
+
+## 7. Discoverability & growth (可发现性与生长)・生长性
+
+The product should grow with the user — deeper power shows up as needs deepen.
+
+- [ ] **Progressive disclosure** — keep the novice path clean; reveal advanced
+      capabilities as the user gets there, don't dump everything at once. _(生长性・自然)_
+- [ ] **Surface related actions at the moment of need** — make the next capability
+      discoverable in context (e.g. after the first item exists, offer what to do
+      with it), not buried in a far-off menu. _(生长性・意义感)_
+
+## 8. Entity lifecycle completeness (实体生命周期完整性)・意义感・确定性
+
+The recurring trap: a feature ships only the **display** of a list, but edit /
+delete / management are never built — so the user can add something and then be
+stuck with it. For every entity a user can see, design its **full lifecycle**:
+create / read / update / delete, plus state transitions (enable/disable,
+connect/disconnect, install/uninstall). A read-only list the user can't manage
+breaks the flow.
+
+**The allowed operation set depends on the entity's source / ownership** — decide
+it explicitly _before_ building. Worked example, the tools/connectors list:
+
+| Entity class                        | Add     | Edit      | Remove             |
+| ----------------------------------- | ------- | --------- | ------------------ |
+| Official / built-in (skills, tools) | —       | —         | ✗ not removable    |
+| Community (installed MCP)           | install | configure | uninstall / remove |
+| User-custom (custom connector)      | create  | edit      | delete             |
+
+- [ ] **No display-only features.** For every listed entity, enumerate CRUD +
+      lifecycle ops and build the ones that apply. _(意义感)_
+- [ ] **Operation set per source/ownership class** — built-in may be read-only;
+      anything the user _installed_ must be removable; anything the user _created_
+      must be editable **and** deletable. _(确定性)_
+- [ ] **Each item exposes its allowed ops** (hover action / context menu / detail
+      page), and there's a clear entry point to add/create where applicable. _(自然)_
+- [ ] **An intentionally-absent op is a documented decision, not an oversight**
+      (e.g. official tools can't be deleted — by design). _(确定性)_
+
+## 9. Capability-gated features・Certainty・Meaningful
+
+A feature can be fully built and still produce a broken result when the selected
+model — or its still-loading config — **can't deliver the capability the feature
+depends on** (for example, an agentic run on a model without tool calling). This
+is usually the user's configuration choice, not a defect; but if the product stays
+silent the user reads it as the product being broken. When a feature's success
+depends on a capability the current config may lack, the product owes a
+**proactive, non-blocking reminder** — a guardrail, not a gate.
+
+- [ ] **Surface the mismatch, don't fail silently.** When a feature needs a model
+      capability (tool calling, vision, reasoning, long context) the current model
+      lacks, show a soft inline warning at the point of action — never a hard block
+      or a modal that stops the user. _(Meaningful)_
+- [ ] **Stay reactive.** The reminder clears the moment the user switches to a
+      capable model — derive it from live state, not a one-shot check. _(Natural)_
+- [ ] **Don't warn while config is loading.** A capability that hasn't resolved yet
+      looks "unsupported"; warning then is a false alarm — exactly the glitch users
+      mistake for a product bug. Warn only on a _resolved_ unsupported state. _(Certainty)_
+- [ ] **Scope to the mode that needs it.** Show only when the capability-dependent
+      mode is on; one reminder per root cause, never a pile of overlapping notices. _(Natural・Certainty)_
+- [ ] **State the problem and the remedy.** The copy says what's wrong _and_ what
+      the user should do about it. _(Meaningful)_
+
+## Quick review checklist
+
+- [ ] Action leads the user forward; success offers a primary "go to result".
+- [ ] Bulk action has a single-item entry (and vice versa).
+- [ ] Async/bulk/irreversible action: confirm → in-progress (locked) → done/error.
+- [ ] Empty / loading / error states are all designed; empty is a real page with a CTA.
+- [ ] Exactly one primary button per surface.
+- [ ] List designed across 1 → 10k rows (virtual scroll / pagination / batch as needed).
+- [ ] Pickers show all valid targets (default/inbox included); empty = truly none.
+- [ ] No antd `Spin`; use `NeuralNetworkLoading` / project loaders.
+- [ ] Advanced capability is progressively disclosed / discoverable at the moment of need.
+- [ ] Listed entities have their full lifecycle (not display-only); ops match source (built-in / installed / custom).
+- [ ] Capability-gated feature warns (soft, reactive, load-gated) when the model can't deliver it; copy gives the remedy.
+
+## Related skills
+
+- **modal** — imperative `createModal` state-machine wiring for confirm/progress/done.
+- **microcopy** — wording for confirm / done / empty / error states.
+- **react** — component priority, `Button` usage, styling.
@@ -0,0 +1,51 @@
+# LobeHub Design Values (设计价值观)
+
+The philosophy behind every LobeHub interface. Read this before designing or
+reviewing a flow; the per-aspect execution rules live in the parent
+[SKILL.md](../SKILL.md) and each checklist item is tagged with the value(s) it serves.
+
+Adapted from Ant Design's design values
+(<https://ant.design/docs/spec/values-cn>, <https://zhuanlan.zhihu.com/p/44809866>).
+LobeHub adopts all four.
+
+## 自然 (Natural)
+
+Minimise cognitive load. Digital products keep getting more complex while human
+attention stays scarce — so the interface should feel as effortless as the
+physical world. The next step should be obvious without thinking; the product
+proactively carries the user forward (sensible defaults, AI-assisted decisions,
+smooth transitions) rather than making them stop and figure things out.
+
+## 意义感 (Meaningful)
+
+Every screen is rooted in the user's real goal, not an isolated feature. Make the
+objective clear, give immediate feedback on the result of each action, and always
+point at the next meaningful step. Calibrate difficulty — neither a patronising
+over-simplification nor an overwhelming wall — so the user keeps a sense of
+progress and accomplishment.
+
+## 确定性 (Certainty)
+
+Low-entropy, predictable interactions. Reuse the same patterns, components, and
+wording so behaviour is never surprising. Keep a single clear focus per surface,
+and design **every** state (empty / loading / error / success) so nothing is left
+undefined. Restraint over cleverness: fewer, consistent rules beat many bespoke
+ones.
+
+## 生长性 (Growth)
+
+The product grows together with the user. As needs deepen and roles evolve,
+surface advanced capabilities progressively and make related features
+discoverable at the moment they become relevant — without crowding the novice
+path. Bridge product value to the user's changing scenarios and aim for
+human–machine symbiosis (人机共生): the user and the agent co-evolve, each making
+the other more capable over time.
+
+## Priority when values conflict
+
+For moment-to-moment interaction decisions: **意义感 ≳ 自然 > 确定性** — never
+sacrifice the user's goal or forward momentum just to keep things uniform.
+
+**生长性 (Growth)** is a longer-horizon lens: weigh it when shaping how a feature
+is discovered and how it scales with the user, not when resolving a single-screen
+layout trade-off.
@@ -49,4 +49,4 @@ Migration owner: @{pr-author}

 The migration owner is responsible for rollout follow-up and incident handling for this schema change.

-> **Note for Claude**: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'` or from commit metadata. Do not hardcode a username.
+> \[!NOTE]: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'` or from commit metadata. Do not hardcode a username.
@@ -18,4 +18,4 @@

@{pr-author}

-> **Note for Claude**: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'`. Do not hardcode a username.
+> \[!NOTE]: Replace `{pr-author}` with the actual PR author. Retrieve via `gh pr view <number> --json author --jq '.author.login'`. Do not hardcode a username.
@@ -86,7 +86,7 @@ New AI model or provider support, typically contributed via community PRs.
 - These PR title prefixes (`feat` / `style`) are in the auto-tag trigger list
 - No special branch naming or manual release steps required — merging the PR triggers auto patch +1

-### When Claude is involved
+### When an agent is involved

 If asked to add model support, just create a normal feature PR. The title prefix will trigger the release automatically.

@@ -425,14 +425,14 @@ OPENAI_API_KEY=sk-xxxxxxxxx
 # MCP_TOOL_TIMEOUT=60000

 # #######################################
-# ######### Klavis Service ##############
+# ######### Composio Service ############
 # #######################################

-# Klavis API Key for accessing Strata hosted MCP servers
-# Get your API key from: https://klavis.io
+# Composio API Key for accessing hosted integrations (Gmail, Slack, etc.)
+# Get your API key from: https://composio.dev
 # IMPORTANT: This key is stored server-side only and NEVER exposed to the client
-# When this key is set, Klavis integration will be automatically enabled
-# KLAVIS_API_KEY=your_klavis_api_key_here
+# When this key is set, Composio integration will be automatically enabled
+# COMPOSIO_API_KEY=your_composio_api_key_here

 # #######################################
 # #### Message Gateway (IM Integration) ##
@@ -445,6 +445,15 @@ OPENAI_API_KEY=sk-xxxxxxxxx
 # MESSAGE_GATEWAY_URL=https://message-gateway.lobehub.com
 # MESSAGE_GATEWAY_SERVICE_TOKEN=your_service_token_here

+# #######################################
+# ######### Agent Gateway Mode ##########
+# #######################################
+
+# Enable Gateway Mode for self-hosted deployments. Requires AGENT_GATEWAY_URL.
+# ENABLE_AGENT_GATEWAY=1
+# AGENT_GATEWAY_URL=https://agent-gateway.example.com
+# AGENT_GATEWAY_SERVICE_TOKEN=your_service_token_here
+
 # #######################################
 # ########### Messenger Bot #############
 # #######################################
@@ -19,12 +19,6 @@ jobs:
    steps:
      - uses: actions/checkout@v6

-      - name: Clean issue notice
-        uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
-        with:
-          actions: 'close-issues'
-          labels: '🚨 Sync Fail'
-
      - name: Sync upstream changes
        id: sync
        uses: aormsby/Fork-Sync-With-Upstream-action@v3.4
@@ -33,22 +27,4 @@ jobs:
          upstream_sync_branch: main
          target_sync_branch: main
          target_repo_token: ${{ secrets.GITHUB_TOKEN }} # automatically generated, no need to set
-          test_mode: false
-
-      - name: Sync check
-        if: failure()
-        uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
-        with:
-          actions: 'create-issue'
-          title: '🚨 同步失败 | Sync Fail'
-          labels: '🚨 Sync Fail'
-          body: |
-            Due to a change in the workflow file of the [LobeChat][lobechat] upstream repository, GitHub has automatically suspended the scheduled automatic update. You need to manually sync your fork. Please refer to the detailed [Tutorial][tutorial-en-US] for instructions.
-
-            由于 [LobeChat][lobechat] 上游仓库的 workflow 文件变更，导致 GitHub 自动暂停了本次自动更新，你需要手动 Sync Fork 一次，请查看 [详细教程][tutorial-zh-CN]
-
-            ![](https://github-production-user-asset-6210df.s3.amazonaws.com/17870709/273954625-df80c890-0822-4ac2-95e6-c990785cbed5.png)
-
-            [lobechat]: https://github.com/lobehub/lobe-chat
-            [tutorial-zh-CN]: https://lobehub.com/zh/docs/self-hosting/advanced/upstream-sync
-            [tutorial-en-US]: https://lobehub.com/docs/self-hosting/advanced/upstream-sync
+          test_mode: false
@@ -59,6 +59,7 @@ bun.lockb
 # Build outputs
 dist/
 public/_spa/
+public/_spa-auth/
 public/spa/
 es/
 lib/
@@ -92,10 +93,8 @@ public/swe-worker*

 # Generated files
 src/app/spa/[variants]/[[...path]]/spaHtmlTemplates.ts
+src/app/spa-auth/authHtmlTemplate.ts
 public/*.js
-public/sitemap.xml
-public/sitemap-index.xml
-sitemap*.xml
 robots.txt

 # Git hooks
@@ -136,3 +136,5 @@ bun run type-check
 ### Code Review

 Before reviewing a PR / diff / branch change, read the **review-checklist** skill (`.agents/skills/review-checklist/SKILL.md`) — it lists the recurring mistakes specific to this codebase.
+
+When designing or reviewing user-facing flows (empty/loading/error states, confirmations, async feedback, button hierarchy, lists at scale, pickers), follow the **ux** skill (`.agents/skills/ux/SKILL.md`) — LobeHub's design values (自然 / 意义感 / 确定性) plus per-aspect execution checklists.
@@ -29,6 +29,7 @@
  },
  "devDependencies": {
    "@lobechat/agent-gateway-client": "workspace:*",
+    "@lobechat/device-control": "workspace:*",
    "@lobechat/device-gateway-client": "workspace:*",
    "@lobechat/device-identity": "workspace:*",
    "@lobechat/heterogeneous-agents": "workspace:*",
@@ -1,5 +1,6 @@
 packages:
  - '../../packages/agent-gateway-client'
+  - '../../packages/device-control'
  - '../../packages/device-gateway-client'
  - '../../packages/device-identity'
  - '../../packages/heterogeneous-agents'
@@ -2,9 +2,16 @@ import fs from 'node:fs';
 import os from 'node:os';
 import path from 'node:path';

+import {
+  defaultGetLocalFilePreview,
+  defaultGetProjectFileIndex,
+  type DeviceControlDeps,
+  executeDeviceRpc,
+} from '@lobechat/device-control';
 import type {
  AgentRunRequestMessage,
  DeviceSystemInfo,
+  RpcRequestMessage,
  SystemInfoRequestMessage,
  ToolCallRequestMessage,
 } from '@lobechat/device-gateway-client';
@@ -262,19 +269,23 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {

  // Handle tool call requests
  client.on('tool_call_request', async (request: ToolCallRequestMessage) => {
-    const { requestId, timeout, toolCall } = request;
+    const { operationId, requestId, timeout, toolCall } = request;
    if (isDaemonChild) {
-      appendLog(`[TOOL] ${toolCall.apiName} (${requestId})`);
+      appendLog(
+        `[TOOL] ${toolCall.apiName}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
+      );
    } else {
-      log.toolCall(toolCall.apiName, requestId, toolCall.arguments);
+      log.toolCall(toolCall.apiName, requestId, toolCall.arguments, operationId);
    }

    const result = await executeToolCall(toolCall.apiName, toolCall.arguments, timeout);

    if (isDaemonChild) {
-      appendLog(`[RESULT] ${result.success ? 'OK' : 'FAIL'} (${requestId})`);
+      appendLog(
+        `[RESULT] ${result.success ? 'OK' : 'FAIL'}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
+      );
    } else {
-      log.toolResult(requestId, result.success, result.content);
+      log.toolResult(requestId, result.success, result.content, operationId);
    }

    client.sendToolCallResponse({
@@ -288,6 +299,31 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
    });
  });

+  // Handle generic server-internal device RPCs (git / workspace / file ops).
+  // Shares the `@lobechat/device-control` dispatcher with the desktop app so the
+  // CLI exposes the same remote-device control surface. File preview / index use
+  // the package's portable defaults (no preview-protocol approval on the CLI).
+  const deviceControlDeps: DeviceControlDeps = {
+    getLocalFilePreview: defaultGetLocalFilePreview,
+    getProjectFileIndex: defaultGetProjectFileIndex,
+  };
+
+  client.on('rpc_request', async (request: RpcRequestMessage) => {
+    const { method, params, requestId } = request;
+    if (isDaemonChild) appendLog(`[RPC] ${method} (${requestId})`);
+    else info(`Received rpc_request: method=${method} (${requestId})`);
+
+    try {
+      const data = await executeDeviceRpc(method, params, deviceControlDeps);
+      client.sendRpcResponse({ requestId, result: { data, success: true } });
+    } catch (err) {
+      const message = err instanceof Error ? err.message : String(err);
+      if (isDaemonChild) appendLog(`[RPC ERROR] ${method}: ${message} (${requestId})`);
+      else error(`rpc_request method=${method} failed: ${message}`);
+      client.sendRpcResponse({ requestId, result: { error: message, success: false } });
+    }
+  });
+
  // Handle gateway-dispatched agent runs (heterogeneous agents, e.g. Claude
  // Code). Mirrors the desktop app: spawn `lh hetero exec`, which owns the full
  // execution + server-ingest pipeline. Ack with the spawn outcome — `accepted`
@@ -649,6 +649,53 @@ describe('hetero exec command', () => {
    ]);
  });

+  it('finishes with result "error" when a terminal error event is pushed despite a clean exit', async () => {
+    // CC relays an API/rate-limit error as an in-stream `error` event but still
+    // exits 0. The finish result must NOT be derived from the exit code alone,
+    // otherwise the topic/task is wrongly marked completed.
+    mockSpawnAgent.mockReturnValue(
+      createFakeHandle({
+        events: [
+          {
+            data: {
+              error: 'API Error: Server is temporarily limiting requests · Rate limited',
+              message: 'API Error: Server is temporarily limiting requests · Rate limited',
+            },
+            operationId: 'op-err',
+            stepIndex: 0,
+            timestamp: 1,
+            type: 'error',
+          },
+        ],
+        exitCode: 0,
+      }),
+    );
+
+    await runCmd([
+      'hetero',
+      'exec',
+      '--type',
+      'claude-code',
+      '--prompt',
+      'hi',
+      '--topic',
+      'topic-1',
+      '--operation-id',
+      'op-err',
+      '--render',
+      'none',
+    ]);
+
+    expect(mockHeteroFinishMutate).toHaveBeenCalledTimes(1);
+    expect(mockHeteroFinishMutate.mock.calls[0][0]).toMatchObject({
+      error: {
+        message: 'API Error: Server is temporarily limiting requests · Rate limited',
+        type: 'AgentRuntimeError',
+      },
+      result: 'error',
+    });
+  });
+
  it('resets the per-message text accumulator at message boundaries (no cross-message duplication)', async () => {
    // The `replace` snapshot accumulator must not span
    // message boundaries. Two assistant messages separated by a
@@ -467,6 +467,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
   *   sessionId     — CC session id from `system.init` (undefined on resume failure)
   *   ingestError   — true when a batch could not be flushed after retries
   *   resumeNotFound — true when a resume-not-found error was intercepted
+   *   sawTerminalError — true when a terminal `error` event was pushed to the
+   *                      ingester (CC can relay an API/rate-limit error this way
+   *                      and still exit 0, so the exit code alone is not enough)
+   *   terminalErrorMessage — the message from that terminal `error` event, used
+   *                      as the task-level error detail in the finish payload
   *   stderrContent  — accumulated stderr (only when interceptResumeErrors=true)
   */
  const runOneAgent = async (
@@ -477,9 +482,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
    code: number | null;
    ingestError: boolean;
    resumeNotFound: boolean;
+    sawTerminalError: boolean;
    sessionId: string | undefined;
    signal: NodeJS.Signals | null;
    stderrContent: string;
+    terminalErrorMessage: string | undefined;
  }> => {
    // One raw-dump file pair per spawn attempt (the resume retry is a second
    // attempt). The stdout tee runs inside `spawnAgent` before the adapter.
@@ -549,6 +556,8 @@ const exec = async (options: ExecOptions): Promise<void> => {
    // into the ingester.  When intercepting resume errors, a matching
    // `error` event is withheld from the ingester and flags a retry instead.
    let resumeNotFound = false;
+    let sawTerminalError = false;
+    let terminalErrorMessage: string | undefined;
    const ingestError = false;
    try {
      for await (const event of handle.events) {
@@ -563,6 +572,16 @@ const exec = async (options: ExecOptions): Promise<void> => {
            continue;
          }
        }
+        // A terminal `error` event (e.g. an API/rate-limit error relayed by CC)
+        // must mark the run as failed even when the child exits 0 — track it so
+        // the finish result is not derived from the exit code alone. Capture the
+        // message too, so the finish payload can surface it as the task-level
+        // error detail (CC relays these on stdout, not stderr).
+        if (event.type === 'error') {
+          sawTerminalError = true;
+          const data = event.data as Record<string, unknown> | undefined;
+          terminalErrorMessage = String(data?.message ?? data?.error ?? '') || undefined;
+        }
        if (emitJsonl) process.stdout.write(`${JSON.stringify(event)}\n`);
        serverIngester?.push(event);
      }
@@ -608,9 +627,11 @@ const exec = async (options: ExecOptions): Promise<void> => {
      code,
      ingestError,
      resumeNotFound,
+      sawTerminalError,
      sessionId: handle.sessionId,
      signal,
      stderrContent,
+      terminalErrorMessage,
    };
  };

@@ -675,16 +696,23 @@ const exec = async (options: ExecOptions): Promise<void> => {
      result = { ...result, ingestError: true };
    }

-    const exitedClean = !result.ingestError && (code === 0 || signal === 'SIGTERM');
+    // CC relays API/rate-limit errors as an in-stream terminal `error` event but
+    // still exits 0, so the exit code alone would report `success`. Treat any
+    // pushed terminal error as a failed run so the topic/task is marked failed.
+    const exitedClean =
+      !result.ingestError && !result.sawTerminalError && (code === 0 || signal === 'SIGTERM');

-    // When the run failed, pass stderr as the error detail so the server can
-    // surface a useful message instead of the generic "Agent execution failed"
-    // fallback.  Trim to the last 1 KB — the tail is most informative and
-    // keeps the tRPC payload small.
+    // When the run failed, pass an error detail so the server surfaces a useful
+    // message instead of the generic "Agent execution failed" fallback. Prefer
+    // the in-stream terminal error (CC relays API/rate-limit errors here while
+    // exiting 0, so stderr is empty); otherwise fall back to the stderr tail.
+    // Trim to the last 1 KB — the tail is most informative and keeps the tRPC
+    // payload small.
    const stderrTail = result.stderrContent.trim();
+    const errorDetail = result.terminalErrorMessage || stderrTail;
    const finishError =
-      !exitedClean && stderrTail
-        ? { message: stderrTail.slice(-1024), type: 'AgentRuntimeError' }
+      !exitedClean && errorDetail
+        ? { message: errorDetail.slice(-1024), type: 'AgentRuntimeError' }
        : undefined;

    try {
@@ -1,4 +1,3 @@
-/* eslint-disable no-console */
 import pc from 'picocolors';

 let verbose = false;
@@ -41,18 +40,20 @@ export const log = {
    console.log(`${timestamp()} ${pc.bold('[STATUS]')} ${color(status)}`);
  },

-  toolCall: (apiName: string, requestId: string, args?: string) => {
+  toolCall: (apiName: string, requestId: string, args?: string, operationId?: string) => {
    console.log(
-      `${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)} ${pc.dim(`(${requestId})`)}`,
+      `${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
    );
    if (args && verbose) {
      console.log(`  ${pc.dim(args)}`);
    }
  },

-  toolResult: (requestId: string, success: boolean, content?: string) => {
+  toolResult: (requestId: string, success: boolean, content?: string, operationId?: string) => {
    const icon = success ? pc.green('OK') : pc.red('FAIL');
-    console.log(`${timestamp()} ${pc.magenta('[RESULT]')} ${icon} ${pc.dim(`(${requestId})`)}`);
+    console.log(
+      `${timestamp()} ${pc.magenta('[RESULT]')} ${icon}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
+    );
    if (content && verbose) {
      const preview = content.length > 200 ? content.slice(0, 200) + '...' : content;
      console.log(`  ${pc.dim(preview)}`);
@@ -56,6 +56,7 @@
    "@electron-toolkit/utils": "^4.0.0",
    "@lobechat/chat-adapter-imessage": "workspace:*",
    "@lobechat/desktop-bridge": "workspace:*",
+    "@lobechat/device-control": "workspace:*",
    "@lobechat/device-gateway-client": "workspace:*",
    "@lobechat/device-identity": "workspace:*",
    "@lobechat/electron-client-ipc": "workspace:*",
@@ -8,6 +8,7 @@ packages:
  - '../../packages/electron-client-ipc'
  - '../../packages/file-loaders'
  - '../../packages/desktop-bridge'
+  - '../../packages/device-control'
  - '../../packages/device-gateway-client'
  - '../../packages/device-identity'
  - '../../packages/local-file-shell'
@@ -3,6 +3,7 @@ import fs from 'node:fs';
 import os from 'node:os';
 import path from 'node:path';

+import { type DeviceControlDeps, executeDeviceRpc as runDeviceRpc } from '@lobechat/device-control';
 import type {
  AgentRunRequestMessage,
  GatewayMcpStdioParams,
@@ -13,11 +14,8 @@ import type {
  GetCommandOutputParams,
  GlobFilesParams,
  GrepContentParams,
-  InitWorkspaceParams,
  KillCommandParams,
  ListLocalFileParams,
-  ListProjectSkillsParams,
-  LocalFilePreviewUrlParams,
  LocalReadFileParams,
  LocalReadFilesParams,
  LocalSearchFilesParams,
@@ -30,15 +28,16 @@ import { type ILocalSystemService, LocalSystemExecutionRuntime } from '@lobechat

 import GatewayConnectionService from '@/services/gatewayConnectionSrv';
 import ImessageBridgeService from '@/services/imessageBridgeSrv';
+import { createLogger } from '@/utils/logger';

-import GitCtr from './GitCtr';
 import HeterogeneousAgentCtr from './HeterogeneousAgentCtr';
 import { ControllerModule, IpcMethod } from './index';
 import LocalFileCtr from './LocalFileCtr';
 import McpCtr from './McpCtr';
 import RemoteServerConfigCtr from './RemoteServerConfigCtr';
 import ShellCommandCtr from './ShellCommandCtr';
-import WorkspaceCtr from './WorkspaceCtr';
+
+const logger = createLogger('controllers:GatewayConnectionCtr');

 /**
 * Inject the lh-notify protocol into the first turn of a new hetero-agent session.
@@ -167,14 +166,6 @@ export default class GatewayConnectionCtr extends ControllerModule {
    return this.app.getController(LocalFileCtr);
  }

-  private get workspaceCtr() {
-    return this.app.getController(WorkspaceCtr);
-  }
-
-  private get gitCtr() {
-    return this.app.getController(GitCtr);
-  }
-
  private get shellCommandCtr() {
    return this.app.getController(ShellCommandCtr);
  }
@@ -353,91 +344,33 @@ export default class GatewayConnectionCtr extends ControllerModule {
    return this.localSystemRuntime;
  }

+  /**
+   * Platform-specific handlers the shared `@lobechat/device-control` dispatcher
+   * delegates to. Git + workspace-scan methods run inside device-control over
+   * `@lobechat/local-file-shell`; only file preview / index (and preview
+   * approval) are desktop-specific and routed back to the controllers here.
+   */
+  private get deviceControlDeps(): DeviceControlDeps {
+    return {
+      approveProjectRoot: async (root) => {
+        try {
+          await this.app.localFileProtocolManager.approveIndexedProjectRoot(root);
+        } catch (error) {
+          logger.error(`Failed to approve project preview root ${root}:`, error);
+        }
+      },
+      getLocalFilePreview: (params) => this.localFileCtr.getLocalFilePreview(params),
+      getProjectFileIndex: (params) => this.localFileCtr.getProjectFileIndex(params),
+    };
+  }
+
  /**
   * Dispatch a generic server-internal device RPC (not an agent tool call) by
-   * method name. Currently only `initWorkspace` (scan the bound project root for
-   * skills + AGENTS.md); add new server-only device methods here.
+   * method name. The dispatch logic lives in `@lobechat/device-control` so the
+   * desktop main process and the CLI daemon share one device RPC surface.
   */
  private async executeDeviceRpc(method: string, params: unknown): Promise<unknown> {
-    switch (method) {
-      case 'initWorkspace': {
-        return this.workspaceCtr.initWorkspace(params as InitWorkspaceParams);
-      }
-
-      case 'getGitBranch': {
-        return this.gitCtr.getGitBranch((params as { path: string }).path);
-      }
-
-      case 'getLinkedPullRequest': {
-        return this.gitCtr.getLinkedPullRequest(params as { branch: string; path: string });
-      }
-
-      case 'getGitWorkingTreeStatus': {
-        return this.gitCtr.getGitWorkingTreeStatus((params as { path: string }).path);
-      }
-
-      case 'getGitAheadBehind': {
-        return this.gitCtr.getGitAheadBehind((params as { path: string }).path);
-      }
-
-      case 'listGitBranches': {
-        return this.gitCtr.listGitBranches((params as { path: string }).path);
-      }
-
-      case 'checkoutGitBranch': {
-        return this.gitCtr.checkoutGitBranch(
-          params as { branch: string; create?: boolean; path: string },
-        );
-      }
-
-      case 'pullGitBranch': {
-        return this.gitCtr.pullGitBranch(params as { path: string });
-      }
-
-      case 'pushGitBranch': {
-        return this.gitCtr.pushGitBranch(params as { path: string });
-      }
-
-      case 'getGitWorkingTreePatches': {
-        return this.gitCtr.getGitWorkingTreePatches((params as { path: string }).path);
-      }
-
-      case 'getGitWorkingTreeFiles': {
-        return this.gitCtr.getGitWorkingTreeFiles((params as { path: string }).path);
-      }
-
-      case 'getProjectFileIndex': {
-        return this.localFileCtr.getProjectFileIndex(params as { scope?: string });
-      }
-
-      case 'getLocalFilePreview': {
-        return this.localFileCtr.getLocalFilePreview(params as LocalFilePreviewUrlParams);
-      }
-
-      case 'listProjectSkills': {
-        return this.workspaceCtr.listProjectSkills(params as ListProjectSkillsParams);
-      }
-
-      case 'getGitBranchDiff': {
-        return this.gitCtr.getGitBranchDiff(params as { baseRef?: string; path: string });
-      }
-
-      case 'listGitRemoteBranches': {
-        return this.gitCtr.listGitRemoteBranches((params as { path: string }).path);
-      }
-
-      case 'revertGitFile': {
-        return this.gitCtr.revertGitFile(params as { filePath: string; path: string });
-      }
-
-      case 'statPath': {
-        return this.workspaceCtr.statPath(params as { path: string });
-      }
-
-      default: {
-        throw new Error(`Unknown device RPC method: ${method}`);
-      }
-    }
+    return runDeviceRpc(method, params, this.deviceControlDeps);
  }

  private async executeToolCall(
@@ -23,7 +23,7 @@ import type {
  HeteroExecImageRef,
 } from '@lobechat/heterogeneous-agents/protocol';
 import { buildHeteroExecStdinPayload } from '@lobechat/heterogeneous-agents/protocol';
-import type { AgentStreamEvent } from '@lobechat/heterogeneous-agents/spawn';
+import type { AgentStreamEvent, UsageData } from '@lobechat/heterogeneous-agents/spawn';
 import {
  AgentStreamPipeline,
  buildAgentInput,
@@ -188,6 +188,21 @@ interface AgentSession {
  modelVerificationLastAttemptAt?: number;
  modelVerificationLastAttemptSessionId?: string;
  process?: ChildProcess;
+  /**
+   * Absolute CLI path resolved by spawn preflight detection. Used for spawn()
+   * when the configured command is bare: detection can find the CLI through
+   * the login-shell PATH or a well-known install location (e.g. the Codex.app
+   * bundled CLI) that plain spawn() with the inherited env can't resolve.
+   */
+  resolvedCommandPath?: string;
+  /**
+   * PATH the preflight detector used to resolve `resolvedCommandPath`, set only
+   * when it fell back to the login-shell PATH. Merged into the child PATH at
+   * spawn so a `#!/usr/bin/env node` shim still finds its interpreter — the
+   * shim resolving in preflight doesn't guarantee `node` is on the leaner
+   * inherited PATH (Finder-launched Electron).
+   */
+  resolvedCommandSearchPath?: string;
  resumeSessionId?: string;
  sessionId: string;
  verifiedModel?: string;
@@ -470,11 +485,20 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
            session.agentType === 'claude-code' ? 'claude-code' : 'codex',
            command,
          );
-    const cliMissingError = this.buildCliMissingError(session);

-    if (!status || status.available || !cliMissingError) return;
+    if (!status || status.available) {
+      // Spawn through the detector-resolved absolute path when the configured
+      // command is bare — detection may have located the CLI somewhere plain
+      // spawn() can't (login-shell PATH, Codex.app bundled CLI, …).
+      const useResolvedPath = Boolean(status?.path) && !command.includes(path.sep);
+      session.resolvedCommandPath = useResolvedPath ? status!.path : undefined;
+      // Carry the login-shell PATH the detector resolved through, so a
+      // `#!/usr/bin/env node` shim spawned by absolute path still finds `node`.
+      session.resolvedCommandSearchPath = useResolvedPath ? status!.resolvedPathEnv : undefined;
+      return;
+    }

-    return cliMissingError;
+    return this.buildCliMissingError(session);
  }

  private get shouldTraceCliOutput(): boolean {
@@ -911,6 +935,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
    let spawnPlan;
    let traceSession;
    let cwd: string;
+    let initialCumulativeUsage: UsageData | undefined;
    let spawnEnv: NodeJS.ProcessEnv;
    try {
      const driver = getHeterogeneousAgentDriver(session.agentType);
@@ -934,7 +959,12 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
      // Forward the user's proxy settings to the CLI. The main-process undici
      // dispatcher doesn't reach child processes — they need env vars.
      const proxyEnv = buildProxyEnv(this.app.storeManager.get('networkProxy'));
-      spawnEnv = { ...buildInheritedSpawnEnv(), ...proxyEnv, ...session.env };
+      const inheritedEnv = buildInheritedSpawnEnv();
+      // When preflight resolved the CLI via the login-shell PATH, spawn with
+      // that PATH (a superset of the inherited one) so a `#!/usr/bin/env node`
+      // shim finds its interpreter. `session.env` still wins if it sets PATH.
+      if (session.resolvedCommandSearchPath) inheritedEnv.PATH = session.resolvedCommandSearchPath;
+      spawnEnv = { ...inheritedEnv, ...proxyEnv, ...session.env };

      if (session.agentType === 'codex') {
        const initialModel = await resolveCodexInitialModel({
@@ -945,6 +975,12 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
          session.model = initialModel.model;
          session.modelSource = initialModel.source;
        }
+
+        if (session.agentSessionId) {
+          initialCumulativeUsage = (
+            await readCodexSessionModel(session.agentSessionId, { env: spawnEnv })
+          )?.cumulativeUsage;
+        }
      }

      traceSession = await this.createCliTraceSession({
@@ -966,7 +1002,10 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
    }
    const useStdin = spawnPlan.stdinPayload !== undefined;
    const cliArgs = spawnPlan.args;
-    const resolvedCliSpawnPlan = await resolveCliSpawnPlan(session.command, cliArgs);
+    const resolvedCliSpawnPlan = await resolveCliSpawnPlan(
+      session.resolvedCommandPath ?? session.command,
+      cliArgs,
+    );

    logger.info(
      'Spawning agent:',
@@ -1001,6 +1040,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
        reject,
        resolve,
        session,
+        initialCumulativeUsage,
        spawnEnv,
        traceSession,
        useStdin,
@@ -1070,6 +1110,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {

  private handleSpawnedAgentProcess({
    cwd,
+    initialCumulativeUsage,
    intervention,
    params,
    proc,
@@ -1088,6 +1129,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
    reject: (reason?: unknown) => void;
    resolve: () => void;
    session: AgentSession;
+    initialCumulativeUsage?: UsageData | undefined;
    spawnEnv: NodeJS.ProcessEnv;
    spawnPlan: HeterogeneousAgentBuildPlan;
    traceSession: CliTraceSession | undefined;
@@ -1128,6 +1170,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
    const pipeline = new AgentStreamPipeline({
      agentType: session.agentType,
      cwd,
+      initialCumulativeUsage,
      initialModel: session.model,
      operationId: params.operationId,
    });
@@ -437,11 +437,13 @@ export default class LocalFileCtr extends ControllerModule {

  @IpcMethod()
  async getLocalFilePreviewUrl({
+    accept,
    path: filePath,
    workingDirectory,
  }: LocalFilePreviewUrlParams): Promise<LocalFilePreviewUrlResult> {
    try {
      const url = await this.app.localFileProtocolManager.createPreviewUrl({
+        accept,
        filePath,
        workspaceRoot: workingDirectory,
      });
@@ -459,11 +461,13 @@ export default class LocalFileCtr extends ControllerModule {

  @IpcMethod()
  async getLocalFilePreview({
+    accept,
    path: filePath,
    workingDirectory,
  }: LocalFilePreviewUrlParams): Promise<LocalFilePreviewResult> {
    try {
      const preview = await this.app.localFileProtocolManager.readPreviewFile({
+        accept,
        filePath,
        workspaceRoot: workingDirectory,
      });
@@ -1,244 +1,53 @@
-import { readdir, readFile, stat } from 'node:fs/promises';
-import path from 'node:path';
-
+import {
+  initWorkspace as runInitWorkspace,
+  listProjectSkills as runListProjectSkills,
+  statPath as runStatPath,
+  type WorkspaceScanDeps,
+} from '@lobechat/device-control';
 import {
  type InitWorkspaceParams,
  type InitWorkspaceResult,
  type ListProjectSkillsParams,
  type ListProjectSkillsResult,
-  type ProjectSkillItem,
 } from '@lobechat/electron-client-ipc';

-import { detectRepoType } from '@/utils/git';
 import { createLogger } from '@/utils/logger';

 import { ControllerModule, IpcMethod } from './index';

 const logger = createLogger('controllers:WorkspaceCtr');

-const SKILL_FRONTMATTER_RE = /^---\r?\n([\s\S]*?)\r?\n---/;
-
-// Cap recursion to guard against pathological directory trees.
-const MAX_SKILL_FILE_COUNT = 1000;
-
-const toPosixRelativePath = (filePath: string) => filePath.split(path.sep).join('/');
-
-const listSkillFilesRecursive = async (dir: string): Promise<string[]> => {
-  const results: string[] = [];
-  const stack: string[] = [dir];
-
-  while (stack.length > 0 && results.length < MAX_SKILL_FILE_COUNT) {
-    const current = stack.pop()!;
-    let entries;
-    try {
-      entries = await readdir(current, { withFileTypes: true });
-    } catch {
-      continue;
-    }
-    for (const entry of entries) {
-      if (entry.name.startsWith('.')) continue;
-      const full = path.join(current, entry.name);
-      if (entry.isDirectory()) {
-        stack.push(full);
-      } else if (entry.isFile()) {
-        results.push(toPosixRelativePath(path.relative(dir, full)));
-        if (results.length >= MAX_SKILL_FILE_COUNT) break;
-      }
-    }
-  }
-  return results.sort();
-};
-
-// Parse a minimal YAML frontmatter block for SKILL.md files.
-// Only handles `key: value` lines; multi-line block scalars fall back to the first line.
-const parseSkillFrontmatter = (raw: string): Record<string, string> => {
-  const match = raw.match(SKILL_FRONTMATTER_RE);
-  if (!match) return {};
-
-  const fields: Record<string, string> = {};
-  for (const line of match[1].split(/\r?\n/)) {
-    const colonIdx = line.indexOf(':');
-    if (colonIdx === -1) continue;
-    const key = line.slice(0, colonIdx).trim();
-    if (!key || key.startsWith('#')) continue;
-    let value = line.slice(colonIdx + 1).trim();
-    if (value.startsWith('|') || value.startsWith('>')) continue;
-    if (
-      (value.startsWith('"') && value.endsWith('"')) ||
-      (value.startsWith("'") && value.endsWith("'"))
-    ) {
-      value = value.slice(1, -1);
-    }
-    fields[key] = value;
-  }
-  return fields;
-};
-
 /**
 * WorkspaceCtr
 *
- * Owns "project workspace" scanning: discovering agent skills (`.agents/skills`
- * / `.claude/skills`) and project-root instructions (`AGENTS.md` / `CLAUDE.md`)
- * under a bound project directory. Split out of LocalFileCtr so the
- * workspace/agent-config concern is distinct from generic local file ops.
+ * Thin IPC layer over `@lobechat/device-control`'s workspace-scan helpers
+ * (skills discovery under `.agents/skills` / `.claude/skills` + project-root
+ * instructions). The scan logic is shared with the device-control RPC dispatch
+ * so the local desktop IPC path, the remote device RPC, and the CLI all run
+ * identical scans; the desktop-only preview-protocol approval is injected here.
 */
 export default class WorkspaceCtr extends ControllerModule {
  static override readonly groupName = 'workspace';

-  /**
-   * Scan one skill source directory (e.g. `.agents/skills`) under `root` and
-   * return parsed frontmatter for each `SKILL.md`. Returns `[]` when the source
-   * directory is absent or unreadable. Unsorted — callers sort/merge.
-   */
-  private async scanSkillsInSource(
-    root: string,
-    source: ProjectSkillItem['source'],
-  ): Promise<ProjectSkillItem[]> {
-    const dir = path.join(root, source);
-    let entries;
-    try {
-      entries = await readdir(dir, { withFileTypes: true });
-    } catch {
-      // Directory does not exist or is not readable.
-      return [];
-    }
-
-    const skills = await Promise.all(
-      entries
-        .filter((entry) => entry.isDirectory() || entry.isSymbolicLink())
-        .map(async (entry) => {
-          const skillDir = path.join(dir, entry.name);
-          const skillFile = path.join(skillDir, 'SKILL.md');
-          try {
-            const raw = await readFile(skillFile, 'utf8');
-            const fields = parseSkillFrontmatter(raw);
-            const files = await listSkillFilesRecursive(skillDir);
-            return {
-              description: fields.description || undefined,
-              fileCount: files.length,
-              files,
-              name: fields.name || entry.name,
-              path: skillFile,
-              skillDir,
-              source,
-            };
-          } catch {
-            return null;
-          }
-        }),
-    );
-
-    return skills.filter((skill): skill is ProjectSkillItem => skill !== null);
+  private get scanDeps(): WorkspaceScanDeps {
+    return { approveProjectRoot: (root) => this.approveProjectRootForPreview(root) };
  }

-  /**
-   * Scan agent skill directories under the project root and return parsed
-   * frontmatter for each SKILL.md. Used by the hetero agent's working sidebar
-   * to surface skills available in the current project. Returns the first
-   * source directory that yields any skills (`.agents/skills` wins).
-   */
  @IpcMethod()
  async listProjectSkills(params: ListProjectSkillsParams): Promise<ListProjectSkillsResult> {
-    const root = params.scope;
-    const sources = ['.agents/skills', '.claude/skills'] as const;
-
-    for (const source of sources) {
-      const skills = (await this.scanSkillsInSource(root, source)).sort((a, b) =>
-        a.name.localeCompare(b.name),
-      );
-
-      if (skills.length > 0) {
-        await this.approveProjectRootForPreview(root);
-        return { root, skills, source };
-      }
-    }
-
-    return { root, skills: [], source: null };
+    return runListProjectSkills(params, this.scanDeps);
  }

-  /**
-   * One-call "workspace init" scan of a bound project directory: merge the
-   * project skills from BOTH `.agents/skills` and `.claude/skills` (deduped by
-   * name, `.agents/skills` winning) and read the project-root agent
-   * instructions file (`AGENTS.md`, else `CLAUDE.md`). Driven server-side at run
-   * start via the generic device RPC (not an LLM-visible tool) and cached onto
-   * `devices.workingDirs[].workspace`.
-   *
-   * Approves the root for the `lobe-file://` preview protocol (same as
-   * `listProjectSkills`) so the user can later click through to the scanned
-   * skills / instructions in the UI.
-   */
  @IpcMethod()
  async initWorkspace(params: InitWorkspaceParams): Promise<InitWorkspaceResult> {
-    const root = params.scope;
-    const sources = ['.agents/skills', '.claude/skills'] as const;
-
-    const seen = new Set<string>();
-    const skills: ProjectSkillItem[] = [];
-    for (const source of sources) {
-      for (const skill of await this.scanSkillsInSource(root, source)) {
-        if (seen.has(skill.name)) continue;
-        seen.add(skill.name);
-        skills.push(skill);
-      }
-    }
-    skills.sort((a, b) => a.name.localeCompare(b.name));
-
-    const instructions = await this.readWorkspaceInstructions(root);
-
-    // Approve regardless of what was found — the run is now bound to this root,
-    // so any later click-through to it should resolve through the preview
-    // protocol even if the project carries neither skills nor instructions.
-    await this.approveProjectRootForPreview(root);
-
-    return { instructions, root, skills };
+    return runInitWorkspace(params, this.scanDeps);
  }

-  /**
-   * Check whether a path exists on this device and is a directory, plus its git
-   * repo type (`git` / `github` / none). Used to validate a manually-entered
-   * working directory from a web / remote client (which can't browse this
-   * device's filesystem) before binding it, and to render the right dir icon.
-   */
  @IpcMethod()
  async statPath(params: {
    path: string;
  }): Promise<{ exists: boolean; isDirectory: boolean; repoType?: 'git' | 'github' }> {
-    try {
-      const stats = await stat(params.path);
-      if (!stats.isDirectory()) return { exists: true, isDirectory: false };
-      const repoType = await detectRepoType(params.path);
-      return { exists: true, isDirectory: true, repoType };
-    } catch {
-      return { exists: false, isDirectory: false };
-    }
-  }
-
-  /**
-   * Read the project-root agent instructions files. Collects every present
-   * candidate (`AGENTS.md`, then `CLAUDE.md`) rather than first-match, since both
-   * can coexist. Each body is capped so a pathologically large file can't bloat
-   * the cached `workingDirs` payload or the injected system role.
-   */
-  private async readWorkspaceInstructions(
-    root: string,
-  ): Promise<InitWorkspaceResult['instructions']> {
-    const MAX_INSTRUCTIONS_BYTES = 64 * 1024;
-    const candidates = ['AGENTS.md', 'CLAUDE.md'] as const;
-
-    const instructions: InitWorkspaceResult['instructions'] = [];
-    for (const source of candidates) {
-      try {
-        const raw = await readFile(path.join(root, source), 'utf8');
-        const content =
-          raw.length > MAX_INSTRUCTIONS_BYTES ? raw.slice(0, MAX_INSTRUCTIONS_BYTES) : raw;
-        instructions.push({ content, source });
-      } catch {
-        // File absent or unreadable; skip it.
-      }
-    }
-
-    return instructions;
+    return runStatPath(params);
  }

  private async approveProjectRootForPreview(root: string) {
@@ -480,6 +480,87 @@ describe('HeterogeneousAgentCtr', () => {
      expect(spawnCalls).toHaveLength(0);
    });

+    it('spawns through the detector-resolved absolute path when the bare command is off PATH', async () => {
+      // Codex desktop app case: `codex` is not on PATH, but the preflight
+      // detector finds the CLI bundled inside Codex.app. Spawning the bare
+      // command would ENOENT — spawn must use the resolved absolute path.
+      const resolvedPath = '/Applications/Codex.app/Contents/Resources/codex';
+      const detect = vi.fn().mockResolvedValue({ available: true, path: resolvedPath });
+      const { proc } = createFakeProc();
+      nextFakeProc = proc;
+
+      const ctr = new HeterogeneousAgentCtr({
+        appStoragePath,
+        storeManager: { get: vi.fn() },
+        toolDetectorManager: { detect },
+      } as any);
+      const { sessionId } = await ctr.startSession({
+        agentType: 'codex',
+        command: 'codex',
+      });
+      await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
+
+      expect(spawnCalls[0].command).toBe(resolvedPath);
+    });
+
+    it('carries the detector login-shell PATH into the spawn env for `env node` shims', async () => {
+      // `codex` resolved via the login-shell PATH (mise/nvm). Spawning the
+      // absolute shim under the leaner inherited PATH would fail at its
+      // `#!/usr/bin/env node` shebang — the resolved PATH must reach the child.
+      const resolvedPath = '/Users/h/.local/share/mise/shims/codex';
+      const searchPath = '/Users/h/.local/share/mise/shims:/usr/bin:/bin';
+      const detect = vi
+        .fn()
+        .mockResolvedValue({ available: true, path: resolvedPath, resolvedPathEnv: searchPath });
+      const { proc } = createFakeProc();
+      nextFakeProc = proc;
+
+      const ctr = new HeterogeneousAgentCtr({
+        appStoragePath,
+        storeManager: { get: vi.fn() },
+        toolDetectorManager: { detect },
+      } as any);
+      const { sessionId } = await ctr.startSession({ agentType: 'codex', command: 'codex' });
+      await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
+
+      expect(spawnCalls[0].command).toBe(resolvedPath);
+      expect(spawnCalls[0].options.env.PATH).toBe(searchPath);
+    });
+
+    it('keeps an explicit path-like command for spawn instead of the detector result', async () => {
+      // detectHeterogeneousCliCommand validates the custom path via --version.
+      execFileMock.mockImplementation(
+        (
+          _file: string,
+          _args: string[],
+          optionsOrCallback: unknown,
+          callback?: (error: Error | null, result: { stderr: string; stdout: string }) => void,
+        ) => {
+          const resolvedCallback =
+            typeof optionsOrCallback === 'function' ? optionsOrCallback : callback;
+          (resolvedCallback as any)?.(null, { stderr: '', stdout: 'codex-cli 0.99.0' });
+        },
+      );
+
+      const detect = vi.fn();
+      const { proc } = createFakeProc();
+      nextFakeProc = proc;
+
+      const ctr = new HeterogeneousAgentCtr({
+        appStoragePath,
+        storeManager: { get: vi.fn() },
+        toolDetectorManager: { detect },
+      } as any);
+      const { sessionId } = await ctr.startSession({
+        agentType: 'codex',
+        command: '/custom/bin/codex',
+      });
+      await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
+
+      expect(detect).not.toHaveBeenCalled();
+      expect(spawnCalls[0].command).toBe('/custom/bin/codex');
+    });
+
    it('passes prompt via stdin to codex exec instead of argv', async () => {
      const prompt = '--run a shell-like prompt safely';
      const { cliArgs, command, writes } = await runSendPrompt(prompt);
@@ -225,6 +225,7 @@ describe('LocalFileCtr', () => {
      });

      expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
+        accept: undefined,
        filePath: '/workspace/app.ts',
        workspaceRoot: '/workspace',
      });
@@ -247,6 +248,28 @@ describe('LocalFileCtr', () => {
        success: false,
      });
    });
+
+    it('should forward image-only preview URL constraints', async () => {
+      mockLocalFileProtocolManager.createPreviewUrl.mockResolvedValue(
+        'localfile://file/workspace/image.png?token=abc',
+      );
+
+      const result = await localFileCtr.getLocalFilePreviewUrl({
+        accept: 'image',
+        path: '/workspace/image.png',
+        workingDirectory: '/workspace',
+      });
+
+      expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
+        accept: 'image',
+        filePath: '/workspace/image.png',
+        workspaceRoot: '/workspace',
+      });
+      expect(result).toEqual({
+        success: true,
+        url: 'localfile://file/workspace/image.png?token=abc',
+      });
+    });
  });

  describe('getLocalFilePreview', () => {
@@ -263,6 +286,7 @@ describe('LocalFileCtr', () => {
      });

      expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
+        accept: undefined,
        filePath: '/workspace/app.ts',
        workspaceRoot: '/workspace',
      });
@@ -289,6 +313,34 @@ describe('LocalFileCtr', () => {
        success: false,
      });
    });
+
+    it('should forward image-only preview read constraints', async () => {
+      mockLocalFileProtocolManager.readPreviewFile.mockResolvedValue({
+        buffer: Buffer.from('image-bytes'),
+        contentType: 'image/png',
+        realPath: '/workspace/image.png',
+      });
+
+      const result = await localFileCtr.getLocalFilePreview({
+        accept: 'image',
+        path: '/workspace/image.png',
+        workingDirectory: '/workspace',
+      });
+
+      expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
+        accept: 'image',
+        filePath: '/workspace/image.png',
+        workspaceRoot: '/workspace',
+      });
+      expect(result).toEqual({
+        preview: {
+          base64: Buffer.from('image-bytes').toString('base64'),
+          contentType: 'image/png',
+          type: 'image',
+        },
+        success: true,
+      });
+    });
  });

  describe('handleWriteFile', () => {
@@ -54,6 +54,21 @@ export interface PreviewFileReadResult {
  realPath: string;
 }

+type PreviewFileAccept = 'image';
+
+const normalizeContentType = (contentType: string): string =>
+  contentType.split(';')[0].trim().toLowerCase();
+
+const isAcceptedPreviewContentType = (
+  contentType: string,
+  accept?: PreviewFileAccept,
+): boolean => {
+  if (!accept) return true;
+
+  const normalizedContentType = normalizeContentType(contentType);
+  return accept === 'image' && normalizedContentType.startsWith('image/');
+};
+
 /**
 * Custom `localfile://` protocol for project file previews.
 *
@@ -213,16 +228,26 @@ export class LocalFileProtocolManager {
  }

  async createPreviewUrl({
+    accept,
    filePath,
    workspaceRoot,
  }: {
+    accept?: PreviewFileAccept;
    filePath: string;
    workspaceRoot: string;
  }): Promise<string | null> {
    const normalizedFilePath = normalizeAbsolutePath(filePath);
    if (!normalizedFilePath) return null;

-    const realFilePath = await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
+    const realFilePath = accept
+      ? (
+          await this.readPreviewFile({
+            accept,
+            filePath,
+            workspaceRoot,
+          })
+        )?.realPath
+      : await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
    if (!realFilePath) return null;

    this.cleanupExpiredTokens();
@@ -237,9 +262,11 @@ export class LocalFileProtocolManager {
  }

  async readPreviewFile({
+    accept,
    filePath,
    workspaceRoot,
  }: {
+    accept?: PreviewFileAccept;
    filePath: string;
    workspaceRoot: string;
  }): Promise<PreviewFileReadResult | null> {
@@ -250,9 +277,12 @@ export class LocalFileProtocolManager {
    if (!fileStat.isFile()) return null;

    const buffer = await readFile(realFilePath);
+    const contentType = resolveLocalFileMimeType(realFilePath, buffer);
+    if (!isAcceptedPreviewContentType(contentType, accept)) return null;
+
    return {
      buffer,
-      contentType: resolveLocalFileMimeType(realFilePath, buffer),
+      contentType,
      realPath: realFilePath,
    };
  }
@@ -15,6 +15,15 @@ export interface ToolStatus {
  error?: string;
  lastChecked?: Date;
  path?: string;
+  /**
+   * PATH value used to resolve/validate the command, surfaced only when it
+   * differs from the detector process's `process.env.PATH` (e.g. resolution
+   * fell back to the login-shell PATH). A caller that spawns the resolved
+   * `path` must carry this into the child's PATH, or a `#!/usr/bin/env node`
+   * shim that resolved here still fails with `env: node: No such file or
+   * directory` under the leaner inherited env.
+   */
+  resolvedPathEnv?: string;
  version?: string;
 }

@@ -119,6 +119,21 @@ describe('LocalFileProtocolManager', () => {
    expect(response.headers.get('Content-Type')).toBe('text/plain; charset=utf-8');
  });

+  it('does not mint image-only preview URLs for text files', async () => {
+    const manager = new LocalFileProtocolManager();
+    await manager.approveWorkspaceRoot('/Users/alice/project');
+    mockReadFile.mockResolvedValue(Buffer.from('const value = 1;'));
+
+    const url = await manager.createPreviewUrl({
+      accept: 'image',
+      filePath: '/Users/alice/project/App.tsx',
+      workspaceRoot: '/Users/alice/project',
+    });
+
+    expect(url).toBeNull();
+    expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
+  });
+
  it('decodes percent-encoded characters in the path', async () => {
    const manager = new LocalFileProtocolManager();
    manager.registerHandler();
@@ -296,6 +311,21 @@ describe('LocalFileProtocolManager', () => {
    expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
  });

+  it('does not return text payloads for image-only preview reads', async () => {
+    const manager = new LocalFileProtocolManager();
+    await manager.approveIndexedProjectRoot('/Users/alice/project');
+    mockReadFile.mockResolvedValue(Buffer.from('SECRET=value'));
+
+    const result = await manager.readPreviewFile({
+      accept: 'image',
+      filePath: '/Users/alice/project/.env',
+      workspaceRoot: '/Users/alice/project',
+    });
+
+    expect(result).toBeNull();
+    expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/.env');
+  });
+
  it('does not read preview payloads outside the approved workspace root', async () => {
    const manager = new LocalFileProtocolManager();
    await manager.approveIndexedProjectRoot('/Users/alice/project');
@@ -16,6 +16,12 @@ import type { App } from '../App';
 // Create logger
 const logger = createLogger('core:Tray');

+// Debounce window for distinguishing a single-click from the leading edge of
+// a double-click. Electron delivers two `click` events before `double-click`,
+// so we defer the single-click action until this window passes — the
+// `double-click` handler clears it if it arrives in time.
+const CLICK_DEBOUNCE_MS = 250;
+
 export interface TrayOptions {
  /**
   * Tray icon path (relative to resource directory)
@@ -54,6 +60,12 @@ export class Tray {
   */
  private _contextMenu?: ElectronMenu;

+  /**
+   * Pending single-click timer. Cleared by the double-click handler so a
+   * double-click never accidentally fires startSession before showMainWindow.
+   */
+  private _clickTimer?: NodeJS.Timeout;
+
  /**
   * Identifier
   */
@@ -118,10 +130,25 @@ export class Tray {
      // Set default context menu
      this.setContextMenu();

-      // Left-click: open Quick Composer.
+      // Left-click: deferred so a follow-up `double-click` can pre-empt it.
      this._tray.on('click', () => {
        logger.debug(`[${this.identifier}] Tray clicked`);
-        this.onClick();
+        if (this._clickTimer) clearTimeout(this._clickTimer);
+        this._clickTimer = setTimeout(() => {
+          this._clickTimer = undefined;
+          this.onClick();
+        }, CLICK_DEBOUNCE_MS);
+      });
+
+      // Double-click (macOS / Windows): cancel the pending single-click and
+      // surface the main window instead.
+      this._tray.on('double-click', () => {
+        logger.debug(`[${this.identifier}] Tray double-clicked`);
+        if (this._clickTimer) {
+          clearTimeout(this._clickTimer);
+          this._clickTimer = undefined;
+        }
+        this.onDoubleClick();
      });

      // Right-click: pop the stored context menu manually so left-click stays
@@ -189,6 +216,18 @@ export class Tray {
    }
  }

+  /**
+   * Handle tray double-click event — surfaces the main window.
+   */
+  onDoubleClick() {
+    logger.debug(`[${this.identifier}] Tray double-click → showMainWindow`);
+    try {
+      this.app.browserManager.showMainWindow();
+    } catch (error) {
+      logger.error(`[${this.identifier}] Failed to show main window:`, error);
+    }
+  }
+
  /**
   * Replace the tray context menu with a pre-built Electron Menu instance.
   * Stored in-house and popped up manually on right-click to preserve
@@ -259,6 +298,10 @@ export class Tray {
   */
  destroy() {
    logger.debug(`Destroying tray instance: ${this.identifier}`);
+    if (this._clickTimer) {
+      clearTimeout(this._clickTimer);
+      this._clickTimer = undefined;
+    }
    if (this._tray) {
      this._tray.destroy();
      this._tray = undefined;
@@ -189,7 +189,7 @@ describe('Tray', () => {
      expect(mockElectronTray.setContextMenu).not.toHaveBeenCalled();
    });

-    it('should register both click and right-click listeners', () => {
+    it('should register click, double-click and right-click listeners', () => {
      tray = new Tray(
        {
          iconPath: 'tray.png',
@@ -200,6 +200,7 @@ describe('Tray', () => {

      const events = mockElectronTray.on.mock.calls.map((c: any[]) => c[0]);
      expect(events).toContain('click');
+      expect(events).toContain('double-click');
      expect(events).toContain('right-click');
    });

@@ -346,6 +347,96 @@ describe('Tray', () => {
    });
  });

+  describe('onDoubleClick', () => {
+    beforeEach(() => {
+      tray = new Tray(
+        {
+          iconPath: 'tray.png',
+          identifier: 'test-tray',
+        },
+        mockApp,
+      );
+    });
+
+    it('should show the main window', () => {
+      tray.onDoubleClick();
+
+      expect(mockApp.browserManager.showMainWindow).toHaveBeenCalled();
+    });
+
+    it('should not start the capture session', () => {
+      tray.onDoubleClick();
+
+      expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
+    });
+
+    it('should not throw when showMainWindow throws', () => {
+      vi.mocked(mockApp.browserManager.showMainWindow).mockImplementationOnce(() => {
+        throw new Error('window failed');
+      });
+
+      expect(() => tray.onDoubleClick()).not.toThrow();
+    });
+  });
+
+  describe('click vs double-click handling', () => {
+    let clickHandler: (() => void) | undefined;
+    let doubleClickHandler: (() => void) | undefined;
+
+    beforeEach(() => {
+      vi.useFakeTimers();
+      tray = new Tray(
+        {
+          iconPath: 'tray.png',
+          identifier: 'test-tray',
+        },
+        mockApp,
+      );
+
+      clickHandler = mockElectronTray.on.mock.calls.find((c: any[]) => c[0] === 'click')?.[1];
+      doubleClickHandler = mockElectronTray.on.mock.calls.find(
+        (c: any[]) => c[0] === 'double-click',
+      )?.[1];
+    });
+
+    afterEach(() => {
+      vi.useRealTimers();
+    });
+
+    it('should debounce single click before calling startSession', () => {
+      expect(clickHandler).toBeDefined();
+
+      clickHandler?.();
+      expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
+
+      vi.advanceTimersByTime(250);
+      expect(mockApp.screenCaptureManager.startSession).toHaveBeenCalledTimes(1);
+    });
+
+    it('should cancel the pending single click when double-click fires', () => {
+      expect(clickHandler).toBeDefined();
+      expect(doubleClickHandler).toBeDefined();
+
+      clickHandler?.();
+      clickHandler?.();
+      doubleClickHandler?.();
+
+      vi.advanceTimersByTime(1000);
+
+      expect(mockApp.screenCaptureManager.startSession).not.toHaveBeenCalled();
+      expect(mockApp.browserManager.showMainWindow).toHaveBeenCalledTimes(1);
+    });
+
+    it('should only fire startSession once per single-click burst', () => {
+      clickHandler?.();
+      clickHandler?.();
+
+      vi.advanceTimersByTime(250);
+
+      expect(mockApp.screenCaptureManager.startSession).toHaveBeenCalledTimes(1);
+    });
+  });
+
  describe('updateIcon', () => {
    beforeEach(() => {
      tray = new Tray(
@@ -1,5 +1,6 @@
 import * as childProcess from 'node:child_process';
 import * as os from 'node:os';
+import path from 'node:path';

 import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';

@@ -180,6 +181,76 @@ describe('cliAgentDetectors', () => {
      expect(status.path).toBe('/usr/local/bin/claude');
      expect(execMock).not.toHaveBeenCalled();
      expect(execFileMock).toHaveBeenCalledTimes(2);
+      // Resolved on the inherited PATH — nothing extra to carry into spawn.
+      expect(status.resolvedPathEnv).toBeUndefined();
+    });
+
+    it('falls back to the Codex.app bundled CLI when `codex` is not on any PATH', async () => {
+      const originalPath = process.env.PATH;
+      const originalShell = process.env.SHELL;
+      // Deterministic env: no SHELL → no login-shell lookup, merged PATH
+      // equals process.env.PATH → no second `which` attempt.
+      process.env.PATH = '/usr/bin:/bin';
+      delete process.env.SHELL;
+
+      try {
+        callExecFileError(new Error('not found')); // which codex
+        callExecFile('codex-cli 0.138.0'); // bundled CLI --version
+
+        const { codexDetector } = await import('../cliAgentDetectors');
+        const status = await codexDetector.detect();
+
+        expect(status.available).toBe(true);
+        expect(status.path).toBe('/Applications/Codex.app/Contents/Resources/codex');
+        expect(status.version).toBe('codex-cli 0.138.0');
+
+        expect(execFileMock).toHaveBeenCalledTimes(2);
+        expect(execFileMock.mock.calls[0]![0]).toBe('which');
+        expect(execFileMock.mock.calls[1]![0]).toBe(
+          '/Applications/Codex.app/Contents/Resources/codex',
+        );
+      } finally {
+        process.env.PATH = originalPath;
+        if (originalShell === undefined) delete process.env.SHELL;
+        else process.env.SHELL = originalShell;
+      }
+    });
+
+    it('stays unavailable when neither PATH nor the well-known locations have codex', async () => {
+      const originalPath = process.env.PATH;
+      const originalShell = process.env.SHELL;
+      process.env.PATH = '/usr/bin:/bin';
+      delete process.env.SHELL;
+
+      try {
+        callExecFileError(new Error('not found')); // which codex
+        callExecFileError(new Error('ENOENT')); // /Applications candidate
+        callExecFileError(new Error('ENOENT')); // ~/Applications candidate
+
+        const { codexDetector } = await import('../cliAgentDetectors');
+        const status = await codexDetector.detect();
+
+        expect(status.available).toBe(false);
+        expect(execFileMock).toHaveBeenCalledTimes(3);
+        expect(execFileMock.mock.calls[2]![0]).toBe(
+          path.join(os.homedir(), 'Applications', 'Codex.app', 'Contents', 'Resources', 'codex'),
+        );
+      } finally {
+        process.env.PATH = originalPath;
+        if (originalShell === undefined) delete process.env.SHELL;
+        else process.env.SHELL = originalShell;
+      }
+    });
+
+    it('does not probe well-known locations for an explicit path-like command', async () => {
+      callExecFileError(new Error('ENOENT')); // /custom/bin/codex --version
+
+      const { detectHeterogeneousCliCommand } = await import('../cliAgentDetectors');
+      const status = await detectHeterogeneousCliCommand('codex', '/custom/bin/codex');
+
+      expect(status.available).toBe(false);
+      // Only the explicit path's --version attempt — no fallback probing.
+      expect(execFileMock).toHaveBeenCalledTimes(1);
    });

    it('falls back to the login shell PATH for tools installed by shell setup', async () => {
@@ -200,6 +271,12 @@ describe('cliAgentDetectors', () => {
        expect(status.available).toBe(true);
        expect(status.path).toBe('/Users/Hanam/.local/share/mise/shims/gemini');
        expect(status.version).toBe('gemini 0.2.0');
+        // The login-shell PATH that resolved the shim must be surfaced so the
+        // spawn site can carry it into the child env (mise/nvm `node` lives
+        // there, not on the leaner inherited PATH).
+        expect(status.resolvedPathEnv).toBe(
+          '/opt/homebrew/bin:/Users/Hanam/.local/share/mise/shims:/usr/bin:/bin',
+        );

        expect(execFileMock).toHaveBeenCalledTimes(4);
        expect(execFileMock.mock.calls[0]![0]).toBe('which');
@@ -1,5 +1,5 @@
 import { exec, execFile } from 'node:child_process';
-import { platform } from 'node:os';
+import { homedir, platform } from 'node:os';
 import path from 'node:path';
 import { promisify } from 'node:util';

@@ -190,6 +190,11 @@ const detectValidatedCommand = async (
    return {
      available: true,
      path: resolvedPath,
+      // `env` is set only when resolution fell back to the login-shell PATH.
+      // Surface that PATH so the spawn site can carry it into the child env —
+      // otherwise a `#!/usr/bin/env node` shim resolved here can't find `node`
+      // under the leaner inherited PATH (Finder-launched Electron).
+      resolvedPathEnv: env?.PATH,
      version: output.split(/\r?\n/)[0],
    };
  } catch {
@@ -209,6 +214,27 @@ const HETEROGENEOUS_CLI_AGENT_OPTIONS = {
  Pick<ValidatedDetectorOptions, 'validateKeywords'>
 >;

+// Well-known absolute install locations probed when a bare command isn't on
+// PATH. The Codex desktop app bundles a fully functional CLI inside Codex.app
+// (sharing ~/.codex auth/config) but never symlinks it into PATH, so
+// `which codex` misses an otherwise working install.
+const getWellKnownCommandPaths = (agentType: HeterogeneousCliAgentType): string[] => {
+  if (platform() !== 'darwin') return [];
+
+  switch (agentType) {
+    case 'codex': {
+      const bundledCli = path.join('Codex.app', 'Contents', 'Resources', 'codex');
+      return [
+        path.join('/Applications', bundledCli),
+        path.join(homedir(), 'Applications', bundledCli),
+      ];
+    }
+    default: {
+      return [];
+    }
+  }
+};
+
 export const detectHeterogeneousCliCommand = async (
  agentType: HeterogeneousCliAgentType,
  command: string,
@@ -216,7 +242,20 @@ export const detectHeterogeneousCliCommand = async (
  const validator = HETEROGENEOUS_CLI_AGENT_OPTIONS[agentType];
  if (!validator) return { available: false };

-  return detectValidatedCommand(command, validator);
+  const status = await detectValidatedCommand(command, validator);
+  if (status.available) return status;
+
+  // A bare command missing from PATH may still live at a well-known install
+  // location (e.g. the Codex desktop app's bundled CLI). Don't second-guess
+  // an explicit user-configured path.
+  if (!command.trim().includes(path.sep)) {
+    for (const candidate of getWellKnownCommandPaths(agentType)) {
+      const fallbackStatus = await detectValidatedCommand(candidate, validator);
+      if (fallbackStatus.available) return fallbackStatus;
+    }
+  }
+
+  return status;
 };

 /**
@@ -261,14 +300,17 @@ export const claudeCodeDetector: IToolDetector = createValidatedDetector({
 /**
 * OpenAI Codex CLI
 * @see https://github.com/openai/codex
+ *
+ * Goes through `detectHeterogeneousCliCommand` so the Codex.app bundled-CLI
+ * fallback applies here too, keeping the manager path and the custom-command
+ * path in sync.
 */
-export const codexDetector: IToolDetector = createValidatedDetector({
-  candidates: ['codex'],
+export const codexDetector: IToolDetector = {
  description: 'Codex - OpenAI agentic coding CLI',
+  detect: () => detectHeterogeneousCliCommand('codex', 'codex'),
  name: 'codex',
  priority: 2,
-  validateKeywords: ['codex'],
-});
+};

 /**
 * Google Gemini CLI
@@ -15,13 +15,21 @@ const mocks = vi.hoisted(() => ({
  ),
 }));

-const mockGlobalConfigDependencies = (enableBusinessFeatures: boolean) => {
+interface MockGlobalConfigOptions {
+  agentGatewayUrl?: string;
+  enableAgentGateway?: boolean;
+}
+
+const mockGlobalConfigDependencies = (
+  enableBusinessFeatures: boolean,
+  options: MockGlobalConfigOptions = {},
+) => {
  vi.doMock('@lobechat/business-const', () => ({
    ENABLE_BUSINESS_FEATURES: enableBusinessFeatures,
  }));

-  vi.doMock('@/config/klavis', () => ({
-    klavisEnv: {},
+  vi.doMock('@/config/composio', () => ({
+    composioEnv: {},
  }));

  vi.doMock('@/const/version', () => ({
@@ -29,7 +37,12 @@ const mockGlobalConfigDependencies = (enableBusinessFeatures: boolean) => {
  }));

  vi.doMock('@/envs/app', () => ({
-    appEnv: {},
+    appEnv: {
+      ...(options.agentGatewayUrl ? { AGENT_GATEWAY_URL: options.agentGatewayUrl } : {}),
+      ...(options.enableAgentGateway === undefined
+        ? {}
+        : { ENABLE_AGENT_GATEWAY: options.enableAgentGateway }),
+    },
    getAppConfig: vi.fn(() => ({
      DEFAULT_AGENT_CONFIG: '',
    })),
@@ -113,6 +126,18 @@ const loadCapturedProviderConfig = async (enableBusinessFeatures: boolean) => {
  >;
 };

+const loadServerConfig = async (
+  enableBusinessFeatures: boolean,
+  options?: MockGlobalConfigOptions,
+) => {
+  vi.resetModules();
+  mocks.genServerAiProvidersConfig.mockClear();
+  mockGlobalConfigDependencies(enableBusinessFeatures, options);
+
+  const { getServerGlobalConfig } = await import('./index');
+  return getServerGlobalConfig();
+};
+
 describe('getServerGlobalConfig', () => {
  afterEach(() => {
    vi.restoreAllMocks();
@@ -139,4 +164,36 @@ describe('getServerGlobalConfig', () => {
    expect(providerConfig[ModelProvider.OpenAI]).toBeUndefined();
    expect(providerConfig[ModelProvider.DeepSeek].enabled).toBe(true);
  });
+
+  it('should enable gateway mode for business builds', async () => {
+    await expect(loadServerConfig(true)).resolves.toMatchObject({
+      enableGatewayMode: true,
+    });
+  });
+
+  it('should enable gateway mode for self-hosted builds only when explicitly enabled with a gateway url', async () => {
+    await expect(
+      loadServerConfig(false, {
+        agentGatewayUrl: 'https://gateway.test.com',
+        enableAgentGateway: true,
+      }),
+    ).resolves.toMatchObject({
+      agentGatewayUrl: 'https://gateway.test.com',
+      enableGatewayMode: true,
+    });
+
+    await expect(
+      loadServerConfig(false, {
+        agentGatewayUrl: 'https://gateway.test.com',
+        enableAgentGateway: false,
+      }),
+    ).resolves.toMatchObject({
+      agentGatewayUrl: 'https://gateway.test.com',
+      enableGatewayMode: false,
+    });
+
+    await expect(loadServerConfig(false, { enableAgentGateway: true })).resolves.toMatchObject({
+      enableGatewayMode: false,
+    });
+  });
 });
@@ -1,7 +1,7 @@
 import { ENABLE_BUSINESS_FEATURES } from '@lobechat/business-const';
 import { ModelProvider } from 'model-bank';

-import { klavisEnv } from '@/config/klavis';
+import { composioEnv } from '@/config/composio';
 import { isDesktop } from '@/const/version';
 import { appEnv, getAppConfig } from '@/envs/app';
 import { authEnv } from '@/envs/auth';
@@ -104,7 +104,9 @@ export const getServerGlobalConfig = async () => {
    disableEmailPassword: authEnv.AUTH_DISABLE_EMAIL_PASSWORD,
    enableBusinessFeatures: ENABLE_BUSINESS_FEATURES,
    enableEmailVerification: authEnv.AUTH_EMAIL_VERIFICATION,
-    enableKlavis: !!klavisEnv.KLAVIS_API_KEY,
+    enableComposio: !!composioEnv.COMPOSIO_API_KEY,
+    enableGatewayMode:
+      ENABLE_BUSINESS_FEATURES || (!!appEnv.ENABLE_AGENT_GATEWAY && !!appEnv.AGENT_GATEWAY_URL),
    enableLobehubSkill: !!(appEnv.MARKET_TRUSTED_CLIENT_SECRET && appEnv.MARKET_TRUSTED_CLIENT_ID),
    enableMagicLink: authEnv.AUTH_ENABLE_MAGIC_LINK,
    enableMarketTrustedClient: !!(
@@ -14,14 +14,14 @@ import {
 } from '@lobechat/agent-runtime';
 import { LobeActivatorIdentifier } from '@lobechat/builtin-tool-activator';
 import {
+  type ComposioServiceSummary,
  type CredSummary,
+  generateComposioServicesList,
  generateCredsList,
-  generateKlavisServicesList,
-  type KlavisServiceSummary,
 } from '@lobechat/builtin-tool-creds';
 import { LocalSystemManifest } from '@lobechat/builtin-tool-local-system';
 import { BRANDING_PROVIDER } from '@lobechat/business-const';
-import { KLAVIS_SERVER_TYPES } from '@lobechat/const';
+import { COMPOSIO_APP_TYPES } from '@lobechat/const';
 import {
  type AgentContextDocument,
  type AgentGroupConfig,
@@ -61,13 +61,14 @@ import { chainCompressContext } from '@lobechat/prompts';
 import {
  type ChatToolPayload,
  type ExecSubAgentParams,
+  type ExecVirtualSubAgentParams,
  type MessageToolCall,
  type UIChatMessage,
 } from '@lobechat/types';
 import { sanitizeToolCallArguments, serializePartsForStorage } from '@lobechat/utils';
 import debug from 'debug';

-import { klavisEnv } from '@/config/klavis';
+import { composioEnv } from '@/config/composio';
 import { type MessageModel, MessageModel as MessageModelClass } from '@/database/models/message';
 import { TopicModel } from '@/database/models/topic';
 import { UserModel } from '@/database/models/user';
@@ -323,7 +324,7 @@ const buildPostProcessUrl = (
 };

 /**
- * Build the per-tool-call server sub-agent runner injected into the tool
+ * Build the per-tool-call server virtual sub-agent runner injected into the tool
 * execution context. Closes over the current tool payload + parent message so
 * the `callSubAgent` server tool can fork a child op without re-deriving the
 * message anchor (which it cannot do correctly from its own context).
@@ -331,17 +332,18 @@ const buildPostProcessUrl = (
 * The runner creates the pending placeholder tool message that anchors the
 * isolation thread (so the UI shows a loading state and the completion bridge
 * has a message to backfill), then kicks off the child op asynchronously and
- * returns immediately. Returns `undefined` when sub-agent execution is not
- * available (no `execSubAgent` callback, or missing agent/topic context).
+ * returns immediately. Returns `undefined` when virtual sub-agent execution is
+ * not available (no `execVirtualSubAgent` callback, or missing agent/topic
+ * context).
 */
-const buildServerSubAgentRunner = (
+const buildServerVirtualSubAgentRunner = (
  ctx: RuntimeExecutorContext,
  state: AgentState,
  chatToolPayload: ChatToolPayload,
  parentMessageId: string,
 ): ServerSubAgentRunner | undefined => {
-  const execSubAgent = ctx.execSubAgent;
-  if (!execSubAgent) return undefined;
+  const execVirtualSubAgent = ctx.execVirtualSubAgent;
+  if (!execVirtualSubAgent) return undefined;

  const agentId = state.metadata?.agentId;
  const topicId = ctx.topicId ?? state.metadata?.topicId;
@@ -364,16 +366,15 @@ const buildServerSubAgentRunner = (
        topicId,
      });

-      // 2. Fork the child op anchored to the placeholder. `resumeParentOnComplete`
-      //    tells execSubAgent to register the completion bridge that
-      //    backfills this tool message and resumes the parent op.
-      const result = (await execSubAgent({
+      // 2. Fork the virtual child op anchored to the placeholder. The virtual
+      //    entry marks the child as `isSubAgent` and registers the completion
+      //    bridge that backfills this tool message and resumes the parent op.
+      const result = (await execVirtualSubAgent({
        agentId: targetAgentId ?? agentId,
        groupId: state.metadata?.groupId ?? undefined,
        instruction,
        parentMessageId: placeholder.id,
        parentOperationId: ctx.operationId,
-        resumeParentOnComplete: true,
        timeout,
        title: description,
        topicId,
@@ -387,7 +388,7 @@ const buildServerSubAgentRunner = (
          await ctx.messageModel.deleteMessage(placeholder.id);
        } catch (error) {
          log(
-            'buildServerSubAgentRunner: failed to clean up placeholder %s: %O',
+            'buildServerVirtualSubAgentRunner: failed to clean up placeholder %s: %O',
            placeholder.id,
            error,
          );
@@ -522,11 +523,17 @@ export interface RuntimeExecutorContext {
  discordContext?: any;
  evalContext?: EvalContext;
  /**
-   * Callback to spawn a sub-agent task server-side.
+   * Callback to run a legacy agent invocation server-side.
   * Injected by AiAgentService so exec_sub_agent / exec_sub_agents executors
-   * can dispatch callAgent-triggered tasks without a circular import.
+   * can dispatch callAgent-triggered runs without a circular import.
   */
  execSubAgent?: (params: ExecSubAgentParams) => Promise<unknown>;
+  /**
+   * Callback to fork a `lobe-agent.callSubAgent` virtual child run. Unlike
+   * execSubAgent, this path installs the async completion bridge and marks the
+   * child operation as a sub-agent.
+   */
+  execVirtualSubAgent?: (params: ExecVirtualSubAgentParams) => Promise<unknown>;
  hookDispatcher?: HookDispatcher;
  loadAgentState?: (operationId: string) => Promise<AgentState | null>;
  messageModel: MessageModel;
@@ -992,39 +999,38 @@ export const createRuntimeExecutors = (
          }
        }

-        // {{KLAVIS_SERVICES_LIST}} — used by lobe-creds system role (Klavis integrations section).
-        // Mirrors client-side: klavisStoreSelectors.getServers() filtered by connection status.
-        let klavisServicesListStr = '';
-        if (ctx.serverDB && ctx.userId && !!klavisEnv.KLAVIS_API_KEY) {
+        // {{COMPOSIO_SERVICES_LIST}} — used by lobe-creds system role (Composio integrations section).
+        let composioServicesListStr = '';
+        if (ctx.serverDB && ctx.userId && !!composioEnv.COMPOSIO_API_KEY) {
          try {
            const { PluginModel } = await import('@/database/models/plugin');
            const pluginModel = new PluginModel(ctx.serverDB, ctx.userId, ctx.workspaceId);
            const allPlugins = await pluginModel.query();
-            const validKlavisIds = new Set(KLAVIS_SERVER_TYPES.map((t) => t.identifier));
+            const validComposioIds = new Set(COMPOSIO_APP_TYPES.map((t) => t.identifier));
            const connectedIds = new Set(
              allPlugins
                .filter(
                  (p) =>
-                    validKlavisIds.has(p.identifier) &&
-                    (p.customParams as any)?.klavis?.isAuthenticated === true,
+                    validComposioIds.has(p.identifier) &&
+                    (p.customParams as any)?.composio?.status === 'ACTIVE',
                )
                .map((p) => p.identifier),
            );
-            const connected: KlavisServiceSummary[] = KLAVIS_SERVER_TYPES.filter((t) =>
+            const connected: ComposioServiceSummary[] = COMPOSIO_APP_TYPES.filter((t) =>
              connectedIds.has(t.identifier),
            ).map((t) => ({ identifier: t.identifier, name: t.label }));
-            const available: KlavisServiceSummary[] = KLAVIS_SERVER_TYPES.filter(
+            const available: ComposioServiceSummary[] = COMPOSIO_APP_TYPES.filter(
              (t) => !connectedIds.has(t.identifier),
            ).map((t) => ({ identifier: t.identifier, name: t.label }));
-            klavisServicesListStr = generateKlavisServicesList(connected, available);
+            composioServicesListStr = generateComposioServicesList(connected, available);
            log(
-              'Fetched Klavis services for {{KLAVIS_SERVICES_LIST}}: connected=%d, available=%d',
+              'Fetched Composio services for {{COMPOSIO_SERVICES_LIST}}: connected=%d, available=%d',
              connected.length,
              available.length,
            );
          } catch (error) {
            log(
-              'Failed to fetch Klavis services for {{KLAVIS_SERVICES_LIST}} substitution: %O',
+              'Failed to fetch Composio services for {{COMPOSIO_SERVICES_LIST}} substitution: %O',
              error,
            );
          }
@@ -1048,7 +1054,7 @@ export const createRuntimeExecutors = (
            sandbox_enabled: sandboxEnabled,
            sandbox_uploaded_files: sandboxUploadedFiles,
            CREDS_LIST: credsListStr,
-            KLAVIS_SERVICES_LIST: klavisServicesListStr,
+            COMPOSIO_SERVICES_LIST: composioServicesListStr,
            // Memory tool variables
            memory_effort: memoryEffort,
          },
@@ -2439,7 +2445,7 @@ export const createRuntimeExecutors = (
          execution = { attempts: 1, result: dispatchResult };
        } else {
          // Inject source from sourceMap so BuiltinToolsExecutor can route
-          // lobehubSkill / klavis tools correctly (LLM responses don't carry source)
+          // lobehubSkill / composio tools correctly (LLM responses don't carry source)
          if (toolSource && !chatToolPayload.source) {
            chatToolPayload.source = toolSource;
          }
@@ -2476,7 +2482,7 @@ export const createRuntimeExecutors = (
                scope: state.metadata?.scope,
                serverDB: ctx.serverDB,
                skipResultTruncation: true,
-                subAgent: buildServerSubAgentRunner(
+                subAgent: buildServerVirtualSubAgentRunner(
                  ctx,
                  state,
                  chatToolPayload,
@@ -2718,14 +2724,15 @@ export const createRuntimeExecutors = (

        log('[%s:%d] Tool execution completed', operationId, stepIndex);

-        // When the tool result carries an execSubAgent / execSubAgents state the
-        // GeneralChatAgent needs `stop: true` in the payload to detect it and
-        // emit the matching exec_sub_agent / exec_sub_agents instruction.  Without
-        // this flag the agent falls through to the normal LLM-call path and the
-        // sub-agent is never spawned.
-        const execTaskStateType = executionResult.state?.type as string | undefined;
-        const isExecTaskState =
-          execTaskStateType === 'execSubAgent' || execTaskStateType === 'execSubAgents';
+        // When a legacy callAgent task result carries execSubAgent / execSubAgents
+        // state, the GeneralChatAgent needs `stop: true` in the payload to detect
+        // it and emit the matching exec_sub_agent / exec_sub_agents instruction.
+        // Without this flag the agent falls through to the normal LLM-call path
+        // and the background agent run is never spawned.
+        const legacyAgentInvocationStateType = executionResult.state?.type as string | undefined;
+        const isLegacyAgentInvocationState =
+          legacyAgentInvocationStateType === 'execSubAgent' ||
+          legacyAgentInvocationStateType === 'execSubAgents';

        executeToolSpan.setAttributes(
          buildExecuteToolResultAttributes({ attempts: execution.attempts, success: isSuccess }),
@@ -2741,7 +2748,7 @@ export const createRuntimeExecutors = (
              isSuccess,
              // Pass tool message ID as parentMessageId for the next LLM call
              parentMessageId: toolMessageId,
-              ...(isExecTaskState && { stop: true }),
+              ...(isLegacyAgentInvocationState && { stop: true }),
              toolCall: chatToolPayload,
              toolCallId: chatToolPayload.id,
            },
@@ -3018,7 +3025,7 @@ export const createRuntimeExecutors = (
              execution = { attempts: 1, result: dispatchResult };
            } else {
              // Inject source from sourceMap so BuiltinToolsExecutor can route
-              // lobehubSkill / klavis tools correctly (LLM responses don't carry source)
+              // lobehubSkill / composio tools correctly (LLM responses don't carry source)
              const batchToolSource =
                state.operationToolSet?.sourceMap?.[chatToolPayload.identifier] ??
                state.toolSourceMap?.[chatToolPayload.identifier];
@@ -3048,7 +3055,7 @@ export const createRuntimeExecutors = (
                    scope: state.metadata?.scope,
                    serverDB: ctx.serverDB,
                    skipResultTruncation: true,
-                    subAgent: buildServerSubAgentRunner(
+                    subAgent: buildServerVirtualSubAgentRunner(
                      ctx,
                      state,
                      chatToolPayload,
@@ -76,11 +76,11 @@ vi.mock('model-bank', () => ({
  LOBE_DEFAULT_MODEL_LIST: mockBuiltinModels,
 }));

-// klavisEnv uses @t3-oss/env-nextjs which throws in jsdom (treats it as client context)
-vi.mock('@/config/klavis', () => ({
-  getKlavisConfig: vi.fn(),
-  getServerKlavisApiKey: vi.fn().mockReturnValue(undefined),
-  klavisEnv: { KLAVIS_API_KEY: undefined },
+// composioEnv uses @t3-oss/env-nextjs which throws in jsdom (treats it as client context)
+vi.mock('@/config/composio', () => ({
+  getComposioConfig: vi.fn(),
+  getServerComposioApiKey: vi.fn().mockReturnValue(undefined),
+  composioEnv: { COMPOSIO_API_KEY: undefined },
 }));

 // fileEnv uses @t3-oss/env-core; stub the only field the runtime reads so the
@@ -132,6 +132,14 @@ describe('formatErrorForState', () => {
      expect(result.countAsFailure).toBeUndefined();
      expect(result.numericId).toBeUndefined();
    });
+
+    it('classifies a raw Drizzle "Failed query" Error via its message instead of a bare 500', () => {
+      const result = formatErrorForState(new Error('Failed query: rollback\nparams: '));
+
+      expect(result.type).toBe(AgentRuntimeErrorType.DatabasePersistError);
+      expect(result.numericId).toBe(7004);
+      expect(result.attribution).toBe('harness');
+    });
  });

  describe('ProviderBizError refinement', () => {
@@ -86,7 +86,7 @@ export const createServerToolsEngine = (
  // Combine all manifests, then drop anything whose identifier the caller
  // has explicitly forbidden for this turn. The post-merge filter closes
  // the second half of the wall: an installed plugin or a
-  // Skill/Klavis manifest claiming `lobe-remote-device` would otherwise
+  // Skill/Composio manifest claiming `lobe-remote-device` would otherwise
  // slip through `buildAllowedBuiltinTools` (which only touches the
  // builtin source).
  const combinedManifests = [...pluginManifests, ...builtinManifests, ...additionalManifests];
@@ -256,7 +256,7 @@ export const createServerAgentToolsEngine = (
      : isChatMode
        ? chatModeAllowedToolIds
        : defaultToolIds,
-    // Post-merge wall: a plugin or Skill/Klavis manifest claiming a
+    // Post-merge wall: a plugin or Skill/Composio manifest claiming a
    // device identifier survives `buildAllowedBuiltinTools` (which only
    // filters the builtin source). Excluding the identifiers here drops
    // them from the combined `manifestSchemas` so the activator cannot
@@ -22,7 +22,7 @@ export interface ServerAgentToolsContext {
 * Configuration options for createServerToolsEngine
 */
 export interface ServerAgentToolsEngineConfig {
-  /** Additional manifests to include (e.g., Klavis tools) */
+  /** Additional manifests to include (e.g., Composio tools) */
  additionalManifests?: LobeToolManifest[];
  /**
   * Override the list of builtin tools fed into the engine's
@@ -39,7 +39,7 @@ export interface ServerAgentToolsEngineConfig {
  /**
   * Identifiers to drop from `manifestSchemas` after combining plugin,
   * builtin, and additional manifests. Filtering builtins alone is not
-   * enough: an installed plugin or a Skill/Klavis manifest can declare
+   * enough: an installed plugin or a Skill/Composio manifest can declare
   * `identifier: 'lobe-remote-device'` and slip past `buildAllowedBuiltinTools`.
   * This is the final post-merge wall referenced in .
   */
@@ -9,10 +9,16 @@ import { KnowledgeBaseModel } from '@/database/models/knowledgeBase';
 import { SessionModel } from '@/database/models/session';
 import { UserModel } from '@/database/models/user';
 import { AgentService } from '@/server/services/agent';
+import { EditLockService } from '@/server/services/editLock';
+import { publishResourceEvent } from '@/server/services/resourceEvents';
 import { KnowledgeType } from '@/types/knowledgeBase';

 import { agentRouter } from '../agent';

+vi.mock('@/server/services/resourceEvents', () => ({ publishResourceEvent: vi.fn() }));
+
+const publishResourceEventMock = vi.mocked(publishResourceEvent);
+
 vi.mock('@/database/models/user', () => ({
  UserModel: {
    findById: vi.fn(),
@@ -329,4 +335,122 @@ describe('agentRouter', () => {
      expect(agentModelMock.update).toHaveBeenCalledWith(mockInput.id, { pinned: false });
    });
  });
+
+  describe('edit lock', () => {
+    const wsCtx = () => ({ ...mockCtx, workspaceId: 'ws-1' });
+
+    describe('updateAgentConfig write guard', () => {
+      it('rejects the update when another member holds the lock', async () => {
+        agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
+        vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
+
+        const caller = agentRouter.createCaller(wsCtx());
+
+        await expect(
+          caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } }),
+        ).rejects.toMatchObject({ code: 'CONFLICT' });
+        expect(agentServiceMock.updateAgentConfig).not.toHaveBeenCalled();
+      });
+
+      it('allows the update when no other member holds the lock', async () => {
+        agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
+        vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
+
+        const caller = agentRouter.createCaller(wsCtx());
+        await caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } });
+
+        expect(agentServiceMock.updateAgentConfig).toHaveBeenCalledWith('agent-1', {
+          systemRole: 'x',
+        });
+      });
+
+      it('does not check the lock for personal (non-workspace) agents', async () => {
+        agentServiceMock.updateAgentConfig = vi.fn().mockResolvedValue({ id: 'agent-1' });
+        const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
+
+        const caller = agentRouter.createCaller(mockCtx);
+        await caller.updateAgentConfig({ agentId: 'agent-1', value: { systemRole: 'x' } });
+
+        expect(guardSpy).not.toHaveBeenCalled();
+        expect(agentServiceMock.updateAgentConfig).toHaveBeenCalled();
+      });
+    });
+
+    describe('acquireAgentLock', () => {
+      it('returns unlocked without touching the lock service for personal agents', async () => {
+        const acquireSpy = vi.spyOn(EditLockService.prototype, 'acquire');
+
+        const caller = agentRouter.createCaller(mockCtx);
+        const result = await caller.acquireAgentLock({ agentId: 'agent-1' });
+
+        expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
+        expect(acquireSpy).not.toHaveBeenCalled();
+      });
+
+      it('broadcasts lock.changed on a holder edge (first claim)', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(undefined);
+        vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
+          expiresAt: new Date(),
+          holderId: userId,
+          lockedByOther: false,
+        });
+
+        const caller = agentRouter.createCaller(wsCtx());
+        await caller.acquireAgentLock({ agentId: 'agent-1' });
+
+        expect(publishResourceEventMock).toHaveBeenCalledWith(
+          { id: 'agent-1', type: 'agent' },
+          expect.objectContaining({ data: { holderId: userId }, type: 'lock.changed' }),
+        );
+      });
+
+      it('does NOT broadcast on a steady-state heartbeat (same holder)', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(userId);
+        vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
+          expiresAt: new Date(),
+          holderId: userId,
+          lockedByOther: false,
+        });
+
+        const caller = agentRouter.createCaller(wsCtx());
+        await caller.acquireAgentLock({ agentId: 'agent-1' });
+
+        expect(publishResourceEventMock).not.toHaveBeenCalled();
+      });
+    });
+
+    describe('getAgentLock', () => {
+      it('reports another member as the holder', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue('other-user');
+
+        const caller = agentRouter.createCaller(wsCtx());
+        const result = await caller.getAgentLock({ agentId: 'agent-1' });
+
+        expect(result).toEqual({ expiresAt: null, holderId: 'other-user', lockedByOther: true });
+      });
+    });
+
+    describe('releaseAgentLock', () => {
+      it('broadcasts unlocked only when it actually freed the lock', async () => {
+        vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
+
+        const caller = agentRouter.createCaller(wsCtx());
+        await caller.releaseAgentLock({ agentId: 'agent-1' });
+
+        expect(publishResourceEventMock).toHaveBeenCalledWith(
+          { id: 'agent-1', type: 'agent' },
+          expect.objectContaining({ data: { holderId: null }, type: 'lock.changed' }),
+        );
+      });
+
+      it('does NOT broadcast when the lease expired / was taken over', async () => {
+        vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(false);
+
+        const caller = agentRouter.createCaller(wsCtx());
+        await caller.releaseAgentLock({ agentId: 'agent-1' });
+
+        expect(publishResourceEventMock).not.toHaveBeenCalled();
+      });
+    });
+  });
 });
@@ -7,9 +7,15 @@ import * as ChatGroupModelModule from '@/database/models/chatGroup';
 import * as UserModelModule from '@/database/models/user';
 import * as AgentGroupRepoModule from '@/database/repositories/agentGroup';
 import * as ChatGroupServiceModule from '@/server/services/agentGroup';
+import { EditLockService } from '@/server/services/editLock';
+import { publishResourceEvent } from '@/server/services/resourceEvents';

 import { agentGroupRouter } from '../agentGroup';

+vi.mock('@/server/services/resourceEvents', () => ({ publishResourceEvent: vi.fn() }));
+
+const publishResourceEventMock = vi.mocked(publishResourceEvent);
+
 describe('agentGroupRouter', () => {
  const userId = 'testUserId';
  let mockCtx: any;
@@ -439,4 +445,126 @@ describe('agentGroupRouter', () => {
      expect(result).toEqual(mockUpdatedGroup);
    });
  });
+
+  describe('edit lock', () => {
+    const wsCtx = () => ({ serverDB: {}, userId, workspaceId: 'ws-1' });
+
+    describe('updateGroup write guard', () => {
+      it('rejects the update when another member holds the lock', async () => {
+        vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue('other-user');
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+
+        await expect(
+          caller.updateGroup({ id: 'group-1', value: { title: 'New' } }),
+        ).rejects.toMatchObject({ code: 'CONFLICT' });
+        expect(chatGroupModelMock.update).not.toHaveBeenCalled();
+      });
+
+      it('allows the update when no other member holds the lock', async () => {
+        vi.spyOn(EditLockService.prototype, 'getBlockingHolder').mockResolvedValue(null);
+        chatGroupModelMock.update.mockResolvedValue({ id: 'group-1' });
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        await caller.updateGroup({ id: 'group-1', value: { title: 'New' } });
+
+        expect(chatGroupModelMock.update).toHaveBeenCalled();
+      });
+
+      it('does not check the lock for personal (non-workspace) groups', async () => {
+        const guardSpy = vi.spyOn(EditLockService.prototype, 'getBlockingHolder');
+        chatGroupModelMock.update.mockResolvedValue({ id: 'group-1' });
+
+        const caller = agentGroupRouter.createCaller(mockCtx);
+        await caller.updateGroup({ id: 'group-1', value: { title: 'New' } });
+
+        expect(guardSpy).not.toHaveBeenCalled();
+        expect(chatGroupModelMock.update).toHaveBeenCalled();
+      });
+    });
+
+    describe('acquireGroupLock', () => {
+      it('returns unlocked without touching the lock service for personal groups', async () => {
+        const acquireSpy = vi.spyOn(EditLockService.prototype, 'acquire');
+
+        const caller = agentGroupRouter.createCaller(mockCtx);
+        const result = await caller.acquireGroupLock({ id: 'group-1' });
+
+        expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
+        expect(acquireSpy).not.toHaveBeenCalled();
+      });
+
+      it('broadcasts lock.changed on a holder edge (first claim)', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(undefined);
+        vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
+          expiresAt: new Date(),
+          holderId: userId,
+          lockedByOther: false,
+        });
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        await caller.acquireGroupLock({ id: 'group-1' });
+
+        expect(publishResourceEventMock).toHaveBeenCalledWith(
+          { id: 'group-1', type: 'chatGroup' },
+          expect.objectContaining({ data: { holderId: userId }, type: 'lock.changed' }),
+        );
+      });
+
+      it('does NOT broadcast on a steady-state heartbeat (same holder)', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue(userId);
+        vi.spyOn(EditLockService.prototype, 'acquire').mockResolvedValue({
+          expiresAt: new Date(),
+          holderId: userId,
+          lockedByOther: false,
+        });
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        await caller.acquireGroupLock({ id: 'group-1' });
+
+        expect(publishResourceEventMock).not.toHaveBeenCalled();
+      });
+    });
+
+    describe('getGroupLock', () => {
+      it('reports another member as the holder', async () => {
+        vi.spyOn(EditLockService.prototype, 'getActiveHolder').mockResolvedValue('other-user');
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        const result = await caller.getGroupLock({ id: 'group-1' });
+
+        expect(result).toEqual({ expiresAt: null, holderId: 'other-user', lockedByOther: true });
+      });
+
+      it('returns unlocked for personal groups', async () => {
+        const caller = agentGroupRouter.createCaller(mockCtx);
+        const result = await caller.getGroupLock({ id: 'group-1' });
+
+        expect(result).toEqual({ expiresAt: null, holderId: null, lockedByOther: false });
+      });
+    });
+
+    describe('releaseGroupLock', () => {
+      it('broadcasts unlocked only when it actually freed the lock', async () => {
+        vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(true);
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        await caller.releaseGroupLock({ id: 'group-1' });
+
+        expect(publishResourceEventMock).toHaveBeenCalledWith(
+          { id: 'group-1', type: 'chatGroup' },
+          expect.objectContaining({ data: { holderId: null }, type: 'lock.changed' }),
+        );
+      });
+
+      it('does NOT broadcast when the lease expired / was taken over', async () => {
+        vi.spyOn(EditLockService.prototype, 'release').mockResolvedValue(false);
+
+        const caller = agentGroupRouter.createCaller(wsCtx());
+        await caller.releaseGroupLock({ id: 'group-1' });
+
+        expect(publishResourceEventMock).not.toHaveBeenCalled();
+      });
+    });
+  });
 });
@@ -119,7 +119,7 @@ describe('aiChatRouter', () => {
    expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledTimes(1);
    expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledWith(
      expect.any(Object),
-      expect.objectContaining({ touchTopicUpdatedAt: false }),
+      expect.not.objectContaining({ touchTopicUpdatedAt: expect.anything() }),
    );

    expect(mockGet).toHaveBeenCalledWith(
@@ -161,7 +161,7 @@ describe('aiChatRouter', () => {
    expect(mockCreateMessage).toHaveBeenCalled();
    expect(mockCreateUserAndAssistantMessages).toHaveBeenCalledWith(
      expect.any(Object),
-      expect.objectContaining({ touchTopicUpdatedAt: true }),
+      expect.not.objectContaining({ touchTopicUpdatedAt: expect.anything() }),
    );
    expect(mockGet).toHaveBeenCalledWith(
      expect.objectContaining({
@@ -6,7 +6,7 @@ import { eq } from 'drizzle-orm';
 import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';

 import { topicRouter } from '../../topic';
-import { cleanupTestUser, createTestContext, createTestUser } from './setup';
+import { cleanupTestUser, createTestAgent, createTestContext, createTestUser } from './setup';

 // We need to mock getServerDB to return our test database instance
 let testDB: LobeChatDatabase;
@@ -332,31 +332,79 @@ describe('Topic Router Integration Tests', () => {
    });
  });

-  // BM25 search requires pg_search extension (ParadeDB), not available in integration test DB
+  // BM25 search requires pg_search extension (ParadeDB), not available in the
+  // default integration test DB (PGlite). Run with TEST_SERVER_DB=1 +
+  // DATABASE_TEST_URL pointing at a ParadeDB instance to exercise these.
  describe.skip('searchTopics', () => {
    it('should search topics using agentId', async () => {
      const caller = topicRouter.createCaller(createTestContext(userId));

-      // Create test topics
-      await caller.createTopic({
-        title: 'TypeScript Discussion',
-        sessionId: testSessionId,
-      });
+      // Topics are agent-native: stored with agentId directly.
+      await serverDB.insert(topics).values([
+        { agentId: testAgentId, title: 'TypeScript Discussion', userId },
+        { agentId: testAgentId, title: 'JavaScript Basics', userId },
+      ]);

-      await caller.createTopic({
-        title: 'JavaScript Basics',
-        sessionId: testSessionId,
-      });
-
-      // Search using agentId
      const result = await caller.searchTopics({
-        keywords: 'TypeScript',
        agentId: testAgentId,
+        keywords: 'TypeScript',
      });

      expect(result.length).toBeGreaterThan(0);
      expect(result[0].title).toContain('TypeScript');
    });
+
+    // Regression for the "No topics match these filters" bug: topics created by
+    // the new agent system carry `agentId` directly with a NULL `sessionId`.
+    // The old search resolved agentId -> sessionId and filtered by the
+    // container only, so these rows were never matched even though the topics
+    // list (which filters by agentId) showed them.
+    it('should find agentId-scoped topics that have no sessionId', async () => {
+      const caller = topicRouter.createCaller(createTestContext(userId));
+
+      // Insert a topic the way the agent runtime does: agentId set, sessionId null.
+      await serverDB.insert(topics).values({
+        agentId: testAgentId,
+        sessionId: null,
+        title: 'rinabrown84@gmail.com',
+        userId,
+      });
+
+      const result = await caller.searchTopics({
+        agentId: testAgentId,
+        keywords: 'rinabrown84@gmail.com',
+      });
+
+      expect(result.length).toBeGreaterThan(0);
+      expect(result[0].title).toBe('rinabrown84@gmail.com');
+    });
+
+    // The agent scope mirrors the topics list exactly (agentId only). A row that
+    // shares this agent's resolved session but is owned by a DIFFERENT agent
+    // must not leak in — the bug the constrained-session-fallback review flagged.
+    it('should not leak another agent topic that shares the session mapping', async () => {
+      const caller = topicRouter.createCaller(createTestContext(userId));
+
+      const otherAgentId = await createTestAgent(serverDB, userId);
+
+      await serverDB.insert(topics).values([
+        { agentId: testAgentId, title: 'mine rinabrown84@gmail.com', userId },
+        // Same session, different agent — used to leak via the session fallback.
+        {
+          agentId: otherAgentId,
+          sessionId: testSessionId,
+          title: 'theirs rinabrown84@gmail.com',
+          userId,
+        },
+      ]);
+
+      const result = await caller.searchTopics({
+        agentId: testAgentId,
+        keywords: 'rinabrown84@gmail.com',
+      });
+
+      expect(result.map((t) => t.title)).toEqual(['mine rinabrown84@gmail.com']);
+    });
  });

  describe('updateTopic', () => {
@@ -719,7 +767,7 @@ describe('Topic Router Integration Tests', () => {
        sessionId: testSessionId,
      });

-      const allTopics = await caller.getAllTopics();
+      const allTopics = await caller.queryTopics();

      expect(allTopics).toHaveLength(2);
    });
@@ -4,12 +4,15 @@ import { pushTokenRouter } from '@/server/routers/lambda/pushToken';

 const mockUpsert = vi.fn();
 const mockUnregister = vi.fn();
+const mockDeleteByExpoTokenAndDevice = vi.fn();

 vi.mock('@/database/models/pushToken', () => ({
  PushTokenModel: vi.fn(() => ({
    unregister: mockUnregister,
    upsert: mockUpsert,
  })),
+  deletePushTokenByExpoTokenAndDevice: (...args: unknown[]) =>
+    mockDeleteByExpoTokenAndDevice(...args),
 }));

 const createCaller = (ctxOverrides: Partial<any> = {}) => {
@@ -91,18 +94,90 @@ describe('pushTokenRouter', () => {
  });

  describe('unregister', () => {
-    it('should call model.unregister with deviceId', async () => {
+    it('should delete by (expoToken, deviceId) when expoToken is provided', async () => {
+      mockDeleteByExpoTokenAndDevice.mockResolvedValueOnce(undefined);
+      const caller = createCaller();
+
+      const result = await caller.unregister({
+        deviceId: 'device-1',
+        expoToken: 'ExponentPushToken[abc]',
+      });
+
+      expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalledWith(expect.anything(), {
+        deviceId: 'device-1',
+        expoToken: 'ExponentPushToken[abc]',
+      });
+      expect(result).toEqual({ success: true });
+      // Legacy (userId, deviceId) path must not fire when expoToken is present
+      expect(mockUnregister).not.toHaveBeenCalled();
+    });
+
+    it('should fall back to (userId, deviceId) for legacy clients with a session', async () => {
+      // Path B — v1.0.7 only sends deviceId; if the request still carries a
+      // valid session we MUST delete the row, otherwise PushChannel keeps
+      // notifying a signed-out device (Expo DeviceNotRegistered only fires on
+      // uninstall, not logout).
      mockUnregister.mockResolvedValueOnce(undefined);
      const caller = createCaller();

-      await caller.unregister({ deviceId: 'device-1' });
+      const result = await caller.unregister({ deviceId: 'device-1' });

      expect(mockUnregister).toHaveBeenCalledWith('device-1');
+      expect(mockDeleteByExpoTokenAndDevice).not.toHaveBeenCalled();
+      expect(result).toEqual({ success: true });
+    });
+
+    it('should silently succeed without expoToken AND without session', async () => {
+      // Path C — v1.0.7 + dead session: the only safe move is silent OK.
+      // Orphan row will be cleaned up by the process-push-receipts worker via
+      // Expo DeviceNotRegistered receipts. Returning 200 here stops the storm.
+      const caller = createCaller({ userId: undefined });
+
+      const result = await caller.unregister({ deviceId: 'device-1' });
+
+      expect(mockDeleteByExpoTokenAndDevice).not.toHaveBeenCalled();
+      expect(mockUnregister).not.toHaveBeenCalled();
+      expect(result).toEqual({ success: true });
+    });
+
+    it('should succeed for an unauthenticated caller carrying expoToken', async () => {
+      // New clients (>=1.0.8) hit Path A regardless of session.
+      const caller = createCaller({ userId: undefined });
+
+      const result = await caller.unregister({
+        deviceId: 'device-1',
+        expoToken: 'ExponentPushToken[abc]',
+      });
+
+      expect(result).toEqual({ success: true });
+      expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalled();
+      expect(mockUnregister).not.toHaveBeenCalled();
+    });
+
+    it('should prefer expoToken precision over the legacy userId fallback', async () => {
+      // If both are available, always take Path A — the (expoToken, deviceId)
+      // pair is more precise and doesn't risk deleting a wrong row.
+      const caller = createCaller();
+
+      await caller.unregister({
+        deviceId: 'device-1',
+        expoToken: 'ExponentPushToken[abc]',
+      });
+
+      expect(mockDeleteByExpoTokenAndDevice).toHaveBeenCalled();
+      expect(mockUnregister).not.toHaveBeenCalled();
    });

    it('should reject empty deviceId', async () => {
      const caller = createCaller();
      await expect(caller.unregister({ deviceId: '' })).rejects.toThrow();
    });
+
+    it('should reject empty expoToken when provided', async () => {
+      const caller = createCaller();
+      await expect(
+        caller.unregister({ deviceId: 'device-1', expoToken: '' }),
+      ).rejects.toThrow();
+    });
  });
 });
@@ -36,6 +36,7 @@ export const compareDocumentHistoryItemsInputSchema = z.object({
 });

 export const updateDocumentInputSchema = z.object({
+  breakAutosaveWindow: z.boolean().optional(),
  content: z.string().optional(),
  editorData: z.string().optional(),
  fileType: z.string().optional(),
@@ -58,6 +59,7 @@ export interface DocumentHistoryListItem {
  isCurrent: boolean;
  savedAt: string;
  saveSource: DocumentHistorySaveSource;
+  userId: string;
 }

 export interface ListHistoryOutput {
@@ -123,6 +125,7 @@ export interface CompareHistoryItemsInput {
 }

 export interface UpdateDocumentInput {
+  breakAutosaveWindow?: boolean;
  content?: string;
  editorData?: string;
  fileType?: string;
@@ -17,6 +17,8 @@ import { workspaceMembers } from '@/database/schemas';
 import { router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';
 import { AgentService } from '@/server/services/agent';
+import { EditLockService } from '@/server/services/editLock';
+import { publishResourceEvent } from '@/server/services/resourceEvents';
 import { TransferErrorCode } from '@/types/transferError';

 const agentProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
@@ -28,6 +30,7 @@ const agentProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) =>
      agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
      agentService: new AgentService(ctx.serverDB, ctx.userId, wsId),
      chatGroupModel: new ChatGroupModel(ctx.serverDB, ctx.userId, wsId),
+      editLockService: new EditLockService(ctx.userId),
      fileModel: new FileModel(ctx.serverDB, ctx.userId, wsId),
      knowledgeBaseModel: new KnowledgeBaseModel(ctx.serverDB, ctx.userId, wsId),
      sessionModel: new SessionModel(ctx.serverDB, ctx.userId, wsId),
@@ -440,6 +443,19 @@ export const agentRouter = router({
      }),
    )
    .mutation(async ({ input, ctx }) => {
+      // Collaborative edit lock: reject writes to a workspace agent another
+      // member is actively editing. Inert until a client acquires the lock.
+      if (ctx.workspaceId) {
+        const blockedBy = await ctx.editLockService.getBlockingHolder('agent', input.agentId);
+        if (blockedBy) {
+          throw new TRPCError({
+            cause: { data: { code: 'DocumentLocked' } },
+            code: 'CONFLICT',
+            message: 'Agent is being edited by another user',
+          });
+        }
+      }
+
      // Use AgentService to update and return the updated agent data
      return ctx.agentService.updateAgentConfig(input.agentId, input.value);
    }),
@@ -458,4 +474,48 @@ export const agentRouter = router({
    .mutation(async ({ input, ctx }) => {
      return ctx.agentModel.update(input.id, { pinned: input.pinned });
    }),
+
+  acquireAgentLock: agentProcedure
+    .use(withScopedPermission('agent:update'))
+    .input(z.object({ agentId: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+      const prev = await ctx.editLockService.getActiveHolder('agent', input.agentId);
+      const result = await ctx.editLockService.acquire('agent', input.agentId);
+      if ((result.holderId ?? null) !== (prev ?? null)) {
+        void publishResourceEvent(
+          { id: input.agentId, type: 'agent' },
+          { actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
+        );
+      }
+      return result;
+    }),
+
+  getAgentLock: agentProcedure
+    .use(withScopedPermission('agent:update'))
+    .input(z.object({ agentId: z.string() }))
+    .query(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+      const holder = await ctx.editLockService.getActiveHolder('agent', input.agentId);
+      return {
+        expiresAt: null,
+        holderId: holder ?? null,
+        lockedByOther: Boolean(holder) && holder !== ctx.userId,
+      };
+    }),
+
+  releaseAgentLock: agentProcedure
+    .use(withScopedPermission('agent:update'))
+    .input(z.object({ agentId: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return;
+      // Only broadcast "unlocked" when we actually released our own lock — if the
+      // lease expired and another member took over, the lock is still held.
+      const released = await ctx.editLockService.release('agent', input.agentId);
+      if (!released) return;
+      void publishResourceEvent(
+        { id: input.agentId, type: 'agent' },
+        { actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
+      );
+    }),
 });
@@ -14,6 +14,8 @@ import { type ChatGroupConfig } from '@/database/types/chatGroup';
 import { router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';
 import { AgentGroupService } from '@/server/services/agentGroup';
+import { EditLockService } from '@/server/services/editLock';
+import { publishResourceEvent } from '@/server/services/resourceEvents';
 import { TransferErrorCode } from '@/types/transferError';

 /**
@@ -55,6 +57,7 @@ const agentGroupProcedure = wsCompatProcedure.use(serverDatabase).use(async (opt
      agentGroupService: new AgentGroupService(ctx.serverDB, ctx.userId, wsId),
      agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
      chatGroupModel: new ChatGroupModel(ctx.serverDB, ctx.userId, wsId),
+      editLockService: new EditLockService(ctx.userId),
      userModel: new UserModel(ctx.serverDB, ctx.userId),
    },
  });
@@ -402,6 +405,19 @@ export const agentGroupRouter = router({
      }),
    )
    .mutation(async ({ input, ctx }) => {
+      // Collaborative edit lock: reject writes to a workspace group another
+      // member is actively editing. Inert until a client acquires the lock.
+      if (ctx.workspaceId) {
+        const blockedBy = await ctx.editLockService.getBlockingHolder('chatGroup', input.id);
+        if (blockedBy) {
+          throw new TRPCError({
+            cause: { data: { code: 'DocumentLocked' } },
+            code: 'CONFLICT',
+            message: 'Group is being edited by another user',
+          });
+        }
+      }
+
      return ctx.chatGroupModel.update(input.id, {
        ...input.value,
        config: ctx.agentGroupService.normalizeGroupConfig(
@@ -409,6 +425,47 @@ export const agentGroupRouter = router({
        ),
      });
    }),
+
+  acquireGroupLock: agentGroupProcedureWrite
+    .input(z.object({ id: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+      const prev = await ctx.editLockService.getActiveHolder('chatGroup', input.id);
+      const result = await ctx.editLockService.acquire('chatGroup', input.id);
+      if ((result.holderId ?? null) !== (prev ?? null)) {
+        void publishResourceEvent(
+          { id: input.id, type: 'chatGroup' },
+          { actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
+        );
+      }
+      return result;
+    }),
+
+  getGroupLock: agentGroupProcedureWrite
+    .input(z.object({ id: z.string() }))
+    .query(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+      const holder = await ctx.editLockService.getActiveHolder('chatGroup', input.id);
+      return {
+        expiresAt: null,
+        holderId: holder ?? null,
+        lockedByOther: Boolean(holder) && holder !== ctx.userId,
+      };
+    }),
+
+  releaseGroupLock: agentGroupProcedureWrite
+    .input(z.object({ id: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      if (!ctx.workspaceId) return;
+      // Only broadcast "unlocked" when we actually released our own lock — if the
+      // lease expired and another member took over, the lock is still held.
+      const released = await ctx.editLockService.release('chatGroup', input.id);
+      if (!released) return;
+      void publishResourceEvent(
+        { id: input.id, type: 'chatGroup' },
+        { actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
+      );
+    }),
 });

 export type AgentGroupRouter = typeof agentGroupRouter;
@@ -85,6 +85,7 @@ export const agentSignalRouter = router({
      return enqueueAgentSignalSourceEvent(sourceEvent, {
        agentId: input.agentId,
        userId: ctx.userId,
+        workspaceId: ctx.workspaceId ?? undefined,
      });
    }),
  listReceipts: agentSignalProcedure
@@ -370,7 +370,6 @@ export const aiChatRouter = router({
            { assistantMessage, userMessage },
            {
              ...(modelTiming ? { timing: modelTiming } : {}),
-              touchTopicUpdatedAt: !isCreateNewTopic,
            },
          );
        },
@@ -0,0 +1,256 @@
+import { type ToolManifest } from '@lobechat/types';
+import { TRPCError } from '@trpc/server';
+import { z } from 'zod';
+
+import { getServerComposioAuthConfigId } from '@/config/composio';
+import { PluginModel } from '@/database/models/plugin';
+import { getComposioClient } from '@/libs/composio';
+import { authedProcedure, router } from '@/libs/trpc/lambda';
+import { serverDatabase } from '@/libs/trpc/lambda/middleware';
+
+const composioProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
+  const client = getComposioClient();
+  const pluginModel = new PluginModel(opts.ctx.serverDB, opts.ctx.userId);
+
+  return opts.next({
+    ctx: { ...opts.ctx, composioClient: client, pluginModel },
+  });
+});
+
+export const composioRouter = router({
+  createConnection: composioProcedure
+    .input(
+      z.object({
+        appSlug: z.string(),
+        identifier: z.string(),
+        label: z.string(),
+      }),
+    )
+    .mutation(async ({ input, ctx }) => {
+      const { appSlug, identifier, label } = input;
+      const { userId } = ctx;
+
+      const callbackUrl = `${process.env.APP_URL || process.env.NEXTAUTH_URL || ''}/api/composio/oauth/callback`;
+
+      // Prefer a pre-configured auth config (e.g. a custom/white-label config
+      // created in the Composio dashboard), pinned per toolkit via env. Falls
+      // back to discovering an existing config for this toolkit, and finally to
+      // auto-creating a Composio-managed one.
+      let authConfigId = getServerComposioAuthConfigId(identifier);
+      if (!authConfigId) {
+        const authConfigs = await (ctx.composioClient.authConfigs as any).list();
+        let authConfig = authConfigs?.items?.find(
+          (c: any) => c.toolkit?.slug?.toLowerCase() === appSlug.toLowerCase(),
+        );
+        if (!authConfig) {
+          authConfig = await (ctx.composioClient.authConfigs as any).create(appSlug, {
+            name: appSlug,
+            type: 'use_composio_managed_auth',
+          });
+        }
+        authConfigId = authConfig.id;
+      }
+
+      if (!authConfigId) {
+        throw new TRPCError({
+          code: 'INTERNAL_SERVER_ERROR',
+          message: `Failed to resolve a Composio auth config for "${appSlug}".`,
+        });
+      }
+
+      // Composio-managed OAuth auth configs no longer support `initiate`; use
+      // `link` (POST /api/v3/connected_accounts/link) to get the redirect URL.
+      const connReq = await (ctx.composioClient.connectedAccounts as any).link(
+        userId,
+        authConfigId,
+        { callbackUrl },
+      );
+
+      let rawTools: any[] = [];
+      try {
+        const toolsResp = await (ctx.composioClient.tools as any).getRawComposioTools({
+          toolkits: [appSlug],
+        });
+        rawTools = toolsResp?.items || toolsResp || [];
+      } catch {
+        // tools may not be available before auth
+      }
+
+      const manifest: ToolManifest = {
+        api: Array.isArray(rawTools)
+          ? rawTools.map((tool: any) => ({
+              description: tool.description || '',
+              name: tool.slug || tool.name || '',
+              parameters: tool.inputParameters ||
+                tool.inputSchema || {
+                  properties: {},
+                  type: 'object',
+                },
+            }))
+          : [],
+        identifier,
+        meta: {
+          avatar: '🔌',
+          description: `Composio: ${label}`,
+          title: label,
+        },
+        type: 'default',
+      };
+
+      await ctx.pluginModel.create({
+        customParams: {
+          composio: {
+            appSlug,
+            authConfigId,
+            connectedAccountId: connReq.id,
+            redirectUrl: connReq.redirectUrl,
+            status: 'PENDING',
+          },
+        },
+        identifier,
+        manifest,
+        source: 'composio',
+        type: 'plugin',
+      });
+
+      return {
+        authConfigId,
+        connectedAccountId: connReq.id,
+        identifier,
+        redirectUrl: connReq.redirectUrl,
+      };
+    }),
+
+  deleteConnection: composioProcedure
+    .input(
+      z.object({
+        connectedAccountId: z.string(),
+        identifier: z.string(),
+      }),
+    )
+    .mutation(async ({ input, ctx }) => {
+      try {
+        await (ctx.composioClient.connectedAccounts as any).delete(input.connectedAccountId);
+      } catch (error) {
+        console.warn('[Composio] Failed to delete remote connection:', error);
+      }
+
+      await ctx.pluginModel.delete(input.identifier);
+
+      return { success: true };
+    }),
+
+  getComposioPlugins: composioProcedure.query(async ({ ctx }) => {
+    const allPlugins = await ctx.pluginModel.query();
+    return allPlugins.filter((plugin) => plugin.customParams?.composio);
+  }),
+
+  getConnection: composioProcedure
+    .input(
+      z.object({
+        connectedAccountId: z.string(),
+      }),
+    )
+    .query(async ({ input, ctx }) => {
+      try {
+        const account = await (ctx.composioClient.connectedAccounts as any).get(
+          input.connectedAccountId,
+        );
+        return {
+          appSlug: account?.toolkit?.slug || '',
+          connectedAccountId: input.connectedAccountId,
+          error: undefined as 'AUTH_ERROR' | undefined,
+          status: (account?.status || 'PENDING') as string,
+        };
+      } catch (error) {
+        const errorMessage = error instanceof Error ? error.message : String(error);
+        const isAuthError = errorMessage.includes('401') || errorMessage.includes('Unauthorized');
+
+        if (isAuthError) {
+          return {
+            appSlug: '',
+            connectedAccountId: input.connectedAccountId,
+            error: 'AUTH_ERROR' as const,
+            status: 'FAILED',
+          };
+        }
+        throw error;
+      }
+    }),
+
+  removeComposioPlugin: composioProcedure
+    .input(z.object({ identifier: z.string() }))
+    .mutation(async ({ input, ctx }) => {
+      await ctx.pluginModel.delete(input.identifier);
+      return { success: true };
+    }),
+
+  updateComposioPlugin: composioProcedure
+    .input(
+      z.object({
+        appSlug: z.string(),
+        authConfigId: z.string(),
+        connectedAccountId: z.string(),
+        identifier: z.string(),
+        label: z.string(),
+        redirectUrl: z.string().optional(),
+        status: z.string(),
+        tools: z.array(
+          z.object({
+            description: z.string().optional(),
+            inputSchema: z.any().optional(),
+            name: z.string(),
+          }),
+        ),
+      }),
+    )
+    .mutation(async ({ input, ctx }) => {
+      const {
+        identifier,
+        label,
+        appSlug,
+        authConfigId,
+        connectedAccountId,
+        tools,
+        status,
+        redirectUrl,
+      } = input;
+
+      const existingPlugin = await ctx.pluginModel.findById(identifier);
+
+      const manifest: ToolManifest = {
+        api: tools.map((tool) => ({
+          description: tool.description || '',
+          name: tool.name,
+          parameters: tool.inputSchema || { properties: {}, type: 'object' },
+        })),
+        identifier,
+        meta: existingPlugin?.manifest?.meta || {
+          avatar: '🔌',
+          description: `Composio: ${label}`,
+          title: label,
+        },
+        type: 'default',
+      };
+
+      const customParams = {
+        composio: { appSlug, authConfigId, connectedAccountId, redirectUrl, status },
+      };
+
+      if (existingPlugin) {
+        await ctx.pluginModel.update(identifier, { customParams, manifest });
+      } else {
+        await ctx.pluginModel.create({
+          customParams,
+          identifier,
+          manifest,
+          source: 'composio',
+          type: 'plugin',
+        });
+      }
+
+      return { savedCount: tools.length };
+    }),
+});
+
+export type ComposioRouter = typeof composioRouter;
@@ -268,9 +268,14 @@ export const connectorRouter = router({
      await ctx.connectorModel.update(input.id, {
        ...patch,
        // undefined → leave untouched; null → clear; object → encrypt the JSON string.
+        // When credentials are cleared, also drop the cached expiry timestamp so
+        // token-refresh logic doesn't act on a stale value for the new server.
        ...(credentials === undefined
          ? {}
-          : { credentials: credentials ? JSON.stringify(credentials) : null }),
+          : {
+              credentials: credentials ? JSON.stringify(credentials) : null,
+              ...(credentials === null ? { tokenExpiresAt: null } : {}),
+            }),
      } as any);
    }),

@@ -358,7 +363,7 @@ export const connectorRouter = router({
    }),

  /**
-   * Sync tools from a client-provided list (for Lobehub OAuth skills, Klavis, etc.
+   * Sync tools from a client-provided list (for Lobehub OAuth skills, Composio, etc.
   * that already have their tool list available on the client side).
   * Idempotent — safe to call whenever the detail panel opens.
   */
@@ -163,6 +163,50 @@ export const deviceRouter = router({
      }),
    ),

+  /**
+   * Rename a branch in a directory on a remote device, via the device's
+   * `renameGitBranch` RPC.
+   */
+  renameGitBranch: deviceProcedure
+    .input(
+      z.object({
+        deviceId: z.string(),
+        from: z.string(),
+        path: z.string(),
+        to: z.string(),
+      }),
+    )
+    .mutation(async ({ ctx, input }) =>
+      deviceGateway.renameGitBranch({
+        deviceId: input.deviceId,
+        from: input.from,
+        path: input.path,
+        to: input.to,
+        userId: ctx.userId,
+      }),
+    ),
+
+  /**
+   * Delete a branch in a directory on a remote device, via the device's
+   * `deleteGitBranch` RPC.
+   */
+  deleteGitBranch: deviceProcedure
+    .input(
+      z.object({
+        branch: z.string(),
+        deviceId: z.string(),
+        path: z.string(),
+      }),
+    )
+    .mutation(async ({ ctx, input }) =>
+      deviceGateway.deleteGitBranch({
+        branch: input.branch,
+        deviceId: input.deviceId,
+        path: input.path,
+        userId: ctx.userId,
+      }),
+    ),
+
  /**
   * Pull (`--ff-only`) the current branch of a directory on a remote device, via
   * the device's `pullGitBranch` RPC.
@@ -275,9 +319,17 @@ export const deviceRouter = router({
   * receives render data, not a `localfile://` URL; saving remains unsupported.
   */
  getLocalFilePreview: deviceProcedure
-    .input(z.object({ deviceId: z.string(), path: z.string(), workingDirectory: z.string() }))
+    .input(
+      z.object({
+        accept: z.enum(['image']).optional(),
+        deviceId: z.string(),
+        path: z.string(),
+        workingDirectory: z.string(),
+      }),
+    )
    .query(async ({ ctx, input }) =>
      deviceGateway.getLocalFilePreview({
+        accept: input.accept,
        deviceId: input.deviceId,
        path: input.path,
        userId: ctx.userId,
@@ -253,6 +253,27 @@ export const documentRouter = router({
      return ctx.documentService.queryDocuments(input);
    }),

+  acquireDocumentLock: documentProcedure
+    .use(withScopedPermission('document:update'))
+    .input(z.object({ id: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      return ctx.documentService.acquireDocumentLock(input.id);
+    }),
+
+  getDocumentLock: documentProcedure
+    .use(withScopedPermission('document:update'))
+    .input(z.object({ id: z.string() }))
+    .query(async ({ ctx, input }) => {
+      return ctx.documentService.getDocumentLock(input.id);
+    }),
+
+  releaseDocumentLock: documentProcedure
+    .use(withScopedPermission('document:update'))
+    .input(z.object({ id: z.string() }))
+    .mutation(async ({ ctx, input }) => {
+      await ctx.documentService.releaseDocumentLock(input.id);
+    }),
+
  updateDocument: documentProcedure
    .use(withScopedPermission('document:update'))
    .input(updateDocumentInputSchema)
@@ -37,6 +37,7 @@ import { briefRouter } from './brief';
 import { changelogRouter } from './changelog';
 import { chunkRouter } from './chunk';
 import { comfyuiRouter } from './comfyui';
+import { composioRouter } from './composio';
 import { configRouter } from './config';
 import { connectorRouter } from './connector';
 import { deviceRouter } from './device';
@@ -50,7 +51,6 @@ import { generationTopicRouter } from './generationTopic';
 import { homeRouter } from './home';
 import { imageRouter } from './image';
 import { importerRouter } from './importer';
-import { klavisRouter } from './klavis';
 import { knowledgeRouter } from './knowledge';
 import { knowledgeBaseRouter } from './knowledgeBase';
 import { llmGenerationTracingRouter } from './llmGenerationTracing';
@@ -115,7 +115,8 @@ export const lambdaRouter = router({
  home: homeRouter,
  image: imageRouter,
  importer: importerRouter,
-  klavis: klavisRouter,
+  composio: composioRouter,
+
  knowledge: knowledgeRouter,
  knowledgeBase: knowledgeBaseRouter,
  llmGenerationTracing: llmGenerationTracingRouter,
@@ -1,284 +0,0 @@
-import { type ToolManifest } from '@lobechat/types';
-import { z } from 'zod';
-
-import { withScopedPermission } from '@/business/server/trpc-middlewares/rbacPermission';
-import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
-import { PluginModel } from '@/database/models/plugin';
-import { getKlavisClient } from '@/libs/klavis';
-import { router } from '@/libs/trpc/lambda';
-import { serverDatabase } from '@/libs/trpc/lambda/middleware';
-
-/**
- * Klavis procedure with API key validation and database access
- */
-const klavisProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
-  const client = getKlavisClient();
-  const wsId = opts.ctx.workspaceId ?? undefined;
-  const pluginModel = new PluginModel(opts.ctx.serverDB, opts.ctx.userId, wsId);
-
-  return opts.next({
-    ctx: { ...opts.ctx, klavisClient: client, pluginModel },
-  });
-});
-
-export const klavisRouter = router({
-  /**
-   * Create a single MCP server instance and save to database
-   * Returns: { serverUrl, instanceId, oauthUrl?, identifier, serverName }
-   */
-  createServerInstance: klavisProcedure
-    .use(withScopedPermission('agent:update'))
-    .input(
-      z.object({
-        /** Identifier for storage (e.g., 'google-calendar') */
-        identifier: z.string(),
-        /** Server name for Klavis API (e.g., 'Google Calendar') */
-        serverName: z.string(),
-        userId: z.string(),
-      }),
-    )
-    .mutation(async ({ input, ctx }) => {
-      const { serverName, userId, identifier } = input;
-
-      // Create a single server instance
-      const response = await ctx.klavisClient.mcpServer.createServerInstance({
-        serverName: serverName as any,
-        userId,
-      });
-
-      const { serverUrl, instanceId, oauthUrl } = response;
-
-      // Get the tool list for this server
-      const toolsResponse = await ctx.klavisClient.mcpServer.getTools(serverName as any);
-      const tools = toolsResponse.tools || [];
-
-      // Save to database using the provided identifier (format: lowercase, spaces replaced with hyphens)
-      const manifest: ToolManifest = {
-        api: tools.map((tool: any) => ({
-          description: tool.description || '',
-          name: tool.name,
-          parameters: tool.inputSchema || { properties: {}, type: 'object' },
-        })),
-        identifier,
-        meta: {
-          avatar: '🔌',
-          description: `LobeHub Mcp Server: ${serverName}`,
-          title: serverName,
-        },
-        type: 'default',
-      };
-
-      // Save to database with oauthUrl and isAuthenticated status
-      const isAuthenticated = !oauthUrl; // If there's no oauthUrl, authentication is not required or already authenticated
-      await ctx.pluginModel.create({
-        customParams: {
-          klavis: {
-            instanceId,
-            isAuthenticated,
-            oauthUrl,
-            serverName,
-            serverUrl,
-          },
-        },
-        identifier,
-        manifest,
-        source: 'klavis',
-        type: 'plugin',
-      });
-
-      return {
-        identifier,
-        instanceId,
-        isAuthenticated,
-        oauthUrl,
-        serverName,
-        serverUrl,
-      };
-    }),
-
-  /**
-   * Delete a server instance
-   */
-  deleteServerInstance: klavisProcedure
-    .use(withScopedPermission('agent:update'))
-    .input(
-      z.object({
-        /** Identifier for storage (e.g., 'google-calendar') */
-        identifier: z.string(),
-        instanceId: z.string(),
-      }),
-    )
-    .mutation(async ({ input, ctx }) => {
-      // Call Klavis API to delete server instance
-      await ctx.klavisClient.mcpServer.deleteServerInstance(input.instanceId);
-
-      // Delete from database (using identifier)
-      await ctx.pluginModel.delete(input.identifier);
-
-      return { success: true };
-    }),
-
-  /**
-   * Get Klavis plugins from database
-   */
-  getKlavisPlugins: klavisProcedure.query(async ({ ctx }) => {
-    const allPlugins = await ctx.pluginModel.query();
-    // Filter plugins that have klavis customParams
-    return allPlugins.filter((plugin) => plugin.customParams?.klavis);
-  }),
-
-  /**
-   * Get server instance status from Klavis API
-   * Returns error object instead of throwing on auth errors (useful for polling)
-   */
-  getServerInstance: klavisProcedure
-    .input(
-      z.object({
-        instanceId: z.string(),
-      }),
-    )
-    .query(async ({ input, ctx }) => {
-      try {
-        const response = await ctx.klavisClient.mcpServer.getServerInstance(input.instanceId);
-        return {
-          authNeeded: response.authNeeded,
-          error: undefined,
-          externalUserId: response.externalUserId,
-          instanceId: response.instanceId,
-          isAuthenticated: response.isAuthenticated,
-          oauthUrl: response.oauthUrl,
-          platform: response.platform,
-          serverName: response.serverName,
-        };
-      } catch (error) {
-        // Check if this is an authentication error
-        const errorMessage = error instanceof Error ? error.message : String(error);
-        const isAuthError =
-          errorMessage.includes('Invalid API key or instance ID') ||
-          errorMessage.includes('Status code: 401');
-
-        // For auth errors, return error object instead of throwing
-        // This prevents 500 errors in logs during polling
-        if (isAuthError) {
-          return {
-            authNeeded: true,
-            error: 'AUTH_ERROR',
-            externalUserId: undefined,
-            instanceId: input.instanceId,
-            isAuthenticated: false,
-            oauthUrl: undefined,
-            platform: undefined,
-            serverName: undefined,
-          };
-        }
-
-        // For other errors, still throw
-        throw error;
-      }
-    }),
-
-  getUserIntergrations: klavisProcedure
-    .input(
-      z.object({
-        userId: z.string(),
-      }),
-    )
-    .query(async ({ input, ctx }) => {
-      const response = await ctx.klavisClient.user.getUserIntegrations(input.userId);
-
-      return {
-        integrations: response.integrations,
-      };
-    }),
-
-  /**
-   * Remove Klavis plugin from database by identifier
-   */
-  removeKlavisPlugin: klavisProcedure
-    .use(withScopedPermission('agent:update'))
-    .input(
-      z.object({
-        /** Identifier for storage (e.g., 'google-calendar') */
-        identifier: z.string(),
-      }),
-    )
-    .mutation(async ({ input, ctx }) => {
-      await ctx.pluginModel.delete(input.identifier);
-      return { success: true };
-    }),
-
-  /**
-   * Update Klavis plugin with tools and auth status in database
-   */
-  updateKlavisPlugin: klavisProcedure
-    .use(withScopedPermission('agent:update'))
-    .input(
-      z.object({
-        /** Identifier for storage (e.g., 'google-calendar') */
-        identifier: z.string(),
-        instanceId: z.string(),
-        isAuthenticated: z.boolean(),
-        oauthUrl: z.string().optional(),
-        /** Server name for Klavis API (e.g., 'Google Calendar') */
-        serverName: z.string(),
-        serverUrl: z.string(),
-        tools: z.array(
-          z.object({
-            description: z.string().optional(),
-            inputSchema: z.any().optional(),
-            name: z.string(),
-          }),
-        ),
-      }),
-    )
-    .mutation(async ({ input, ctx }) => {
-      const { identifier, serverName, serverUrl, instanceId, tools, isAuthenticated, oauthUrl } =
-        input;
-
-      // Get existing plugin (using identifier)
-      const existingPlugin = await ctx.pluginModel.findById(identifier);
-
-      // Build manifest containing all tools
-      const manifest: ToolManifest = {
-        api: tools.map((tool) => ({
-          description: tool.description || '',
-          name: tool.name,
-          parameters: tool.inputSchema || { properties: {}, type: 'object' },
-        })),
-        identifier,
-        meta: existingPlugin?.manifest?.meta || {
-          avatar: '🔌',
-          description: `LobeHub Mcp Server: ${serverName}`,
-          title: serverName,
-        },
-        type: 'default',
-      };
-
-      const customParams = {
-        klavis: {
-          instanceId,
-          isAuthenticated,
-          oauthUrl,
-          serverName,
-          serverUrl,
-        },
-      };
-
-      // Update or create plugin
-      if (existingPlugin) {
-        await ctx.pluginModel.update(identifier, { customParams, manifest });
-      } else {
-        await ctx.pluginModel.create({
-          customParams,
-          identifier,
-          manifest,
-          source: 'klavis',
-          type: 'plugin',
-        });
-      }
-
-      return { savedCount: tools.length };
-    }),
-});
-
-export type KlavisRouter = typeof klavisRouter;
@@ -53,7 +53,7 @@ export const oauthDeviceFlowRouter = router({
      );

      if (!providerDetail?.keyVaults) {
-        return { isAuthenticated: false };
+        return { status: 'PENDING' };
      }

      const keyVaults = providerDetail.keyVaults as Record<string, any>;
@@ -63,12 +63,12 @@ export const oauthDeviceFlowRouter = router({
        return {
          avatarUrl: keyVaults.githubAvatarUrl as string | undefined,
          expiresAt: keyVaults.oauthTokenExpiresAt || keyVaults.bearerTokenExpiresAt,
-          isAuthenticated: true,
+          status: 'ACTIVE',
          username: keyVaults.githubUsername as string | undefined,
        };
      }

-      return { isAuthenticated: false };
+      return { status: 'PENDING' };
    }),

  /**
@@ -1,10 +1,13 @@
 import { z } from 'zod';

-import { PushTokenModel } from '@/database/models/pushToken';
-import { authedProcedure, router } from '@/libs/trpc/lambda';
+import {
+  deletePushTokenByExpoTokenAndDevice,
+  PushTokenModel,
+} from '@/database/models/pushToken';
+import { authedProcedure, publicProcedure, router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';

-const pushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
+const authedPushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
  const { ctx } = opts;

  return opts.next({
@@ -13,7 +16,7 @@ const pushTokenProcedure = authedProcedure.use(serverDatabase).use(async (opts)
 });

 export const pushTokenRouter = router({
-  register: pushTokenProcedure
+  register: authedPushTokenProcedure
    .input(
      z.object({
        appVersion: z.string().optional(),
@@ -27,10 +30,53 @@ export const pushTokenRouter = router({
      return ctx.pushTokenModel.upsert(input);
    }),

-  unregister: pushTokenProcedure
-    .input(z.object({ deviceId: z.string().min(1) }))
+  /**
+   * Public on purpose: clients call this during sign-out, and in the wild many
+   * of those calls arrive after the session is already gone (expired OIDC
+   * token / cleared cookie). Authenticating by session here causes a 401
+   * storm on every such logout.
+   *
+   * Authorization model (Path A — new clients ≥ 1.0.8): the caller presents the
+   * (deviceId, expoToken) pair it received at registration. Holding both = proof
+   * of ownership of the row, same trust model as APNs/FCM unregister.
+   *
+   * Backwards compat for v1.0.7 (only sends `deviceId`):
+   *  - Path B — when the request still carries a valid session, fall back to
+   *    the original (userId, deviceId) delete. This covers the *active*
+   *    sign-out path so PushChannel doesn't keep notifying a signed-out device
+   *    until the user uninstalls (Expo's DeviceNotRegistered receipt only
+   *    fires on uninstall, not on logout).
+   *  - Path C — when there's no session either, silently succeed. The orphan
+   *    row will be cleaned up by the existing `process-push-receipts` worker
+   *    via Expo's DeviceNotRegistered receipts. Returning 200 here is what
+   *    actually stops the 401 storm in production.
+   */
+  unregister: publicProcedure
+    .use(serverDatabase)
+    .input(
+      z.object({
+        deviceId: z.string().min(1),
+        expoToken: z.string().min(1).optional(),
+      }),
+    )
    .mutation(async ({ ctx, input }) => {
-      return ctx.pushTokenModel.unregister(input.deviceId);
+      const { deviceId, expoToken } = input;
+
+      // Path A: new clients — precise delete by (expoToken, deviceId), no session needed
+      if (expoToken) {
+        await deletePushTokenByExpoTokenAndDevice(ctx.serverDB, { deviceId, expoToken });
+        return { success: true };
+      }
+
+      // Path B: legacy v1.0.7 + valid session — fall back to (userId, deviceId)
+      if (ctx.userId) {
+        const pushTokenModel = new PushTokenModel(ctx.serverDB, ctx.userId);
+        await pushTokenModel.unregister(deviceId);
+        return { success: true };
+      }
+
+      // Path C: legacy v1.0.7 with no session — silent OK, cron worker cleans up
+      return { success: true };
    }),
 });

@@ -14,6 +14,8 @@ import { TopicModel } from '@/database/models/topic';
 import { workspaceMembers } from '@/database/schemas';
 import { router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';
+import { EditLockService } from '@/server/services/editLock';
+import { publishResourceEvent } from '@/server/services/resourceEvents';
 import { TaskService } from '@/server/services/task';
 import { TaskLifecycleService } from '@/server/services/taskLifecycle';
 import { TaskRunnerService } from '@/server/services/taskRunner';
@@ -26,6 +28,7 @@ const taskProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) =>
    ctx: {
      agentModel: new AgentModel(ctx.serverDB, ctx.userId, wsId),
      briefModel: new BriefModel(ctx.serverDB, ctx.userId, wsId),
+      editLockService: new EditLockService(ctx.userId),
      taskLifecycle: new TaskLifecycleService(ctx.serverDB, ctx.userId, wsId),
      taskModel: new TaskModel(ctx.serverDB, ctx.userId, wsId),
      taskService: new TaskService(ctx.serverDB, ctx.userId, wsId),
@@ -927,6 +930,20 @@ export const taskRouter = router({
      const model = ctx.taskModel;
      await assertAssigneeAgentBelongsToUser(ctx.agentModel, data.assigneeAgentId);
      const resolved = await resolveOrThrow(model, id);
+
+      // Collaborative edit lock: reject writes to a workspace task another member
+      // is actively editing. Inert until a client acquires the lock.
+      if (ctx.workspaceId) {
+        const blockedBy = await ctx.editLockService.getBlockingHolder('task', resolved.id);
+        if (blockedBy) {
+          throw new TRPCError({
+            cause: { data: { code: 'DocumentLocked' } },
+            code: 'CONFLICT',
+            message: 'Task is being edited by another user',
+          });
+        }
+      }
+
      const resolvedParentTaskId =
        parentTaskId === undefined
          ? undefined
@@ -947,6 +964,44 @@ export const taskRouter = router({
    }
  }),

+  acquireTaskLock: taskProcedureWrite.input(idInput).mutation(async ({ ctx, input }) => {
+    if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+    const resolved = await resolveOrThrow(ctx.taskModel, input.id);
+    const prev = await ctx.editLockService.getActiveHolder('task', resolved.id);
+    const result = await ctx.editLockService.acquire('task', resolved.id);
+    if ((result.holderId ?? null) !== (prev ?? null)) {
+      void publishResourceEvent(
+        { id: resolved.id, type: 'task' },
+        { actorId: ctx.userId, data: { holderId: result.holderId }, type: 'lock.changed' },
+      );
+    }
+    return result;
+  }),
+
+  getTaskLock: taskProcedureWrite.input(idInput).query(async ({ ctx, input }) => {
+    if (!ctx.workspaceId) return { expiresAt: null, holderId: null, lockedByOther: false };
+    const resolved = await resolveOrThrow(ctx.taskModel, input.id);
+    const holder = await ctx.editLockService.getActiveHolder('task', resolved.id);
+    return {
+      expiresAt: null,
+      holderId: holder ?? null,
+      lockedByOther: Boolean(holder) && holder !== ctx.userId,
+    };
+  }),
+
+  releaseTaskLock: taskProcedureWrite.input(idInput).mutation(async ({ ctx, input }) => {
+    if (!ctx.workspaceId) return;
+    const resolved = await resolveOrThrow(ctx.taskModel, input.id);
+    // Only broadcast "unlocked" when we actually released our own lock — if the
+    // lease expired and another member took over, the lock is still held.
+    const released = await ctx.editLockService.release('task', resolved.id);
+    if (!released) return;
+    void publishResourceEvent(
+      { id: resolved.id, type: 'task' },
+      { actorId: ctx.userId, data: { holderId: null }, type: 'lock.changed' },
+    );
+  }),
+
  updateConfig: taskProcedureWrite
    .input(idInput.merge(z.object({ config: z.record(z.unknown()) })))
    .mutation(async ({ input, ctx }) => {
@@ -162,6 +162,18 @@ export const topicRouter = router({
      return ctx.topicModel.batchDeleteBySessionId(resolved.sessionId);
    }),

+  batchMoveTopics: topicProcedure
+    .use(withScopedPermission('topic:update'))
+    .input(
+      z.object({
+        targetAgentId: z.string(),
+        topicIds: z.array(z.string()),
+      }),
+    )
+    .mutation(async ({ input, ctx }) => {
+      return ctx.topicModel.batchMoveToAgent(input.topicIds, input.targetAgentId);
+    }),
+
  cloneTopic: topicProcedure
    .use(withScopedPermission('topic:create'))
    .input(z.object({ id: z.string(), newTitle: z.string().optional() }))
@@ -239,9 +251,18 @@ export const topicRouter = router({
      return ctx.topicShareModel.create(input.topicId, input.visibility);
    }),

-  getAllTopics: topicProcedure.query(async ({ ctx }) => {
-    return ctx.topicModel.queryAll();
-  }),
+  queryTopics: topicProcedure
+    .input(
+      z
+        .object({
+          pageSize: z.number().max(500).optional(),
+          statuses: z.array(z.string()).optional(),
+        })
+        .optional(),
+    )
+    .query(async ({ input, ctx }) => {
+      return ctx.topicModel.queryTopics({ pageSize: input?.pageSize, statuses: input?.statuses });
+    }),

  getShareInfo: topicProcedure
    .input(z.object({ topicId: z.string() }))
@@ -570,7 +591,17 @@ export const topicRouter = router({
        ctx.workspaceId ?? undefined,
      );

-      return ctx.topicModel.queryByKeyword(input.keywords, resolved.sessionId);
+      // Scope the search exactly like the topics list (`query`): by agentId
+      // directly (the new agent system stamps every topic with an agentId).
+      // Passing only the resolved sessionId used to miss every agentId-scoped
+      // topic — the cause of "no topics match" in the per-agent Topics search.
+      // `containerId` is only the fallback for legacy callers that pass no
+      // agentId/groupId.
+      return ctx.topicModel.queryByKeyword(input.keywords, {
+        agentId: input.agentId,
+        containerId: resolved.sessionId,
+        groupId: input.groupId,
+      });
    }),

  /**
@@ -1,11 +1,12 @@
 import { z } from 'zod';

+import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
 import { AgentOperationModel } from '@/database/models/agentOperation';
 import { LlmGenerationTracingModel } from '@/database/models/llmGenerationTracing';
 import { VerifyCheckResultModel } from '@/database/models/verifyCheckResult';
 import { VerifyCriterionModel } from '@/database/models/verifyCriterion';
 import { VerifyRubricModel } from '@/database/models/verifyRubric';
-import { authedProcedure, router } from '@/libs/trpc/lambda';
+import { router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';
 import {
  VerifyExecutorService,
@@ -35,18 +36,19 @@ const checkItemSchema = z.object({
  verifierType: verifierTypeSchema,
 });

-const verifyProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
+const verifyProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
  const { ctx } = opts;
+  const workspaceId = ctx.workspaceId ?? undefined;
  return opts.next({
    ctx: {
-      criterionModel: new VerifyCriterionModel(ctx.serverDB, ctx.userId),
-      executorService: new VerifyExecutorService(ctx.serverDB, ctx.userId),
-      tracingModel: new LlmGenerationTracingModel(ctx.serverDB, ctx.userId),
-      feedbackService: new VerifyFeedbackService(ctx.serverDB, ctx.userId),
-      operationModel: new AgentOperationModel(ctx.serverDB, ctx.userId),
-      planGenerator: new VerifyPlanGeneratorService(ctx.serverDB, ctx.userId),
-      resultModel: new VerifyCheckResultModel(ctx.serverDB, ctx.userId),
-      rubricModel: new VerifyRubricModel(ctx.serverDB, ctx.userId),
+      criterionModel: new VerifyCriterionModel(ctx.serverDB, ctx.userId, workspaceId),
+      executorService: new VerifyExecutorService(ctx.serverDB, ctx.userId, workspaceId),
+      tracingModel: new LlmGenerationTracingModel(ctx.serverDB, ctx.userId, workspaceId),
+      feedbackService: new VerifyFeedbackService(ctx.serverDB, ctx.userId, workspaceId),
+      operationModel: new AgentOperationModel(ctx.serverDB, ctx.userId, workspaceId),
+      planGenerator: new VerifyPlanGeneratorService(ctx.serverDB, ctx.userId, workspaceId),
+      resultModel: new VerifyCheckResultModel(ctx.serverDB, ctx.userId, workspaceId),
+      rubricModel: new VerifyRubricModel(ctx.serverDB, ctx.userId, workspaceId),
    },
  });
 });
@@ -3,8 +3,7 @@ import { z } from 'zod';
 import { withScopedPermission } from '@/business/server/trpc-middlewares/rbacPermission';
 import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
 import { TopicModel } from '@/database/models/topic';
-import { getServerDB } from '@/database/server';
-import { publicProcedure, router } from '@/libs/trpc/lambda';
+import { router } from '@/libs/trpc/lambda';
 import { serverDatabase } from '@/libs/trpc/lambda/middleware';
 import { type BatchTaskResult } from '@/types/service';

@@ -95,12 +94,7 @@ export const topicRouter = router({
      return data.id;
    }),

-  getAllTopics: topicProcedure.query(async ({ ctx }) => {
-    return ctx.topicModel.queryAll();
-  }),
-
-  // TODO: this procedure should be used with authedProcedure
-  getTopics: publicProcedure
+  getTopics: topicProcedure
    .input(
      z.object({
        containerId: z.string().nullable().optional(),
@@ -109,12 +103,7 @@ export const topicRouter = router({
      }),
    )
    .query(async ({ input, ctx }) => {
-      if (!ctx.userId) return [];
-
-      const serverDB = await getServerDB();
-      const topicModel = new TopicModel(serverDB, ctx.userId, ctx.workspaceId ?? undefined);
-
-      return topicModel.query(input);
+      return ctx.topicModel.query(input);
    }),

  hasTopics: topicProcedure.query(async ({ ctx }) => {
@@ -0,0 +1,115 @@
+import { TRPCError } from '@trpc/server';
+import { z } from 'zod';
+
+import { PluginModel } from '@/database/models/plugin';
+import { getComposioClient } from '@/libs/composio';
+import { authedProcedure, publicProcedure, router } from '@/libs/trpc/lambda';
+import { serverDatabase } from '@/libs/trpc/lambda/middleware';
+import { MCPService } from '@/server/services/mcp';
+
+const composioProcedure = authedProcedure.use(serverDatabase).use(async (opts) => {
+  const composioClient = getComposioClient();
+  const pluginModel = new PluginModel(opts.ctx.serverDB, opts.ctx.userId);
+  return opts.next({ ctx: { ...opts.ctx, composioClient, pluginModel } });
+});
+
+export const composioToolsRouter = router({
+  executeAction: composioProcedure
+    .input(
+      z.object({
+        identifier: z.string(),
+        toolArgs: z.record(z.unknown()).optional(),
+        toolSlug: z.string(),
+      }),
+    )
+    .mutation(async ({ ctx, input }) => {
+      // Resolve the connected account server-side from the caller's own plugin
+      // record (PluginModel is user-scoped). Never trust a connectedAccountId
+      // supplied by the client — that would let a user drive another user's
+      // connection.
+      const plugin = await ctx.pluginModel.findById(input.identifier);
+      const connectedAccountId = plugin?.customParams?.composio?.connectedAccountId;
+
+      if (!connectedAccountId) {
+        throw new TRPCError({
+          code: 'NOT_FOUND',
+          message: `No Composio connection found for "${input.identifier}".`,
+        });
+      }
+
+      const result = await (ctx.composioClient.tools as any).execute(input.toolSlug, {
+        arguments: input.toolArgs || {},
+        connectedAccountId,
+        // Toolkit version resolves to "latest"; allow manual execution without a
+        // pinned version (Composio otherwise throws ComposioToolVersionRequiredError).
+        dangerouslySkipVersionCheck: true,
+        userId: ctx.userId,
+      });
+
+      if (!result) {
+        return {
+          content: 'Unknown error',
+          state: { content: [{ text: 'Unknown error', type: 'text' }], isError: true },
+          success: false,
+        };
+      }
+
+      const data = result as any;
+      const content = data?.data || data?.result || data;
+      const contentStr = typeof content === 'string' ? content : JSON.stringify(content);
+
+      return await MCPService.processToolCallResult({
+        content: [{ text: contentStr, type: 'text' }],
+        isError: false,
+      });
+    }),
+
+  getActions: publicProcedure.input(z.object({ appSlug: z.string() })).query(async ({ input }) => {
+    const client = getComposioClient();
+    const response = await (client.tools as any).getRawComposioTools({
+      toolkits: [input.appSlug],
+    });
+
+    const items = response?.items || response || [];
+    const tools = Array.isArray(items)
+      ? items.map((tool: any) => ({
+          description: tool.description || '',
+          inputSchema: tool.inputParameters ||
+            tool.inputSchema || {
+              properties: {},
+              type: 'object',
+            },
+          name: tool.slug || tool.name || '',
+        }))
+      : [];
+
+    return { tools };
+  }),
+
+  listActions: composioProcedure
+    .input(z.object({ appSlug: z.string() }))
+    .query(async ({ ctx, input }) => {
+      // Use getRawComposioTools (raw tool defs with slug/inputParameters), NOT
+      // tools.get() — the latter returns provider-wrapped (OpenAI-format) tools
+      // whose name/params live under `.function`, so slug/name/inputSchema come
+      // back empty and every tool collapses to the same `${identifier}____` name.
+      const response = await (ctx.composioClient.tools as any).getRawComposioTools({
+        toolkits: [input.appSlug],
+      });
+
+      const items = response?.items || response || [];
+      const tools = Array.isArray(items)
+        ? items.map((tool: any) => ({
+            description: tool.description || '',
+            inputSchema: tool.inputParameters ||
+              tool.inputSchema || {
+                properties: {},
+                type: 'object',
+              },
+            name: tool.slug || tool.name || '',
+          }))
+        : [];
+
+      return { tools };
+    }),
+});
@@ -1,13 +1,14 @@
 import { publicProcedure, router } from '@/libs/trpc/lambda';

-import { klavisRouter } from './klavis';
+import { composioToolsRouter } from './composio';
 import { marketRouter } from './market';
 import { mcpRouter } from './mcp';
 import { searchRouter } from './search';

 export const toolsRouter = router({
  healthcheck: publicProcedure.query(() => "i'm live!"),
-  klavis: klavisRouter,
+  composio: composioToolsRouter,
+
  market: marketRouter,
  mcp: mcpRouter,
  search: searchRouter,
@@ -1,141 +0,0 @@
-import { z } from 'zod';
-
-import { wsCompatProcedure } from '@/business/server/trpc-middlewares/workspaceAuth';
-import { ConnectorModel } from '@/database/models/connector';
-import { ConnectorToolModel } from '@/database/models/connectorTool';
-import { ConnectorToolPermission } from '@/database/schemas';
-import { getKlavisClient } from '@/libs/klavis';
-import { publicProcedure, router } from '@/libs/trpc/lambda';
-import { serverDatabase } from '@/libs/trpc/lambda/middleware';
-import { MCPService } from '@/server/services/mcp';
-
-/**
- * Klavis procedure with client initialized in context
- */
-const klavisProcedure = wsCompatProcedure.use(serverDatabase).use(async (opts) => {
-  const klavisClient = getKlavisClient();
-
-  return opts.next({
-    ctx: { ...opts.ctx, klavisClient },
-  });
-});
-
-/**
- * Klavis router for tools
- * Contains callTool and listTools which call external Klavis API
- */
-export const klavisRouter = router({
-  /**
-   * Call a tool on a Klavis Strata server
-   */
-  callTool: klavisProcedure
-    .input(
-      z.object({
-        /** Klavis server identifier (e.g. 'gmail', 'google-calendar') for precise permission lookup */
-        identifier: z.string().optional(),
-        serverUrl: z.string(),
-        toolArgs: z.record(z.unknown()).optional(),
-        toolName: z.string(),
-      }),
-    )
-    .mutation(async ({ ctx, input }) => {
-      // ── Connector tool permission gate ────────────────────────────────────
-      // Use identifier + toolName when available for a precise lookup (avoids
-      // same-name collisions across connectors). Falls back to toolName-only
-      // if identifier is absent (legacy callers).
-      if (ctx.userId && ctx.serverDB) {
-        const wsId = ctx.workspaceId ?? undefined;
-        const connectorToolModel = new ConnectorToolModel(ctx.serverDB, ctx.userId, wsId);
-        let connectorTool:
-          | Awaited<ReturnType<typeof connectorToolModel.findByToolName>>
-          | undefined;
-
-        if (input.identifier) {
-          const connectorModel = new ConnectorModel(ctx.serverDB, ctx.userId, wsId);
-          const [connector] = await connectorModel.queryByIdentifiers([input.identifier]);
-          if (connector) {
-            const tools = await connectorToolModel.queryByConnector(connector.id);
-            connectorTool = tools.find((t) => t.toolName === input.toolName);
-          }
-        } else {
-          connectorTool = await connectorToolModel.findByToolName(input.toolName);
-        }
-
-        if (connectorTool?.permission === ConnectorToolPermission.disabled) {
-          const message =
-            `The tool "${input.toolName}" has been disabled by the user and cannot be executed. ` +
-            `Please inform the user that this tool is currently disabled. ` +
-            `They can re-enable it in Settings > Connectors.`;
-          return {
-            content: message,
-            state: { content: [{ text: message, type: 'text' }], isError: false },
-            success: true,
-          };
-        }
-      }
-      // ── End permission gate ───────────────────────────────────────────────
-
-      const response = await ctx.klavisClient.mcpServer.callTools({
-        serverUrl: input.serverUrl,
-        toolArgs: input.toolArgs,
-        toolName: input.toolName,
-      });
-
-      // Handle error case
-      if (!response.success || !response.result) {
-        return {
-          content: response.error || 'Unknown error',
-          state: {
-            content: [{ text: response.error || 'Unknown error', type: 'text' }],
-            isError: true,
-          },
-          success: false,
-        };
-      }
-
-      // Process the response using the common MCP tool call result processor
-      const processedResult = await MCPService.processToolCallResult({
-        content: (response.result.content || []) as any[],
-        isError: response.result.isError,
-      });
-
-      return processedResult;
-    }),
-
-  /**
-   * Get tools by server name (public endpoint, no auth required)
-   */
-  getTools: publicProcedure
-    .input(
-      z.object({
-        serverName: z.string(),
-      }),
-    )
-    .query(async ({ input }) => {
-      const klavisClient = getKlavisClient();
-      const response = await klavisClient.mcpServer.getTools(input.serverName as any);
-
-      return {
-        tools: response.tools,
-      };
-    }),
-
-  /**
-   * List tools available on a Klavis Strata server
-   */
-  listTools: klavisProcedure
-    .input(
-      z.object({
-        serverUrl: z.string(),
-      }),
-    )
-    .query(async ({ ctx, input }) => {
-      const response = await ctx.klavisClient.mcpServer.listTools({
-        serverUrl: input.serverUrl,
-      });
-
-      return {
-        tools: response.tools,
-      };
-    }),
-});
@@ -231,6 +231,57 @@ describe('AgentService', () => {
      // Avatar should not be present for non-builtin agents
      expect((result as any)?.avatar).toBeUndefined();
    });
+
+    it('should NOT inherit the member personal default model for a workspace inbox', async () => {
+      // Workspace inbox is persisted with an empty model/provider.
+      const mockAgent = {
+        id: 'agent-1',
+        slug: 'inbox',
+      };
+      const serverDefaultConfig = { model: 'system-default-model', provider: 'system-provider' };
+
+      const mockAgentModel = {
+        getBuiltinAgent: vi.fn().mockResolvedValue(mockAgent),
+      };
+
+      (AgentModel as any).mockImplementation(() => mockAgentModel);
+      (parseAgentConfig as any).mockReturnValue(serverDefaultConfig);
+      // The member opening the workspace inbox has a personal default model.
+      mockUserModel.getUserSettingsDefaultAgentConfig.mockResolvedValueOnce({
+        config: { model: 'opus-4.6', provider: 'anthropic' },
+      });
+
+      const workspaceService = new AgentService(mockDb, mockUserId, mockWorkspaceId);
+      const result = await workspaceService.getBuiltinAgent('inbox');
+
+      // Should fall back to the system default, NOT the member's personal model.
+      expect(result?.model).toBe('system-default-model');
+      expect(result?.provider).toBe('system-provider');
+    });
+
+    it('should still apply the personal default model for a personal inbox', async () => {
+      const mockAgent = {
+        id: 'agent-1',
+        slug: 'inbox',
+      };
+
+      const mockAgentModel = {
+        getBuiltinAgent: vi.fn().mockResolvedValue(mockAgent),
+      };
+
+      (AgentModel as any).mockImplementation(() => mockAgentModel);
+      (parseAgentConfig as any).mockReturnValue({});
+      mockUserModel.getUserSettingsDefaultAgentConfig.mockResolvedValueOnce({
+        config: { model: 'user-preferred-model', provider: 'user-provider' },
+      });
+
+      // No workspaceId → personal scope keeps the personal default behavior.
+      const newService = new AgentService(mockDb, mockUserId);
+      const result = await newService.getBuiltinAgent('inbox');
+
+      expect(result?.model).toBe('user-preferred-model');
+      expect(result?.provider).toBe('user-provider');
+    });
  });

  describe('getAgentConfig', () => {
@@ -174,6 +174,13 @@ export class AgentService {
   * 2. serverDefaultAgentConfig - from environment variable
   * 3. userDefaultAgentConfig - from user settings (defaultAgent.config)
   * 4. agent - actual agent config from database
+   *
+   * Workspace exception: a workspace is a shared resource, so its agents must
+   * NOT inherit any individual member's *personal* default model. Otherwise a
+   * shared agent persisted with an empty model (e.g. the workspace inbox)
+   * resolves to whoever opens it — the creator's personal default leaks in and
+   * the workspace looks "initialized" with their model. For workspace-scoped
+   * reads we skip the user layer and fall back to the system default instead.
   */
  private mergeDefaultConfig(
    agent: any,
@@ -181,12 +188,17 @@ export class AgentService {
  ): LobeAgentConfig | null {
    if (!agent) return null;

-    const userDefaultAgentConfig =
-      (defaultAgentConfig as { config?: PartialDeep<LobeAgentConfig> })?.config || {};
-
-    // Merge configs in order: DEFAULT -> server -> user -> agent
+    // Merge configs in order: DEFAULT -> server -> [user] -> agent
    const serverDefaultAgentConfig = getServerDefaultAgentConfig();
    const baseConfig = merge(DEFAULT_AGENT_CONFIG, serverDefaultAgentConfig);
+
+    // Skip the personal default layer for workspace-scoped agents (see above).
+    if (this.workspaceId) {
+      return merge(baseConfig, cleanObject(agent));
+    }
+
+    const userDefaultAgentConfig =
+      (defaultAgentConfig as { config?: PartialDeep<LobeAgentConfig> })?.config || {};
    const withUserConfig = merge(baseConfig, userDefaultAgentConfig);

    return merge(withUserConfig, cleanObject(agent));
@@ -10,6 +10,7 @@ import type { LobeChatDatabase } from '@/database/type';
 import { AgentRuntimeCoordinator } from '@/server/modules/AgentRuntime/AgentRuntimeCoordinator';

 import { OperationTraceRecorder } from './OperationTraceRecorder';
+import { createDefaultSnapshotStore } from './snapshotStore';

 const log = debug('lobe-server:abandon-operation');

@@ -127,25 +128,3 @@ export class AbandonOperationService {
    return result;
  }
 }
-
-function createDefaultSnapshotStore(): ISnapshotStore | null {
-  if (process.env.ENABLE_AGENT_S3_TRACING === '1') {
-    try {
-      const { S3SnapshotStore } = require('@/server/modules/AgentTracing');
-      return new S3SnapshotStore();
-    } catch {
-      /* S3SnapshotStore not available */
-    }
-  }
-
-  if (process.env.NODE_ENV === 'development') {
-    try {
-      const { FileSnapshotStore } = require('@lobechat/agent-tracing');
-      return new FileSnapshotStore();
-    } catch {
-      /* agent-tracing not available */
-    }
-  }
-
-  return null;
-}
@@ -1696,7 +1696,7 @@ describe('AgentRuntimeService', () => {
      expect(casSpy).not.toHaveBeenCalled();
    });

-    it('arms a one-shot verify when the parent has not parked yet and scheduleVerifyOnHold is set', async () => {
+    it('arms the first verify (attempt 1, 15s) when the parent has not parked yet and scheduleVerifyOnHold is set', async () => {
      // Child completed before the parent's parking step persisted its state.
      mockCoordinator.loadAgentState.mockResolvedValue({
        pendingToolsCalling: [],
@@ -1712,14 +1712,15 @@ describe('AgentRuntimeService', () => {
      expect(won).toBe(false);
      expect(mockQueueService.scheduleMessage).toHaveBeenCalledWith(
        expect.objectContaining({
+          delay: 15_000,
          operationId: parentOpId,
-          payload: { verifyAsyncToolBarrier: true },
+          payload: { asyncToolVerifyAttempt: 1, verifyAsyncToolBarrier: true },
          stepIndex: 2,
        }),
      );
    });

-    it('arms a one-shot verify when the barrier is unsatisfied and scheduleVerifyOnHold is set', async () => {
+    it('arms a verify when the barrier is unsatisfied and scheduleVerifyOnHold is set', async () => {
      mockCoordinator.loadAgentState.mockResolvedValue({
        pendingToolsCalling: [{ id: 'tc1' }],
        status: 'waiting_for_async_tool',
@@ -1736,7 +1737,104 @@ describe('AgentRuntimeService', () => {

      expect(won).toBe(false);
      expect(mockQueueService.scheduleMessage).toHaveBeenCalledWith(
-        expect.objectContaining({ payload: { verifyAsyncToolBarrier: true } }),
+        expect.objectContaining({
+          payload: { asyncToolVerifyAttempt: 1, verifyAsyncToolBarrier: true },
+        }),
+      );
+    });
+
+    it('re-arms the next verify with exponential backoff while the barrier holds', async () => {
+      mockCoordinator.loadAgentState.mockResolvedValue({
+        pendingToolsCalling: [{ id: 'tc1' }],
+        status: 'waiting_for_async_tool',
+        stepCount: 1,
+      });
+      (service as any).serverDB.query = {
+        messagePlugins: { findFirst: vi.fn().mockResolvedValue(null) },
+      };
+
+      // A verify handler running as attempt 2 re-arms attempt 3 (60s).
+      await service.tryResumeParentFromAsyncTool(
+        { parentOperationId: parentOpId },
+        { scheduleVerifyOnHold: true, verifyAttempt: 3 },
+      );
+
+      expect(mockQueueService.scheduleMessage).toHaveBeenCalledWith(
+        expect.objectContaining({
+          delay: 60_000,
+          payload: { asyncToolVerifyAttempt: 3, verifyAsyncToolBarrier: true },
+        }),
+      );
+    });
+
+    it('stops re-arming once the bounded attempts are exhausted', async () => {
+      mockCoordinator.loadAgentState.mockResolvedValue({
+        pendingToolsCalling: [{ id: 'tc1' }],
+        status: 'waiting_for_async_tool',
+        stepCount: 1,
+      });
+      (service as any).serverDB.query = {
+        messagePlugins: { findFirst: vi.fn().mockResolvedValue(null) },
+      };
+
+      const won = await service.tryResumeParentFromAsyncTool(
+        { parentOperationId: parentOpId },
+        { scheduleVerifyOnHold: true, verifyAttempt: 6 },
+      );
+
+      expect(won).toBe(false);
+      expect(mockQueueService.scheduleMessage).not.toHaveBeenCalled();
+    });
+
+    it('trusts a just-backfilled message id without re-reading it (read-your-writes)', async () => {
+      mockCoordinator.loadAgentState.mockResolvedValue({
+        pendingToolsCalling: [{ id: 'tc1' }],
+        status: 'waiting_for_async_tool',
+        stepCount: 3,
+      });
+      // Plugin row exists (created at park) but its state still reads stale.
+      const findById = vi.fn().mockResolvedValue({ content: '' });
+      (service as any).serverDB.query = {
+        messagePlugins: {
+          findFirst: vi.fn().mockResolvedValue({ id: 'msg-tc1', state: null, toolCallId: 'tc1' }),
+        },
+      };
+      (service as any).messageModel.findById = findById;
+      const casSpy = vi
+        .spyOn(AgentOperationModel.prototype, 'tryResumeFromAsyncTool')
+        .mockResolvedValue(true);
+
+      const won = await service.tryResumeParentFromAsyncTool(
+        { parentOperationId: parentOpId },
+        { knownFulfilledMessageId: 'msg-tc1' },
+      );
+
+      expect(won).toBe(true);
+      expect(casSpy).toHaveBeenCalledWith(parentOpId);
+      // The stale read must be skipped — barrier trusted the local backfill.
+      expect(findById).not.toHaveBeenCalled();
+    });
+
+    it('arms a fallback verify when a parked op has no pending tools', async () => {
+      mockCoordinator.loadAgentState.mockResolvedValue({
+        pendingToolsCalling: [],
+        status: 'waiting_for_async_tool',
+        stepCount: 4,
+      });
+      const casSpy = vi.spyOn(AgentOperationModel.prototype, 'tryResumeFromAsyncTool');
+
+      const won = await service.tryResumeParentFromAsyncTool(
+        { parentOperationId: parentOpId },
+        { scheduleVerifyOnHold: true },
+      );
+
+      expect(won).toBe(false);
+      expect(casSpy).not.toHaveBeenCalled();
+      expect(mockQueueService.scheduleMessage).toHaveBeenCalledWith(
+        expect.objectContaining({
+          payload: { asyncToolVerifyAttempt: 1, verifyAsyncToolBarrier: true },
+          stepIndex: 4,
+        }),
      );
    });

@@ -1805,7 +1903,7 @@ describe('AgentRuntimeService', () => {
      });
      expect(resumeSpy).toHaveBeenCalledWith(
        { parentOperationId: 'parent-op-1' },
-        { scheduleVerifyOnHold: true },
+        { knownFulfilledMessageId: 'tool-msg-1', scheduleVerifyOnHold: true },
      );
    });

@@ -20,12 +20,18 @@ import {
  trace as otelTrace,
 } from '@lobechat/observability-otel/api';
 import {
+  asyncToolResumeCounter,
  buildInvokeAgentAttributes,
  buildInvokeAgentResultAttributes,
  invokeAgentSpanName,
  tracer as agentRuntimeTracer,
 } from '@lobechat/observability-otel/modules/agent-runtime';
-import { type ChatToolPayload, type ExecSubAgentParams, type UIChatMessage } from '@lobechat/types';
+import {
+  type ChatToolPayload,
+  type ExecSubAgentParams,
+  type ExecVirtualSubAgentParams,
+  type UIChatMessage,
+} from '@lobechat/types';
 import debug from 'debug';
 import urlJoin from 'url-join';

@@ -56,6 +62,7 @@ import { CompletionLifecycle } from './CompletionLifecycle';
 import { hookDispatcher } from './hooks';
 import { HumanInterventionHandler } from './HumanInterventionHandler';
 import { OperationTraceRecorder } from './OperationTraceRecorder';
+import { createDefaultSnapshotStore } from './snapshotStore';
 import { buildStepPresentation, formatTokenCount } from './stepPresentation';
 import {
  type AgentExecutionParams,
@@ -78,13 +85,37 @@ if (process.env.VERCEL) {
 const log = debug('lobe-server:agent-runtime-service');

 /**
- * Delay before a one-shot `verifyAsyncToolBarrier` re-check fires after a
+ * Base delay before the first `verifyAsyncToolBarrier` re-check fires after a
 * sub-agent completion found the parent not yet resumable. Long enough for
 * the parent's parking step to finish persisting, short enough that a lost
- * resume is recovered promptly.
+ * resume is recovered promptly. Subsequent attempts back off exponentially —
+ * see {@link asyncToolVerifyDelayMs}.
 */
 const ASYNC_TOOL_VERIFY_DELAY_MS = 15_000;

+/**
+ * Maximum number of bounded watchdog re-checks armed per parked parent. The
+ * watchdog re-arms after each unsatisfied check (instead of the old single
+ * shot) so a transient miss — a read-replica lag, a sibling dying between
+ * backfill and resume — is retried rather than leaving the parent stuck in
+ * `waiting_for_async_tool` forever. With exponential backoff from a 15s base,
+ * 5 attempts span ~15s → ~7.75min total before giving up. See LOBE-10385.
+ */
+const ASYNC_TOOL_VERIFY_MAX_ATTEMPTS = 5;
+
+/** Hard ceiling on a single backoff delay so late attempts don't overshoot. */
+const ASYNC_TOOL_VERIFY_MAX_DELAY_MS = 240_000;
+
+/**
+ * Exponential backoff delay for the Nth (1-based) watchdog re-check:
+ * 15s, 30s, 60s, 120s, 240s, capped at {@link ASYNC_TOOL_VERIFY_MAX_DELAY_MS}.
+ */
+const asyncToolVerifyDelayMs = (attempt: number): number =>
+  Math.min(
+    ASYNC_TOOL_VERIFY_DELAY_MS * 2 ** (Math.max(1, attempt) - 1),
+    ASYNC_TOOL_VERIFY_MAX_DELAY_MS,
+  );
+
 /**
 * Format error for storage in message pluginError metadata.
 * Handles Error objects which don't serialize properly with JSON.stringify.
@@ -126,13 +157,17 @@ const toAgentSignalSnapshotEvents = (
 */
 export interface AgentRuntimeDelegate {
  /**
-   * Fork a sub-agent through the full high-level pipeline
+   * Run a legacy agent invocation through the full high-level pipeline
   * (AiAgentService.execSubAgent → execAgent: agent-config resolution, tool
-   * engine, context engineering, createOperation). Returns a deferred result;
-   * the parent op parks (`waiting_for_async_tool`) until the completion bridge
-   * backfills the placeholder and resumes it.
+   * engine, context engineering, createOperation).
   */
  execSubAgent?: (params: ExecSubAgentParams) => Promise<unknown>;
+  /**
+   * Fork a `lobe-agent.callSubAgent` virtual child run. The child is marked as a
+   * sub-agent and owns the completion bridge that backfills the parent tool
+   * placeholder before resuming the parked parent operation.
+   */
+  execVirtualSubAgent?: (params: ExecVirtualSubAgentParams) => Promise<unknown>;
 }

 export interface AgentRuntimeServiceOptions {
@@ -247,7 +282,7 @@ export class AgentRuntimeService {
    this.queueService =
      options?.queueService === null ? null : (options?.queueService ?? new QueueService());
    this.traceRecorder = new OperationTraceRecorder(
-      options?.snapshotStore ?? this.createDefaultSnapshotStore(),
+      options?.snapshotStore ?? createDefaultSnapshotStore(),
    );
    this.agentFactory = options?.agentFactory;
    this.delegate = options?.delegate ?? {};
@@ -580,15 +615,28 @@ export class AgentRuntimeService {
      resumeAsyncTool,
      toolMessageId,
      verifyAsyncToolBarrier,
+      asyncToolVerifyAttempt,
      externalRetryCount = 0,
    } = params;

    // Watchdog re-check for a parked async-tool wait: re-run the barrier + CAS
    // without claiming the step lock or executing anything. Idempotent — the
    // CAS guarantees at most one real resume regardless of how many checks run.
+    // Opt back into `scheduleVerifyOnHold` with the next attempt so an
+    // unsatisfied barrier re-arms (bounded backoff) instead of giving up after
+    // a single shot — the core LOBE-10385 fix.
    if (verifyAsyncToolBarrier) {
-      log('[%s][%d] Running async-tool barrier verify', operationId, stepIndex);
-      const resumed = await this.tryResumeParentFromAsyncTool({ parentOperationId: operationId });
+      const attempt = asyncToolVerifyAttempt ?? 1;
+      log(
+        '[%s][%d] Running async-tool barrier verify (attempt %d)',
+        operationId,
+        stepIndex,
+        attempt,
+      );
+      const resumed = await this.tryResumeParentFromAsyncTool(
+        { parentOperationId: operationId },
+        { scheduleVerifyOnHold: true, verifyAttempt: attempt + 1 },
+      );
      return {
        nextStepScheduled: resumed,
        state: {},
@@ -1617,12 +1665,30 @@ export class AgentRuntimeService {
   */
  async tryResumeParentFromAsyncTool(
    params: { parentOperationId: string },
-    options?: { scheduleVerifyOnHold?: boolean },
+    options?: {
+      /**
+       * Message id of a tool placeholder the caller just backfilled to a
+       * terminal state. Trusted by the barrier as fulfilled without re-reading
+       * `message_plugins` — closes the read-your-writes gap where the barrier
+       * query hits a read replica that hasn't seen the just-committed write.
+       */
+      knownFulfilledMessageId?: string;
+      scheduleVerifyOnHold?: boolean;
+      /** 1-based watchdog attempt to arm when the parent isn't resumable yet. */
+      verifyAttempt?: number;
+    },
  ): Promise<boolean> {
    const { parentOperationId } = params;

    const state = await this.coordinator.loadAgentState(parentOperationId);
-    if (!state) return false;
+    if (!state) {
+      // State expired (Redis TTL) or never persisted — nothing left to resume.
+      // Surface it: a missing state at completion time is how a parent silently
+      // strands. There is no stepCount/status to arm a verify against.
+      log('[%s] async-tool resume: parent state missing/expired, cannot resume', parentOperationId);
+      asyncToolResumeCounter.add(1, { outcome: 'no_state' });
+      return false;
+    }

    if (state.status !== 'waiting_for_async_tool') {
      // Not parked (yet). Either the op already resumed/finished — nothing to
@@ -1633,12 +1699,27 @@ export class AgentRuntimeService {
    }

    const pending = (state.pendingToolsCalling ?? []) as ChatToolPayload[];
-    if (pending.length === 0) return false;
+    if (pending.length === 0) {
+      // Parked but no pending tools recorded — usually the parked snapshot's
+      // `pendingToolsCalling` hasn't finished persisting yet. Warn, report, and
+      // arm a fallback re-check rather than returning silently (the old bug).
+      log(
+        '[%s] async-tool resume: parked op has no pending tools, arming fallback',
+        parentOperationId,
+      );
+      asyncToolResumeCounter.add(1, { outcome: 'no_pending' });
+      await this.maybeScheduleAsyncToolVerify(parentOperationId, state, options);
+      return false;
+    }

    // Barrier: every pending tool must have a fulfilled tool_result message.
-    const allFulfilled = await this.allPendingToolsFulfilled(pending);
+    const allFulfilled = await this.allPendingToolsFulfilled(
+      pending,
+      options?.knownFulfilledMessageId,
+    );
    if (!allFulfilled) {
      log('[%s] async-tool barrier not yet satisfied, holding', parentOperationId);
+      asyncToolResumeCounter.add(1, { outcome: 'barrier_held' });
      await this.maybeScheduleAsyncToolVerify(parentOperationId, state, options);
      return false;
    }
@@ -1649,9 +1730,12 @@ export class AgentRuntimeService {
    );
    if (!won) {
      log('[%s] lost async-tool resume CAS, no-op', parentOperationId);
+      asyncToolResumeCounter.add(1, { outcome: 'lost_cas' });
      return false;
    }

+    asyncToolResumeCounter.add(1, { outcome: 'resumed' });
+
    log('[%s] won async-tool resume CAS, scheduling step %d', parentOperationId, state.stepCount);

    if (this.queueService) {
@@ -1672,36 +1756,60 @@ export class AgentRuntimeService {
  }

  /**
-   * Arm a one-shot delayed `verifyAsyncToolBarrier` re-check for a parent op
-   * whose resume attempt found it not yet resumable. Skipped for terminal
-   * states (nothing left to resume) and when the caller didn't opt in — the
-   * verify execution itself never re-arms, keeping retries bounded to one
-   * per completion event.
+   * Arm the next bounded `verifyAsyncToolBarrier` re-check for a parent op whose
+   * resume attempt found it not yet resumable. Skipped for terminal states
+   * (nothing left to resume) and when the caller didn't opt in.
+   *
+   * Unlike the original single shot, the watchdog re-arms after each unsatisfied
+   * check: the verify handler re-enters here with `verifyAttempt + 1`, backing
+   * off exponentially up to {@link ASYNC_TOOL_VERIFY_MAX_ATTEMPTS}. A transient
+   * miss (read-replica lag, a sibling dying between backfill and resume) is thus
+   * retried instead of permanently stranding the parent. Once attempts are
+   * exhausted the chain stops and the `verify_exhausted` metric fires so the
+   * orphan is observable. See LOBE-10385.
   */
  private async maybeScheduleAsyncToolVerify(
    parentOperationId: string,
    state: AgentState,
-    options?: { scheduleVerifyOnHold?: boolean },
+    options?: { scheduleVerifyOnHold?: boolean; verifyAttempt?: number },
  ): Promise<void> {
    if (!options?.scheduleVerifyOnHold || !this.queueService) return;

    const status = state.status as string;
    if (status === 'done' || status === 'error' || status === 'interrupted') return;

+    const attempt = options.verifyAttempt ?? 1;
+    if (attempt > ASYNC_TOOL_VERIFY_MAX_ATTEMPTS) {
+      // Bounded retries spent and the parent is still not resumable — give up
+      // re-arming and report so the stuck wait can be detected, not silently
+      // accumulated.
+      log(
+        '[%s] async-tool barrier verify exhausted after %d attempts, giving up (status: %s)',
+        parentOperationId,
+        ASYNC_TOOL_VERIFY_MAX_ATTEMPTS,
+        status,
+      );
+      asyncToolResumeCounter.add(1, { outcome: 'verify_exhausted' });
+      return;
+    }
+
+    const delay = asyncToolVerifyDelayMs(attempt);
    log(
-      '[%s] scheduling async-tool barrier verify in %dms (status: %s)',
+      '[%s] scheduling async-tool barrier verify attempt %d/%d in %dms (status: %s)',
      parentOperationId,
-      ASYNC_TOOL_VERIFY_DELAY_MS,
+      attempt,
+      ASYNC_TOOL_VERIFY_MAX_ATTEMPTS,
+      delay,
      status,
    );

    try {
      await this.queueService.scheduleMessage({
        context: undefined,
-        delay: ASYNC_TOOL_VERIFY_DELAY_MS,
+        delay,
        endpoint: `${this.baseURL}/run`,
        operationId: parentOperationId,
-        payload: { verifyAsyncToolBarrier: true },
+        payload: { asyncToolVerifyAttempt: attempt, verifyAsyncToolBarrier: true },
        priority: 'high',
        stepIndex: state.stepCount,
      });
@@ -1782,22 +1890,40 @@ export class AgentRuntimeService {
      );
    }

-    // 2. Barrier + CAS + resume the parent op (infra errors propagate too)
-    return this.tryResumeParentFromAsyncTool({ parentOperationId }, { scheduleVerifyOnHold: true });
+    // 2. Barrier + CAS + resume the parent op (infra errors propagate too).
+    // Pass the just-backfilled message id so the barrier trusts this write
+    // instead of re-reading a possibly-stale replica.
+    return this.tryResumeParentFromAsyncTool(
+      { parentOperationId },
+      { knownFulfilledMessageId: toolMessageId, scheduleVerifyOnHold: true },
+    );
  }

  /**
   * Whether every pending tool call has a fulfilled tool_result message — i.e.
   * a tool message exists for its `tool_call_id` with non-empty content or a
   * terminal pluginState. Looks up by `tool_call_id` (plugin id === message id).
+   *
+   * `knownFulfilledMessageId` short-circuits the per-tool content/state read for
+   * a placeholder the caller just backfilled in the same request: its terminal
+   * write is a local fact, so re-reading it (possibly from a lagging read
+   * replica) would only risk a false negative that strands the parent. The
+   * plugin row itself predates the park, so the `tool_call_id → plugin.id`
+   * lookup still resolves; only the freshly written content/state is trusted.
   */
-  private async allPendingToolsFulfilled(pending: ChatToolPayload[]): Promise<boolean> {
+  private async allPendingToolsFulfilled(
+    pending: ChatToolPayload[],
+    knownFulfilledMessageId?: string,
+  ): Promise<boolean> {
    for (const tc of pending) {
      const plugin = await this.serverDB.query.messagePlugins.findFirst({
        where: (mp, { eq }) => eq(mp.toolCallId, tc.id),
      });
      if (!plugin) return false;

+      // Trust the caller's own just-committed backfill (read-your-writes).
+      if (knownFulfilledMessageId && plugin.id === knownFulfilledMessageId) continue;
+
      const message = await this.messageModel.findById(plugin.id);
      const pluginState = plugin.state as { status?: string } | null;
      const fulfilled =
@@ -1864,10 +1990,7 @@ export class AgentRuntimeService {
          if (!tool || typeof tool !== 'object') continue;

          const toolPayload = tool as { id?: unknown; result_msg_id?: unknown };
-          if (
-            typeof toolPayload.id === 'string' &&
-            typeof toolPayload.result_msg_id === 'string'
-          ) {
+          if (typeof toolPayload.id === 'string' && typeof toolPayload.result_msg_id === 'string') {
            toolResultMessageIds.set(toolPayload.id, toolPayload.result_msg_id);
          }
        }
@@ -1944,6 +2067,7 @@ export class AgentRuntimeService {
      userTimezone: metadata?.userTimezone,
      evalContext: metadata?.evalContext,
      execSubAgent: this.delegate.execSubAgent,
+      execVirtualSubAgent: this.delegate.execVirtualSubAgent,
      hookDispatcher,
      loadAgentState: this.coordinator.loadAgentState.bind(this.coordinator),
      messageModel: this.messageModel,
@@ -1967,34 +2091,6 @@ export class AgentRuntimeService {
    return { agent, runtime };
  }

-  /**
-   * Create default snapshot store based on environment.
-   * - ENABLE_AGENT_S3_TRACING=1 → S3SnapshotStore
-   * - NODE_ENV=development → FileSnapshotStore
-   * - Otherwise → null (no tracing)
-   */
-  private createDefaultSnapshotStore(): ISnapshotStore | null {
-    if (process.env.ENABLE_AGENT_S3_TRACING === '1') {
-      try {
-        const { S3SnapshotStore } = require('@/server/modules/AgentTracing');
-        return new S3SnapshotStore();
-      } catch {
-        // S3SnapshotStore not available
-      }
-    }
-
-    if (process.env.NODE_ENV === 'development') {
-      try {
-        const { FileSnapshotStore } = require('@lobechat/agent-tracing');
-        return new FileSnapshotStore();
-      } catch {
-        // agent-tracing not available
-      }
-    }
-
-    return null;
-  }
-
  /**
   * Compute device context from DB messages at step boundary.
   * Uses findInMessages visitor to scan tool messages for device activation.
@@ -344,11 +344,16 @@ export class CompletionLifecycle {
          metadata?.assistantMessageId,
          metadata?.userId || this.userId,
        );
-        void runVerifyOnCompletion(this.serverDB, metadata?.userId || this.userId, {
-          deliverable: event.lastAssistantContent ?? '',
-          goal,
-          operationId,
-        });
+        void runVerifyOnCompletion(
+          this.serverDB,
+          metadata?.userId || this.userId,
+          {
+            deliverable: event.lastAssistantContent ?? '',
+            goal,
+            operationId,
+          },
+          this.workspaceId,
+        );
      }

      if (reason === 'error') {
@@ -0,0 +1,71 @@
+// @vitest-environment node
+import { afterEach, describe, expect, it, vi } from 'vitest';
+
+import { createDefaultSnapshotStore, shouldUseAgentS3Tracing } from '../snapshotStore';
+
+const s3SnapshotStoreMock = vi.fn(() => ({ kind: 's3' }));
+const fileSnapshotStoreMock = vi.fn(() => ({ kind: 'file' }));
+
+const setEnv = (nodeEnv: string, agentS3Tracing?: string) => {
+  vi.stubEnv('NODE_ENV', nodeEnv);
+  vi.stubEnv('ENABLE_AGENT_S3_TRACING', agentS3Tracing);
+};
+
+const loadModule = vi.fn((moduleName: string) => {
+  if (moduleName === '@/server/modules/AgentTracing') {
+    return { S3SnapshotStore: s3SnapshotStoreMock };
+  }
+
+  if (moduleName === '@lobechat/agent-tracing') {
+    return { FileSnapshotStore: fileSnapshotStoreMock };
+  }
+
+  throw new Error(`Unexpected module: ${moduleName}`);
+});
+
+describe('agent runtime snapshot store defaults', () => {
+  afterEach(() => {
+    vi.unstubAllEnvs();
+    vi.clearAllMocks();
+  });
+
+  it('enables S3 tracing by default in production when env is unset', () => {
+    setEnv('production');
+
+    expect(shouldUseAgentS3Tracing()).toBe(true);
+    expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 's3' });
+    expect(loadModule).toHaveBeenCalledWith('@/server/modules/AgentTracing');
+    expect(s3SnapshotStoreMock).toHaveBeenCalledTimes(1);
+    expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
+  });
+
+  it('uses the local file snapshot store in development when env is unset', () => {
+    setEnv('development');
+
+    expect(shouldUseAgentS3Tracing()).toBe(false);
+    expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 'file' });
+    expect(loadModule).toHaveBeenCalledWith('@lobechat/agent-tracing');
+    expect(s3SnapshotStoreMock).not.toHaveBeenCalled();
+    expect(fileSnapshotStoreMock).toHaveBeenCalledTimes(1);
+  });
+
+  it('lets ENABLE_AGENT_S3_TRACING=1 force S3 tracing outside production', () => {
+    setEnv('development', '1');
+
+    expect(shouldUseAgentS3Tracing()).toBe(true);
+    expect(createDefaultSnapshotStore(loadModule)).toEqual({ kind: 's3' });
+    expect(loadModule).toHaveBeenCalledWith('@/server/modules/AgentTracing');
+    expect(s3SnapshotStoreMock).toHaveBeenCalledTimes(1);
+    expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
+  });
+
+  it('lets an explicit ENABLE_AGENT_S3_TRACING value disable the production default', () => {
+    setEnv('production', '0');
+
+    expect(shouldUseAgentS3Tracing()).toBe(false);
+    expect(createDefaultSnapshotStore(loadModule)).toBeNull();
+    expect(loadModule).not.toHaveBeenCalled();
+    expect(s3SnapshotStoreMock).not.toHaveBeenCalled();
+    expect(fileSnapshotStoreMock).not.toHaveBeenCalled();
+  });
+});
@@ -0,0 +1,59 @@
+import type { ISnapshotStore } from '@lobechat/agent-tracing';
+
+const ENABLE_AGENT_S3_TRACING_VALUE = '1';
+
+type SnapshotStoreConstructor = new () => ISnapshotStore;
+type SnapshotStoreModuleLoader = (moduleName: string) => unknown;
+
+interface FileSnapshotStoreModule {
+  FileSnapshotStore: SnapshotStoreConstructor;
+}
+
+interface S3SnapshotStoreModule {
+  S3SnapshotStore: SnapshotStoreConstructor;
+}
+
+const nodeRequire: SnapshotStoreModuleLoader = (moduleName) => require(moduleName);
+
+export const shouldUseAgentS3Tracing = () => {
+  const explicitValue = process.env.ENABLE_AGENT_S3_TRACING;
+
+  if (explicitValue !== undefined) return explicitValue === ENABLE_AGENT_S3_TRACING_VALUE;
+
+  return process.env.NODE_ENV === 'production';
+};
+
+/**
+ * Create default snapshot store based on environment.
+ * - ENABLE_AGENT_S3_TRACING=1 -> S3SnapshotStore
+ * - NODE_ENV=production with ENABLE_AGENT_S3_TRACING unset -> S3SnapshotStore
+ * - NODE_ENV=development -> FileSnapshotStore
+ * - Otherwise -> null (no tracing)
+ */
+export const createDefaultSnapshotStore = (
+  loadModule: SnapshotStoreModuleLoader = nodeRequire,
+): ISnapshotStore | null => {
+  if (shouldUseAgentS3Tracing()) {
+    try {
+      const { S3SnapshotStore } = loadModule(
+        '@/server/modules/AgentTracing',
+      ) as S3SnapshotStoreModule;
+      return new S3SnapshotStore();
+    } catch {
+      // S3SnapshotStore not available
+    }
+  }
+
+  if (process.env.NODE_ENV === 'development') {
+    try {
+      const { FileSnapshotStore } = loadModule(
+        '@lobechat/agent-tracing',
+      ) as FileSnapshotStoreModule;
+      return new FileSnapshotStore();
+    } catch {
+      // agent-tracing not available
+    }
+  }
+
+  return null;
+};
@@ -121,6 +121,12 @@ export type StepCompletionReason =

 export interface AgentExecutionParams {
  approvedToolCall?: any;
+  /**
+   * 1-based attempt number carried by a `verifyAsyncToolBarrier` re-check so the
+   * bounded watchdog can back off and stop after a fixed number of tries. Absent
+   * (treated as attempt 1) on the first re-check armed by a completion bridge.
+   */
+  asyncToolVerifyAttempt?: number;
  context?: AgentRuntimeContext;
  externalRetryCount?: number;
  humanInput?: any;
@@ -144,10 +150,13 @@ export interface AgentExecutionParams {
  /**
   * Watchdog re-check for a parked `waiting_for_async_tool` op: re-runs the
   * resume barrier + CAS without claiming the step lock or executing a step.
-   * A no-op when the op already resumed or the barrier is still unsatisfied.
-   * Scheduled one-shot by `tryResumeParentFromAsyncTool` when a sub-agent
-   * completion found the parent not yet resumable (covers the
-   * child-finishes-before-parent-parks race and transient barrier failures).
+   * A no-op when the op already resumed. While the barrier is still unsatisfied
+   * it re-arms the next check with exponential backoff (see
+   * `asyncToolVerifyAttempt`) up to a bounded number of attempts, so a transient
+   * miss is retried rather than permanently stranding the parent. First armed by
+   * `tryResumeParentFromAsyncTool` when a sub-agent completion found the parent
+   * not yet resumable (covers the child-finishes-before-parent-parks race and
+   * transient barrier failures).
   */
  verifyAsyncToolBarrier?: boolean;
 }
@@ -221,6 +230,9 @@ export interface OperationCreationParams {
  deviceAccessPolicy?: { canUseDevice: boolean; reason: DeviceAccessReason };
  /** Device system info for placeholder variable replacement in Local System systemRole */
  deviceSystemInfo?: Record<string, string>;
+  /** Discord context for injecting channel/guild info into agent system message */
+  discordContext?: any;
+  evalContext?: any;
  /**
   * Resolved execution plan for the run (see `resolveExecutionPlan`).
   * Forwarded into `state.metadata.executionPlan` so step-level layers (the
@@ -228,9 +240,6 @@ export interface OperationCreationParams {
   * device capability from raw config.
   */
  executionPlan?: ExecutionPlan;
-  /** Discord context for injecting channel/guild info into agent system message */
-  discordContext?: any;
-  evalContext?: any;
  /**
   * External lifecycle hooks
   * Registered once, auto-adapt to local (in-memory) or production (webhook) mode
@@ -141,9 +141,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -97,9 +97,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -101,9 +101,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -98,9 +98,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -7,7 +7,7 @@ const {
  mockCreateOperation,
  mockCreateServerAgentToolsEngine,
  mockGetAgentConfig,
-  mockGetKlavisManifests,
+  mockGetComposioManifests,
  mockGetLobehubSkillManifests,
  mockMessageCreate,
  mockPluginQuery,
@@ -18,7 +18,7 @@ const {
    getEnabledPluginManifests: vi.fn().mockReturnValue(new Map()),
  }),
  mockGetAgentConfig: vi.fn(),
-  mockGetKlavisManifests: vi.fn().mockResolvedValue([]),
+  mockGetComposioManifests: vi.fn().mockResolvedValue([]),
  mockGetLobehubSkillManifests: vi.fn().mockResolvedValue([]),
  mockMessageCreate: vi.fn(),
  mockPluginQuery: vi.fn().mockResolvedValue([]),
@@ -97,9 +97,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: mockGetKlavisManifests,
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: mockGetComposioManifests,
  })),
 }));

@@ -176,7 +176,7 @@ describe('AiAgentService.execAgent - disableTools', () => {

    // Manifest fetches should NOT be called
    expect(mockGetLobehubSkillManifests).not.toHaveBeenCalled();
-    expect(mockGetKlavisManifests).not.toHaveBeenCalled();
+    expect(mockGetComposioManifests).not.toHaveBeenCalled();

    // ToolsEngine should NOT be created
    expect(mockCreateServerAgentToolsEngine).not.toHaveBeenCalled();
@@ -196,7 +196,7 @@ describe('AiAgentService.execAgent - disableTools', () => {
    // All tool discovery steps should be called
    expect(mockPluginQuery).toHaveBeenCalledTimes(1);
    expect(mockGetLobehubSkillManifests).toHaveBeenCalledTimes(1);
-    expect(mockGetKlavisManifests).toHaveBeenCalledTimes(1);
+    expect(mockGetComposioManifests).toHaveBeenCalledTimes(1);
    expect(mockCreateServerAgentToolsEngine).toHaveBeenCalledTimes(1);
  });
 });
@@ -100,9 +100,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -68,9 +68,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -68,9 +68,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -91,9 +91,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -101,9 +101,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -100,10 +100,10 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-// Mock KlavisService (for getKlavisManifests)
-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+// Mock ComposioService (for getComposioManifests)
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -90,9 +90,9 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -21,6 +21,12 @@ vi.mock('@/database/models/thread', () => ({
  ThreadModel: vi.fn().mockImplementation(() => mockThreadModel),
 }));

+vi.mock('@/database/models/agentOperation', () => ({
+  AgentOperationModel: vi.fn().mockImplementation(() => ({
+    findById: vi.fn().mockResolvedValue({ trigger: 'cli' }),
+  })),
+}));
+
 // Mock other models
 vi.mock('@/database/models/agent', () => ({
  AgentModel: vi.fn().mockImplementation(() => ({
@@ -81,10 +87,10 @@ vi.mock('@/server/services/market', () => ({
  })),
 }));

-// Mock KlavisService
-vi.mock('@/server/services/klavis', () => ({
-  KlavisService: vi.fn().mockImplementation(() => ({
-    getKlavisManifests: vi.fn().mockResolvedValue([]),
+// Mock ComposioService
+vi.mock('@/server/services/composio', () => ({
+  ComposioService: vi.fn().mockImplementation(() => ({
+    getComposioManifests: vi.fn().mockResolvedValue([]),
  })),
 }));

@@ -115,7 +121,7 @@ describe('AiAgentService.execSubAgent', () => {
    service = new AiAgentService(mockDb, userId);
  });

-  describe('successful task execution', () => {
+  describe('successful isolated execution', () => {
    it('should create Thread with correct parameters', async () => {
      // Mock execAgent to return success
      vi.spyOn(service, 'execAgent').mockResolvedValue({
@@ -208,6 +214,7 @@ describe('AiAgentService.execSubAgent', () => {
        agentId: 'agent-1',
        appContext: {
          groupId: 'group-1',
+          isSubAgent: false,
          threadId: 'thread-123',
          topicId: 'topic-1',
        },
@@ -223,6 +230,46 @@ describe('AiAgentService.execSubAgent', () => {
      });
    });

+    it('should run deferred lobe-agent children through execVirtualSubAgent', async () => {
+      const execAgentSpy = vi.spyOn(service, 'execAgent').mockResolvedValue({
+        agentId: 'agent-1',
+        assistantMessageId: 'assistant-msg-1',
+        autoStarted: true,
+        createdAt: new Date().toISOString(),
+        message: 'Agent operation created successfully',
+        messageId: 'queue-msg-1',
+        operationId: 'op-123',
+        status: 'created',
+        success: true,
+        timestamp: new Date().toISOString(),
+        topicId: 'topic-1',
+        userMessageId: 'user-msg-1',
+      });
+
+      await service.execVirtualSubAgent({
+        agentId: 'agent-1',
+        instruction: 'Nested research task',
+        parentMessageId: 'tool-msg-1',
+        parentOperationId: 'parent-op-1',
+        topicId: 'topic-1',
+      });
+
+      expect(execAgentSpy).toHaveBeenCalledWith(
+        expect.objectContaining({
+          appContext: expect.objectContaining({
+            isSubAgent: true,
+            threadId: 'thread-123',
+            topicId: 'topic-1',
+          }),
+          hooks: expect.arrayContaining([
+            expect.objectContaining({ id: 'sub-agent-bridge', type: 'onComplete' }),
+          ]),
+          parentOperationId: 'parent-op-1',
+          trigger: 'cli',
+        }),
+      );
+    });
+
    it('should store operationId and startedAt in Thread metadata', async () => {
      vi.spyOn(service, 'execAgent').mockResolvedValue({
        agentId: 'agent-1',
@@ -409,7 +456,7 @@ describe('AiAgentService.execSubAgent', () => {
          parentMessageId: 'parent-msg-1',
          topicId: 'topic-1',
        }),
-      ).rejects.toThrow('Failed to create thread for task execution');
+      ).rejects.toThrow('Failed to create thread for agent execution');
    });

    it('should throw error when Thread creation throws', async () => {
@@ -427,7 +474,7 @@ describe('AiAgentService.execSubAgent', () => {
    });
  });

-  describe('task message summary update', () => {
+  describe('source message summary update', () => {
    it('should pass sourceMessageId (parentMessageId) to callbacks for summary update', async () => {
      const execAgentSpy = vi.spyOn(service, 'execAgent').mockResolvedValue({
        agentId: 'agent-1',
@@ -36,6 +36,7 @@ import type {
  ExecGroupAgentResult,
  ExecSubAgentParams,
  ExecSubAgentResult,
+  ExecVirtualSubAgentParams,
  LobeAgentAgencyConfig,
  MessagePluginItem,
  UserInterventionConfig,
@@ -94,6 +95,7 @@ import {
  resolveAgentSelfIterationCapability,
 } from '@/server/services/agentSignal/featureGate';
 import { shouldSuppressSignal } from '@/server/services/agentSignal/suppressSignal';
+import { ComposioService } from '@/server/services/composio';
 import { deviceGateway } from '@/server/services/deviceGateway';
 import { DocumentService } from '@/server/services/document';
 import { FileService } from '@/server/services/file';
@@ -103,7 +105,6 @@ import {
 } from '@/server/services/file/resolveAttachments';
 import { HeterogeneousAgentService } from '@/server/services/heterogeneousAgent';
 import type { ConversationHistoryEntry } from '@/server/services/heterogeneousAgent/cloudHeteroContext';
-import { KlavisService } from '@/server/services/klavis';
 import { MarketService } from '@/server/services/market';
 import { markdownToTxt } from '@/utils/markdownToTxt';

@@ -286,7 +287,7 @@ export class AiAgentService {
  private readonly topicModel: TopicModel;
  private readonly agentRuntimeService: AgentRuntimeService;
  private readonly marketService: MarketService;
-  private readonly klavisService: KlavisService;
+  private readonly composioService: ComposioService;

  private readonly workspaceId?: string;

@@ -318,14 +319,15 @@ export class AiAgentService {
      // high-level pipelines mid-step. See AgentRuntimeDelegate. New high-level
      // capabilities the runtime calls into go in this `delegate` object.
      //
-      // `execSubAgent` is an auto-bound arrow field, so no `.bind(this)`.
+      // Arrow fields are auto-bound, so no `.bind(this)`.
      delegate: {
        execSubAgent: this.execSubAgent,
+        execVirtualSubAgent: this.execVirtualSubAgent,
      },
      workspaceId: wsId,
    });
    this.marketService = new MarketService({ userInfo: { userId } });
-    this.klavisService = new KlavisService({ db, userId, workspaceId: wsId });
+    this.composioService = new ComposioService({ db, userId });
  }

  private async resolveOperationTaskId(
@@ -415,9 +417,10 @@ export class AiAgentService {
   * Execute a single agent step against this service's runtime.
   *
   * Delegates to the internal AgentRuntimeService, which is already wired with
-   * the `execSubAgent` fork callback. The QStash step worker drives stepping
-   * through here so `lobe-agent.callSubAgent` can fork sub-agents — building a
-   * bare runtime there would lose the callback and fail with SUB_AGENT_UNAVAILABLE.
+   * the agent-invocation fork callbacks. The QStash step worker drives stepping
+   * through here so `lobe-agent.callSubAgent` can fork virtual sub-agents —
+   * building a bare runtime there would lose the callback and fail with
+   * SUB_AGENT_UNAVAILABLE.
   */
  executeStep(params: AgentExecutionParams): Promise<AgentExecutionResult> {
    return this.agentRuntimeService.executeStep(params);
@@ -1390,7 +1393,7 @@ export class AiAgentService {

    // These are needed outside the tools block (for agent management context, skill engine, etc.)
    let lobehubSkillManifests: LobeToolManifest[] = [];
-    let klavisManifests: LobeToolManifest[] = [];
+    let composioManifests: LobeToolManifest[] = [];
    let connectorManifests: ReturnType<typeof buildConnectorManifests> = [];
    let agentPlugins: string[] = [...(agentConfig?.plugins ?? []), ...(additionalPluginIds || [])];

@@ -1455,7 +1458,7 @@ export class AiAgentService {
          : [];

      // Only connectors WITH a real MCP endpoint (mcpServerUrl or stdio) can replace plugins in the
-      // manifest. Connectors WITHOUT an endpoint (e.g. Lobehub/Klavis OAuth skills synced via
+      // manifest. Connectors WITHOUT an endpoint (e.g. Lobehub/Composio OAuth skills synced via
      // syncToolsFromClient) must continue using their original plugin executor path — otherwise
      // after humanIntervention approval the runtime tries to call mcpServerUrl='' and returns empty.
      const connectorsMcp = connectors.filter(
@@ -1500,24 +1503,24 @@ export class AiAgentService {
      }
      log('execAgent: got %d lobehub skill manifests', lobehubSkillManifests.length);

-      // 5d. Fetch Klavis tool manifests from database
+      // 5d. Fetch Composio tool manifests from database
      try {
-        klavisManifests = await this.klavisService.getKlavisManifests();
+        composioManifests = await this.composioService.getComposioManifests();
      } catch (error) {
-        log('execAgent: failed to fetch klavis manifests: %O', error);
+        log('execAgent: failed to fetch composio manifests: %O', error);
      }
-      log('execAgent: got %d klavis manifests', klavisManifests.length);
+      log('execAgent: got %d composio manifests', composioManifests.length);

-      // 5d-1. Patch Lobehub/Klavis manifests AND community-MCP plugin manifests
+      // 5d-1. Patch Lobehub/Composio manifests AND community-MCP plugin manifests
      // with connector tool permissions. This enables needs_approval (→
      // humanIntervention: 'required') and disabled (→ blocking description) for
      // any tool managed via the connector system but executed through a
-      // non-connector path (Lobehub/Klavis skills, community MCP plugins).
+      // non-connector path (Lobehub/Composio skills, community MCP plugins).
      // The 'disabled' hard-block is already enforced universally in
      // ToolExecutionService; this surfaces the permission to the model too.
      if (
        lobehubSkillManifests.length > 0 ||
-        klavisManifests.length > 0 ||
+        composioManifests.length > 0 ||
        pluginsWithoutConnectors.length > 0
      ) {
        try {
@@ -1526,7 +1529,7 @@ export class AiAgentService {
          const { ConnectorToolModel } = await import('@/database/models/connectorTool');
          const allIdentifiers = [
            ...lobehubSkillManifests.map((m) => m.identifier),
-            ...klavisManifests.map((m) => m.identifier),
+            ...composioManifests.map((m) => m.identifier),
            ...pluginsWithoutConnectors.map((p) => p.identifier),
          ];
          const connectorEntries =
@@ -1552,7 +1555,7 @@ export class AiAgentService {
                : m;
            });

-            klavisManifests = klavisManifests.map((m) => {
+            composioManifests = composioManifests.map((m) => {
              const perms = connectorToolsMap.get(m.identifier);
              return perms && perms.size > 0
                ? (patchManifestWithPermissions(m as any, perms as any) as any)
@@ -1684,7 +1687,7 @@ export class AiAgentService {
      );

      const toolsEngine = createServerAgentToolsEngine(toolsContext, {
-        additionalManifests: [...lobehubSkillManifests, ...klavisManifests, ...connectorManifests],
+        additionalManifests: [...lobehubSkillManifests, ...composioManifests, ...connectorManifests],
        agentConfig: {
          chatConfig: agentConfig.chatConfig ?? undefined,
          plugins: agentPlugins,
@@ -1714,9 +1717,9 @@ export class AiAgentService {
          ...agentPlugins,
          ...(disableLocalSystem ? [] : [LocalSystemManifest.identifier]),
          RemoteDeviceManifest.identifier,
-          // Include LobeHub Skills and Klavis tools so they are passed to generateToolsDetailed
+          // Include LobeHub Skills and Composio tools so they are passed to generateToolsDetailed
          ...lobehubSkillManifests.map((m) => m.identifier),
-          ...klavisManifests.map((m) => m.identifier),
+          ...composioManifests.map((m) => m.identifier),
          // Connector manifests are also injected as additionalManifests
          ...connectorManifests.map((m) => m.identifier),
        ]),
@@ -1737,7 +1740,7 @@ export class AiAgentService {

      // Single guard for every `toolManifestMap[id] = ...` ingest below.
      // Mirrors the post-merge filter in `createServerToolsEngine`: an
-      // installed plugin, a LobeHub Skill, or a Klavis manifest declaring
+      // installed plugin, a LobeHub Skill, or a Composio manifest declaring
      // `identifier: 'lobe-remote-device'` would otherwise reach the
      // activator-discovery map and let an external bot sender enable it
      // (). Centralising the check at the ingest layer means
@@ -1811,14 +1814,14 @@ export class AiAgentService {
        toolManifestMap[LocalSystemManifest.identifier] = LocalSystemManifest as LobeToolManifest;
      }

-      // Include lobehub skill and klavis manifests for activator discovery
+      // Include lobehub skill and composio manifests for activator discovery
      for (const manifest of lobehubSkillManifests) {
        if (!isManifestIngestAllowed(manifest.identifier)) continue;
        if (!toolManifestMap[manifest.identifier]) {
          toolManifestMap[manifest.identifier] = manifest;
        }
      }
-      for (const manifest of klavisManifests) {
+      for (const manifest of composioManifests) {
        if (!isManifestIngestAllowed(manifest.identifier)) continue;
        if (!toolManifestMap[manifest.identifier]) {
          toolManifestMap[manifest.identifier] = manifest;
@@ -1829,9 +1832,9 @@ export class AiAgentService {
        if (!isManifestIngestAllowed(manifest.identifier)) continue;
        toolSourceMap[manifest.identifier] = 'lobehubSkill';
      }
-      for (const manifest of klavisManifests) {
+      for (const manifest of composioManifests) {
        if (!isManifestIngestAllowed(manifest.identifier)) continue;
-        toolSourceMap[manifest.identifier] = 'klavis';
+        toolSourceMap[manifest.identifier] = 'composio';
      }

      // Mark tools that must run on the user's machine (local-system, stdio
@@ -1863,10 +1866,10 @@ export class AiAgentService {
      }

      log(
-        'execAgent: generated %d tools, %d lobehub skills, %d klavis tools',
+        'execAgent: generated %d tools, %d lobehub skills, %d composio tools',
        tools?.length ?? 0,
        lobehubSkillManifests.length,
-        klavisManifests.length,
+        composioManifests.length,
      );

      const agentSelfIterationEnabled = agentConfig.chatConfig?.selfIteration?.enabled === true;
@@ -2092,12 +2095,12 @@ export class AiAgentService {
          name: manifest.meta?.title || manifest.identifier,
          type: 'lobehub-skill' as const,
        })),
-        // Klavis tools
-        ...klavisManifests.map((manifest) => ({
+        // Composio tools
+        ...composioManifests.map((manifest) => ({
          description: manifest.meta?.description,
          identifier: manifest.identifier,
          name: manifest.meta?.title || manifest.identifier,
-          type: 'klavis' as const,
+          type: 'composio' as const,
        })),
        // Custom connectors (user-added MCP servers)
        ...connectorManifests.map((manifest) => ({
@@ -2296,7 +2299,7 @@ export class AiAgentService {
        : undefined;

    // 13. Create user message in database
-    // Include threadId if provided (for SubAgent task execution in isolated Thread)
+    // Include threadId if provided (for isolated agent execution)
    const userMessageRecord = runFromHistory
      ? undefined
      : await this.messageModel.create({
@@ -2344,7 +2347,7 @@ export class AiAgentService {
    }

    // 14. Create assistant message placeholder in database
-    // Include threadId if provided (for SubAgent task execution in isolated Thread)
+    // Include threadId if provided (for isolated agent execution)
    const assistantMessageRecord = await this.messageModel.create({
      agentId: persistAgentId,
      content: LOADING_FLAT,
@@ -2856,35 +2859,46 @@ export class AiAgentService {
  }

  /**
-   * Execute SubAgent task (supports both Group and Single Agent mode)
+   * Execute an agent in an isolated Thread context.
   *
-   * This method is called by Supervisor (Group mode) or Agent (Single mode)
-   * to delegate tasks to SubAgents. Each task runs in an isolated Thread context.
-   *
-   * - Group mode: pass groupId, Thread will be associated with the Group
-   * - Single Agent mode: omit groupId, Thread will only be associated with the Agent
-   *
-   * Flow:
-   * 1. Create Thread (type='isolation', status='processing')
-   * 2. Delegate to execAgent with threadId in appContext
-   * 3. Store operationId in Thread metadata
+   * Group/callAgent paths use this entry. It does not mark the child as a
+   * virtual sub-agent and it does not install the async completion bridge.
   */
-  // Arrow field (not a method) so it stays bound to this instance when handed to
-  // AgentRuntimeService as the `execSubAgent` fork callback — no `.bind(this)`.
-  execSubAgent = async (params: ExecSubAgentParams): Promise<ExecSubAgentResult> => {
-    const {
-      groupId,
-      topicId,
-      parentMessageId,
-      agentId,
-      instruction,
-      title,
-      parentOperationId,
-      resumeParentOnComplete,
-    } = params;
+  // Arrow field (not a method) so it stays bound when handed to AgentRuntimeService.
+  execSubAgent = async (params: ExecSubAgentParams): Promise<ExecSubAgentResult> =>
+    this.execAgentThreadRun(params, {
+      isSubAgent: false,
+      logScope: 'execSubAgent',
+    });
+
+  /**
+   * Execute a virtual sub-agent created by `lobe-agent.callSubAgent`.
+   *
+   * This path is a child operation of the current agent run. It is marked as a
+   * sub-agent so it cannot recursively spawn more sub-agents, and it registers
+   * the bridge that backfills the parent's placeholder tool message.
+   */
+  execVirtualSubAgent = async (params: ExecVirtualSubAgentParams): Promise<ExecSubAgentResult> =>
+    this.execAgentThreadRun(params, {
+      isSubAgent: true,
+      logScope: 'execVirtualSubAgent',
+      resumeParentOnComplete: true,
+    });
+
+  private async execAgentThreadRun(
+    params: ExecSubAgentParams | ExecVirtualSubAgentParams,
+    options: {
+      isSubAgent: boolean;
+      logScope: 'execSubAgent' | 'execVirtualSubAgent';
+      resumeParentOnComplete?: boolean;
+    },
+  ): Promise<ExecSubAgentResult> {
+    const { groupId, topicId, parentMessageId, agentId, instruction, title, parentOperationId } =
+      params;

    log(
-      'execSubAgent: agentId=%s, groupId=%s, topicId=%s, instruction=%s',
+      '%s: agentId=%s, groupId=%s, topicId=%s, instruction=%s',
+      options.logScope,
      agentId,
      groupId,
      topicId,
@@ -2903,7 +2917,7 @@ export class AiAgentService {
        .catch(() => {});
    }

-    // 1. Create Thread for isolated task execution
+    // 1. Create Thread for isolated agent execution
    const thread = await this.threadModel.create({
      agentId,
      groupId,
@@ -2914,10 +2928,10 @@ export class AiAgentService {
    });

    if (!thread) {
-      throw new Error('Failed to create thread for task execution');
+      throw new Error('Failed to create thread for agent execution');
    }

-    log('execSubAgent: created thread %s', thread.id);
+    log('%s: created thread %s', options.logScope, thread.id);

    // 2. Update Thread status to processing with startedAt timestamp
    const startedAt = new Date().toISOString();
@@ -2926,14 +2940,19 @@ export class AiAgentService {
      status: ThreadStatus.Processing,
    });

-    // 3. Create hooks for updating Thread metadata and task message
-    const threadHooks = this.createThreadHooks(thread.id, startedAt, parentMessageId);
-    // For the deferred-tool path, also register the completion bridge that
+    // 3. Create hooks for updating Thread metadata and source message
+    const threadHooks = this.createThreadHooks(
+      thread.id,
+      startedAt,
+      parentMessageId,
+      options.logScope,
+    );
+    // For the virtual sub-agent path, also register the completion bridge that
    // backfills the parent's placeholder tool message and resumes the parked
-    // parent op once the whole batch is done. Registered last so its
-    // tool-message backfill (content + pluginState) is the final write.
+    // parent op once the child run is done. Registered last so its tool-message
+    // backfill (content + pluginState) is the final write.
    const hooks =
-      resumeParentOnComplete && parentOperationId
+      options.resumeParentOnComplete && parentOperationId
        ? [
            ...threadHooks,
            this.createSubAgentBridgeHook(parentOperationId, parentMessageId, thread.id),
@@ -2953,16 +2972,23 @@ export class AiAgentService {
        ).findById(parentOperationId);
        inheritedTrigger = parentOp?.trigger ?? undefined;
      } catch (error) {
-        log('execSubAgent: failed to read parent operation trigger: %O', error);
+        log('%s: failed to read parent operation trigger: %O', options.logScope, error);
      }
    }

+    const appContext: NonNullable<InternalExecAgentParams['appContext']> = {
+      groupId,
+      isSubAgent: options.isSubAgent,
+      threadId: thread.id,
+      topicId,
+    };
+
    // 4. Delegate to execAgent with threadId in appContext and hooks
    // The instruction will be created as user message in the Thread
-    // Use headless mode to skip human approval in async task execution
+    // Use headless mode to skip human approval in async agent execution
    const result = await this.execAgent({
      agentId,
-      appContext: { groupId, threadId: thread.id, topicId },
+      appContext,
      autoStart: true,
      hooks,
      parentOperationId,
@@ -2972,7 +2998,8 @@ export class AiAgentService {
    });

    log(
-      'execSubAgent: delegated to execAgent, operationId=%s, success=%s',
+      '%s: delegated to execAgent, operationId=%s, success=%s',
+      options.logScope,
      result.operationId,
      result.success,
    );
@@ -3028,7 +3055,7 @@ export class AiAgentService {
      success: result.success ?? false,
      threadId: thread.id,
    };
-  };
+  }

  /**
   * Create step lifecycle callbacks for updating Thread metadata
@@ -3036,12 +3063,13 @@ export class AiAgentService {
   *
   * @param threadId - The Thread ID to update
   * @param startedAt - The start time ISO string
-   * @param sourceMessageId - The task message ID (sourceMessageId from Thread) to update with summary
+   * @param sourceMessageId - The source message ID from Thread to update with summary
   */
  private createThreadMetadataCallbacks(
    threadId: string,
    startedAt: string,
    sourceMessageId: string,
+    logScope: 'execSubAgent' | 'execVirtualSubAgent' = 'execSubAgent',
  ): StepLifecycleCallbacks {
    // Accumulator for tracking metrics across steps
    let accumulatedToolCalls = 0;
@@ -3067,9 +3095,9 @@ export class AiAgentService {
              totalToolCalls: accumulatedToolCalls,
            },
          });
-          log('execSubAgent: updated thread %s metadata after step %d', threadId, state.stepCount);
+          log('%s: updated thread %s metadata after step %d', logScope, threadId, state.stepCount);
        } catch (error) {
-          log('execSubAgent: failed to update thread metadata: %O', error);
+          log('%s: failed to update thread metadata: %O', logScope, error);
        }
      },

@@ -3101,13 +3129,13 @@ export class AiAgentService {
          }
        }

-        // Log error when task fails
+        // Log error when the isolated run fails
        if (reason === 'error' && finalState.error) {
-          console.error('execSubAgent: task failed for thread %s:', threadId, finalState.error);
+          console.error('%s: run failed for thread %s:', logScope, threadId, finalState.error);
        }

        try {
-          // Extract summary from last assistant message and update task message content
+          // Extract summary from last assistant message and update source message content
          const lastAssistantMessage = finalState.messages
            ?.slice()
            .reverse()
@@ -3117,7 +3145,7 @@ export class AiAgentService {
            await this.messageModel.update(sourceMessageId, {
              content: lastAssistantMessage.content,
            });
-            log('execSubAgent: updated task message %s with summary', sourceMessageId);
+            log('%s: updated source message %s with summary', logScope, sourceMessageId);
          }

          // Format error for proper serialization (Error objects don't serialize with JSON.stringify)
@@ -3140,13 +3168,14 @@ export class AiAgentService {
          });

          log(
-            'execSubAgent: thread %s completed with status %s, reason: %s',
+            '%s: thread %s completed with status %s, reason: %s',
+            logScope,
            threadId,
            status,
            reason,
          );
        } catch (error) {
-          console.error('execSubAgent: failed to update thread on completion: %O', error);
+          console.error('%s: failed to update thread on completion: %O', logScope, error);
        }
      },
    };
@@ -3160,6 +3189,7 @@ export class AiAgentService {
    threadId: string,
    startedAt: string,
    sourceMessageId: string,
+    logScope: 'execSubAgent' | 'execVirtualSubAgent',
  ): AgentHook[] {
    let accumulatedToolCalls = 0;

@@ -3186,7 +3216,7 @@ export class AiAgentService {
              },
            });
          } catch (error) {
-            log('Thread hook afterStep: failed to update metadata: %O', error);
+            log('%s: thread hook afterStep failed to update metadata: %O', logScope, error);
          }
        },
        id: 'thread-metadata-update',
@@ -3226,14 +3256,15 @@ export class AiAgentService {

          if (event.reason === 'error' && finalState.error) {
            console.error(
-              'Thread hook onComplete: task failed for thread %s:',
+              '%s: thread hook onComplete run failed for thread %s:',
+              logScope,
              threadId,
              finalState.error,
            );
          }

          try {
-            // Update task message with summary
+            // Update source message with summary
            const lastAssistantMessage = finalState.messages
              ?.slice()
              .reverse()
@@ -3263,13 +3294,14 @@ export class AiAgentService {
            });

            log(
-              'Thread hook onComplete: thread %s status=%s reason=%s',
+              '%s: thread hook onComplete thread %s status=%s reason=%s',
+              logScope,
              threadId,
              status,
              event.reason,
            );
          } catch (error) {
-            console.error('Thread hook onComplete: failed to update: %O', error);
+            console.error('%s: thread hook onComplete failed to update: %O', logScope, error);
          }
        },
        id: 'thread-completion',
--- a/Show More
+++ b/Show More