mirror of
https://github.com/lobehub/lobe-chat.git
synced 2026-06-14 03:30:19 +00:00
Compare commits
102 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f47e65d215 | |||
| 6dcbd387f7 | |||
| fa58fd12a0 | |||
| 913ee4210d | |||
| 99411041b9 | |||
| 39bce329fd | |||
| 55a969a3c1 | |||
| f51dd06a36 | |||
| 24e34c7545 | |||
| 81d40b90d4 | |||
| 9cde29fb14 | |||
| ebe8411e7e | |||
| 381e87474c | |||
| 09fd6f3411 | |||
| d9d9f44cb2 | |||
| 1244a40950 | |||
| a48c2badd9 | |||
| 3f3f12dbd2 | |||
| 99023811d8 | |||
| 480a2979e1 | |||
| 531900cf70 | |||
| c9325794e5 | |||
| 4a11ed9887 | |||
| be7b759820 | |||
| fa76928f62 | |||
| f6db1361ee | |||
| 5d6eaf53f3 | |||
| c4e4469083 | |||
| 800b534741 | |||
| 03b9d07d0b | |||
| f60d1fe8dd | |||
| e5a27dc97c | |||
| c7e0c83174 | |||
| ab958a0b98 | |||
| 5362be4078 | |||
| 6887930428 | |||
| da94942d9c | |||
| a9141c8ade | |||
| 8ab5ec5364 | |||
| 222534dbe1 | |||
| f31c94490d | |||
| 52eaf2702e | |||
| ce81ea44bf | |||
| 29974d3ab9 | |||
| f4c431b028 | |||
| 34fbd9ffd3 | |||
| 09b5e926bf | |||
| d3e8e7cb65 | |||
| 60bed5782f | |||
| 35b6bc55b8 | |||
| 365dd1ff64 | |||
| 7633c0e83f | |||
| 87b1f39c0f | |||
| ca91d2d756 | |||
| 61586b9377 | |||
| eca449e4e2 | |||
| 6c8976b641 | |||
| 60d9d3c3c7 | |||
| 2dd4cf7a1d | |||
| 575ef1e8ee | |||
| ba6976c063 | |||
| bfdfd3bca3 | |||
| f6c23e3654 | |||
| 813d756b9c | |||
| 671bc26e0d | |||
| 309c25cb44 | |||
| a810bf3dcd | |||
| 7d6be512b8 | |||
| 1130f7df32 | |||
| e20496e444 | |||
| dbc8d76c8d | |||
| ecfdac5395 | |||
| 5f4bec347b | |||
| 77e4d0492b | |||
| a60d11df48 | |||
| 14501ea69a | |||
| b76992e581 | |||
| 97e4e345d1 | |||
| c609a60f0e | |||
| 06bf82f3e0 | |||
| 3ccc23152c | |||
| 3a780a62f6 | |||
| e98ad7edca | |||
| 686778fe51 | |||
| 914976a52f | |||
| fdd955404d | |||
| 6d47c1d07e | |||
| c65cf8c2a0 | |||
| 981c57d6f9 | |||
| 87eba86514 | |||
| 09e6f02e45 | |||
| a2ea314cd8 | |||
| e2be720726 | |||
| 8b6905ec7e | |||
| e4830943cf | |||
| 5dfb6fc288 | |||
| 94ea3f6a34 | |||
| b8339abc76 | |||
| c037609b8b | |||
| b8b37cffa3 | |||
| e8e4b2e822 | |||
| 248a4dcab5 |
@@ -0,0 +1,376 @@
|
||||
---
|
||||
name: agent-testing
|
||||
description: >
|
||||
Agentic end-to-end testing for LobeHub: backend verification via the CLI,
|
||||
frontend verification via agent-browser (Electron), full-stack verification in
|
||||
the browser, and bot-channel verification via osascript. Local-first today,
|
||||
designed to extend to cloud automation. Triggers on 'cli test', 'test with cli',
|
||||
'verify with cli', 'backend test with cli', 'local test', 'test in electron',
|
||||
'test desktop', 'test bot', 'bot test', 'test in discord', 'test in telegram',
|
||||
'test in slack', 'test in wechat', 'test in weixin', 'test in lark', 'test in feishu',
|
||||
'test in qq', 'manual test', 'osascript', 'test report', or any local
|
||||
end-to-end verification task.
|
||||
---
|
||||
|
||||
# Agent Testing (Agentic End-to-End Verification)
|
||||
|
||||
One skill for all agentic end-to-end testing — local-first today, designed to
|
||||
also run as full cloud automation. Every test session follows the same
|
||||
four-step contract:
|
||||
|
||||
```
|
||||
Step -1: Plan approval → Step 0: Env + Auth → Step 1: Pick surface → Step 2: Run → Step 3: Structured report
|
||||
```
|
||||
|
||||
## Step -1 — Plan approval for non-trivial tests
|
||||
|
||||
Skip directly to Step 0 if: the test is a single re-run after a fix, the plan
|
||||
was already agreed on, or the user gave exact commands.
|
||||
|
||||
Otherwise, propose a test plan (surface, cases, expected evidence, assumptions)
|
||||
and use the runtime structured question tool (`request_user_input` /
|
||||
ask-user-question equivalent) with two fixed choices:
|
||||
|
||||
1. `开始执行 (Recommended)` — 测试方案没问题,开始执行
|
||||
2. `先讨论下` — 方案有问题,先讨论下
|
||||
|
||||
Wait for the user's choice before proceeding.
|
||||
|
||||
## Step 0 — Environment setup + auth check (mandatory)
|
||||
|
||||
Step 0 is about getting the environment ready: **dependencies are healthy**
|
||||
and **auth is green**. A test run that dies halfway on a missing dependency or
|
||||
a login wall wastes the whole session — clear both gates BEFORE writing a
|
||||
single test step.
|
||||
|
||||
### 0.0 Resolve the current test environment
|
||||
|
||||
Before starting a dev server, checking auth, opening agent-browser, or writing
|
||||
test steps, print and confirm the current local test environment:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/test-env.sh
|
||||
```
|
||||
|
||||
This command is the source of truth for local test ports. It reads the current
|
||||
shell plus `.env` files using the same precedence as `scripts/runWithEnv.mts`,
|
||||
then prints:
|
||||
|
||||
- `APP_URL`
|
||||
- `PORT`
|
||||
- `SERVER_URL`
|
||||
- `AUTH_TRUSTED_ORIGINS`
|
||||
- `SPA_PORT`
|
||||
- `MOBILE_SPA_PORT`
|
||||
- `DESKTOP_PORT`
|
||||
|
||||
For commands that need these values, export them from the same resolver:
|
||||
|
||||
```bash
|
||||
eval "$(./.agents/skills/agent-testing/scripts/test-env.sh --exports)"
|
||||
```
|
||||
|
||||
Do not rely on hard-coded port tables. If the printed values do not match the
|
||||
running dev server, fix/export the env first, then continue.
|
||||
|
||||
### 0.1 Dependencies are installed — root AND standalone apps
|
||||
|
||||
The root pnpm workspace does **NOT** cover every app: `pnpm-workspace.yaml`
|
||||
lists `packages/**`, `e2e`, `apps/server`, and only `apps/desktop/src/main` —
|
||||
**`apps/desktop` and `apps/cli` are standalone**, each keeping its own
|
||||
`node_modules` with its own links into `packages/`. A root install does not
|
||||
refresh them, so install in every app the test will touch:
|
||||
|
||||
```bash
|
||||
pnpm install # root workspace
|
||||
cd apps/desktop && pnpm install # Electron surface
|
||||
cd apps/cli && pnpm install # CLI surface
|
||||
```
|
||||
|
||||
Symptom of a stale standalone install: the build/launch fails to resolve a
|
||||
recently added workspace package — `Rolldown failed to resolve import
|
||||
"@lobechat/<pkg>"` (Electron) or `Cannot find module '@lobechat/<pkg>'` (CLI).
|
||||
|
||||
### 0.2 Run scripts from the repo root
|
||||
|
||||
All paths in this skill (`./.agents/skills/agent-testing/...`) are
|
||||
repo-root-relative, and background commands inherit the current working
|
||||
directory — a script launched while `cwd` is `apps/desktop` fails with
|
||||
`No such file or directory`. Verify `pwd` is the repo root before launching
|
||||
long-running scripts.
|
||||
|
||||
### 0.3 Init local dev env without `.env`
|
||||
|
||||
For Web smoke against local code, start a **normal local dev environment**.
|
||||
First check the repo root for `.env`:
|
||||
|
||||
- If `.env` exists, use the existing local configuration and start the dev
|
||||
server normally.
|
||||
- If `.env` does not exist, use the agent-testing env bootstrap.
|
||||
|
||||
Do not start the standalone e2e server as the product under test.
|
||||
|
||||
Use `scripts/init-dev-env.sh`. It follows the e2e setup pattern — Postgres,
|
||||
migrations, auth/key-vault/S3 test env, seed user — but it is owned by this
|
||||
skill and starts the repo's dev server (`pnpm run dev:next` / `bun run dev`),
|
||||
not `e2e/scripts/setup.ts --start`. The script hard-blocks when root `.env`
|
||||
exists, so it cannot accidentally override a user's local config. When `.env`
|
||||
exists, do not call any `init-dev-env.sh` subcommand.
|
||||
|
||||
Decision flow:
|
||||
|
||||
```bash
|
||||
if [[ -f .env ]]; then
|
||||
bun run dev
|
||||
else
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
|
||||
fi
|
||||
```
|
||||
|
||||
Bootstrap flow when no `.env` exists:
|
||||
|
||||
```bash
|
||||
# From repo root. Managed DB flow requires Docker Desktop.
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh setup-db
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
|
||||
```
|
||||
|
||||
If using an existing Postgres instead of the managed Docker DB, set
|
||||
`DATABASE_URL` and skip `setup-db`:
|
||||
|
||||
```bash
|
||||
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate
|
||||
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
DATABASE_URL=postgresql://... ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
|
||||
```
|
||||
|
||||
For backend-only checks, `dev-next` is available, but Web smoke needs the
|
||||
full-stack `dev` command so Next can proxy the SPA HTML from Vite:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
|
||||
```
|
||||
|
||||
Useful subcommands:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh env # print exports
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh write # write .records/env/agent-testing-dev.env
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh migrate # migrations only
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user # seed user + CLI API key
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash # local QStash for workflow paths
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh clean-db # remove managed DB container
|
||||
```
|
||||
|
||||
Default script env:
|
||||
|
||||
- `APP_URL=http://localhost:3010`
|
||||
- `DATABASE_URL=postgresql://postgres:postgres@localhost:5433/postgres`
|
||||
- `DATABASE_DRIVER=node`
|
||||
- `FEATURE_FLAGS=-agent_self_iteration` so local smoke does not require QStash
|
||||
- Local QStash defaults (`QSTASH_URL`, `QSTASH_TOKEN`, signing keys) are exported;
|
||||
run `init-dev-env.sh qstash` in a separate terminal when the path under test
|
||||
triggers QStash/Workflow.
|
||||
- `KEY_VAULTS_SECRET`, `AUTH_SECRET`, auth verification off
|
||||
- S3 mock vars
|
||||
- Managed DB container: `lobehub-agent-testing-postgres`
|
||||
|
||||
`seed-user` creates `agent-testing@lobehub.com` / `TestPassword123!` with
|
||||
onboarding already completed, plus a local API key in
|
||||
`.records/env/agent-testing-cli.env` for CLI automation. When running Cucumber
|
||||
against this dev server, pass the same script env into the test process too;
|
||||
Cucumber has its own `BeforeAll` seed path and it must see `DATABASE_URL`
|
||||
instead of silently skipping setup:
|
||||
|
||||
```bash
|
||||
cd e2e
|
||||
# Only in the no-.env branch.
|
||||
eval "$(../.agents/skills/agent-testing/scripts/init-dev-env.sh env)"
|
||||
BASE_URL=http://localhost:3010 HEADLESS=true bun run test:smoke
|
||||
```
|
||||
|
||||
### 0.4 Auth is green for the selected surface
|
||||
|
||||
**Auth is the gate for automated testing, but the gate is surface-scoped.**
|
||||
Pick the intended surface first when it is already clear from the task, then
|
||||
check only that surface. Do not block a Web test on CLI device-code auth or an
|
||||
Electron login state unless the test spans those surfaces.
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh status --surface web
|
||||
```
|
||||
|
||||
Use `status` with no `--surface` only for cross-surface test plans.
|
||||
|
||||
| Surface | Mechanism | One-key path | Standard check |
|
||||
| -------- | --------------------------------------------- | ------------------------ | ----------------------------------------- |
|
||||
| CLI | Seeded API key, device-code fallback | `setup-auth.sh cli-seed` | `setup-auth.sh status --surface cli` |
|
||||
| Web | Seeded better-auth login into `agent-browser` | `setup-auth.sh web-seed` | `setup-auth.sh status --surface web` |
|
||||
| Electron | App's own persistent login state | Log in once in the app | `setup-auth.sh status --surface electron` |
|
||||
| Bot | Native apps already logged in | — | per-platform screenshot |
|
||||
|
||||
Login-state checks are standardized — do NOT hand-roll `window.__LOBE_STORES`
|
||||
eval snippets; use `scripts/app-probe.sh auth` (returns `{ isSignedIn, userId }`,
|
||||
works for Electron CDP and web sessions via `AB_TARGET`).
|
||||
|
||||
For Web tests, the test surface is always `agent-browser --session lobehub-dev`.
|
||||
Use `setup-auth.sh web-seed` first in the seeded local env. The user's normal
|
||||
Chrome is only a source for copying the Cookie header when seed auth is not
|
||||
available or `status --surface web` still fails. If Chrome is already logged in,
|
||||
do not open a login page; verify agent-browser first, then request the Network
|
||||
`Cookie:` header only if that verification fails. Full background and failure modes:
|
||||
[references/auth.md](./references/auth.md).
|
||||
|
||||
## Step 1 — Pick the surface by change scope
|
||||
|
||||
| Change scope | Default surface | Why | Guide |
|
||||
| ------------------------------------------------------- | ------------------------------------ | ----------------------------------------------------------------- | ---------------------------------- |
|
||||
| **Backend** (TRPC router / service / model / migration) | **CLI** | Fastest loop, text-assertable output, zero UI flakiness | [cli/index.md](./cli/index.md) |
|
||||
| **Pure frontend** (components, store, styles, UX) | **Electron** (agent-browser + CDP) | Primary product shape; `__LOBE_STORES` state introspection | [ui/electron.md](./ui/electron.md) |
|
||||
| **Full-stack** (new API + UI consuming it) | **Web** (browser + local dev server) | One surface where network requests and UI are observable together | [ui/web.md](./ui/web.md) |
|
||||
| **Bot channels** (Discord / WeChat / Lark / …) | Native app via osascript / bridge | Only way to exercise the real channel end-to-end | `bot/<platform>/index.md` |
|
||||
|
||||
Escalate, don't duplicate: verify a backend change with the CLI first; only add
|
||||
a UI pass when the change actually affects the UI.
|
||||
|
||||
### Environment support (local macOS vs cloud Linux)
|
||||
|
||||
The decisive constraint per surface is **how evidence (screenshots) is
|
||||
captured**: CDP-based capture (`agent-browser screenshot`) renders from the
|
||||
browser engine and needs no real display; OS-level capture (`screencapture`,
|
||||
osascript) is macOS-only.
|
||||
|
||||
| Surface | macOS (local) | Linux / cloud (headless) | Screenshot mechanism |
|
||||
| -------- | ------------- | --------------------------------------------------------- | ------------------------------------------------------ |
|
||||
| CLI | ✅ | ✅ | n/a — text output |
|
||||
| Web | ✅ | ✅ headless Chromium works natively | CDP — no display needed |
|
||||
| Electron | ✅ | ⚠️ runs, but needs a display server: wrap with `xvfb-run` | CDP works under Xvfb; `capture-app-window.sh` does NOT |
|
||||
| Bot | ✅ | ❌ osascript + native apps are macOS-only | macOS `screencapture` only |
|
||||
|
||||
When a test must stay cloud-portable, prefer CDP-based evidence over
|
||||
OS-level capture wherever both exist.
|
||||
|
||||
### Bot platforms
|
||||
|
||||
| Platform | Guide | Quick switcher |
|
||||
| ------------- | ------------------------------------------------ | --------------------- |
|
||||
| Discord | [bot/discord/index.md](./bot/discord/index.md) | `Cmd+K` |
|
||||
| Slack | [bot/slack/index.md](./bot/slack/index.md) | `Cmd+K` |
|
||||
| Telegram | [bot/telegram/index.md](./bot/telegram/index.md) | `Cmd+F` |
|
||||
| WeChat / 微信 | [bot/wechat/index.md](./bot/wechat/index.md) | `Cmd+F` |
|
||||
| Lark / 飞书 | [bot/lark/index.md](./bot/lark/index.md) | `Cmd+K` |
|
||||
| QQ | [bot/qq/index.md](./bot/qq/index.md) | `Cmd+F` |
|
||||
| iMessage | [bot/imessage/index.md](./bot/imessage/index.md) | bridge (no osascript) |
|
||||
|
||||
Each platform folder contains an `index.md` (activation, navigation,
|
||||
send-message, verification snippets) and a `test-<platform>-bot.sh` script
|
||||
sharing the interface:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/bot/<platform>/test-<platform>-bot.sh <channel_or_contact> <message> [wait_seconds] [screenshot_path]
|
||||
```
|
||||
|
||||
New to osascript automation? Read
|
||||
[references/osascript.md](./references/osascript.md) first — it is a general
|
||||
macOS-automation asset (activate, type, paste, screenshot, accessibility reads,
|
||||
gotchas), not bot-specific.
|
||||
|
||||
## Step 2 — Run
|
||||
|
||||
Surface guides above carry the detailed workflows. Shared infrastructure:
|
||||
|
||||
| Need | Where |
|
||||
| ------------------------------------ | -------------------------------------------------------------------- |
|
||||
| Start / restart the local dev server | [references/dev-server.md](./references/dev-server.md) |
|
||||
| `agent-browser` command reference | [references/agent-browser.md](./references/agent-browser.md) |
|
||||
| osascript patterns (general macOS) | [references/osascript.md](./references/osascript.md) |
|
||||
| Agent gateway probing | [references/agent-gateway.md](./references/agent-gateway.md) |
|
||||
| Screen recording | [references/record-app-screen.md](./references/record-app-screen.md) |
|
||||
|
||||
### Scripts
|
||||
|
||||
All under `.agents/skills/agent-testing/scripts/`:
|
||||
|
||||
| Script | Usage |
|
||||
| ------------------------- | ---------------------------------------------------------------------------- |
|
||||
| `test-env.sh` | Print/export the resolved local test env and ports |
|
||||
| `setup-auth.sh` | One-stop auth setup & status check (`status` / `cli` / `web`) |
|
||||
| `init-dev-env.sh` | Self-contained local dev env (`setup-db` / `seed-user` / `dev-next` / `dev`) |
|
||||
| `app-probe.sh` | LobeHub app probes: `auth` / `route` / `ops` / `goto <path>` / `errors` |
|
||||
| `record-gif.sh` | Frame-sequence → GIF for time-based behavior (streaming, timers, animations) |
|
||||
| `report-init.sh` | Scaffold a structured test report (Step 3) |
|
||||
| `electron-dev.sh` | Manage Electron dev env (start/stop/status/restart, CDP 9222) |
|
||||
| `capture-app-window.sh` | Screenshot a specific app window (general; used by bot tests) |
|
||||
| `record-app-screen.sh` | Record app screen (video + periodic screenshots) |
|
||||
| `record-electron-demo.sh` | Record Electron app demo with ffmpeg |
|
||||
| `agent-gateway/` | Gateway probe / dump / analyze tools |
|
||||
|
||||
`app-probe.sh` is the LobeHub-specific fast path into app state — auth check,
|
||||
current route, running operations, and `goto <path>` quick navigation
|
||||
(`/agent/<agentId>/<topicId>`, `/task/<taskId>`, `/settings`, …) so a test can
|
||||
jump straight to the state under test instead of clicking through the UI. See
|
||||
[ui/electron.md](./ui/electron.md#lobehub-probes--quick-navigation) for usage.
|
||||
|
||||
## Step 3 — Structured report (mandatory deliverable)
|
||||
|
||||
Every automated test session ends with a structured, evidence-backed report —
|
||||
not a chat-only summary. Scaffold it up front and fill it as you test:
|
||||
|
||||
```bash
|
||||
DIR=$(./.agents/skills/agent-testing/scripts/report-init.sh my-feature "Verify my feature")
|
||||
# ... test, saving screenshots / CLI transcripts into $DIR/assets/ ...
|
||||
# fill $DIR/report.md (scope, case table with inline evidence, verdict, score) and $DIR/result.json
|
||||
```
|
||||
|
||||
Reports live in `.records/reports/<timestamp>-<slug>/` (gitignored): `report.md`
|
||||
(human-readable, with screenshots/GIFs embedded directly in the case table),
|
||||
`result.json` (machine-readable pass/fail + score), `assets/` (evidence).
|
||||
Format spec and evidence rules:
|
||||
[references/report.md](./references/report.md).
|
||||
|
||||
Two hard rules worth front-loading:
|
||||
|
||||
- **Report language = the user's conversation language.** Write the ENTIRE
|
||||
`report.md` (headings included) in the language the user is conversing in —
|
||||
no mixed English. `result.json` keys/status values stay English.
|
||||
- **The case table is the main reading surface.** Prefer the compact
|
||||
`# | case | result | key observation | evidence` shape and embed the
|
||||
screenshot/GIF in the evidence cell. Use separate evidence sections only for
|
||||
long CLI transcripts, HAR summaries, or supplemental detail.
|
||||
- **Visual evidence must render inline.** Screenshots and GIFs in `report.md`
|
||||
must use Markdown image syntax like ``. Do not
|
||||
use bare file paths, Markdown links, or local file links as the primary
|
||||
visual evidence; those make the report unreadable without opening each asset.
|
||||
- **Final replies must include visual evidence links.** When a run includes UI
|
||||
screenshots or GIFs, include the report directory and the most important
|
||||
visual artifacts in the final chat response. Each item must include a stable
|
||||
label, an evidence caption describing the observed UI outcome, and a
|
||||
repo-relative path, for example:
|
||||
`[Image #1 - error toast shows provider auth failure](<report-dir>/assets/foo.png)`.
|
||||
Use repo-relative paths, not absolute paths.
|
||||
- **Time-based behavior needs a GIF, not a screenshot.** If a case asserts
|
||||
change over time (streaming output, a ticking timer, loading states,
|
||||
animations), record it with `scripts/record-gif.sh` and embed the GIF —
|
||||
a static screenshot cannot prove the behavior.
|
||||
|
||||
## Directory map
|
||||
|
||||
```
|
||||
agent-testing/
|
||||
├── SKILL.md # this router
|
||||
├── cli/index.md # backend verification via the LobeHub CLI
|
||||
├── ui/electron.md # pure-frontend verification in the desktop app
|
||||
├── ui/web.md # full-stack verification in the browser
|
||||
├── bot/<platform>/ # bot-channel verification (osascript / bridge)
|
||||
├── references/ # shared knowledge: auth, dev-server, agent-browser, osascript, report
|
||||
└── scripts/ # setup-auth, report-init, electron-dev, capture, recording, gateway
|
||||
```
|
||||
|
||||
## Gotchas
|
||||
|
||||
- agent-browser: see [references/agent-browser.md](./references/agent-browser.md#gotchas)
|
||||
- Electron: see [ui/electron.md](./ui/electron.md#electron-gotchas)
|
||||
- osascript: see [references/osascript.md](./references/osascript.md#gotchas)
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `Discord` | **Process name:** `Discord`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -92,6 +92,6 @@ echo "Screenshot saved to /tmp/discord-test-result.png"
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "!ping"
|
||||
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
|
||||
./.agents/skills/agent-testing/bot/discord/test-discord-bot.sh "bot-testing" "!ping"
|
||||
./.agents/skills/agent-testing/bot/discord/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
|
||||
```
|
||||
+1
-1
@@ -60,5 +60,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
+3
-3
@@ -21,7 +21,7 @@ So the test surface is three layers:
|
||||
curl -sS -m4 -o /dev/null -w '%{http_code}\n' \
|
||||
"http://127.0.0.1:1234/api/v1/server/info?password=<PW>" # expect 200
|
||||
```
|
||||
- **Electron dev running with CDP**: `./.agents/skills/local-testing/scripts/electron-dev.sh start`
|
||||
- **Electron dev running with CDP**: `./.agents/skills/agent-testing/scripts/electron-dev.sh start`
|
||||
- The **iMessage Desktop branch** checked out (the `imessageBridge` IPC group
|
||||
and `@lobechat/chat-adapter-imessage` must be compiled into the main bundle).
|
||||
Run `pnpm install --ignore-scripts` at the repo root **and** in `apps/desktop/`
|
||||
@@ -31,7 +31,7 @@ So the test surface is three layers:
|
||||
## Fast path: automated script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/imessage/test-imessage-bridge.sh '<bluebubbles_password>' [bb_url] [cdp_port]
|
||||
./.agents/skills/agent-testing/bot/imessage/test-imessage-bridge.sh '<bluebubbles_password>' [bb_url] [cdp_port]
|
||||
```
|
||||
|
||||
Asserts the whole flow and self-cleans (unique `applicationId` per run, removes
|
||||
@@ -136,7 +136,7 @@ Verifies the leg the bridge uses to _reply_: `BlueBubblesApiClient.sendText`
|
||||
→ `POST /api/v1/message/text`. Run the helper against your own number:
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/imessage/send-imessage-test.sh '<bb_password>' '+<E164>' # e.g. +15551234567
|
||||
./.agents/skills/agent-testing/bot/imessage/send-imessage-test.sh '<bb_password>' '+<E164>' # e.g. +15551234567
|
||||
```
|
||||
|
||||
**Gotcha that bites everyone:** with `method=apple-script` and a _new_
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `Lark` or `飞书` | **Process name:** `Lark` or `飞书`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -56,6 +56,6 @@ screencapture /tmp/lark-bot-response.png
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "@MyBot hello"
|
||||
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "Help me with this" 30
|
||||
./.agents/skills/agent-testing/bot/lark/test-lark-bot.sh "bot-testing" "@MyBot hello"
|
||||
./.agents/skills/agent-testing/bot/lark/test-lark-bot.sh "bot-testing" "Help me with this" 30
|
||||
```
|
||||
+1
-1
@@ -80,5 +80,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `QQ` | **Process name:** `QQ`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -57,6 +57,6 @@ screencapture /tmp/qq-bot-response.png
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "bot-testing" "Hello bot" 15
|
||||
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "MyBot" "/help" 10
|
||||
./.agents/skills/agent-testing/bot/qq/test-qq-bot.sh "bot-testing" "Hello bot" 15
|
||||
./.agents/skills/agent-testing/bot/qq/test-qq-bot.sh "MyBot" "/help" 10
|
||||
```
|
||||
+1
-1
@@ -72,5 +72,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `Slack` | **Process name:** `Slack`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -68,6 +68,6 @@ screencapture /tmp/slack-bot-response.png
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "@mybot hello"
|
||||
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
|
||||
./.agents/skills/agent-testing/bot/slack/test-slack-bot.sh "bot-testing" "@mybot hello"
|
||||
./.agents/skills/agent-testing/bot/slack/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
|
||||
```
|
||||
+1
-1
@@ -60,5 +60,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `Telegram` | **Process name:** `Telegram`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -75,6 +75,6 @@ curl -s "https://api.telegram.org/bot$TELEGRAM_BOT_TOKEN/getUpdates?limit=5" | j
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "MyTestBot" "/start"
|
||||
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "GPTBot" "Hello" 60
|
||||
./.agents/skills/agent-testing/bot/telegram/test-telegram-bot.sh "MyTestBot" "/start"
|
||||
./.agents/skills/agent-testing/bot/telegram/test-telegram-bot.sh "GPTBot" "Hello" 60
|
||||
```
|
||||
+1
-1
@@ -75,5 +75,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
+3
-3
@@ -2,7 +2,7 @@
|
||||
|
||||
**App name:** `微信` or `WeChat` | **Process name:** `WeChat`
|
||||
|
||||
See [osascript-common.md](../osascript-common.md) for shared patterns.
|
||||
See [references/osascript.md](../../references/osascript.md) for shared patterns.
|
||||
|
||||
## Activate & Navigate
|
||||
|
||||
@@ -76,6 +76,6 @@ screencapture /tmp/wechat-bot-response.png
|
||||
## Script
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "文件传输助手" "test message" 5
|
||||
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
|
||||
./.agents/skills/agent-testing/bot/wechat/test-wechat-bot.sh "文件传输助手" "test message" 5
|
||||
./.agents/skills/agent-testing/bot/wechat/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
|
||||
```
|
||||
+1
-1
@@ -81,5 +81,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
|
||||
sleep "$WAIT"
|
||||
|
||||
echo "[$APP] Capturing screenshot..."
|
||||
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
"$SCRIPT_DIR/../../scripts/capture-app-window.sh" "$APP" "$SCREENSHOT"
|
||||
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
|
||||
@@ -0,0 +1,152 @@
|
||||
# CLI Backend Verification
|
||||
|
||||
Default surface for verifying **backend changes** (TRPC routers, services,
|
||||
models, migrations) end-to-end: fastest loop, text-assertable output, zero UI
|
||||
flakiness.
|
||||
|
||||
## When to use
|
||||
|
||||
- Verifying TRPC router / service / model changes end-to-end
|
||||
- Testing new API fields or response structure changes
|
||||
- Validating CLI command output after backend modifications
|
||||
- Debugging data flow issues between server and CLI
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Requirement | Details |
|
||||
| ------------ | ---------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Dev server | `localhost:3010` — see [../references/dev-server.md](../references/dev-server.md) |
|
||||
| CLI source | `apps/cli/` — runs from source, no rebuild; standalone `node_modules` — run `pnpm install` inside `apps/cli/` (root install does not cover it) |
|
||||
| CLI dev mode | `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated settings |
|
||||
| Auth | Seeded API key first; Device Code Flow only as fallback — see [../references/auth.md](../references/auth.md) |
|
||||
|
||||
All CLI dev commands run from `apps/cli/`. Subsequent examples use `$CLI`:
|
||||
|
||||
```bash
|
||||
source ../../.records/env/agent-testing-cli.env
|
||||
CLI="bun src/index.ts"
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1 — Server up?
|
||||
|
||||
See [../references/dev-server.md](../references/dev-server.md) for the health
|
||||
check, start, and restart commands. Server-side code changes require a restart.
|
||||
|
||||
### Step 2 — Auth ready?
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh status
|
||||
```
|
||||
|
||||
If the CLI is not ready in the seeded local environment:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
source .records/env/agent-testing-cli.env
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
|
||||
```
|
||||
|
||||
If the target environment is not seeded, use the interactive fallback:
|
||||
|
||||
```bash
|
||||
cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
|
||||
```
|
||||
|
||||
Seeded API-key auth does not store credentials. It writes local settings under
|
||||
`$HOME/.lobehub-dev` and requires the generated env file to be sourced before
|
||||
CLI commands. Details:
|
||||
[../references/auth.md](../references/auth.md).
|
||||
|
||||
### Step 3 — Test with CLI commands
|
||||
|
||||
CLI runs from source, so CLI-side code changes take effect immediately without
|
||||
rebuilding:
|
||||
|
||||
```bash
|
||||
cd apps/cli
|
||||
$CLI <command>
|
||||
```
|
||||
|
||||
Capture output for the report as you go (e.g. `$CLI task list | tee "$DIR/assets/task-list.txt"`).
|
||||
|
||||
### Step 4 — Clean up test data
|
||||
|
||||
```bash
|
||||
$CLI task delete < id > -y
|
||||
$CLI agent delete < id > -y
|
||||
```
|
||||
|
||||
### Step 5 — Report
|
||||
|
||||
Finish with a structured report —
|
||||
[../references/report.md](../references/report.md). CLI evidence = exact
|
||||
command + trimmed output.
|
||||
|
||||
## Common testing patterns
|
||||
|
||||
### Task system
|
||||
|
||||
```bash
|
||||
$CLI task list
|
||||
$CLI task create -n "Root Task" -i "Test instruction"
|
||||
$CLI task create -n "Child Task" -i "Sub instruction" --parent T-1
|
||||
$CLI task view T-1
|
||||
$CLI task tree T-1
|
||||
$CLI task edit T-1 --status running
|
||||
$CLI task comment T-1 -m "Test comment"
|
||||
$CLI task delete T-1 -y
|
||||
```
|
||||
|
||||
### Agent system
|
||||
|
||||
```bash
|
||||
$CLI agent list
|
||||
$CLI agent view <agent-id>
|
||||
$CLI agent run <agent-id> -m "Test prompt"
|
||||
```
|
||||
|
||||
### Document & knowledge base
|
||||
|
||||
```bash
|
||||
$CLI doc list
|
||||
$CLI doc create -t "Test Doc" -c "Content here"
|
||||
$CLI doc view <doc-id>
|
||||
$CLI kb list
|
||||
$CLI kb tree <kb-id>
|
||||
```
|
||||
|
||||
### Model & provider
|
||||
|
||||
```bash
|
||||
$CLI model list
|
||||
$CLI provider list
|
||||
$CLI provider test <provider-id>
|
||||
```
|
||||
|
||||
## Dev-test cycle
|
||||
|
||||
```
|
||||
1. Make code changes (service/model/router/type)
|
||||
|
|
||||
2. Run unit tests (fast feedback)
|
||||
bunx vitest run --silent='passed-only' '<test-file>'
|
||||
|
|
||||
3. Restart dev server (if server-side changes — see dev-server.md)
|
||||
|
|
||||
4. CLI verification (end-to-end)
|
||||
$CLI <command>
|
||||
|
|
||||
5. Clean up test data + write the report
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Issue | Solution |
|
||||
| --------------------------- | ------------------------------------------------------------------------------------------------------ |
|
||||
| `No authentication found` | Source `.records/env/agent-testing-cli.env`, or run device-code `login --server http://localhost:3010` |
|
||||
| `UNAUTHORIZED` on API calls | Re-run `init-dev-env.sh seed-user` and re-source the env file; for device-code fallback, re-run login |
|
||||
| `ECONNREFUSED` | Dev server not running — see dev-server.md |
|
||||
| CLI shows old data/behavior | Server needs restart to pick up code changes |
|
||||
| Login opens wrong server | Must use `--server` flag (env var doesn't work) |
|
||||
@@ -0,0 +1,257 @@
|
||||
# agent-browser CLI Reference
|
||||
|
||||
Generic reference for the `agent-browser` CLI — automate Chromium-based apps (Electron, Chrome, web) via Chrome DevTools Protocol. LobeHub-specific patterns live in [../ui/electron.md](../ui/electron.md) and [../ui/web.md](../ui/web.md); authentication recipes live in [auth.md](./auth.md).
|
||||
|
||||
Use `agent-browser` to automate Chromium-based apps via Chrome DevTools Protocol.
|
||||
|
||||
Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Run `agent-browser upgrade` to update.
|
||||
|
||||
## Core Workflow
|
||||
|
||||
Every browser automation follows this pattern:
|
||||
|
||||
1. **Navigate**: `agent-browser open <url>`
|
||||
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
|
||||
3. **Interact**: Use refs to click, fill, select
|
||||
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
|
||||
|
||||
```bash
|
||||
agent-browser open https://example.com/form
|
||||
agent-browser snapshot -i
|
||||
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
|
||||
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser snapshot -i # Check result
|
||||
```
|
||||
|
||||
## Command Chaining
|
||||
|
||||
```bash
|
||||
# Chain open + wait + snapshot in one call
|
||||
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
|
||||
```
|
||||
|
||||
Use `&&` when you don't need to read intermediate output. Run commands separately when you need to parse output first (e.g., snapshot to discover refs, then interact).
|
||||
|
||||
## Essential Commands
|
||||
|
||||
```bash
|
||||
# Navigation
|
||||
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
||||
agent-browser close # Close browser
|
||||
agent-browser close --all # Close all active sessions
|
||||
|
||||
# Snapshot
|
||||
agent-browser snapshot -i # Interactive elements with refs (recommended)
|
||||
agent-browser snapshot -s "#selector" # Scope to CSS selector
|
||||
|
||||
# Interaction (use @refs from snapshot)
|
||||
agent-browser click @e1 # Click element
|
||||
agent-browser click @e1 --new-tab # Click and open in new tab
|
||||
agent-browser fill @e2 "text" # Clear and type text
|
||||
agent-browser type @e2 "text" # Type without clearing
|
||||
agent-browser select @e1 "option" # Select dropdown option
|
||||
agent-browser check @e1 # Check checkbox
|
||||
agent-browser press Enter # Press key
|
||||
agent-browser keyboard type "text" # Type at current focus (no selector)
|
||||
agent-browser keyboard inserttext "text" # Insert without key events
|
||||
agent-browser scroll down 500 # Scroll page
|
||||
agent-browser scroll down 500 --selector "div.content" # Scroll within container
|
||||
|
||||
# Get information
|
||||
agent-browser get text @e1 # Get element text
|
||||
agent-browser get url # Get current URL
|
||||
agent-browser get title # Get page title
|
||||
agent-browser get cdp-url # Get CDP WebSocket URL
|
||||
|
||||
# Wait
|
||||
agent-browser wait @e1 # Wait for element
|
||||
agent-browser wait --load networkidle # Wait for network idle
|
||||
agent-browser wait --url "**/page" # Wait for URL pattern
|
||||
agent-browser wait 2000 # Wait milliseconds
|
||||
agent-browser wait --text "Welcome" # Wait for text to appear
|
||||
agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear
|
||||
agent-browser wait "#spinner" --state hidden # Wait for element to disappear
|
||||
|
||||
# Downloads
|
||||
agent-browser download @e1 ./file.pdf # Click element to trigger download
|
||||
agent-browser wait --download ./output.zip # Wait for any download to complete
|
||||
|
||||
# Network
|
||||
agent-browser network requests # Inspect tracked requests
|
||||
agent-browser network requests --type xhr,fetch # Filter by resource type
|
||||
agent-browser network requests --method POST # Filter by HTTP method
|
||||
agent-browser network route "**/api/*" --abort # Block matching requests
|
||||
agent-browser network har start # Start HAR recording
|
||||
agent-browser network har stop ./capture.har # Stop and save HAR file
|
||||
|
||||
# Viewport & Device Emulation
|
||||
agent-browser set viewport 1920 1080 # Set viewport size (default: 1280x720)
|
||||
agent-browser set viewport 1920 1080 2 # 2x retina
|
||||
agent-browser set device "iPhone 14" # Emulate device (viewport + user agent)
|
||||
|
||||
# Capture
|
||||
agent-browser screenshot # Screenshot to temp dir
|
||||
agent-browser screenshot --full # Full page screenshot
|
||||
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
|
||||
agent-browser pdf output.pdf # Save as PDF
|
||||
|
||||
# Clipboard
|
||||
agent-browser clipboard read # Read text from clipboard
|
||||
agent-browser clipboard write "text" # Write text to clipboard
|
||||
agent-browser clipboard copy # Copy current selection
|
||||
agent-browser clipboard paste # Paste from clipboard
|
||||
|
||||
# Dialogs (alert, confirm, prompt, beforeunload)
|
||||
agent-browser dialog accept # Accept dialog
|
||||
agent-browser dialog accept "input" # Accept prompt dialog with text
|
||||
agent-browser dialog dismiss # Dismiss/cancel dialog
|
||||
agent-browser dialog status # Check if dialog is open
|
||||
|
||||
# Diff (compare page states)
|
||||
agent-browser diff snapshot # Compare current vs last snapshot
|
||||
agent-browser diff screenshot --baseline before.png # Visual pixel diff
|
||||
agent-browser diff url <url1> <url2> # Compare two pages
|
||||
|
||||
# Streaming
|
||||
agent-browser stream enable # Start WebSocket streaming
|
||||
agent-browser stream status # Inspect streaming state
|
||||
agent-browser stream disable # Stop streaming
|
||||
```
|
||||
|
||||
## Batch Execution
|
||||
|
||||
```bash
|
||||
echo '[
|
||||
["open", "https://example.com"],
|
||||
["snapshot", "-i"],
|
||||
["click", "@e1"],
|
||||
["screenshot", "result.png"]
|
||||
]' | agent-browser batch --json
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```bash
|
||||
# Option 1: Auth vault (credentials stored encrypted)
|
||||
echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin
|
||||
agent-browser auth login myapp
|
||||
|
||||
# Option 2: Session name (auto-save/restore cookies + localStorage)
|
||||
agent-browser --session-name myapp open https://app.example.com/login
|
||||
agent-browser close # State auto-saved
|
||||
agent-browser --session-name myapp open https://app.example.com/dashboard # Auto-restored
|
||||
|
||||
# Option 3: Persistent profile
|
||||
agent-browser --profile ~/.myapp open https://app.example.com/login
|
||||
|
||||
# Option 4: State file
|
||||
agent-browser state save auth.json
|
||||
agent-browser state load auth.json
|
||||
```
|
||||
|
||||
### LobeHub dev server — inject better-auth cookie
|
||||
|
||||
`agent-browser --headed` on macOS can create an off-screen Chromium window, blocking manual login. For a local LobeHub dev server (e.g. `localhost:3010`), copy the `better-auth.session_token` cookie out of a **Network request** in the user's own Chrome DevTools and load it via `state load`. See [auth.md](./auth.md) for the full recipe.
|
||||
|
||||
## Semantic Locators (Alternative to Refs)
|
||||
|
||||
```bash
|
||||
agent-browser find text "Sign In" click
|
||||
agent-browser find label "Email" fill "user@test.com"
|
||||
agent-browser find role button click --name "Submit"
|
||||
agent-browser find placeholder "Search" type "query"
|
||||
agent-browser find testid "submit-btn" click
|
||||
```
|
||||
|
||||
## JavaScript Evaluation (eval)
|
||||
|
||||
```bash
|
||||
# Simple expressions
|
||||
agent-browser eval 'document.title'
|
||||
|
||||
# Complex JS: use --stdin with heredoc (RECOMMENDED)
|
||||
agent-browser eval --stdin << 'EVALEOF'
|
||||
JSON.stringify(
|
||||
Array.from(document.querySelectorAll("img"))
|
||||
.filter(i => !i.alt)
|
||||
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
|
||||
)
|
||||
EVALEOF
|
||||
|
||||
# Base64 encoding (avoids all shell escaping issues)
|
||||
agent-browser eval -b "$(echo -n 'document.title' | base64)"
|
||||
```
|
||||
|
||||
## Ref Lifecycle
|
||||
|
||||
Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after clicking links/buttons that navigate, form submissions, or dynamic content loading.
|
||||
|
||||
## Annotated Screenshots (Vision Mode)
|
||||
|
||||
```bash
|
||||
agent-browser screenshot --annotate
|
||||
# Output includes the image path and a legend:
|
||||
# [1] @e1 button "Submit"
|
||||
# [2] @e2 link "Home"
|
||||
agent-browser click @e2 # Click using ref from annotated screenshot
|
||||
```
|
||||
|
||||
## Parallel Sessions
|
||||
|
||||
```bash
|
||||
agent-browser --session site1 open https://site-a.com
|
||||
agent-browser --session site2 open https://site-b.com
|
||||
agent-browser session list
|
||||
```
|
||||
|
||||
## Connect to Existing Chrome
|
||||
|
||||
```bash
|
||||
agent-browser --auto-connect snapshot # Auto-discover running Chrome
|
||||
agent-browser --cdp 9222 snapshot # Explicit CDP port
|
||||
```
|
||||
|
||||
## iOS Simulator (Mobile Safari)
|
||||
|
||||
```bash
|
||||
agent-browser device list
|
||||
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
|
||||
agent-browser -p ios snapshot -i
|
||||
agent-browser -p ios tap @e1
|
||||
agent-browser -p ios swipe up
|
||||
agent-browser -p ios screenshot mobile.png
|
||||
agent-browser -p ios close
|
||||
```
|
||||
|
||||
## Observability Dashboard
|
||||
|
||||
```bash
|
||||
agent-browser dashboard install
|
||||
agent-browser dashboard start # Background server on port 4848
|
||||
agent-browser dashboard stop
|
||||
```
|
||||
|
||||
## Cloud Providers
|
||||
|
||||
Use `-p <provider>` to run against cloud browsers: `agentcore`, `browserbase`, `browserless`, `browseruse`, `kernel`.
|
||||
|
||||
## Browser Engine Selection
|
||||
|
||||
```bash
|
||||
agent-browser --engine lightpanda open example.com # 10x faster, 10x less memory
|
||||
```
|
||||
|
||||
## Gotchas
|
||||
|
||||
- **Daemon can get stuck** — if commands hang, `agent-browser close --all` or `pkill -f agent-browser` to reset
|
||||
- **HMR invalidates everything** — after code changes, refs break. Re-snapshot or restart
|
||||
- **`snapshot -i` doesn't find contenteditable** — use `snapshot -i -C` for rich text editors
|
||||
- **`fill` doesn't work on contenteditable** — use `type` for chat inputs
|
||||
- **Screenshots go to `~/.agent-browser/tmp/screenshots/`** — read them with the `Read` tool
|
||||
- **Dialogs block all commands** — if commands time out, check `agent-browser dialog status`
|
||||
- **Default timeout is 25s** — override with `AGENT_BROWSER_DEFAULT_TIMEOUT` (ms) or use explicit waits
|
||||
- **Shell quoting corrupts eval** — use `eval --stdin <<'EVALEOF'` for complex JS
|
||||
+5
-5
@@ -19,13 +19,13 @@ works for any LobeHub streaming session.
|
||||
|
||||
```bash
|
||||
# 1. Start Electron with CDP
|
||||
./.agents/skills/local-testing/scripts/electron-dev.sh start
|
||||
./.agents/skills/agent-testing/scripts/electron-dev.sh start
|
||||
|
||||
# 2. Navigate to a chat, switch runtime to Cloud Sandbox (gateway mode)
|
||||
|
||||
# 3. Install the probe + helpers
|
||||
agent-browser --cdp 9222 eval --stdin \
|
||||
< .agents/skills/local-testing/scripts/agent-gateway/probe.js
|
||||
< .agents/skills/agent-testing/scripts/agent-gateway/probe.js
|
||||
|
||||
# 4. Send a tool-call message — manually or via type+press
|
||||
agent-browser --cdp 9222 eval "window.__PROBE_EVENT('SENT')"
|
||||
@@ -34,15 +34,15 @@ agent-browser --cdp 9222 eval "window.__PROBE_EVENT('SENT')"
|
||||
# rightmost inactive tab as AWAY — edit ROUND_TRIPS / DWELL_MS in the
|
||||
# file if you want different timing)
|
||||
agent-browser --cdp 9222 eval --stdin \
|
||||
< .agents/skills/local-testing/scripts/agent-gateway/tab-switch.js
|
||||
< .agents/skills/agent-testing/scripts/agent-gateway/tab-switch.js
|
||||
|
||||
# 6. Wait for streaming to finish, then dump
|
||||
agent-browser --cdp 9222 eval --stdin \
|
||||
< .agents/skills/local-testing/scripts/agent-gateway/probe-dump.js \
|
||||
< .agents/skills/agent-testing/scripts/agent-gateway/probe-dump.js \
|
||||
> /tmp/probe.json
|
||||
|
||||
# 7. Analyze
|
||||
node .agents/skills/local-testing/scripts/agent-gateway/analyze.mjs /tmp/probe.json
|
||||
node .agents/skills/agent-testing/scripts/agent-gateway/analyze.mjs /tmp/probe.json
|
||||
```
|
||||
|
||||
The analyzer prints three sections: EVENTS, TIMELINE, REGRESSIONS. If
|
||||
@@ -0,0 +1,166 @@
|
||||
# Auth Setup for Local Agent Testing
|
||||
|
||||
**Auth is the gate for all automated testing.** Complete
|
||||
[Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) first so
|
||||
`SERVER_URL` and ports are resolved, then verify auth before writing any test
|
||||
step.
|
||||
|
||||
Initialize helpers first:
|
||||
|
||||
```bash
|
||||
SCRIPT="./.agents/skills/agent-testing/scripts/setup-auth.sh"
|
||||
TEST_ENV="./.agents/skills/agent-testing/scripts/test-env.sh"
|
||||
eval "$($TEST_ENV --exports)"
|
||||
```
|
||||
|
||||
Quick reference after initialization:
|
||||
|
||||
| Command | Purpose |
|
||||
| ------------------------------ | -------------------------------------------------- |
|
||||
| `$SCRIPT status` | Check all surfaces (server + CLI + web + Electron) |
|
||||
| `$SCRIPT status --surface web` | Check only the Web surface gate |
|
||||
| `$SCRIPT cli-seed` | Configure CLI API-key auth from the seeded key |
|
||||
| `$SCRIPT cli` | Interactive CLI device-code login (user must run) |
|
||||
| `$SCRIPT open-chrome` | Open Chrome at `SERVER_URL` with DevTools |
|
||||
| `$SCRIPT web-seed` | Sign in the seeded user and inject cookies |
|
||||
| `pbpaste \| $SCRIPT web` | Inject a copied Cookie header into agent-browser |
|
||||
| `$SCRIPT web-verify` | Live-check agent-browser session auth |
|
||||
|
||||
Use `localhost` for Web auth; better-auth cookies are stored for `localhost`,
|
||||
not `127.0.0.1`.
|
||||
|
||||
## Per-surface overview
|
||||
|
||||
| Surface | Mechanism | Persistence | Human interaction |
|
||||
| -------- | ---------------------------------------- | ----------------------------------------------------------------- | ---------------------------------------------- |
|
||||
| CLI | Seeded API key or OIDC Device Code Flow | `.records/env/agent-testing-cli.env` + `$HOME/.lobehub-dev` | No for seed path; yes for device-code fallback |
|
||||
| Web | Seeded better-auth login or cookie copy | `~/.lobehub-agent-testing/web-state.json` + agent-browser session | No for seed path; copy cookie only as fallback |
|
||||
| Electron | App's own login state | Electron user-data dir | Log in once manually in the app |
|
||||
| Bot | Native apps (Discord/WeChat/…) logged in | Each app's own session | Once per app |
|
||||
|
||||
## CLI — Seeded API key
|
||||
|
||||
For the self-contained no-root-`.env` dev environment, seed the baseline user
|
||||
and API key once:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
source .records/env/agent-testing-cli.env
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh cli-seed
|
||||
```
|
||||
|
||||
The seed step writes `LOBE_API_KEY` for humans and maps it to the CLI's current
|
||||
auth variable, `LOBEHUB_CLI_API_KEY`. It also sets `LOBEHUB_SERVER` so CLI
|
||||
commands hit the local server without needing a stored device-code token.
|
||||
|
||||
Use this for automated CLI verification:
|
||||
|
||||
```bash
|
||||
cd apps/cli
|
||||
source ../../.records/env/agent-testing-cli.env
|
||||
bun src/index.ts <command>
|
||||
```
|
||||
|
||||
## CLI — Device Code Flow fallback
|
||||
|
||||
Use device-code login only when testing against a non-seeded environment.
|
||||
Credentials are isolated from the user's real CLI config via
|
||||
`LOBEHUB_CLI_HOME=.lobehub-dev`, which the current CLI stores under
|
||||
`$HOME/.lobehub-dev`.
|
||||
|
||||
```bash
|
||||
cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3010
|
||||
```
|
||||
|
||||
- The `--server` flag is required — an env var does NOT work and login will hit
|
||||
the wrong server without it.
|
||||
- Check state without logging in: `setup-auth.sh status` (verifies
|
||||
`LOBEHUB_CLI_API_KEY` when present, otherwise checks the stored server URL).
|
||||
- `UNAUTHORIZED` on API calls means the token expired — re-run login.
|
||||
|
||||
## Web — seeded better-auth login
|
||||
|
||||
The Web test surface is `agent-browser --session lobehub-dev`. The user's
|
||||
ordinary Chrome is only a cookie source; Chrome screenshots, Chrome Network
|
||||
records, and Chrome logged-in state do not prove the agent-browser test session
|
||||
is authenticated.
|
||||
|
||||
For the seeded local dev environment, use the automatic path:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
|
||||
```
|
||||
|
||||
`web-seed` posts the seeded email/password to
|
||||
`/api/auth/sign-in/email`, stores the returned cookie jar under
|
||||
`~/.lobehub-agent-testing/`, converts it to Playwright `storageState`, loads it
|
||||
into the `agent-browser` session, and verifies the session does not land on
|
||||
`/signin`.
|
||||
|
||||
## Web — manual cookie injection fallback
|
||||
|
||||
`agent-browser --headed` on macOS often creates the Chromium window off-screen —
|
||||
the user can't see or interact with it, so manual login inside the agent-browser
|
||||
session fails. Instead, copy the **better-auth session cookie** out of the
|
||||
user's own logged-in Chrome and inject it as a Playwright-style state file.
|
||||
|
||||
Do **not** use this on production URLs — only local dev. Treat the cookie as a
|
||||
secret: don't paste it into shared logs, PRs, or commit it anywhere.
|
||||
|
||||
### Web — decision flow
|
||||
|
||||
1. `$SCRIPT status --surface web` — green? Start testing. Do not ask for a Cookie header.
|
||||
2. Not green and using the seeded local env → `$SCRIPT web-seed`.
|
||||
3. Still not green or not using the seed env → `$SCRIPT open-chrome` opens Chrome at `SERVER_URL` with DevTools.
|
||||
4. User copies the `Cookie:` header from Network tab → any same-origin request → Request Headers → right-click `Cookie:` → **Copy value**. Must be from Network, NOT `document.cookie` (HttpOnly cookies are invisible to `document.cookie`).
|
||||
5. `pbpaste | $SCRIPT web` — filters to better-auth cookies (`session_token`, `session_data`, `state`), builds Playwright `storageState`, loads it into the `agent-browser` session (`lobehub-dev`), opens `SERVER_URL`, and asserts the URL is not `/signin`.
|
||||
|
||||
### Using the authenticated session
|
||||
|
||||
```bash
|
||||
agent-browser --session lobehub-dev open "$SERVER_URL/"
|
||||
agent-browser --session lobehub-dev snapshot -i | head -20
|
||||
```
|
||||
|
||||
### Notes
|
||||
|
||||
- `storageState` doesn't enforce the HttpOnly flag on load — the script stores
|
||||
cookies with `httpOnly: false`, which is fine for local dev and sidesteps a
|
||||
CDP-context quirk where HttpOnly cookies sometimes fail to attach.
|
||||
- The state file is kept at `~/.lobehub-agent-testing/web-state.json` so
|
||||
`setup-auth.sh status` can report web-auth readiness across sessions.
|
||||
|
||||
### Common failure modes
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
| --------------------------------------------- | ------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
|
||||
| Still redirects to `/signin` after injection | User pasted from `document.cookie` → missed HttpOnly session | Re-pull from Network request Headers, not console |
|
||||
| Script reports `no better-auth cookies found` | User pasted the wrong value, or the cookie parser regressed | Keep the raw `Cookie:` header as-is; run `scripts/setup-auth.test.sh` if the input looks valid |
|
||||
| Login works briefly then expires | `better-auth.session_token` rotated (user logged out / signed in again) | Re-copy and re-inject |
|
||||
| Domain mismatch | Cookie domain must be `localhost` literally, no leading dot for local dev | — |
|
||||
|
||||
## Electron
|
||||
|
||||
The desktop app keeps its own persistent login state in its user-data
|
||||
directory — log in once manually inside the app and it survives restarts of
|
||||
`electron-dev.sh`. No injection needed. The standard check (do NOT hand-roll a
|
||||
store eval) once Electron is up with CDP:
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/app-probe.sh auth
|
||||
# → {"ok":true,"isSignedIn":true,"userId":"user_xxx"}
|
||||
```
|
||||
|
||||
`setup-auth.sh status` runs this probe automatically when CDP 9222 is
|
||||
reachable.
|
||||
|
||||
## Scope
|
||||
|
||||
These recipes only cover **local dev** authentication. They do not:
|
||||
|
||||
- Work for production — production cookies are `Secure; HttpOnly; Domain=.lobehub.com`
|
||||
and must be delivered over HTTPS.
|
||||
- Replace real OAuth flows — tests that must exercise the login UI itself need a
|
||||
real Chromium with `--remote-debugging-port` or a bot account.
|
||||
- Flow cookies back to the user's Chrome — injection is one-way.
|
||||
@@ -0,0 +1,98 @@
|
||||
# Local Dev Server
|
||||
|
||||
Single source of truth for starting / restarting the backend that all test
|
||||
surfaces (CLI, Electron, Web) hit.
|
||||
|
||||
## Resolve ports first
|
||||
|
||||
Run `test-env.sh` as described in
|
||||
[SKILL.md Step 0.0](../SKILL.md#00-resolve-the-current-test-environment)
|
||||
before starting or probing any local test surface.
|
||||
|
||||
## Ports & modes
|
||||
|
||||
| Command | What it runs | Port source |
|
||||
| ------------------- | --------------------------------------------------------- | ------------------- |
|
||||
| `pnpm run dev:next` | Next.js backend (API + auth) | `PORT` |
|
||||
| `bun run dev` | Full-stack (Next.js + Vite SPA, via `devStartupSequence`) | `PORT` + `SPA_PORT` |
|
||||
| `bun run dev:spa` | Vite SPA only, proxies API to `PORT` | `SPA_PORT` |
|
||||
|
||||
In the **cloud repo** (where this repo is the `lobehub/` submodule), local
|
||||
worktree names map to fallback defaults only when `.env` and shell env do not
|
||||
provide values:
|
||||
|
||||
| Workspace directory | Default `SERVER_URL` |
|
||||
| ------------------- | -------------------------------- |
|
||||
| `lobehub` | `http://localhost:3010` |
|
||||
| `lobehub-cloud` | `http://localhost:3020` |
|
||||
| `lobehub-cloud-1` | `http://localhost:3021` |
|
||||
| `lobehub-cloud-N` | `http://localhost:$((3020 + N))` |
|
||||
|
||||
`test-env.sh` and `setup-auth.sh` both use the resolved env first and these
|
||||
worktree defaults only as fallback. Treat the dev-server terminal output as the
|
||||
final source of truth when testing a non-standard port, then export it for every
|
||||
agent-testing command:
|
||||
|
||||
```bash
|
||||
export SERVER_URL=http://localhost:<port-from-dev-output>
|
||||
```
|
||||
|
||||
## Health check
|
||||
|
||||
```bash
|
||||
curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/"
|
||||
```
|
||||
|
||||
## Start / restart
|
||||
|
||||
```bash
|
||||
# Start backend only.
|
||||
# With root .env: use the existing local config.
|
||||
pnpm run dev:next
|
||||
|
||||
# Without root .env: use the self-contained agent-testing env.
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
|
||||
|
||||
# Full-stack SPA + backend. Required for Web smoke.
|
||||
# With root .env:
|
||||
bun run dev
|
||||
|
||||
# Without root .env:
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh dev
|
||||
|
||||
# Local QStash. Run in a separate terminal only when testing workflow paths.
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh qstash
|
||||
|
||||
# Restart — required to pick up server-side code changes
|
||||
lsof -ti:"$PORT" | xargs kill
|
||||
pnpm run dev:next
|
||||
# or, when no root .env exists:
|
||||
# ./.agents/skills/agent-testing/scripts/init-dev-env.sh dev-next
|
||||
```
|
||||
|
||||
## When a server restart is needed
|
||||
|
||||
Next.js hot-reload may not pick up changes in workspace packages — restart when
|
||||
in doubt.
|
||||
|
||||
| Change location | Restart? |
|
||||
| ----------------------------------------------- | -------- |
|
||||
| `apps/server/src/` (routers, services, modules) | Yes |
|
||||
| `src/server/` (agent-hono, workflows-hono) | Yes |
|
||||
| `packages/database/` (models) | Yes |
|
||||
| `packages/types/` | Yes |
|
||||
| `packages/prompts/` | Yes |
|
||||
| `apps/cli/` (CLI runs from source) | No |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Issue | Solution |
|
||||
| ------------------------- | --------------------------------------------------------------------------------------------- |
|
||||
| `ECONNREFUSED` | Server not running — start it |
|
||||
| `EADDRINUSE` on the port | Already running — `lsof -ti:<port> \| xargs kill` first |
|
||||
| Stale data / old behavior | Server needs a restart to pick up code changes |
|
||||
| QStash workflow failures | Start `init-dev-env.sh qstash` and make sure dev server inherited the script's `QSTASH_*` env |
|
||||
|
||||
Marketplace/community endpoints are not part of the local agent-testing auth
|
||||
gate. Do not block local product-chain verification on marketplace API auth
|
||||
unless the change explicitly targets marketplace behavior.
|
||||
+10
-10
@@ -12,13 +12,13 @@ General-purpose screen recording tool for the Electron app. Captures CDP screens
|
||||
|
||||
```bash
|
||||
# Start recording (Electron must be running with CDP)
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh start [output_name]
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh start [output_name]
|
||||
|
||||
# Stop recording and assemble video
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh stop
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh stop
|
||||
|
||||
# Check if recording is active
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh status
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh status
|
||||
```
|
||||
|
||||
### Arguments
|
||||
@@ -74,10 +74,10 @@ The `.records/` directory is at the project root and is gitignored.
|
||||
|
||||
```bash
|
||||
# Start Electron
|
||||
.agents/skills/local-testing/scripts/electron-dev.sh start
|
||||
.agents/skills/agent-testing/scripts/electron-dev.sh start
|
||||
|
||||
# Start recording
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh start my-test
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh start my-test
|
||||
|
||||
# Run automation
|
||||
agent-browser --cdp 9222 click @e61
|
||||
@@ -86,14 +86,14 @@ agent-browser --cdp 9222 press Enter
|
||||
sleep 10
|
||||
|
||||
# Stop and get results
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh stop
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh stop
|
||||
# → .records/my-test.mp4 + .records/my-test/*.png
|
||||
```
|
||||
|
||||
### Gateway Streaming Demo
|
||||
|
||||
```bash
|
||||
.agents/skills/local-testing/scripts/electron-dev.sh start
|
||||
.agents/skills/agent-testing/scripts/electron-dev.sh start
|
||||
|
||||
# Inject gateway URL
|
||||
agent-browser --cdp 9222 eval --stdin << 'EOF'
|
||||
@@ -106,19 +106,19 @@ agent-browser --cdp 9222 eval --stdin << 'EOF'
|
||||
EOF
|
||||
|
||||
# Record
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh start gateway-demo
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh start gateway-demo
|
||||
|
||||
# Navigate to agent, send message, wait for completion...
|
||||
# (automation commands here)
|
||||
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh stop
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh stop
|
||||
open .records/gateway-demo.mp4
|
||||
```
|
||||
|
||||
### Check Active Recording
|
||||
|
||||
```bash
|
||||
.agents/skills/local-testing/scripts/record-app-screen.sh status
|
||||
.agents/skills/agent-testing/scripts/record-app-screen.sh status
|
||||
# [record] Active recording
|
||||
# Frames: 42 captured (running: yes)
|
||||
# Screenshots: 14 captured (running: yes)
|
||||
@@ -0,0 +1,186 @@
|
||||
# Structured Test Reports
|
||||
|
||||
Every automated test session ends with a structured, evidence-backed report.
|
||||
A chat-only summary is not an acceptable deliverable: the report is what the
|
||||
user (or a reviewer, or a later agent) audits without replaying the session.
|
||||
|
||||
## Location & layout
|
||||
|
||||
Reports live under `.records/reports/` (gitignored, like all `.records/`
|
||||
output):
|
||||
|
||||
```
|
||||
.records/reports/<YYYYMMDD-HHMMSS>-<slug>/
|
||||
├── report.md # human-readable report (case table with inline screenshots, verdict)
|
||||
├── result.json # machine-readable results (pass/fail counts, score)
|
||||
└── assets/ # evidence: screenshots, HAR files, CLI transcripts
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
1. **Scaffold up front** — before running the first test step:
|
||||
|
||||
```bash
|
||||
DIR=$(./.agents/skills/agent-testing/scripts/report-init.sh < slug > "<title>")
|
||||
```
|
||||
|
||||
The script creates the directory, pre-fills branch / commit / date in both
|
||||
files, and prints the directory path. The scaffold uses the compact report
|
||||
shape below; translate its headings and table labels to the user's language
|
||||
before delivery if needed.
|
||||
|
||||
2. **Collect evidence as you test** — every asserted behavior gets one evidence
|
||||
item in `$DIR/assets/`:
|
||||
- UI (static state): `agent-browser screenshot` or `capture-app-window.sh`,
|
||||
then **verify the screenshot with the Read tool before citing it** —
|
||||
never cite an image you haven't looked at.
|
||||
|
||||
- UI (time-based behavior): **screenshot vs GIF is a judgment you must
|
||||
make per case.** If the assertion is about change over time — streaming
|
||||
output, a ticking timer, loading/progress states, animations,
|
||||
appear/disappear transitions — a static screenshot cannot prove it.
|
||||
Record a frame sequence and synthesize a GIF:
|
||||
|
||||
```bash
|
||||
# start recording (background), trigger the behavior, wait for it to finish
|
||||
../scripts/record-gif.sh "$DIR/assets/case2-streaming.gif" 12 2 &
|
||||
GIF_PID=$!
|
||||
# ... drive the scenario ...
|
||||
wait $GIF_PID
|
||||
```
|
||||
|
||||
Embed it like an image: ``. Verify
|
||||
at least the first/last frames visually (Read the GIF) before citing.
|
||||
|
||||
- CLI: exact command + trimmed output (`$CLI task list | tee "$DIR/assets/task-list.txt"`).
|
||||
|
||||
- Network: `agent-browser network requests` dumps or HAR files.
|
||||
|
||||
3. **Fill `report.md` as you go** — don't reconstruct from memory at the end.
|
||||
The primary evidence belongs in the case table itself: each row should pair
|
||||
the assertion with the screenshot/GIF or non-visual artifact that proves it,
|
||||
so readers can scan the result without jumping between sections. UI evidence
|
||||
must render inline with Markdown image syntax; a plain link or file path is
|
||||
not acceptable as primary visual evidence.
|
||||
|
||||
4. **Set the verdict** in both `report.md` and `result.json`, then link the
|
||||
report directory in your final answer to the user. If UI evidence exists,
|
||||
list the key screenshot/GIF links in the final chat response. Use Markdown
|
||||
link text as the evidence caption, for example:
|
||||
`[Image #1 - observed outcome](<report-dir>/assets/case1.png)`.
|
||||
|
||||
## Report language (hard rule)
|
||||
|
||||
**`report.md` MUST be written in the language the user is conversing in** —
|
||||
the whole file, headings included. If the conversation is in Chinese, the
|
||||
report is in Chinese; do not mix English prose into it. The scaffold headings
|
||||
are placeholders — translate them when filling if the user is not conversing in
|
||||
the scaffold language. Exceptions that stay as-is: code/commands, identifiers,
|
||||
log excerpts, and `result.json` (its keys and status values are machine-read
|
||||
and stay English; the `title` and case `name` fields follow the user's
|
||||
language).
|
||||
|
||||
## report.md sections
|
||||
|
||||
Default report shape:
|
||||
|
||||
| Section | Content |
|
||||
| ---------------- | -------------------------------------------------------------------------------------------- |
|
||||
| **Scope** | What changed / what is being verified; branch, commit, date, surface, entry URL/page, focus |
|
||||
| **Cases** | Compact table: `# \| Case \| Result \| Key observation \| Evidence` |
|
||||
| **Verdict** | Overall verdict first (`pass` / `partial` / `fail`), then the concise reasons and follow-ups |
|
||||
| **Verification** | Commands or automated checks run in this session, with trimmed results |
|
||||
| **Score** | Pass/fail/blocked counts, optional 0–100 score |
|
||||
|
||||
The case table is the main reading surface. Prefer one clear row per user
|
||||
scenario or regression assertion, and put the screenshot/GIF directly in the
|
||||
`Evidence` cell:
|
||||
|
||||
```markdown
|
||||
| # | Case | Result | Key observation | Evidence |
|
||||
| --- | ------------------------ | ------ | ----------------------------------------------------------------- | ------------------------------------------------ |
|
||||
| 1 | Create a new page | pass | Title and body persisted after refresh |  |
|
||||
| 2 | Respect requested length | fail | Requested about 600 Chinese characters; final body was about 1286 |  |
|
||||
```
|
||||
|
||||
## Inline visual evidence
|
||||
|
||||
Screenshots and GIFs must be embedded so the report shows the image inline:
|
||||
|
||||
```markdown
|
||||

|
||||

|
||||
```
|
||||
|
||||
Do **not** use these as the primary evidence for UI cases:
|
||||
|
||||
```markdown
|
||||
[case 1 result](assets/case1-result.png)
|
||||
assets/case1-result.png
|
||||
file:///tmp/case1-result.png
|
||||
```
|
||||
|
||||
Links are acceptable for non-visual artifacts such as CLI transcripts, HAR
|
||||
files, or long logs. For videos, embed a representative screenshot/GIF inline in
|
||||
the case row and link the full video as supplemental evidence.
|
||||
|
||||
Avoid the old wide table with separate `steps`, `expected`, and `actual`
|
||||
columns unless the test is purely non-visual and truly needs that breakdown.
|
||||
For UI reports, those columns make screenshot-backed reading harder. Put
|
||||
procedural detail in the row's key observation only when it changes the
|
||||
interpretation of the result.
|
||||
|
||||
Use an extra evidence/detail section only when the inline table cannot carry
|
||||
the material cleanly, such as long CLI transcripts, HAR summaries, or multiple
|
||||
screenshots for one case. In that situation, keep the table evidence cell as an
|
||||
inline visual proof for UI cases or a concise link for non-visual artifacts,
|
||||
then put the longer material under `Verification` or a brief
|
||||
`Additional Evidence` section.
|
||||
|
||||
Status values: `pass` / `fail` / `blocked` (couldn't run — e.g. auth or env
|
||||
missing; a blocked case is not a pass).
|
||||
|
||||
## result.json schema
|
||||
|
||||
```json
|
||||
{
|
||||
"branch": "feat/task-tree",
|
||||
"cases": [
|
||||
{
|
||||
"id": "1",
|
||||
"name": "task tree returns nested children",
|
||||
"surface": "cli",
|
||||
"status": "pass",
|
||||
"evidence": ["assets/task-tree.txt"]
|
||||
}
|
||||
],
|
||||
"commit": "abc1234",
|
||||
"createdAt": "2026-06-11T15:30:00+08:00",
|
||||
"summary": {
|
||||
"total": 1,
|
||||
"passed": 1,
|
||||
"failed": 0,
|
||||
"blocked": 0,
|
||||
"score": 100,
|
||||
"verdict": "pass"
|
||||
},
|
||||
"surfaces": ["cli"],
|
||||
"title": "Verify task tree API"
|
||||
}
|
||||
```
|
||||
|
||||
`score` is optional — use it when the verdict has a subjective component (UI
|
||||
polish, copy quality); omit it for purely binary runs. `verdict` is the single
|
||||
word the user reads first: `pass`, `fail`, or `partial`.
|
||||
|
||||
## Rules
|
||||
|
||||
- **No evidence, no claim** — every `pass`/`fail` in the case table must link
|
||||
at least one asset. UI cases must inline-embed their primary screenshot/GIF;
|
||||
non-visual CLI/network cases may link transcripts, HAR files, or logs.
|
||||
- **Screenshots must be visually verified** with the Read tool before being
|
||||
cited.
|
||||
- **Report failures faithfully** — a failing case with clear evidence is a good
|
||||
report; a vague green one is not.
|
||||
- If coverage was cut (cases skipped, surfaces not exercised), say so in the
|
||||
Verdict section — silent truncation reads as "covered everything".
|
||||
+1
-1
@@ -11,7 +11,7 @@
|
||||
// 6. ROLLBACKS — msgN / childN / role drops in the active-topic timeline
|
||||
//
|
||||
// Usage:
|
||||
// bun run .agents/skills/local-testing/scripts/agent-gateway/analyze-events.ts <dump.json>
|
||||
// bun run .agents/skills/agent-testing/scripts/agent-gateway/analyze-events.ts <dump.json>
|
||||
|
||||
import { readFileSync } from 'node:fs';
|
||||
|
||||
+4
-4
@@ -5,16 +5,16 @@
|
||||
// streaming-replay test fixtures.
|
||||
//
|
||||
// Commands:
|
||||
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts install
|
||||
// bun run .agents/skills/agent-testing/scripts/agent-gateway/run.ts install
|
||||
// Bundle probe-events.ts and inject into the CDP-attached browser.
|
||||
// Re-installing clears all buffers and re-patches WebSocket / fetch.
|
||||
//
|
||||
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts dump [name]
|
||||
// bun run .agents/skills/agent-testing/scripts/agent-gateway/run.ts dump [name]
|
||||
// Stop the timeline timer, fetch the capture as JSON, write it to
|
||||
// `.agent-gateway/<name>-<YYYYMMDD-HHmmss>.json`. `name` defaults to
|
||||
// `dump`. Prints the absolute path written.
|
||||
//
|
||||
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts analyze [path]
|
||||
// bun run .agents/skills/agent-testing/scripts/agent-gateway/run.ts analyze [path]
|
||||
// Run analyze-events.ts on the dump. `path` defaults to the most
|
||||
// recently modified file in `.agent-gateway/`.
|
||||
//
|
||||
@@ -28,7 +28,7 @@ import path from 'node:path';
|
||||
import { fileURLToPath } from 'node:url';
|
||||
|
||||
const SCRIPT_DIR = path.dirname(fileURLToPath(import.meta.url));
|
||||
// .agents/skills/local-testing/scripts/agent-gateway/ → 5 levels up
|
||||
// .agents/skills/agent-testing/scripts/agent-gateway/ → 5 levels up
|
||||
const PROJECT_ROOT = path.resolve(SCRIPT_DIR, '../../../../..');
|
||||
const DUMP_DIR = path.join(PROJECT_ROOT, '.agent-gateway');
|
||||
|
||||
+95
@@ -0,0 +1,95 @@
|
||||
#!/usr/bin/env bash
|
||||
# app-probe.sh — standardized probes for a running LobeHub app (Electron via
|
||||
# CDP, or a web agent-browser session). Use these instead of hand-rolling
|
||||
# `window.__LOBE_STORES` eval snippets — especially the auth check.
|
||||
#
|
||||
# Usage:
|
||||
# app-probe.sh auth # { isSignedIn, userId } from the user store
|
||||
# app-probe.sh route # current SPA route
|
||||
# app-probe.sh ops # running chat operations (type / status / startTime)
|
||||
# app-probe.sh goto <path> # navigate the SPA to a route (full reload), e.g. goto /agent/agt_xxx
|
||||
# app-probe.sh errors-install # install a console.error interceptor
|
||||
# app-probe.sh errors # dump errors captured since errors-install
|
||||
#
|
||||
# Target selection (default: Electron over CDP 9222):
|
||||
# AB_TARGET="--cdp 9222" # Electron (default; CDP_PORT also honored)
|
||||
# AB_TARGET="--session lobehub-dev" # web agent-browser session
|
||||
#
|
||||
# Common routes (desktop SPA): / /agent/<agentId> /agent/<agentId>/<topicId>
|
||||
# /task /task/<taskId> /page /settings /community
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
AB_TARGET="${AB_TARGET:---cdp ${CDP_PORT:-9222}}"
|
||||
|
||||
run_eval() {
|
||||
# shellcheck disable=SC2086
|
||||
agent-browser $AB_TARGET eval --stdin
|
||||
}
|
||||
|
||||
case "${1:-}" in
|
||||
auth)
|
||||
run_eval << 'EVALEOF'
|
||||
(function () {
|
||||
var stores = window.__LOBE_STORES;
|
||||
if (!stores || !stores.user) return JSON.stringify({ ok: false, reason: 'no user store — app not loaded yet?' });
|
||||
var u = stores.user();
|
||||
return JSON.stringify({ ok: !!u.isSignedIn, isSignedIn: !!u.isSignedIn, userId: (u.user && u.user.id) || null });
|
||||
})()
|
||||
EVALEOF
|
||||
;;
|
||||
route)
|
||||
run_eval << 'EVALEOF'
|
||||
location.pathname + location.search + location.hash
|
||||
EVALEOF
|
||||
;;
|
||||
ops)
|
||||
run_eval << 'EVALEOF'
|
||||
(function () {
|
||||
var stores = window.__LOBE_STORES;
|
||||
if (!stores || !stores.chat) return JSON.stringify({ ok: false, reason: 'no chat store — open a conversation first' });
|
||||
var ops = Object.values(stores.chat().operations || {});
|
||||
var running = ops.filter(function (o) { return o.status === 'running'; });
|
||||
return JSON.stringify({
|
||||
ok: true,
|
||||
running: running.map(function (o) { return { startTime: o.metadata && o.metadata.startTime, type: o.type }; }),
|
||||
runningCount: running.length,
|
||||
total: ops.length,
|
||||
});
|
||||
})()
|
||||
EVALEOF
|
||||
;;
|
||||
goto)
|
||||
TARGET_PATH="${2:?Usage: app-probe.sh goto <path>}"
|
||||
# shellcheck disable=SC2086
|
||||
agent-browser $AB_TARGET eval "location.href = '$TARGET_PATH'" > /dev/null
|
||||
sleep 2
|
||||
bash "${BASH_SOURCE[0]}" route
|
||||
;;
|
||||
errors-install)
|
||||
run_eval << 'EVALEOF'
|
||||
(function () {
|
||||
window.__CAPTURED_ERRORS = [];
|
||||
var orig = console.error;
|
||||
console.error = function () {
|
||||
var msg = Array.from(arguments).map(function (a) {
|
||||
if (a instanceof Error) return a.message;
|
||||
return typeof a === 'object' ? JSON.stringify(a) : String(a);
|
||||
}).join(' ');
|
||||
window.__CAPTURED_ERRORS.push(msg);
|
||||
orig.apply(console, arguments);
|
||||
};
|
||||
return 'installed';
|
||||
})()
|
||||
EVALEOF
|
||||
;;
|
||||
errors)
|
||||
run_eval << 'EVALEOF'
|
||||
JSON.stringify(window.__CAPTURED_ERRORS || 'interceptor not installed — run errors-install first')
|
||||
EVALEOF
|
||||
;;
|
||||
*)
|
||||
echo "Usage: $0 {auth|route|ops|goto <path>|errors-install|errors}" >&2
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
+407
@@ -0,0 +1,407 @@
|
||||
#!/usr/bin/env bash
|
||||
# init-dev-env.sh — self-contained local dev env for agent testing.
|
||||
#
|
||||
# This script initializes the env needed to run LobeHub's normal local dev
|
||||
# server without depending on a root .env file. It follows the same shape as
|
||||
# the e2e bootstrap (Postgres + migrations + auth/key-vault/S3 test env), but
|
||||
# starts the repo's dev server, not the standalone e2e server.
|
||||
#
|
||||
# Guardrail: if repo-root .env exists, every non-help command exits immediately.
|
||||
# Existing local config always wins.
|
||||
#
|
||||
# Usage:
|
||||
# init-dev-env.sh env # print shell exports
|
||||
# init-dev-env.sh write [file] # write a source-able env file
|
||||
# init-dev-env.sh setup-db # start local Postgres and run migrations
|
||||
# init-dev-env.sh migrate # run DB migrations against the configured DB
|
||||
# init-dev-env.sh seed-user # seed the baseline test user + CLI API key
|
||||
# init-dev-env.sh qstash # run local Upstash QStash dev server
|
||||
# init-dev-env.sh dev-next # exec `pnpm run dev:next` with this env
|
||||
# init-dev-env.sh dev # exec `bun run dev` with this env
|
||||
# init-dev-env.sh clean-db # remove the managed Postgres container
|
||||
#
|
||||
# Overrides:
|
||||
# SERVER_PORT=3010 DB_PORT=5433 DB_CONTAINER=lobehub-agent-testing-postgres QSTASH_DEV_PORT=8080
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
|
||||
ROOT_ENV_FILE="$REPO_ROOT/.env"
|
||||
|
||||
SERVER_PORT="${SERVER_PORT:-3010}"
|
||||
DB_PORT="${DB_PORT:-5433}"
|
||||
DB_CONTAINER="${DB_CONTAINER:-lobehub-agent-testing-postgres}"
|
||||
DATABASE_URL="${DATABASE_URL:-postgresql://postgres:postgres@localhost:${DB_PORT}/postgres}"
|
||||
ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-dev.env"
|
||||
CLI_ENV_FILE_DEFAULT="$REPO_ROOT/.records/env/agent-testing-cli.env"
|
||||
AGENT_TESTING_API_KEY="${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}"
|
||||
QSTASH_DEV_PORT="${QSTASH_DEV_PORT:-8080}"
|
||||
QSTASH_LOCAL_TOKEN="${QSTASH_LOCAL_TOKEN:-eyJVc2VySUQiOiJkZWZhdWx0VXNlciIsIlBhc3N3b3JkIjoiZGVmYXVsdFBhc3N3b3JkIn0=}"
|
||||
QSTASH_LOCAL_CURRENT_SIGNING_KEY="${QSTASH_LOCAL_CURRENT_SIGNING_KEY:-sig_7kYjw48mhY7kAjqNGcy6cr29RJ6r}"
|
||||
QSTASH_LOCAL_NEXT_SIGNING_KEY="${QSTASH_LOCAL_NEXT_SIGNING_KEY:-sig_5ZB6DVzB1wjE8S6rZ7eenA8Pdnhs}"
|
||||
|
||||
ok() { printf ' \033[32m✔\033[0m %s\n' "$1"; }
|
||||
bad() { printf ' \033[31m✘\033[0m %s\n' "$1"; }
|
||||
note() { printf ' %s\n' "$1"; }
|
||||
|
||||
guard_no_root_env() {
|
||||
if [[ -f "$ROOT_ENV_FILE" ]]; then
|
||||
bad "root .env exists: $ROOT_ENV_FILE"
|
||||
note "Use the existing local configuration instead of init-dev-env.sh."
|
||||
note "Start normally from repo root, e.g. pnpm run dev:next or bun run dev."
|
||||
exit 1
|
||||
fi
|
||||
}
|
||||
|
||||
apply_env() {
|
||||
export APP_URL="${APP_URL:-http://localhost:${SERVER_PORT}}"
|
||||
export AUTH_EMAIL_VERIFICATION="${AUTH_EMAIL_VERIFICATION:-0}"
|
||||
export AUTH_SECRET="${AUTH_SECRET:-agent-testing-local-auth-secret-32chars}"
|
||||
export DATABASE_DRIVER="${DATABASE_DRIVER:-node}"
|
||||
export DATABASE_URL
|
||||
export FEATURE_FLAGS="${FEATURE_FLAGS:--agent_self_iteration}"
|
||||
export KEY_VAULTS_SECRET="${KEY_VAULTS_SECRET:-r2gbBPKyJ8ZRKCLKt+I3DImfcL+wGxaQyRC56xtm9Uk=}"
|
||||
export NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION="${NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION:-0}"
|
||||
export NODE_OPTIONS="${NODE_OPTIONS:---max-old-space-size=6144}"
|
||||
export PORT="${PORT:-$SERVER_PORT}"
|
||||
export QSTASH_CURRENT_SIGNING_KEY="${QSTASH_CURRENT_SIGNING_KEY:-$QSTASH_LOCAL_CURRENT_SIGNING_KEY}"
|
||||
export QSTASH_DEV_PORT
|
||||
export QSTASH_NEXT_SIGNING_KEY="${QSTASH_NEXT_SIGNING_KEY:-$QSTASH_LOCAL_NEXT_SIGNING_KEY}"
|
||||
export QSTASH_TOKEN="${QSTASH_TOKEN:-$QSTASH_LOCAL_TOKEN}"
|
||||
export QSTASH_URL="${QSTASH_URL:-http://127.0.0.1:${QSTASH_DEV_PORT}}"
|
||||
export S3_ACCESS_KEY_ID="${S3_ACCESS_KEY_ID:-agent-testing-access-key}"
|
||||
export S3_BUCKET="${S3_BUCKET:-agent-testing-bucket}"
|
||||
export S3_ENDPOINT="${S3_ENDPOINT:-https://agent-testing-s3.localhost}"
|
||||
export S3_SECRET_ACCESS_KEY="${S3_SECRET_ACCESS_KEY:-agent-testing-secret-key}"
|
||||
}
|
||||
|
||||
env_keys() {
|
||||
printf '%s\n' \
|
||||
APP_URL \
|
||||
AUTH_EMAIL_VERIFICATION \
|
||||
AUTH_SECRET \
|
||||
DATABASE_DRIVER \
|
||||
DATABASE_URL \
|
||||
FEATURE_FLAGS \
|
||||
KEY_VAULTS_SECRET \
|
||||
NEXT_PUBLIC_AUTH_EMAIL_VERIFICATION \
|
||||
NODE_OPTIONS \
|
||||
PORT \
|
||||
QSTASH_CURRENT_SIGNING_KEY \
|
||||
QSTASH_DEV_PORT \
|
||||
QSTASH_NEXT_SIGNING_KEY \
|
||||
QSTASH_TOKEN \
|
||||
QSTASH_URL \
|
||||
S3_ACCESS_KEY_ID \
|
||||
S3_BUCKET \
|
||||
S3_ENDPOINT \
|
||||
S3_SECRET_ACCESS_KEY
|
||||
}
|
||||
|
||||
print_env() {
|
||||
apply_env
|
||||
while IFS= read -r key; do
|
||||
printf 'export %s=%q\n' "$key" "${!key}"
|
||||
done < <(env_keys)
|
||||
}
|
||||
|
||||
write_env() {
|
||||
local file="${1:-$ENV_FILE_DEFAULT}"
|
||||
apply_env
|
||||
mkdir -p "$(dirname "$file")"
|
||||
{
|
||||
printf '# Source this file before starting LobeHub local dev server.\n'
|
||||
printf '# Generated by %s\n' "$0"
|
||||
while IFS= read -r key; do
|
||||
printf 'export %s=%q\n' "$key" "${!key}"
|
||||
done < <(env_keys)
|
||||
} > "$file"
|
||||
ok "wrote env file: $file"
|
||||
note "source it with: source $file"
|
||||
}
|
||||
|
||||
require_docker() {
|
||||
if ! command -v docker > /dev/null 2>&1; then
|
||||
bad "docker CLI is not available"
|
||||
note "Install/start Docker Desktop, or provide DATABASE_URL for an existing Postgres."
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
wait_for_db() {
|
||||
printf ' waiting for Postgres'
|
||||
until docker exec "$DB_CONTAINER" pg_isready -U postgres > /dev/null 2>&1; do
|
||||
printf '.'
|
||||
sleep 2
|
||||
done
|
||||
printf '\n'
|
||||
}
|
||||
|
||||
start_db() {
|
||||
require_docker
|
||||
|
||||
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
|
||||
ok "Postgres container already running: $DB_CONTAINER"
|
||||
elif docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
|
||||
docker start "$DB_CONTAINER" > /dev/null
|
||||
ok "started existing Postgres container: $DB_CONTAINER"
|
||||
else
|
||||
docker run -d \
|
||||
--name "$DB_CONTAINER" \
|
||||
-e POSTGRES_PASSWORD=postgres \
|
||||
-p "${DB_PORT}:5432" \
|
||||
paradedb/paradedb:latest > /dev/null
|
||||
ok "created Postgres container: $DB_CONTAINER"
|
||||
fi
|
||||
|
||||
wait_for_db
|
||||
}
|
||||
|
||||
migrate_db() {
|
||||
apply_env
|
||||
cd "$REPO_ROOT"
|
||||
bun run db:migrate
|
||||
}
|
||||
|
||||
seed_user() {
|
||||
apply_env
|
||||
export AGENT_TESTING_API_KEY
|
||||
export AGENT_TESTING_CLI_ENV_FILE="${AGENT_TESTING_CLI_ENV_FILE:-$CLI_ENV_FILE_DEFAULT}"
|
||||
cd "$REPO_ROOT"
|
||||
node <<'NODE'
|
||||
const bcrypt = require('bcryptjs');
|
||||
const crypto = require('node:crypto');
|
||||
const fs = require('node:fs');
|
||||
const path = require('node:path');
|
||||
const pg = require('pg');
|
||||
|
||||
const databaseUrl = process.env.DATABASE_URL;
|
||||
if (!databaseUrl) {
|
||||
throw new Error('DATABASE_URL is required to seed the baseline test user.');
|
||||
}
|
||||
|
||||
const TEST_USER = {
|
||||
email: 'agent-testing@lobehub.com',
|
||||
fullName: 'Agent Testing User',
|
||||
id: 'user_agent_testing_001',
|
||||
password: 'TestPassword123!',
|
||||
username: 'agent_testing_user',
|
||||
};
|
||||
|
||||
const TEST_API_KEY = {
|
||||
id: 'api_key_agent_testing_001',
|
||||
key: process.env.AGENT_TESTING_API_KEY || 'sk-lh-agenttesting0001',
|
||||
name: 'Agent Testing CLI API Key',
|
||||
};
|
||||
|
||||
const validateApiKeyFormat = (apiKey) => /^sk-lh-[\da-z]{16}$/.test(apiKey);
|
||||
|
||||
const hashApiKey = (apiKey) => {
|
||||
const secret = process.env.KEY_VAULTS_SECRET;
|
||||
if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
|
||||
|
||||
return crypto.createHmac('sha256', secret).update(apiKey).digest('hex');
|
||||
};
|
||||
|
||||
const encryptWithKeyVaultsSecret = (plaintext) => {
|
||||
const secret = process.env.KEY_VAULTS_SECRET;
|
||||
if (!secret) throw new Error('KEY_VAULTS_SECRET is required to seed the baseline API key.');
|
||||
|
||||
const rawKey = Buffer.from(secret, 'base64');
|
||||
if (![16, 24, 32].includes(rawKey.length)) {
|
||||
throw new Error(
|
||||
`KEY_VAULTS_SECRET must decode to 16, 24, or 32 bytes, got ${rawKey.length} bytes.`,
|
||||
);
|
||||
}
|
||||
|
||||
const iv = crypto.randomBytes(12);
|
||||
const cipher = crypto.createCipheriv(`aes-${rawKey.length * 8}-gcm`, rawKey, iv);
|
||||
const encrypted = Buffer.concat([cipher.update(plaintext, 'utf8'), cipher.final()]);
|
||||
const authTag = cipher.getAuthTag();
|
||||
|
||||
return `${iv.toString('hex')}:${authTag.toString('hex')}:${encrypted.toString('hex')}`;
|
||||
};
|
||||
|
||||
const writeCliEnvFile = () => {
|
||||
const file = process.env.AGENT_TESTING_CLI_ENV_FILE || '.records/env/agent-testing-cli.env';
|
||||
fs.mkdirSync(path.dirname(file), { recursive: true });
|
||||
fs.writeFileSync(
|
||||
file,
|
||||
[
|
||||
'# Source this file before running LobeHub CLI agent tests.',
|
||||
'# Generated by init-dev-env.sh seed-user',
|
||||
`export LOBE_API_KEY=${TEST_API_KEY.key}`,
|
||||
`export LOBEHUB_CLI_API_KEY="${'${LOBE_API_KEY}'}"`,
|
||||
`export LOBEHUB_SERVER=${process.env.APP_URL}`,
|
||||
'export LOBEHUB_CLI_HOME=.lobehub-dev',
|
||||
'',
|
||||
].join('\n'),
|
||||
);
|
||||
|
||||
return file;
|
||||
};
|
||||
|
||||
const client = new pg.Client({ connectionString: databaseUrl });
|
||||
|
||||
(async () => {
|
||||
if (!validateApiKeyFormat(TEST_API_KEY.key)) {
|
||||
throw new Error(`Invalid AGENT_TESTING_API_KEY format: ${TEST_API_KEY.key}`);
|
||||
}
|
||||
|
||||
await client.connect();
|
||||
const now = new Date().toISOString();
|
||||
const onboarding = JSON.stringify({ finishedAt: now, version: 1 });
|
||||
const passwordHash = await bcrypt.hash(TEST_USER.password, 10);
|
||||
const encryptedApiKey = encryptWithKeyVaultsSecret(TEST_API_KEY.key);
|
||||
const apiKeyHash = hashApiKey(TEST_API_KEY.key);
|
||||
|
||||
await client.query(
|
||||
`INSERT INTO users (id, email, normalized_email, username, full_name, email_verified, onboarding, created_at, updated_at, last_active_at)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $8, $8)
|
||||
ON CONFLICT (id) DO UPDATE SET onboarding = $7, updated_at = $8`,
|
||||
[
|
||||
TEST_USER.id,
|
||||
TEST_USER.email,
|
||||
TEST_USER.email.toLowerCase(),
|
||||
TEST_USER.username,
|
||||
TEST_USER.fullName,
|
||||
true,
|
||||
onboarding,
|
||||
now,
|
||||
],
|
||||
);
|
||||
|
||||
await client.query(
|
||||
`INSERT INTO accounts (id, user_id, account_id, provider_id, password, created_at, updated_at)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $6)
|
||||
ON CONFLICT DO NOTHING`,
|
||||
[
|
||||
'agent_testing_account_001',
|
||||
TEST_USER.id,
|
||||
TEST_USER.email,
|
||||
'credential',
|
||||
passwordHash,
|
||||
now,
|
||||
],
|
||||
);
|
||||
|
||||
await client.query(
|
||||
`INSERT INTO api_keys (id, name, key, key_hash, enabled, expires_at, user_id, workspace_id, created_at, updated_at)
|
||||
VALUES ($1, $2, $3, $4, $5, NULL, $6, NULL, $7, $7)
|
||||
ON CONFLICT (id) DO UPDATE
|
||||
SET name = EXCLUDED.name,
|
||||
key = EXCLUDED.key,
|
||||
key_hash = EXCLUDED.key_hash,
|
||||
enabled = EXCLUDED.enabled,
|
||||
expires_at = NULL,
|
||||
updated_at = EXCLUDED.updated_at`,
|
||||
[
|
||||
TEST_API_KEY.id,
|
||||
TEST_API_KEY.name,
|
||||
encryptedApiKey,
|
||||
apiKeyHash,
|
||||
true,
|
||||
TEST_USER.id,
|
||||
now,
|
||||
],
|
||||
);
|
||||
|
||||
const cliEnvFile = writeCliEnvFile();
|
||||
|
||||
console.log('seeded baseline user:');
|
||||
console.log(` email: ${TEST_USER.email}`);
|
||||
console.log(` password: ${TEST_USER.password}`);
|
||||
console.log('seeded baseline API key:');
|
||||
console.log(` LOBE_API_KEY: ${TEST_API_KEY.key}`);
|
||||
console.log(` CLI env: ${cliEnvFile}`);
|
||||
})()
|
||||
.finally(() => client.end())
|
||||
.catch((error) => {
|
||||
console.error(error);
|
||||
process.exit(1);
|
||||
});
|
||||
NODE
|
||||
}
|
||||
|
||||
cmd_status() {
|
||||
apply_env
|
||||
echo "agent-testing local dev env:"
|
||||
note "APP_URL=$APP_URL"
|
||||
note "DATABASE_URL=$DATABASE_URL"
|
||||
note "PORT=$PORT"
|
||||
note "QSTASH_URL=$QSTASH_URL"
|
||||
if command -v docker > /dev/null 2>&1; then
|
||||
ok "docker CLI available"
|
||||
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
|
||||
ok "managed Postgres running: $DB_CONTAINER"
|
||||
else
|
||||
note "managed Postgres is not running: $DB_CONTAINER"
|
||||
fi
|
||||
else
|
||||
bad "docker CLI is not available"
|
||||
fi
|
||||
}
|
||||
|
||||
cmd_qstash() {
|
||||
apply_env
|
||||
cd "$REPO_ROOT"
|
||||
note "starting local QStash dev server at $QSTASH_URL"
|
||||
note "keep this process running while testing workflow paths"
|
||||
exec pnpm run qstash -- -port "$QSTASH_DEV_PORT"
|
||||
}
|
||||
|
||||
cmd_dev_next() {
|
||||
apply_env
|
||||
cd "$REPO_ROOT"
|
||||
exec pnpm run dev:next
|
||||
}
|
||||
|
||||
cmd_dev() {
|
||||
apply_env
|
||||
cd "$REPO_ROOT"
|
||||
exec bun run dev
|
||||
}
|
||||
|
||||
cmd_clean_db() {
|
||||
require_docker
|
||||
if docker ps --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
|
||||
docker stop "$DB_CONTAINER" > /dev/null
|
||||
fi
|
||||
if docker ps -a --format '{{.Names}}' | grep -Fxq "$DB_CONTAINER"; then
|
||||
docker rm "$DB_CONTAINER" > /dev/null
|
||||
ok "removed Postgres container: $DB_CONTAINER"
|
||||
else
|
||||
note "Postgres container not found: $DB_CONTAINER"
|
||||
fi
|
||||
}
|
||||
|
||||
usage() {
|
||||
sed -n '3,24p' "$0" >&2
|
||||
}
|
||||
|
||||
COMMAND="${1:-status}"
|
||||
|
||||
case "$COMMAND" in
|
||||
help|-h|--help) usage; exit 0 ;;
|
||||
*) guard_no_root_env ;;
|
||||
esac
|
||||
|
||||
case "$COMMAND" in
|
||||
env) print_env ;;
|
||||
write) shift; write_env "${1:-}" ;;
|
||||
setup-db)
|
||||
start_db
|
||||
migrate_db
|
||||
;;
|
||||
migrate) migrate_db ;;
|
||||
seed-user) seed_user ;;
|
||||
qstash) cmd_qstash ;;
|
||||
dev-next) cmd_dev_next ;;
|
||||
dev) cmd_dev ;;
|
||||
clean-db) cmd_clean_db ;;
|
||||
status) cmd_status ;;
|
||||
*)
|
||||
usage
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
+61
@@ -0,0 +1,61 @@
|
||||
#!/usr/bin/env bash
|
||||
# record-gif.sh — capture a frame sequence via agent-browser (CDP) and
|
||||
# synthesize a GIF for embedding in a test report.
|
||||
#
|
||||
# Use this whenever the asserted behavior is about CHANGE OVER TIME —
|
||||
# streaming output, a ticking timer, loading states, animations. A static
|
||||
# screenshot cannot prove those; a GIF can. Cloud-portable: frames come from
|
||||
# CDP rendering, no OS-level screen capture.
|
||||
#
|
||||
# Usage:
|
||||
# record-gif.sh <output.gif> <duration_seconds> [fps]
|
||||
#
|
||||
# AB_TARGET="--cdp 9222" # Electron (default; CDP_PORT honored)
|
||||
# AB_TARGET="--session lobehub-dev" # web agent-browser session
|
||||
# GIF_WIDTH=960 # output width (px), default 960
|
||||
#
|
||||
# Requires ffmpeg (`brew install ffmpeg`). Effective fps is capped by
|
||||
# screenshot latency (~0.3-0.5s per frame); 1-2 fps is the realistic range.
|
||||
#
|
||||
# Example — record a 12s run and embed it in the report:
|
||||
# ./record-gif.sh "$DIR/assets/case2-tray-running.gif" 12 2 &
|
||||
# GIF_PID=$!
|
||||
# # ... trigger the streaming behavior ...
|
||||
# wait $GIF_PID
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
OUT="${1:?Usage: record-gif.sh <output.gif> <duration_seconds> [fps]}"
|
||||
DUR="${2:?Usage: record-gif.sh <output.gif> <duration_seconds> [fps]}"
|
||||
FPS="${3:-2}"
|
||||
AB_TARGET="${AB_TARGET:---cdp ${CDP_PORT:-9222}}"
|
||||
GIF_WIDTH="${GIF_WIDTH:-960}"
|
||||
|
||||
command -v ffmpeg > /dev/null || {
|
||||
echo "ffmpeg not found — install with: brew install ffmpeg" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
TMP=$(mktemp -d)
|
||||
trap 'rm -rf "$TMP"' EXIT
|
||||
|
||||
FRAMES=$((DUR * FPS))
|
||||
INTERVAL=$(python3 -c "print(1 / $FPS)")
|
||||
|
||||
for i in $(seq -f '%04g' 1 "$FRAMES"); do
|
||||
# shellcheck disable=SC2086
|
||||
agent-browser $AB_TARGET screenshot "$TMP/frame-$i.png" > /dev/null 2>&1 || true
|
||||
sleep "$INTERVAL"
|
||||
done
|
||||
|
||||
CAPTURED=$(find "$TMP" -name 'frame-*.png' | wc -l | tr -d ' ')
|
||||
[ "$CAPTURED" -gt 0 ] || {
|
||||
echo "no frames captured — is the app reachable via $AB_TARGET?" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
ffmpeg -y -loglevel error -framerate "$FPS" -pattern_type glob -i "$TMP/frame-*.png" \
|
||||
-vf "fps=$FPS,scale=$GIF_WIDTH:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" \
|
||||
"$OUT"
|
||||
|
||||
echo "$OUT ($CAPTURED frames @ ${FPS}fps)"
|
||||
+88
@@ -0,0 +1,88 @@
|
||||
#!/usr/bin/env bash
|
||||
# report-init.sh — scaffold a structured test report under .records/reports/.
|
||||
#
|
||||
# Format spec and evidence rules: ../references/report.md
|
||||
#
|
||||
# Usage:
|
||||
# report-init.sh <slug> [title]
|
||||
#
|
||||
# Prints the report directory path (capture it: DIR=$(report-init.sh my-test)).
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SLUG="${1:?Usage: report-init.sh <slug> [title]}"
|
||||
TITLE="${2:-$SLUG}"
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
|
||||
TS="$(date +%Y%m%d-%H%M%S)"
|
||||
DIR="$REPO_ROOT/.records/reports/$TS-$SLUG"
|
||||
mkdir -p "$DIR/assets"
|
||||
|
||||
BRANCH=$(git -C "$REPO_ROOT" branch --show-current 2> /dev/null || echo "unknown")
|
||||
COMMIT=$(git -C "$REPO_ROOT" rev-parse --short HEAD 2> /dev/null || echo "unknown")
|
||||
DATE_HUMAN=$(date '+%Y-%m-%d %H:%M')
|
||||
DATE_ISO=$(date '+%Y-%m-%dT%H:%M:%S%z')
|
||||
|
||||
cat > "$DIR/report.md" << EOF
|
||||
# 测试报告:$TITLE
|
||||
|
||||
## 范围
|
||||
|
||||
<!-- 测试目标 / 变更范围 / 重点风险 -->
|
||||
|
||||
- 分支:\`$BRANCH\`
|
||||
- 当前提交:\`$COMMIT\`
|
||||
- 日期:$DATE_HUMAN
|
||||
- 表面:<!-- CLI / Electron + CDP / Web / Bot:<platform> -->
|
||||
- 测试页 / 入口:<!-- e.g. /settings or http://localhost:3010 -->
|
||||
- 重点:<!-- 本轮最关心的体验、功能或回归点 -->
|
||||
|
||||
## 用例
|
||||
|
||||
| # | 用例 | 结果 | 关键现象 | 证据 |
|
||||
| - | ---- | ---- | -------- | ---- |
|
||||
| 1 | | 待测 | |  |
|
||||
|
||||
## 结论
|
||||
|
||||
整体结论:\`pending\`。
|
||||
|
||||
<!-- 用 1-2 段概括用户最需要知道的结果;失败和阻塞必须明确说明影响。 -->
|
||||
|
||||
仍需处理 / 跟进:
|
||||
|
||||
- <!-- TODO -->
|
||||
|
||||
## 本轮验证
|
||||
|
||||
<!-- 如有自动化或命令行验证,保留精简命令与结果;没有则写“未运行额外自动化验证”。 -->
|
||||
|
||||
\`\`\`bash
|
||||
# command
|
||||
\`\`\`
|
||||
|
||||
结果:
|
||||
|
||||
- <!-- TODO -->
|
||||
|
||||
## 评分
|
||||
|
||||
- 通过:0
|
||||
- 失败:0
|
||||
- 阻塞:0
|
||||
- 评分:— / 100
|
||||
EOF
|
||||
|
||||
cat > "$DIR/result.json" << EOF
|
||||
{
|
||||
"title": "$TITLE",
|
||||
"createdAt": "$DATE_ISO",
|
||||
"branch": "$BRANCH",
|
||||
"commit": "$COMMIT",
|
||||
"surfaces": [],
|
||||
"cases": [],
|
||||
"summary": { "total": 0, "passed": 0, "failed": 0, "blocked": 0, "verdict": "pending" }
|
||||
}
|
||||
EOF
|
||||
|
||||
echo "$DIR"
|
||||
+553
@@ -0,0 +1,553 @@
|
||||
#!/usr/bin/env bash
|
||||
# setup-auth.sh — one-stop auth setup & check for local agent testing.
|
||||
#
|
||||
# Auth is the gate for all automated testing: prepare it BEFORE writing any
|
||||
# test step. Background and failure modes: ../references/auth.md
|
||||
#
|
||||
# Usage:
|
||||
# setup-auth.sh status # check server + CLI + web + Electron readiness
|
||||
# setup-auth.sh status --surface web # check only the Web surface gate
|
||||
# setup-auth.sh cli-seed # configure CLI API-key auth from seeded local env
|
||||
# setup-auth.sh cli # interactive CLI device-code login (run by a human)
|
||||
# setup-auth.sh open-chrome # open SERVER_URL in Chrome and show DevTools
|
||||
# setup-auth.sh web-seed # sign in seeded user and inject cookies automatically
|
||||
# setup-auth.sh web # stdin = Cookie header -> inject into agent-browser session
|
||||
# setup-auth.sh web-verify # live-check the agent-browser session is authenticated
|
||||
#
|
||||
# Env:
|
||||
# SERVER_URL (default from test-env.sh) dev server under test
|
||||
# SESSION (default lobehub-dev) agent-browser session name
|
||||
# AUTH_DIR (default ~/.lobehub-agent-testing) where web state is persisted
|
||||
# SEED_EMAIL / SEED_PASSWORD seeded better-auth login
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
REPO_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../../.." && pwd)"
|
||||
|
||||
workspace_root_for_port() {
|
||||
local root="$REPO_ROOT"
|
||||
local name
|
||||
name="$(basename "$root")"
|
||||
|
||||
if [[ "$name" == "lobehub" ]]; then
|
||||
local parent
|
||||
parent="$(cd "$root/.." && pwd)"
|
||||
local parent_name
|
||||
parent_name="$(basename "$parent")"
|
||||
if [[ "$parent_name" == lobehub-cloud* ]]; then
|
||||
root="$parent"
|
||||
fi
|
||||
fi
|
||||
|
||||
printf '%s\n' "$root"
|
||||
}
|
||||
|
||||
default_server_url() {
|
||||
local env_resolver resolved
|
||||
env_resolver="$(dirname "${BASH_SOURCE[0]}")/test-env.sh"
|
||||
if [[ -x "$env_resolver" ]]; then
|
||||
resolved="$("$env_resolver" --value SERVER_URL 2> /dev/null || true)"
|
||||
if [[ -n "$resolved" ]]; then
|
||||
printf '%s\n' "$resolved"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
local root name suffix port
|
||||
root="$(workspace_root_for_port)"
|
||||
name="$(basename "$root")"
|
||||
|
||||
case "$name" in
|
||||
lobehub-cloud)
|
||||
port=3020
|
||||
;;
|
||||
lobehub-cloud-*)
|
||||
suffix="${name#lobehub-cloud-}"
|
||||
if [[ "$suffix" =~ ^[0-9]+$ ]]; then
|
||||
port=$((3020 + 10#$suffix))
|
||||
else
|
||||
port=3010
|
||||
fi
|
||||
;;
|
||||
*)
|
||||
port=3010
|
||||
;;
|
||||
esac
|
||||
|
||||
printf 'http://localhost:%s\n' "$port"
|
||||
}
|
||||
|
||||
SERVER_URL="${SERVER_URL:-$(default_server_url)}"
|
||||
SESSION="${SESSION:-lobehub-dev}"
|
||||
AUTH_DIR="${AUTH_DIR:-$HOME/.lobehub-agent-testing}"
|
||||
STATE_FILE="$AUTH_DIR/web-state.json"
|
||||
CLI_HOME_NAME="${LOBEHUB_CLI_HOME:-.lobehub-dev}"
|
||||
CLI_HOME="$HOME/${CLI_HOME_NAME#/}"
|
||||
CLI_CREDENTIALS_FILE="$CLI_HOME/credentials.json"
|
||||
SEED_EMAIL="${SEED_EMAIL:-agent-testing@lobehub.com}"
|
||||
SEED_PASSWORD="${SEED_PASSWORD:-TestPassword123!}"
|
||||
SEED_API_KEY="${SEED_API_KEY:-${AGENT_TESTING_API_KEY:-sk-lh-agenttesting0001}}"
|
||||
CLI_ENV_FILE="${CLI_ENV_FILE:-$REPO_ROOT/.records/env/agent-testing-cli.env}"
|
||||
|
||||
ok() { printf ' \033[32m✔\033[0m %s\n' "$1"; }
|
||||
bad() { printf ' \033[31m✘\033[0m %s\n' "$1"; }
|
||||
note() { printf ' %s\n' "$1"; }
|
||||
|
||||
usage() {
|
||||
cat << EOF
|
||||
Usage:
|
||||
$0 status [--surface all|cli|web|electron]
|
||||
$0 cli-seed
|
||||
$0 cli
|
||||
$0 open-chrome [--dry-run]
|
||||
$0 web-seed
|
||||
$0 web
|
||||
$0 web-verify
|
||||
|
||||
Env:
|
||||
SERVER_URL=$SERVER_URL
|
||||
SESSION=$SESSION
|
||||
AUTH_DIR=$AUTH_DIR
|
||||
SEED_EMAIL=$SEED_EMAIL
|
||||
CLI_HOME=$CLI_HOME
|
||||
EOF
|
||||
}
|
||||
|
||||
check_server() {
|
||||
local code
|
||||
code=$(curl -s -o /dev/null -w '%{http_code}' "$SERVER_URL/" 2> /dev/null || true)
|
||||
if [[ "$code" =~ ^[23] ]]; then
|
||||
ok "dev server reachable at $SERVER_URL"
|
||||
else
|
||||
bad "dev server NOT reachable at $SERVER_URL (http_code='$code')"
|
||||
note "start it: pnpm run dev:next (see references/dev-server.md)"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
check_cli() {
|
||||
local api_key="${LOBEHUB_CLI_API_KEY:-${LOBE_API_KEY:-}}"
|
||||
if [[ -n "$api_key" ]]; then
|
||||
local body_file code
|
||||
body_file="$(mktemp)"
|
||||
code=$(curl -sS -o "$body_file" -w '%{http_code}' \
|
||||
-H "Authorization: Bearer $api_key" \
|
||||
"$SERVER_URL/api/v1/users/me?includeCount=0" 2> /dev/null || true)
|
||||
|
||||
if [[ "$code" =~ ^[23] ]]; then
|
||||
rm -f "$body_file"
|
||||
ok "CLI API-key auth valid for $SERVER_URL"
|
||||
return 0
|
||||
fi
|
||||
|
||||
bad "CLI API-key auth failed for $SERVER_URL (http_code='$code')"
|
||||
note "seed the local API key first:"
|
||||
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
|
||||
note "source $CLI_ENV_FILE"
|
||||
rm -f "$body_file"
|
||||
return 1
|
||||
fi
|
||||
|
||||
if [[ -f "$CLI_HOME/settings.json" ]] && grep -q "$SERVER_URL" "$CLI_HOME/settings.json" && [[ -f "$CLI_CREDENTIALS_FILE" ]]; then
|
||||
ok "CLI device-code credentials configured for $SERVER_URL (creds: $CLI_HOME)"
|
||||
else
|
||||
bad "CLI not logged in to $SERVER_URL"
|
||||
note "automated path:"
|
||||
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user && source $CLI_ENV_FILE && $0 cli-seed"
|
||||
note "interactive fallback:"
|
||||
note "cd apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server $SERVER_URL"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
check_web() {
|
||||
if [[ -f "$STATE_FILE" ]]; then
|
||||
ok "web auth state saved ($STATE_FILE)"
|
||||
else
|
||||
bad "no web auth state for agent-browser"
|
||||
note "for the seeded local user, run: $0 web-seed"
|
||||
note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
|
||||
note "pbpaste | $0 web (see references/auth.md)"
|
||||
return 1
|
||||
fi
|
||||
cmd_web_verify --skip-server-check
|
||||
}
|
||||
|
||||
check_agent_browser() {
|
||||
if command -v agent-browser > /dev/null 2>&1; then
|
||||
ok "agent-browser available"
|
||||
else
|
||||
bad "agent-browser command not found"
|
||||
note "install or expose agent-browser before Web/Electron UI testing"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
check_electron() {
|
||||
local cdp_port="${CDP_PORT:-9222}"
|
||||
if ! curl -s -o /dev/null --max-time 2 "http://localhost:$cdp_port/json/version" 2> /dev/null; then
|
||||
note "electron: not running (CDP $cdp_port unreachable) — start with electron-dev.sh; check skipped"
|
||||
return 0
|
||||
fi
|
||||
local probe result
|
||||
probe="$(dirname "${BASH_SOURCE[0]}")/app-probe.sh"
|
||||
result=$(bash "$probe" auth 2> /dev/null || true)
|
||||
# agent-browser eval returns the JSON string with escaped quotes — normalize.
|
||||
result="${result//\\/}"
|
||||
if [[ "$result" == *'"isSignedIn":true'* ]]; then
|
||||
ok "electron app signed in ($result)"
|
||||
else
|
||||
bad "electron app NOT signed in ($result)"
|
||||
note "log in once manually inside the app (state persists across restarts)"
|
||||
return 1
|
||||
fi
|
||||
}
|
||||
|
||||
cmd_status() {
|
||||
local surface="all"
|
||||
while [[ $# -gt 0 ]]; do
|
||||
case "$1" in
|
||||
--surface)
|
||||
if [[ $# -lt 2 ]]; then
|
||||
echo "--surface requires one of: all, cli, web, electron" >&2
|
||||
return 2
|
||||
fi
|
||||
surface="${2:-}"
|
||||
shift 2
|
||||
;;
|
||||
--surface=*)
|
||||
surface="${1#*=}"
|
||||
shift
|
||||
;;
|
||||
all|cli|web|electron)
|
||||
surface="$1"
|
||||
shift
|
||||
;;
|
||||
-h|--help)
|
||||
usage
|
||||
return 0
|
||||
;;
|
||||
*)
|
||||
echo "unknown status option: $1" >&2
|
||||
usage >&2
|
||||
return 2
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
case "$surface" in
|
||||
all|cli|web|electron) ;;
|
||||
"")
|
||||
echo "--surface requires one of: all, cli, web, electron" >&2
|
||||
return 2
|
||||
;;
|
||||
*)
|
||||
echo "unknown surface: $surface" >&2
|
||||
usage >&2
|
||||
return 2
|
||||
;;
|
||||
esac
|
||||
|
||||
echo "agent-testing auth status (surface=$surface, SERVER_URL=$SERVER_URL):"
|
||||
local rc=0
|
||||
case "$surface" in
|
||||
all)
|
||||
check_server || rc=1
|
||||
check_cli || rc=1
|
||||
check_web || rc=1
|
||||
check_electron || rc=1
|
||||
;;
|
||||
cli)
|
||||
check_server || rc=1
|
||||
check_cli || rc=1
|
||||
;;
|
||||
web)
|
||||
check_server || rc=1
|
||||
check_web || rc=1
|
||||
;;
|
||||
electron)
|
||||
check_electron || rc=1
|
||||
;;
|
||||
esac
|
||||
if [[ $rc -eq 0 ]]; then
|
||||
echo "$surface auth green — safe to start automated testing on this surface."
|
||||
else
|
||||
echo "$surface auth NOT ready — fix the ✘ items before writing any test step."
|
||||
fi
|
||||
return $rc
|
||||
}
|
||||
|
||||
cmd_cli() {
|
||||
echo "Starting CLI device-code login against $SERVER_URL ..."
|
||||
echo "(opens a browser authorization — must be run by a human in a terminal)"
|
||||
cd "$REPO_ROOT/apps/cli"
|
||||
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server "$SERVER_URL"
|
||||
}
|
||||
|
||||
write_cli_seed_env() {
|
||||
mkdir -p "$(dirname "$CLI_ENV_FILE")"
|
||||
cat > "$CLI_ENV_FILE" << EOF
|
||||
# Source this file before running LobeHub CLI agent tests.
|
||||
# Generated by setup-auth.sh cli-seed
|
||||
export LOBE_API_KEY=$SEED_API_KEY
|
||||
export LOBEHUB_CLI_API_KEY="\${LOBE_API_KEY}"
|
||||
export LOBEHUB_SERVER=$SERVER_URL
|
||||
export LOBEHUB_CLI_HOME=.lobehub-dev
|
||||
EOF
|
||||
}
|
||||
|
||||
write_cli_settings() {
|
||||
mkdir -p "$CLI_HOME"
|
||||
python3 - "$CLI_HOME/settings.json" "$SERVER_URL" << 'PY'
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
|
||||
path, server_url = sys.argv[1], sys.argv[2]
|
||||
os.makedirs(os.path.dirname(path), exist_ok=True)
|
||||
with open(path, "w") as f:
|
||||
json.dump({"serverUrl": server_url}, f, indent=2)
|
||||
f.write("\n")
|
||||
os.chmod(path, 0o600)
|
||||
PY
|
||||
}
|
||||
|
||||
cmd_cli_seed() {
|
||||
check_server || return 1
|
||||
write_cli_seed_env
|
||||
write_cli_settings
|
||||
ok "wrote CLI seed env: $CLI_ENV_FILE"
|
||||
note "source it before CLI commands: source $CLI_ENV_FILE"
|
||||
note "settings saved at: $CLI_HOME/settings.json"
|
||||
LOBE_API_KEY="$SEED_API_KEY" LOBEHUB_CLI_API_KEY="$SEED_API_KEY" check_cli
|
||||
}
|
||||
|
||||
cmd_open_chrome() {
|
||||
local mode="${1:-}"
|
||||
if [[ "$mode" != "" && "$mode" != "--dry-run" ]]; then
|
||||
echo "unknown open-chrome option: $mode" >&2
|
||||
usage >&2
|
||||
return 2
|
||||
fi
|
||||
|
||||
if [[ "$mode" == "--dry-run" ]]; then
|
||||
echo "would open Google Chrome at $SERVER_URL/"
|
||||
echo "would press Cmd+Option+I to open DevTools"
|
||||
echo "would open DevTools command menu and run 'Show Network'"
|
||||
return 0
|
||||
fi
|
||||
|
||||
if [[ "$(uname -s)" != "Darwin" ]]; then
|
||||
bad "open-chrome is macOS-only"
|
||||
note "open $SERVER_URL/ in your browser and open DevTools manually"
|
||||
return 1
|
||||
fi
|
||||
|
||||
if ! command -v osascript > /dev/null 2>&1; then
|
||||
bad "osascript not found"
|
||||
note "open $SERVER_URL/ in Chrome and press Cmd+Option+I manually"
|
||||
return 1
|
||||
fi
|
||||
|
||||
SERVER_URL="$SERVER_URL" osascript << 'OSA'
|
||||
set targetUrl to (system attribute "SERVER_URL") & "/"
|
||||
|
||||
tell application "Google Chrome"
|
||||
activate
|
||||
if (count of windows) = 0 then
|
||||
make new window
|
||||
end if
|
||||
tell front window to make new tab with properties {URL:targetUrl}
|
||||
end tell
|
||||
|
||||
delay 1
|
||||
|
||||
tell application "System Events"
|
||||
tell process "Google Chrome"
|
||||
set frontmost to true
|
||||
keystroke "i" using {command down, option down}
|
||||
delay 1
|
||||
keystroke "p" using {command down, shift down}
|
||||
delay 0.2
|
||||
keystroke "Show Network"
|
||||
key code 36
|
||||
end tell
|
||||
end tell
|
||||
OSA
|
||||
ok "opened Chrome at $SERVER_URL/ and requested DevTools Network panel"
|
||||
}
|
||||
|
||||
cookie_header_from_jar() {
|
||||
local jar="$1"
|
||||
awk '
|
||||
BEGIN { first = 1 }
|
||||
/^$/ { next }
|
||||
/^#/ {
|
||||
if ($0 !~ /^#HttpOnly_/) next
|
||||
sub(/^#HttpOnly_/, "")
|
||||
}
|
||||
NF >= 7 {
|
||||
if (!first) printf "; "
|
||||
printf "%s=%s", $6, $7
|
||||
first = 0
|
||||
}
|
||||
END {
|
||||
if (!first) printf "\n"
|
||||
}
|
||||
' "$jar"
|
||||
}
|
||||
|
||||
# Build a Playwright storageState file from a raw Cookie header on stdin,
|
||||
# keeping only the better-auth cookies. See references/auth.md for why the
|
||||
# header must come from a Network request (HttpOnly) and why httpOnly=false.
|
||||
cmd_web() {
|
||||
mkdir -p "$AUTH_DIR"
|
||||
local raw
|
||||
raw="$(cat)"
|
||||
COOKIE_INPUT="$raw" python3 - "$STATE_FILE" << 'PY'
|
||||
import json, os, sys, time
|
||||
|
||||
raw = os.environ.get("COOKIE_INPUT", "").strip()
|
||||
cookie_lines = []
|
||||
for line in raw.splitlines():
|
||||
stripped = line.strip()
|
||||
if not stripped:
|
||||
continue
|
||||
if stripped.lower().startswith("cookie:"):
|
||||
cookie_lines.append(stripped.split(":", 1)[1].strip())
|
||||
else:
|
||||
cookie_lines.append(stripped)
|
||||
|
||||
raw = "; ".join(cookie_lines)
|
||||
|
||||
WANTED = {"better-auth.session_token", "better-auth.session_data", "better-auth.state"}
|
||||
exp = int(time.time()) + 30 * 24 * 3600 # 30 days
|
||||
|
||||
cookies = []
|
||||
for pair in raw.split(";"):
|
||||
pair = pair.strip()
|
||||
if "=" not in pair:
|
||||
continue
|
||||
name, _, value = pair.partition("=")
|
||||
if name not in WANTED:
|
||||
continue
|
||||
cookies.append({
|
||||
"name": name,
|
||||
"value": value,
|
||||
"domain": "localhost",
|
||||
"path": "/",
|
||||
"expires": exp,
|
||||
"httpOnly": False,
|
||||
"secure": False,
|
||||
"sameSite": "Lax",
|
||||
})
|
||||
|
||||
if not cookies:
|
||||
sys.stderr.write("no better-auth cookies found in input — paste the raw Cookie header from a Network request\n")
|
||||
sys.exit(1)
|
||||
|
||||
with open(sys.argv[1], "w") as f:
|
||||
json.dump({"cookies": cookies, "origins": []}, f, indent=2)
|
||||
print(f"wrote {len(cookies)} cookie(s) to {sys.argv[1]}")
|
||||
PY
|
||||
cmd_web_verify
|
||||
}
|
||||
|
||||
cmd_web_seed() {
|
||||
check_server || return 1
|
||||
mkdir -p "$AUTH_DIR"
|
||||
|
||||
local cookie_jar="$AUTH_DIR/web-seed-cookie.jar"
|
||||
local response_body="$AUTH_DIR/web-seed-response.json"
|
||||
local payload code
|
||||
payload="$(
|
||||
SEED_EMAIL="$SEED_EMAIL" SEED_PASSWORD="$SEED_PASSWORD" python3 - << 'PY'
|
||||
import json
|
||||
import os
|
||||
|
||||
print(json.dumps({
|
||||
"callbackURL": "/",
|
||||
"email": os.environ["SEED_EMAIL"],
|
||||
"password": os.environ["SEED_PASSWORD"],
|
||||
}))
|
||||
PY
|
||||
)"
|
||||
|
||||
code=$(curl -sS -o "$response_body" -w '%{http_code}' \
|
||||
-c "$cookie_jar" \
|
||||
-H 'Content-Type: application/json' \
|
||||
-X POST "$SERVER_URL/api/auth/sign-in/email" \
|
||||
--data "$payload" 2> /dev/null || true)
|
||||
|
||||
if [[ ! "$code" =~ ^[23] ]]; then
|
||||
bad "seed user sign-in failed at $SERVER_URL/api/auth/sign-in/email (http_code='$code')"
|
||||
note "make sure the seed user exists:"
|
||||
note "./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user"
|
||||
return 1
|
||||
fi
|
||||
|
||||
local cookie_header
|
||||
cookie_header="$(cookie_header_from_jar "$cookie_jar")"
|
||||
if [[ -z "$cookie_header" ]]; then
|
||||
bad "seed sign-in succeeded but no cookies were written to $cookie_jar"
|
||||
return 1
|
||||
fi
|
||||
|
||||
printf '%s\n' "$cookie_header" | cmd_web
|
||||
}
|
||||
|
||||
cmd_web_verify() {
|
||||
local skip_server_check="${1:-}"
|
||||
if [[ "$skip_server_check" != "--skip-server-check" ]]; then
|
||||
check_server || return 1
|
||||
fi
|
||||
if [[ ! -f "$STATE_FILE" ]]; then
|
||||
bad "no web auth state for agent-browser"
|
||||
note "for the seeded local user, run: $0 web-seed"
|
||||
note "or copy the Cookie header from Chrome DevTools (Network tab), then:"
|
||||
note "pbpaste | $0 web"
|
||||
return 1
|
||||
fi
|
||||
check_agent_browser || return 1
|
||||
if ! agent-browser --session "$SESSION" state load "$STATE_FILE" > /dev/null; then
|
||||
bad "failed to load web auth state into agent-browser session '$SESSION'"
|
||||
return 1
|
||||
fi
|
||||
if ! agent-browser --session "$SESSION" open "$SERVER_URL/" > /dev/null; then
|
||||
bad "failed to open $SERVER_URL in agent-browser session '$SESSION'"
|
||||
return 1
|
||||
fi
|
||||
local url
|
||||
url=$(agent-browser --session "$SESSION" get url 2> /dev/null || true)
|
||||
if [[ -z "$url" ]]; then
|
||||
bad "agent-browser session '$SESSION' did not report a current URL"
|
||||
return 1
|
||||
fi
|
||||
if [[ "$url" == *"/signin"* || "$url" == *"/login"* ]]; then
|
||||
bad "agent-browser session '$SESSION' NOT authenticated (landed on $url)"
|
||||
note "re-copy the Cookie header and re-run: pbpaste | $0 web"
|
||||
return 1
|
||||
fi
|
||||
ok "agent-browser session '$SESSION' authenticated (at $url)"
|
||||
}
|
||||
|
||||
case "${1:-status}" in
|
||||
status)
|
||||
shift || true
|
||||
cmd_status "$@"
|
||||
;;
|
||||
cli-seed) cmd_cli_seed ;;
|
||||
cli) cmd_cli ;;
|
||||
open-chrome)
|
||||
shift || true
|
||||
cmd_open_chrome "$@"
|
||||
;;
|
||||
web-seed) cmd_web_seed ;;
|
||||
web) cmd_web ;;
|
||||
web-verify) cmd_web_verify ;;
|
||||
-h|--help) usage ;;
|
||||
*)
|
||||
echo "Usage: $0 {status|cli-seed|cli|open-chrome|web-seed|web|web-verify}" >&2
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
+197
@@ -0,0 +1,197 @@
|
||||
#!/usr/bin/env bash
|
||||
# Smoke tests for setup-auth.sh. Uses a temporary agent-browser stub and local
|
||||
# HTTP server, so it does not need real browser auth.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
SCRIPT="$SCRIPT_DIR/setup-auth.sh"
|
||||
|
||||
fail() {
|
||||
echo "FAIL: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
assert_contains() {
|
||||
local file="$1"
|
||||
local text="$2"
|
||||
grep -Fq "$text" "$file" || fail "expected '$text' in $file"
|
||||
}
|
||||
|
||||
tmp_dir="$(mktemp -d)"
|
||||
server_pid=""
|
||||
|
||||
cleanup() {
|
||||
if [[ -n "$server_pid" ]]; then
|
||||
kill "$server_pid" > /dev/null 2>&1 || true
|
||||
wait "$server_pid" > /dev/null 2>&1 || true
|
||||
fi
|
||||
rm -rf "$tmp_dir"
|
||||
}
|
||||
trap cleanup EXIT
|
||||
export HOME="$tmp_dir/home"
|
||||
|
||||
port="$(python3 - << 'PY'
|
||||
import socket
|
||||
|
||||
sock = socket.socket()
|
||||
sock.bind(("127.0.0.1", 0))
|
||||
print(sock.getsockname()[1])
|
||||
sock.close()
|
||||
PY
|
||||
)"
|
||||
|
||||
python3 - "$port" << 'PY' > "$tmp_dir/http.log" 2>&1 &
|
||||
from http.server import BaseHTTPRequestHandler, ThreadingHTTPServer
|
||||
import sys
|
||||
|
||||
|
||||
class Handler(BaseHTTPRequestHandler):
|
||||
def do_GET(self):
|
||||
if self.path.startswith("/api/v1/users/me"):
|
||||
if self.headers.get("authorization") != "Bearer sk-lh-agenttesting0001":
|
||||
self.send_response(401)
|
||||
self.end_headers()
|
||||
self.wfile.write(b'{"success":false}')
|
||||
return
|
||||
|
||||
self.send_response(200)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.end_headers()
|
||||
self.wfile.write(b'{"success":true,"data":{"id":"user_agent_testing_001"}}')
|
||||
return
|
||||
|
||||
self.send_response(200)
|
||||
self.end_headers()
|
||||
self.wfile.write(b"ok")
|
||||
|
||||
def do_POST(self):
|
||||
length = int(self.headers.get("content-length") or "0")
|
||||
if length:
|
||||
self.rfile.read(length)
|
||||
|
||||
if self.path != "/api/auth/sign-in/email":
|
||||
self.send_response(404)
|
||||
self.end_headers()
|
||||
return
|
||||
|
||||
self.send_response(200)
|
||||
self.send_header(
|
||||
"Set-Cookie",
|
||||
"better-auth.session_token=seed.token; Path=/; HttpOnly; SameSite=Lax",
|
||||
)
|
||||
self.send_header(
|
||||
"Set-Cookie",
|
||||
"better-auth.session_data=seed.data; Path=/; HttpOnly; SameSite=Lax",
|
||||
)
|
||||
self.send_header("Content-Type", "application/json")
|
||||
self.end_headers()
|
||||
self.wfile.write(b'{"ok":true}')
|
||||
|
||||
def log_message(self, format, *args):
|
||||
return
|
||||
|
||||
|
||||
ThreadingHTTPServer(("localhost", int(sys.argv[1])), Handler).serve_forever()
|
||||
PY
|
||||
server_pid="$!"
|
||||
|
||||
server_url="http://localhost:$port"
|
||||
for _ in {1..50}; do
|
||||
if curl -s -o /dev/null "$server_url/"; then
|
||||
break
|
||||
fi
|
||||
sleep 0.1
|
||||
done
|
||||
curl -s -o /dev/null "$server_url/" || fail "test HTTP server did not start"
|
||||
|
||||
mkdir -p "$tmp_dir/bin" "$tmp_dir/auth"
|
||||
cat > "$tmp_dir/bin/agent-browser" << 'SH'
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [[ "${1:-}" == "--session" ]]; then
|
||||
shift 2
|
||||
fi
|
||||
|
||||
case "${1:-}" in
|
||||
state)
|
||||
[[ "${2:-}" == "load" ]] || exit 2
|
||||
[[ -f "${3:-}" ]] || exit 1
|
||||
;;
|
||||
open)
|
||||
printf '%s\n' "${2:-}" > "${AGENT_BROWSER_URL_FILE:?}"
|
||||
;;
|
||||
get)
|
||||
[[ "${2:-}" == "url" ]] || exit 2
|
||||
cat "${AGENT_BROWSER_URL_FILE:?}"
|
||||
;;
|
||||
*)
|
||||
echo "unexpected agent-browser command: $*" >&2
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
SH
|
||||
chmod +x "$tmp_dir/bin/agent-browser"
|
||||
|
||||
export PATH="$tmp_dir/bin:$PATH"
|
||||
export AUTH_DIR="$tmp_dir/auth"
|
||||
export SESSION="setup-auth-test"
|
||||
export SERVER_URL="$server_url"
|
||||
export AGENT_BROWSER_URL_FILE="$tmp_dir/current-url"
|
||||
|
||||
cookie_header="Cookie: foo=bar; better-auth.session_token=test.token; better-auth.session_data=encoded%3D; theme=dark"
|
||||
printf '%s\n' "$cookie_header" | "$SCRIPT" web > "$tmp_dir/web.out"
|
||||
|
||||
python3 - "$AUTH_DIR/web-state.json" << 'PY'
|
||||
import json, sys
|
||||
|
||||
with open(sys.argv[1]) as f:
|
||||
state = json.load(f)
|
||||
|
||||
names = {cookie["name"] for cookie in state["cookies"]}
|
||||
expected = {"better-auth.session_token", "better-auth.session_data"}
|
||||
if names != expected:
|
||||
raise SystemExit(f"unexpected cookies: {sorted(names)}")
|
||||
PY
|
||||
|
||||
"$SCRIPT" web-seed > "$tmp_dir/web-seed.out"
|
||||
|
||||
python3 - "$AUTH_DIR/web-state.json" << 'PY'
|
||||
import json, sys
|
||||
|
||||
with open(sys.argv[1]) as f:
|
||||
state = json.load(f)
|
||||
|
||||
values = {cookie["name"]: cookie["value"] for cookie in state["cookies"]}
|
||||
expected = {
|
||||
"better-auth.session_token": "seed.token",
|
||||
"better-auth.session_data": "seed.data",
|
||||
}
|
||||
if values != expected:
|
||||
raise SystemExit(f"unexpected seeded cookies: {values}")
|
||||
PY
|
||||
|
||||
"$SCRIPT" status --surface web > "$tmp_dir/status.out"
|
||||
assert_contains "$tmp_dir/status.out" "surface=web"
|
||||
assert_contains "$tmp_dir/status.out" "web auth green"
|
||||
|
||||
"$SCRIPT" cli-seed > "$tmp_dir/cli-seed.out"
|
||||
assert_contains "$tmp_dir/cli-seed.out" "CLI API-key auth valid"
|
||||
assert_contains "$tmp_dir/cli-seed.out" "settings saved at: $HOME/.lobehub-dev/settings.json"
|
||||
|
||||
if "$SCRIPT" status --surface cli > "$tmp_dir/cli-no-env.out"; then
|
||||
fail "cli status without API key unexpectedly passed"
|
||||
fi
|
||||
assert_contains "$tmp_dir/cli-no-env.out" "CLI not logged in"
|
||||
|
||||
LOBEHUB_CLI_API_KEY=sk-lh-agenttesting0001 "$SCRIPT" status --surface cli > "$tmp_dir/cli-status.out"
|
||||
assert_contains "$tmp_dir/cli-status.out" "CLI API-key auth valid"
|
||||
assert_contains "$tmp_dir/cli-status.out" "cli auth green"
|
||||
|
||||
if printf 'foo=bar\n' | "$SCRIPT" web > "$tmp_dir/invalid.out" 2> "$tmp_dir/invalid.err"; then
|
||||
fail "invalid cookie unexpectedly passed"
|
||||
fi
|
||||
assert_contains "$tmp_dir/invalid.err" "no better-auth cookies found"
|
||||
|
||||
echo "setup-auth tests passed"
|
||||
+377
@@ -0,0 +1,377 @@
|
||||
#!/usr/bin/env bash
|
||||
# Print the resolved local test environment for agent-testing.
|
||||
#
|
||||
# This is intentionally read-only. It mirrors scripts/runWithEnv.mts precedence:
|
||||
# .env -> .env.$NODE_ENV -> .env.local -> .env.$NODE_ENV.local, then shell env.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
REPO_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)"
|
||||
NODE_ENV="${NODE_ENV:-development}"
|
||||
|
||||
VALUE_APP_URL=""
|
||||
VALUE_PORT=""
|
||||
VALUE_SERVER_URL=""
|
||||
VALUE_AUTH_TRUSTED_ORIGINS=""
|
||||
VALUE_SPA_PORT=""
|
||||
VALUE_MOBILE_SPA_PORT=""
|
||||
VALUE_DESKTOP_PORT=""
|
||||
|
||||
SOURCE_APP_URL=""
|
||||
SOURCE_PORT=""
|
||||
SOURCE_SERVER_URL=""
|
||||
SOURCE_AUTH_TRUSTED_ORIGINS=""
|
||||
SOURCE_SPA_PORT=""
|
||||
SOURCE_MOBILE_SPA_PORT=""
|
||||
SOURCE_DESKTOP_PORT=""
|
||||
|
||||
LOADED_ENV_FILES=""
|
||||
|
||||
keys() {
|
||||
printf '%s\n' \
|
||||
APP_URL \
|
||||
PORT \
|
||||
SERVER_URL \
|
||||
AUTH_TRUSTED_ORIGINS \
|
||||
SPA_PORT \
|
||||
MOBILE_SPA_PORT \
|
||||
DESKTOP_PORT
|
||||
}
|
||||
|
||||
trim() {
|
||||
local value="$1"
|
||||
value="${value#"${value%%[![:space:]]*}"}"
|
||||
value="${value%"${value##*[![:space:]]}"}"
|
||||
printf '%s' "$value"
|
||||
}
|
||||
|
||||
workspace_root() {
|
||||
local root="$REPO_ROOT"
|
||||
local name
|
||||
name="$(basename "$root")"
|
||||
|
||||
if [[ "$name" == "lobehub" ]]; then
|
||||
local parent parent_name
|
||||
parent="$(cd "$root/.." && pwd)"
|
||||
parent_name="$(basename "$parent")"
|
||||
if [[ "$parent_name" == lobehub-cloud* ]]; then
|
||||
root="$parent"
|
||||
fi
|
||||
fi
|
||||
|
||||
printf '%s\n' "$root"
|
||||
}
|
||||
|
||||
workspace_offset() {
|
||||
local name="$1"
|
||||
|
||||
case "$name" in
|
||||
lobehub-cloud)
|
||||
printf '0\n'
|
||||
;;
|
||||
lobehub-cloud-*)
|
||||
local suffix="${name#lobehub-cloud-}"
|
||||
if [[ "$suffix" =~ ^[0-9]+$ ]]; then
|
||||
printf '%s\n' "$((10#$suffix))"
|
||||
else
|
||||
printf '\n'
|
||||
fi
|
||||
;;
|
||||
*)
|
||||
printf '\n'
|
||||
;;
|
||||
esac
|
||||
}
|
||||
|
||||
default_port() {
|
||||
local base="$1"
|
||||
local fallback="$2"
|
||||
local root name offset
|
||||
root="$(workspace_root)"
|
||||
name="$(basename "$root")"
|
||||
offset="$(workspace_offset "$name")"
|
||||
|
||||
if [[ -n "$offset" ]]; then
|
||||
printf '%s\n' "$((base + offset))"
|
||||
else
|
||||
printf '%s\n' "$fallback"
|
||||
fi
|
||||
}
|
||||
|
||||
url_port() {
|
||||
local url="$1"
|
||||
local hostport
|
||||
hostport="${url#*://}"
|
||||
hostport="${hostport%%/*}"
|
||||
|
||||
if [[ "$hostport" == *:* ]]; then
|
||||
local port="${hostport##*:}"
|
||||
if [[ "$port" =~ ^[0-9]+$ ]]; then
|
||||
printf '%s\n' "$port"
|
||||
return 0
|
||||
fi
|
||||
fi
|
||||
|
||||
return 1
|
||||
}
|
||||
|
||||
url_origin() {
|
||||
local url="$1"
|
||||
local scheme rest hostport
|
||||
if [[ "$url" == *"://"* ]]; then
|
||||
scheme="${url%%://*}"
|
||||
rest="${url#*://}"
|
||||
hostport="${rest%%/*}"
|
||||
printf '%s://%s\n' "$scheme" "$hostport"
|
||||
else
|
||||
printf '%s\n' "$url"
|
||||
fi
|
||||
}
|
||||
|
||||
set_value() {
|
||||
local key="$1"
|
||||
local value="$2"
|
||||
local source="$3"
|
||||
|
||||
case "$key" in
|
||||
APP_URL) VALUE_APP_URL="$value"; SOURCE_APP_URL="$source" ;;
|
||||
PORT) VALUE_PORT="$value"; SOURCE_PORT="$source" ;;
|
||||
SERVER_URL) VALUE_SERVER_URL="$value"; SOURCE_SERVER_URL="$source" ;;
|
||||
AUTH_TRUSTED_ORIGINS) VALUE_AUTH_TRUSTED_ORIGINS="$value"; SOURCE_AUTH_TRUSTED_ORIGINS="$source" ;;
|
||||
SPA_PORT) VALUE_SPA_PORT="$value"; SOURCE_SPA_PORT="$source" ;;
|
||||
MOBILE_SPA_PORT) VALUE_MOBILE_SPA_PORT="$value"; SOURCE_MOBILE_SPA_PORT="$source" ;;
|
||||
DESKTOP_PORT) VALUE_DESKTOP_PORT="$value"; SOURCE_DESKTOP_PORT="$source" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
value_for() {
|
||||
case "$1" in
|
||||
APP_URL) printf '%s\n' "$VALUE_APP_URL" ;;
|
||||
PORT) printf '%s\n' "$VALUE_PORT" ;;
|
||||
SERVER_URL) printf '%s\n' "$VALUE_SERVER_URL" ;;
|
||||
AUTH_TRUSTED_ORIGINS) printf '%s\n' "$VALUE_AUTH_TRUSTED_ORIGINS" ;;
|
||||
SPA_PORT) printf '%s\n' "$VALUE_SPA_PORT" ;;
|
||||
MOBILE_SPA_PORT) printf '%s\n' "$VALUE_MOBILE_SPA_PORT" ;;
|
||||
DESKTOP_PORT) printf '%s\n' "$VALUE_DESKTOP_PORT" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
source_for() {
|
||||
case "$1" in
|
||||
APP_URL) printf '%s\n' "$SOURCE_APP_URL" ;;
|
||||
PORT) printf '%s\n' "$SOURCE_PORT" ;;
|
||||
SERVER_URL) printf '%s\n' "$SOURCE_SERVER_URL" ;;
|
||||
AUTH_TRUSTED_ORIGINS) printf '%s\n' "$SOURCE_AUTH_TRUSTED_ORIGINS" ;;
|
||||
SPA_PORT) printf '%s\n' "$SOURCE_SPA_PORT" ;;
|
||||
MOBILE_SPA_PORT) printf '%s\n' "$SOURCE_MOBILE_SPA_PORT" ;;
|
||||
DESKTOP_PORT) printf '%s\n' "$SOURCE_DESKTOP_PORT" ;;
|
||||
esac
|
||||
}
|
||||
|
||||
is_tracked_key() {
|
||||
case "$1" in
|
||||
APP_URL|PORT|SERVER_URL|AUTH_TRUSTED_ORIGINS|SPA_PORT|MOBILE_SPA_PORT|DESKTOP_PORT) return 0 ;;
|
||||
*) return 1 ;;
|
||||
esac
|
||||
}
|
||||
|
||||
parse_env_file() {
|
||||
local file="$1"
|
||||
local root="$2"
|
||||
local label="${file#$root/}"
|
||||
local line key value
|
||||
|
||||
[[ -f "$file" ]] || return 0
|
||||
if [[ -z "$LOADED_ENV_FILES" ]]; then
|
||||
LOADED_ENV_FILES="$label"
|
||||
else
|
||||
LOADED_ENV_FILES="$LOADED_ENV_FILES, $label"
|
||||
fi
|
||||
|
||||
while IFS= read -r line || [[ -n "$line" ]]; do
|
||||
line="$(trim "$line")"
|
||||
[[ -z "$line" || "$line" == \#* ]] && continue
|
||||
|
||||
if [[ "$line" == export[[:space:]]* ]]; then
|
||||
line="$(trim "${line#export}")"
|
||||
fi
|
||||
|
||||
[[ "$line" == *=* ]] || continue
|
||||
key="$(trim "${line%%=*}")"
|
||||
value="$(trim "${line#*=}")"
|
||||
is_tracked_key "$key" || continue
|
||||
|
||||
if [[ "$value" == \"*\" && "$value" == *\" && ${#value} -ge 2 ]]; then
|
||||
value="${value:1:${#value}-2}"
|
||||
elif [[ "$value" == \'* && "$value" == *\' && ${#value} -ge 2 ]]; then
|
||||
value="${value:1:${#value}-2}"
|
||||
fi
|
||||
|
||||
set_value "$key" "$value" "$label"
|
||||
done < "$file"
|
||||
}
|
||||
|
||||
apply_env_files() {
|
||||
local root="$1"
|
||||
parse_env_file "$root/.env" "$root"
|
||||
parse_env_file "$root/.env.$NODE_ENV" "$root"
|
||||
parse_env_file "$root/.env.local" "$root"
|
||||
parse_env_file "$root/.env.$NODE_ENV.local" "$root"
|
||||
}
|
||||
|
||||
apply_shell_overrides() {
|
||||
local key value
|
||||
while IFS= read -r key; do
|
||||
if [[ -n "${!key+x}" ]]; then
|
||||
value="${!key}"
|
||||
set_value "$key" "$value" "shell"
|
||||
fi
|
||||
done < <(keys)
|
||||
}
|
||||
|
||||
resolve_defaults() {
|
||||
local app_port spa_port mobile_spa_port desktop_port
|
||||
app_port="$(default_port 3020 3010)"
|
||||
spa_port="$(default_port 9800 9876)"
|
||||
mobile_spa_port="$(default_port 3810 3012)"
|
||||
desktop_port="$(default_port 3030 3015)"
|
||||
|
||||
if [[ -z "$VALUE_APP_URL" ]]; then
|
||||
set_value APP_URL "http://localhost:$app_port" "inferred"
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_PORT" ]]; then
|
||||
if app_port="$(url_port "$VALUE_APP_URL")"; then
|
||||
set_value PORT "$app_port" "inferred from APP_URL"
|
||||
else
|
||||
set_value PORT "$(default_port 3020 3010)" "inferred"
|
||||
fi
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_SERVER_URL" ]]; then
|
||||
set_value SERVER_URL "$VALUE_APP_URL" "from APP_URL"
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_SPA_PORT" ]]; then
|
||||
set_value SPA_PORT "$spa_port" "inferred"
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_MOBILE_SPA_PORT" ]]; then
|
||||
set_value MOBILE_SPA_PORT "$mobile_spa_port" "inferred"
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_DESKTOP_PORT" ]]; then
|
||||
set_value DESKTOP_PORT "$desktop_port" "inferred"
|
||||
fi
|
||||
|
||||
if [[ -z "$VALUE_AUTH_TRUSTED_ORIGINS" ]]; then
|
||||
set_value AUTH_TRUSTED_ORIGINS "$(url_origin "$VALUE_APP_URL"),http://localhost:$VALUE_SPA_PORT" "inferred"
|
||||
fi
|
||||
}
|
||||
|
||||
contains_origin() {
|
||||
local list="$1"
|
||||
local expected="$2"
|
||||
local item
|
||||
IFS=',' read -r -a items <<< "$list"
|
||||
for item in "${items[@]}"; do
|
||||
item="$(trim "$item")"
|
||||
[[ "$item" == "$expected" ]] && return 0
|
||||
done
|
||||
return 1
|
||||
}
|
||||
|
||||
print_exports() {
|
||||
local key value
|
||||
while IFS= read -r key; do
|
||||
value="$(value_for "$key")"
|
||||
printf 'export %s=%q\n' "$key" "$value"
|
||||
done < <(keys)
|
||||
}
|
||||
|
||||
print_value() {
|
||||
local key="$1"
|
||||
if ! is_tracked_key "$key"; then
|
||||
echo "unknown key: $key" >&2
|
||||
exit 2
|
||||
fi
|
||||
value_for "$key"
|
||||
}
|
||||
|
||||
print_human() {
|
||||
local root="$1"
|
||||
local key value source
|
||||
|
||||
echo "agent-testing test env:"
|
||||
printf ' workspace: %s\n' "$root"
|
||||
printf ' NODE_ENV: %s\n' "$NODE_ENV"
|
||||
printf ' env files: %s\n' "${LOADED_ENV_FILES:-none}"
|
||||
echo
|
||||
echo "resolved values:"
|
||||
while IFS= read -r key; do
|
||||
value="$(value_for "$key")"
|
||||
source="$(source_for "$key")"
|
||||
printf ' %-22s %s (%s)\n' "$key=$value" "" "$source"
|
||||
done < <(keys)
|
||||
echo
|
||||
echo "checks:"
|
||||
|
||||
local app_origin spa_origin app_port
|
||||
app_origin="$(url_origin "$VALUE_APP_URL")"
|
||||
spa_origin="http://localhost:$VALUE_SPA_PORT"
|
||||
if app_port="$(url_port "$VALUE_APP_URL")" && [[ "$app_port" == "$VALUE_PORT" ]]; then
|
||||
printf ' OK PORT matches APP_URL (%s)\n' "$VALUE_PORT"
|
||||
else
|
||||
printf ' WARN PORT (%s) does not match APP_URL (%s)\n' "$VALUE_PORT" "$VALUE_APP_URL"
|
||||
fi
|
||||
|
||||
if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$app_origin"; then
|
||||
printf ' OK AUTH_TRUSTED_ORIGINS includes %s\n' "$app_origin"
|
||||
else
|
||||
printf ' WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$app_origin"
|
||||
fi
|
||||
|
||||
if contains_origin "$VALUE_AUTH_TRUSTED_ORIGINS" "$spa_origin"; then
|
||||
printf ' OK AUTH_TRUSTED_ORIGINS includes %s\n' "$spa_origin"
|
||||
else
|
||||
printf ' WARN AUTH_TRUSTED_ORIGINS is missing %s\n' "$spa_origin"
|
||||
fi
|
||||
}
|
||||
|
||||
usage() {
|
||||
cat << EOF
|
||||
Usage:
|
||||
$0 # print resolved test environment
|
||||
$0 --exports # print source-able export lines
|
||||
$0 --value KEY # print one resolved value
|
||||
|
||||
Tracked keys:
|
||||
APP_URL PORT SERVER_URL AUTH_TRUSTED_ORIGINS SPA_PORT MOBILE_SPA_PORT DESKTOP_PORT
|
||||
EOF
|
||||
}
|
||||
|
||||
ROOT="$(workspace_root)"
|
||||
apply_env_files "$ROOT"
|
||||
apply_shell_overrides
|
||||
resolve_defaults
|
||||
|
||||
case "${1:-}" in
|
||||
"")
|
||||
print_human "$ROOT"
|
||||
;;
|
||||
--exports)
|
||||
print_exports
|
||||
;;
|
||||
--value)
|
||||
print_value "${2:-}"
|
||||
;;
|
||||
-h|--help)
|
||||
usage
|
||||
;;
|
||||
*)
|
||||
echo "unknown option: $1" >&2
|
||||
usage >&2
|
||||
exit 2
|
||||
;;
|
||||
esac
|
||||
+57
@@ -0,0 +1,57 @@
|
||||
#!/usr/bin/env bash
|
||||
# Smoke tests for test-env.sh.
|
||||
|
||||
set -euo pipefail
|
||||
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
fail() {
|
||||
echo "FAIL: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
assert_eq() {
|
||||
local actual="$1"
|
||||
local expected="$2"
|
||||
[[ "$actual" == "$expected" ]] || fail "expected '$expected', got '$actual'"
|
||||
}
|
||||
|
||||
assert_contains() {
|
||||
local file="$1"
|
||||
local text="$2"
|
||||
grep -Fq "$text" "$file" || fail "expected '$text' in $file"
|
||||
}
|
||||
|
||||
tmp_dir="$(mktemp -d)"
|
||||
trap 'rm -rf "$tmp_dir"' EXIT
|
||||
|
||||
mkdir -p "$tmp_dir/lobehub-cloud-1/.agents/skills" "$tmp_dir/lobehub/.agents/skills"
|
||||
ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing"
|
||||
ln -s "$SCRIPT_DIR/.." "$tmp_dir/lobehub/.agents/skills/agent-testing"
|
||||
|
||||
cloud_script="$tmp_dir/lobehub-cloud-1/.agents/skills/agent-testing/scripts/test-env.sh"
|
||||
oss_script="$tmp_dir/lobehub/.agents/skills/agent-testing/scripts/test-env.sh"
|
||||
|
||||
assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:3021"
|
||||
assert_eq "$("$cloud_script" --value SPA_PORT)" "9801"
|
||||
assert_eq "$("$cloud_script" --value MOBILE_SPA_PORT)" "3811"
|
||||
assert_eq "$("$cloud_script" --value DESKTOP_PORT)" "3031"
|
||||
assert_eq "$("$oss_script" --value SERVER_URL)" "http://localhost:3010"
|
||||
|
||||
cat > "$tmp_dir/lobehub-cloud-1/.env" << 'EOF'
|
||||
APP_URL=http://localhost:4123
|
||||
PORT=4123
|
||||
AUTH_TRUSTED_ORIGINS=http://localhost:4123,http://localhost:9823
|
||||
SPA_PORT=9823
|
||||
MOBILE_SPA_PORT=3823
|
||||
DESKTOP_PORT=3043
|
||||
EOF
|
||||
|
||||
assert_eq "$("$cloud_script" --value SERVER_URL)" "http://localhost:4123"
|
||||
assert_eq "$("$cloud_script" --value SPA_PORT)" "9823"
|
||||
"$cloud_script" --exports > "$tmp_dir/exports.out"
|
||||
assert_contains "$tmp_dir/exports.out" "export APP_URL=http://localhost:4123"
|
||||
assert_contains "$tmp_dir/exports.out" "export SERVER_URL=http://localhost:4123"
|
||||
assert_contains "$tmp_dir/exports.out" "export AUTH_TRUSTED_ORIGINS=http://localhost:4123\\,http://localhost:9823"
|
||||
|
||||
echo "test-env tests passed"
|
||||
@@ -0,0 +1,154 @@
|
||||
# Electron (LobeHub Desktop) UI Testing
|
||||
|
||||
Default surface for verifying **pure frontend changes** (components, store logic, styles, interactions) in the primary product shape. Drives the Electron renderer over CDP with `agent-browser` — see [../references/agent-browser.md](../references/agent-browser.md) for the full command reference.
|
||||
|
||||
**Auth**: the Electron app keeps its own persistent login state — log in once manually in the app; sessions survive restarts. Run `../scripts/setup-auth.sh status` before testing (see [../references/auth.md](../references/auth.md)).
|
||||
|
||||
**Linux / headless (cloud)**: Electron itself runs on Linux, but it has no true headless mode — it needs a display server. In a headless environment wrap the launch with `xvfb-run` (virtual framebuffer). Everything CDP-based keeps working under Xvfb: the `agent-browser --cdp 9222` connection, snapshots, eval, and `agent-browser screenshot` (captured from the renderer via CDP, not the OS screen). What does NOT work on Linux: `capture-app-window.sh` (macOS `screencapture`), osascript, and the ffmpeg recording scripts in their current form.
|
||||
|
||||
### Setup / Teardown
|
||||
|
||||
Use the `electron-dev.sh` script to manage the Electron dev environment. It handles process lifecycle, waits for SPA readiness, and reliably kills all child processes (main + helpers + vite).
|
||||
|
||||
```bash
|
||||
SCRIPT=".agents/skills/agent-testing/scripts/electron-dev.sh"
|
||||
|
||||
# Start Electron dev with CDP (idempotent — skips if already running)
|
||||
$SCRIPT start
|
||||
|
||||
# Check if Electron is running and CDP is reachable
|
||||
$SCRIPT status
|
||||
|
||||
# Kill all Electron-related processes (main + helper + vite)
|
||||
$SCRIPT stop
|
||||
|
||||
# Force fresh restart
|
||||
$SCRIPT restart
|
||||
```
|
||||
|
||||
After `start` succeeds, connect with: `agent-browser --cdp 9222 snapshot -i`
|
||||
|
||||
**Always run `$SCRIPT stop` when done testing** — `pkill -f "Electron"` alone won't catch all helper processes.
|
||||
|
||||
#### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
| ----------------- | ----------------------- | ---------------------------------------- |
|
||||
| `CDP_PORT` | `9222` | Chrome DevTools Protocol port |
|
||||
| `ELECTRON_LOG` | `/tmp/electron-dev.log` | Electron process log |
|
||||
| `ELECTRON_WAIT_S` | `60` | Max seconds to wait for Electron process |
|
||||
| `RENDERER_WAIT_S` | `60` | Max seconds to wait for SPA to load |
|
||||
|
||||
### LobeHub Probes & Quick Navigation
|
||||
|
||||
`scripts/app-probe.sh` is the standard fast path into app state — **use it
|
||||
instead of hand-rolling `__LOBE_STORES` eval snippets** for these common needs:
|
||||
|
||||
```bash
|
||||
PROBE=".agents/skills/agent-testing/scripts/app-probe.sh"
|
||||
|
||||
$PROBE auth # login check (Step 0.3) → { isSignedIn, userId }
|
||||
$PROBE route # current SPA route
|
||||
$PROBE ops # running chat operations (type / startTime)
|
||||
$PROBE goto /settings # jump the SPA straight to a route (full reload)
|
||||
$PROBE errors-install # install console.error interceptor
|
||||
$PROBE errors # dump captured errors
|
||||
```
|
||||
|
||||
`goto` lets a test enter the state under test directly instead of clicking
|
||||
through the UI. Common desktop routes:
|
||||
|
||||
| Route | Where it lands |
|
||||
| ----------------------------- | ------------------------------------ |
|
||||
| `/` | Home (has a chat input) |
|
||||
| `/agent/<agentId>` | Agent conversation (latest topic) |
|
||||
| `/agent/<agentId>/<topicId>` | Specific topic in a conversation |
|
||||
| `/task` · `/task/<taskId>` | Task list / task detail |
|
||||
| `/page` | Documents (文稿) |
|
||||
| `/settings` | Settings |
|
||||
| `/community` | Discover / community |
|
||||
|
||||
Targets default to Electron (`--cdp 9222`); set `AB_TARGET="--session <name>"`
|
||||
for web sessions. For deeper or one-off state inspection, fall back to raw
|
||||
eval below.
|
||||
|
||||
### LobeHub-Specific Patterns
|
||||
|
||||
#### Access Zustand Store State
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
var chat = window.__LOBE_STORES.chat();
|
||||
var ops = Object.values(chat.operations);
|
||||
return JSON.stringify({
|
||||
ops: ops.map(function(o) { return { type: o.type, status: o.status }; }),
|
||||
activeAgent: chat.activeAgentId,
|
||||
activeTopic: chat.activeTopicId,
|
||||
});
|
||||
})()
|
||||
EVALEOF
|
||||
```
|
||||
|
||||
#### Find and Use the Chat Input
|
||||
|
||||
```bash
|
||||
# The chat input is contenteditable — must use -C flag
|
||||
agent-browser --cdp 9222 snapshot -i -C 2>&1 | grep "editable"
|
||||
|
||||
agent-browser --cdp 9222 click @e48
|
||||
agent-browser --cdp 9222 type @e48 "Hello world"
|
||||
agent-browser --cdp 9222 press Enter
|
||||
```
|
||||
|
||||
#### Wait for Agent to Complete
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
var chat = window.__LOBE_STORES.chat();
|
||||
var ops = Object.values(chat.operations);
|
||||
var running = ops.filter(function(o) { return o.status === 'running'; });
|
||||
return running.length === 0 ? 'done' : 'running: ' + running.length;
|
||||
})()
|
||||
EVALEOF
|
||||
```
|
||||
|
||||
#### Install Error Interceptor
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
window.__CAPTURED_ERRORS = [];
|
||||
var orig = console.error;
|
||||
console.error = function() {
|
||||
var msg = Array.from(arguments).map(function(a) {
|
||||
if (a instanceof Error) return a.message;
|
||||
return typeof a === 'object' ? JSON.stringify(a) : String(a);
|
||||
}).join(' ');
|
||||
window.__CAPTURED_ERRORS.push(msg);
|
||||
orig.apply(console, arguments);
|
||||
};
|
||||
return 'installed';
|
||||
})()
|
||||
EVALEOF
|
||||
|
||||
# Later, check captured errors:
|
||||
agent-browser --cdp 9222 eval "JSON.stringify(window.__CAPTURED_ERRORS)"
|
||||
```
|
||||
|
||||
## Electron Gotchas
|
||||
|
||||
- **Always use `electron-dev.sh stop` to clean up** — `pkill -f "Electron"` only kills the main process; helper processes (GPU, renderer, network) survive. The script finds and kills all of them via PID matching against the project's electron binary path.
|
||||
- **`npx electron-vite dev` must run from `apps/desktop/`** — running from project root fails silently. The `electron-dev.sh` script handles this automatically.
|
||||
- **Dev build auto-opens DevTools, which hijacks the CDP target** — `agent-browser --cdp 9222` may attach to the DevTools page (`devtools://…`) instead of the app (`app://renderer/`). Symptom: `get url` returns a `devtools://` URL. Fix: close the DevTools target and reconnect:
|
||||
|
||||
```bash
|
||||
DT_ID=$(curl -s http://localhost:9222/json/list | python3 -c "import json,sys; ts=json.load(sys.stdin); print(next(t['id'] for t in ts if t['type']=='page' and t['url'].startswith('devtools://')))")
|
||||
curl -s "http://localhost:9222/json/close/$DT_ID" > /dev/null
|
||||
agent-browser close --all && agent-browser --cdp 9222 get url # expect app://renderer/
|
||||
```
|
||||
|
||||
- **Don't resize the Electron window after load** — resizing triggers full SPA reload
|
||||
- **Store is at `window.__LOBE_STORES`** not `window.__ZUSTAND_STORES__`
|
||||
- **Streaming / ticking UI needs GIF evidence** — see `scripts/record-gif.sh`; a static screenshot cannot prove time-based behavior.
|
||||
@@ -0,0 +1,78 @@
|
||||
# Web (Full-Stack) Testing
|
||||
|
||||
Default surface for **full-stack changes** — a new/changed API plus the UI that
|
||||
consumes it. The browser is the one surface where network requests and UI state
|
||||
are observable together, so you can assert both sides of the contract in a
|
||||
single run.
|
||||
|
||||
For pure-frontend changes prefer [electron.md](./electron.md); for
|
||||
backend-only changes prefer [../cli/index.md](../cli/index.md).
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Complete [Step 0.0](../SKILL.md#00-resolve-the-current-test-environment) (resolve ports) and [Step -1](../SKILL.md#step--1--plan-approval-for-non-trivial-tests) (plan approval) first.
|
||||
- Local dev server running — [../references/dev-server.md](../references/dev-server.md)
|
||||
- Web auth verified in agent-browser — prefer `setup-auth.sh web-seed`, see [auth decision flow](../references/auth.md#web--decision-flow).
|
||||
|
||||
## Option A — agent-browser with seeded auth (recommended)
|
||||
|
||||
```bash
|
||||
./.agents/skills/agent-testing/scripts/init-dev-env.sh seed-user
|
||||
./.agents/skills/agent-testing/scripts/setup-auth.sh web-seed
|
||||
```
|
||||
|
||||
Then drive the verified session:
|
||||
|
||||
```bash
|
||||
SESSION=lobehub-dev
|
||||
|
||||
agent-browser --session $SESSION open "$SERVER_URL/"
|
||||
agent-browser --session $SESSION snapshot -i
|
||||
# interact via refs — full command reference: ../references/agent-browser.md
|
||||
```
|
||||
|
||||
Use this session as the evidence source. Do not use ordinary Chrome screenshots
|
||||
or Chrome Network records as proof for Web tests; ordinary Chrome is only a
|
||||
fallback source for copying cookies into agent-browser when the seeded login is
|
||||
not available.
|
||||
|
||||
### Watch the API while driving the UI
|
||||
|
||||
```bash
|
||||
# After triggering the UI action under test:
|
||||
agent-browser --session $SESSION network requests --type xhr,fetch
|
||||
agent-browser --session $SESSION network requests --method POST
|
||||
|
||||
# Record a full HAR for the report
|
||||
agent-browser --session $SESSION network har start
|
||||
# ... drive the scenario ...
|
||||
agent-browser --session $SESSION network har stop ./capture.har
|
||||
```
|
||||
|
||||
Assert both layers: the request/response shape (network) and the rendered
|
||||
result (snapshot/screenshot). Both belong in the report as evidence.
|
||||
|
||||
## Option B — real Chrome with remote debugging
|
||||
|
||||
For flows that need a real, visible browser (e.g. exercising the login UI
|
||||
itself):
|
||||
|
||||
```bash
|
||||
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
|
||||
--remote-debugging-port=9222 \
|
||||
--user-data-dir=/tmp/chrome-test-profile \
|
||||
"<URL>" &
|
||||
sleep 5
|
||||
agent-browser --cdp 9222 snapshot -i
|
||||
|
||||
# Or auto-discover running Chrome with remote debugging
|
||||
agent-browser --auto-connect snapshot -i
|
||||
```
|
||||
|
||||
## Option C — Debug Proxy (local frontend, production backend)
|
||||
|
||||
`bun run dev:spa` prints a **Debug Proxy** URL
|
||||
(`https://app.lobehub.com/_dangerous_local_dev_proxy?debug-host=…`) that loads
|
||||
your local Vite SPA inside the online environment — HMR against real server
|
||||
config. Useful for verifying frontend behavior against production data, **not**
|
||||
for testing backend changes (the backend is production, not your branch).
|
||||
@@ -1,172 +0,0 @@
|
||||
---
|
||||
name: cli-backend-testing
|
||||
description: >
|
||||
CLI + Backend integration testing workflow. Use when verifying backend API changes
|
||||
(TRPC routers, services, models) via the LobeHub CLI against a local dev server.
|
||||
Triggers on 'cli test', 'test with cli', 'verify with cli', 'local cli test',
|
||||
'backend test with cli', or when needing to validate server-side changes end-to-end.
|
||||
---
|
||||
|
||||
# CLI + Backend Integration Testing
|
||||
|
||||
Standard workflow for verifying backend changes using the LobeHub CLI (`lh`) against a local dev server.
|
||||
|
||||
## When to Use
|
||||
|
||||
- Verifying TRPC router / service / model changes end-to-end
|
||||
- Testing new API fields or response structure changes
|
||||
- Validating CLI command output after backend modifications
|
||||
- Debugging data flow issues between server and CLI
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Requirement | Details |
|
||||
| ------------ | ------------------------------------------------------------- |
|
||||
| Dev server | `localhost:3011` (Next.js) |
|
||||
| CLI source | `lobehub/apps/cli/` |
|
||||
| CLI dev mode | Uses `LOBEHUB_CLI_HOME=.lobehub-dev` for isolated credentials |
|
||||
| Auth | Device Code Flow login to local server |
|
||||
|
||||
## Quick Reference
|
||||
|
||||
All CLI dev commands run from `lobehub/apps/cli/`. Subsequent examples use `$CLI`:
|
||||
|
||||
```bash
|
||||
CLI="LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts"
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
### Step 1: Ensure Dev Server is Running
|
||||
|
||||
```bash
|
||||
curl -s -o /dev/null -w '%{http_code}' http://localhost:3011/ 2> /dev/null
|
||||
```
|
||||
|
||||
- **If reachable**: skip to Step 2.
|
||||
- **If unreachable**: start from cloud repo root:
|
||||
|
||||
```bash
|
||||
pnpm run dev:next
|
||||
```
|
||||
|
||||
To **restart** (pick up server-side code changes):
|
||||
|
||||
```bash
|
||||
lsof -ti:3011 | xargs kill
|
||||
pnpm run dev:next
|
||||
```
|
||||
|
||||
**Important:** Server-side code changes in the submodule (`lobehub/apps/server/src/`, `lobehub/src/server/`, `lobehub/packages/`) require a server restart. Next.js hot-reload may not pick up changes in submodule packages.
|
||||
|
||||
### Step 2: Check CLI Authentication
|
||||
|
||||
```bash
|
||||
cat lobehub/apps/cli/.lobehub-dev/settings.json 2> /dev/null
|
||||
```
|
||||
|
||||
- **If file exists and contains `"serverUrl": "http://localhost:3011"`**: skip to Step 3.
|
||||
- **If missing or wrong server**: ask the user to run:
|
||||
|
||||
```bash
|
||||
! cd lobehub/apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3011
|
||||
```
|
||||
|
||||
> Login requires interactive browser authorization (OIDC Device Code Flow), so the user must run it themselves via `!` prefix. Credentials persist in `lobehub/apps/cli/.lobehub-dev/`.
|
||||
|
||||
### Step 3: Test with CLI Commands
|
||||
|
||||
CLI runs from source, so CLI-side code changes take effect immediately without rebuilding.
|
||||
|
||||
```bash
|
||||
cd lobehub/apps/cli
|
||||
$CLI <command>
|
||||
```
|
||||
|
||||
### Step 4: Clean Up Test Data
|
||||
|
||||
```bash
|
||||
$CLI task delete < id > -y
|
||||
$CLI agent delete < id > -y
|
||||
```
|
||||
|
||||
## Common Testing Patterns
|
||||
|
||||
### Task System
|
||||
|
||||
```bash
|
||||
$CLI task list
|
||||
$CLI task create -n "Root Task" -i "Test instruction"
|
||||
$CLI task create -n "Child Task" -i "Sub instruction" --parent T-1
|
||||
$CLI task view T-1
|
||||
$CLI task tree T-1
|
||||
$CLI task edit T-1 --status running
|
||||
$CLI task comment T-1 -m "Test comment"
|
||||
$CLI task delete T-1 -y
|
||||
```
|
||||
|
||||
### Agent System
|
||||
|
||||
```bash
|
||||
$CLI agent list
|
||||
$CLI agent view <agent-id>
|
||||
$CLI agent run <agent-id> -m "Test prompt"
|
||||
```
|
||||
|
||||
### Document & Knowledge Base
|
||||
|
||||
```bash
|
||||
$CLI doc list
|
||||
$CLI doc create -t "Test Doc" -c "Content here"
|
||||
$CLI doc view <doc-id>
|
||||
$CLI kb list
|
||||
$CLI kb tree <kb-id>
|
||||
```
|
||||
|
||||
### Model & Provider
|
||||
|
||||
```bash
|
||||
$CLI model list
|
||||
$CLI provider list
|
||||
$CLI provider test <provider-id>
|
||||
```
|
||||
|
||||
## Dev-Test Cycle
|
||||
|
||||
```
|
||||
1. Make code changes (service/model/router/type)
|
||||
|
|
||||
2. Run unit tests (fast feedback)
|
||||
bunx vitest run --silent='passed-only' '<test-file>'
|
||||
|
|
||||
3. Restart dev server (if server-side changes)
|
||||
lsof -ti:3011 | xargs kill && pnpm run dev:next
|
||||
|
|
||||
4. CLI verification (end-to-end)
|
||||
$CLI <command>
|
||||
|
|
||||
5. Clean up test data
|
||||
```
|
||||
|
||||
### When Server Restart is Needed
|
||||
|
||||
| Change Location | Restart? |
|
||||
| ------------------------------------------------------- | -------- |
|
||||
| `lobehub/apps/server/src/` (routers, services, modules) | Yes |
|
||||
| `lobehub/src/server/` (agent-hono, workflows-hono) | Yes |
|
||||
| `lobehub/packages/database/` (models) | Yes |
|
||||
| `lobehub/packages/types/` | Yes |
|
||||
| `lobehub/packages/prompts/` | Yes |
|
||||
| `lobehub/apps/cli/` (CLI code) | No |
|
||||
| `src/` (cloud overrides) | Yes |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
| Issue | Solution |
|
||||
| --------------------------- | --------------------------------------------------------------------- |
|
||||
| `No authentication found` | Run `login --server http://localhost:3011` |
|
||||
| `UNAUTHORIZED` on API calls | Token expired; re-run login |
|
||||
| `ECONNREFUSED` | Dev server not running; start with `pnpm run dev:next` |
|
||||
| CLI shows old data/behavior | Server needs restart to pick up code changes |
|
||||
| `EADDRINUSE` on port 3011 | Server already running; kill with `lsof -ti:3011 \| xargs kill` |
|
||||
| Login opens wrong server | Must use `--server http://localhost:3011` flag (env var doesn't work) |
|
||||
@@ -241,6 +241,6 @@ When the bug comes from a real trace, distill it into the closest existing test
|
||||
3. Add or update the narrowest failing test near the broken layer.
|
||||
4. Fix the smallest layer that can explain the symptom.
|
||||
5. Re-run focused tests.
|
||||
6. Only then do an Electron smoke test with the `local-testing` skill if UI confirmation is still needed.
|
||||
6. Only then do an Electron smoke test with the `agent-testing` skill if UI confirmation is still needed.
|
||||
|
||||
Do not start with a broad Electron repro if a raw trace or adapter test can prove the fault zone faster.
|
||||
|
||||
@@ -1,561 +0,0 @@
|
||||
---
|
||||
name: local-testing
|
||||
description: >
|
||||
Local app and bot testing. Uses agent-browser CLI for Electron/web app UI testing,
|
||||
and osascript (AppleScript) for controlling native macOS apps (WeChat, Discord, Telegram, Slack, Lark/飞书, QQ)
|
||||
to test bots. Triggers on 'local test', 'test in electron', 'test desktop', 'test bot',
|
||||
'bot test', 'test in discord', 'test in telegram', 'test in slack', 'test in weixin',
|
||||
'test in wechat', 'test in lark', 'test in feishu', 'test in qq',
|
||||
'manual test', 'osascript', or UI/bot verification tasks.
|
||||
---
|
||||
|
||||
# Local App & Bot Testing
|
||||
|
||||
Two approaches for local testing on macOS:
|
||||
|
||||
| Approach | Tool | Best For |
|
||||
| --------------------------- | ------------------- | ---------------------------------------------------- |
|
||||
| **agent-browser + CDP** | `agent-browser` CLI | Electron apps, web apps (DOM access, JS eval) |
|
||||
| **osascript (AppleScript)** | `osascript -e` | Native macOS apps (WeChat, Discord, Telegram, Slack) |
|
||||
|
||||
---
|
||||
|
||||
# Part 1: agent-browser (Electron / Web Apps)
|
||||
|
||||
Use `agent-browser` to automate Chromium-based apps via Chrome DevTools Protocol.
|
||||
|
||||
Install via `npm i -g agent-browser`, `brew install agent-browser`, or `cargo install agent-browser`. Run `agent-browser install` to download Chrome. Run `agent-browser upgrade` to update.
|
||||
|
||||
## Core Workflow
|
||||
|
||||
Every browser automation follows this pattern:
|
||||
|
||||
1. **Navigate**: `agent-browser open <url>`
|
||||
2. **Snapshot**: `agent-browser snapshot -i` (get element refs like `@e1`, `@e2`)
|
||||
3. **Interact**: Use refs to click, fill, select
|
||||
4. **Re-snapshot**: After navigation or DOM changes, get fresh refs
|
||||
|
||||
```bash
|
||||
agent-browser open https://example.com/form
|
||||
agent-browser snapshot -i
|
||||
# Output: @e1 [input type="email"], @e2 [input type="password"], @e3 [button] "Submit"
|
||||
|
||||
agent-browser fill @e1 "user@example.com"
|
||||
agent-browser fill @e2 "password123"
|
||||
agent-browser click @e3
|
||||
agent-browser wait --load networkidle
|
||||
agent-browser snapshot -i # Check result
|
||||
```
|
||||
|
||||
## Command Chaining
|
||||
|
||||
```bash
|
||||
# Chain open + wait + snapshot in one call
|
||||
agent-browser open https://example.com && agent-browser wait --load networkidle && agent-browser snapshot -i
|
||||
```
|
||||
|
||||
Use `&&` when you don't need to read intermediate output. Run commands separately when you need to parse output first (e.g., snapshot to discover refs, then interact).
|
||||
|
||||
## Essential Commands
|
||||
|
||||
```bash
|
||||
# Navigation
|
||||
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
||||
agent-browser close # Close browser
|
||||
agent-browser close --all # Close all active sessions
|
||||
|
||||
# Snapshot
|
||||
agent-browser snapshot -i # Interactive elements with refs (recommended)
|
||||
agent-browser snapshot -s "#selector" # Scope to CSS selector
|
||||
|
||||
# Interaction (use @refs from snapshot)
|
||||
agent-browser click @e1 # Click element
|
||||
agent-browser click @e1 --new-tab # Click and open in new tab
|
||||
agent-browser fill @e2 "text" # Clear and type text
|
||||
agent-browser type @e2 "text" # Type without clearing
|
||||
agent-browser select @e1 "option" # Select dropdown option
|
||||
agent-browser check @e1 # Check checkbox
|
||||
agent-browser press Enter # Press key
|
||||
agent-browser keyboard type "text" # Type at current focus (no selector)
|
||||
agent-browser keyboard inserttext "text" # Insert without key events
|
||||
agent-browser scroll down 500 # Scroll page
|
||||
agent-browser scroll down 500 --selector "div.content" # Scroll within container
|
||||
|
||||
# Get information
|
||||
agent-browser get text @e1 # Get element text
|
||||
agent-browser get url # Get current URL
|
||||
agent-browser get title # Get page title
|
||||
agent-browser get cdp-url # Get CDP WebSocket URL
|
||||
|
||||
# Wait
|
||||
agent-browser wait @e1 # Wait for element
|
||||
agent-browser wait --load networkidle # Wait for network idle
|
||||
agent-browser wait --url "**/page" # Wait for URL pattern
|
||||
agent-browser wait 2000 # Wait milliseconds
|
||||
agent-browser wait --text "Welcome" # Wait for text to appear
|
||||
agent-browser wait --fn "!document.body.innerText.includes('Loading...')" # Wait for text to disappear
|
||||
agent-browser wait "#spinner" --state hidden # Wait for element to disappear
|
||||
|
||||
# Downloads
|
||||
agent-browser download @e1 ./file.pdf # Click element to trigger download
|
||||
agent-browser wait --download ./output.zip # Wait for any download to complete
|
||||
|
||||
# Network
|
||||
agent-browser network requests # Inspect tracked requests
|
||||
agent-browser network requests --type xhr,fetch # Filter by resource type
|
||||
agent-browser network requests --method POST # Filter by HTTP method
|
||||
agent-browser network route "**/api/*" --abort # Block matching requests
|
||||
agent-browser network har start # Start HAR recording
|
||||
agent-browser network har stop ./capture.har # Stop and save HAR file
|
||||
|
||||
# Viewport & Device Emulation
|
||||
agent-browser set viewport 1920 1080 # Set viewport size (default: 1280x720)
|
||||
agent-browser set viewport 1920 1080 2 # 2x retina
|
||||
agent-browser set device "iPhone 14" # Emulate device (viewport + user agent)
|
||||
|
||||
# Capture
|
||||
agent-browser screenshot # Screenshot to temp dir
|
||||
agent-browser screenshot --full # Full page screenshot
|
||||
agent-browser screenshot --annotate # Annotated screenshot with numbered element labels
|
||||
agent-browser pdf output.pdf # Save as PDF
|
||||
|
||||
# Clipboard
|
||||
agent-browser clipboard read # Read text from clipboard
|
||||
agent-browser clipboard write "text" # Write text to clipboard
|
||||
agent-browser clipboard copy # Copy current selection
|
||||
agent-browser clipboard paste # Paste from clipboard
|
||||
|
||||
# Dialogs (alert, confirm, prompt, beforeunload)
|
||||
agent-browser dialog accept # Accept dialog
|
||||
agent-browser dialog accept "input" # Accept prompt dialog with text
|
||||
agent-browser dialog dismiss # Dismiss/cancel dialog
|
||||
agent-browser dialog status # Check if dialog is open
|
||||
|
||||
# Diff (compare page states)
|
||||
agent-browser diff snapshot # Compare current vs last snapshot
|
||||
agent-browser diff screenshot --baseline before.png # Visual pixel diff
|
||||
agent-browser diff url <url1> <url2> # Compare two pages
|
||||
|
||||
# Streaming
|
||||
agent-browser stream enable # Start WebSocket streaming
|
||||
agent-browser stream status # Inspect streaming state
|
||||
agent-browser stream disable # Stop streaming
|
||||
```
|
||||
|
||||
## Batch Execution
|
||||
|
||||
```bash
|
||||
echo '[
|
||||
["open", "https://example.com"],
|
||||
["snapshot", "-i"],
|
||||
["click", "@e1"],
|
||||
["screenshot", "result.png"]
|
||||
]' | agent-browser batch --json
|
||||
```
|
||||
|
||||
## Authentication
|
||||
|
||||
```bash
|
||||
# Option 1: Auth vault (credentials stored encrypted)
|
||||
echo "$PASSWORD" | agent-browser auth save myapp --url https://app.example.com/login --username user --password-stdin
|
||||
agent-browser auth login myapp
|
||||
|
||||
# Option 2: Session name (auto-save/restore cookies + localStorage)
|
||||
agent-browser --session-name myapp open https://app.example.com/login
|
||||
agent-browser close # State auto-saved
|
||||
agent-browser --session-name myapp open https://app.example.com/dashboard # Auto-restored
|
||||
|
||||
# Option 3: Persistent profile
|
||||
agent-browser --profile ~/.myapp open https://app.example.com/login
|
||||
|
||||
# Option 4: State file
|
||||
agent-browser state save auth.json
|
||||
agent-browser state load auth.json
|
||||
```
|
||||
|
||||
### LobeHub dev server — inject better-auth cookie
|
||||
|
||||
`agent-browser --headed` on macOS can create an off-screen Chromium window, blocking manual login. For a local LobeHub dev server (e.g. `localhost:3011`), copy the `better-auth.session_token` cookie out of a **Network request** in the user's own Chrome DevTools and load it via `state load`. See [references/agent-browser-login.md](./references/agent-browser-login.md) for the full recipe.
|
||||
|
||||
## Semantic Locators (Alternative to Refs)
|
||||
|
||||
```bash
|
||||
agent-browser find text "Sign In" click
|
||||
agent-browser find label "Email" fill "user@test.com"
|
||||
agent-browser find role button click --name "Submit"
|
||||
agent-browser find placeholder "Search" type "query"
|
||||
agent-browser find testid "submit-btn" click
|
||||
```
|
||||
|
||||
## JavaScript Evaluation (eval)
|
||||
|
||||
```bash
|
||||
# Simple expressions
|
||||
agent-browser eval 'document.title'
|
||||
|
||||
# Complex JS: use --stdin with heredoc (RECOMMENDED)
|
||||
agent-browser eval --stdin << 'EVALEOF'
|
||||
JSON.stringify(
|
||||
Array.from(document.querySelectorAll("img"))
|
||||
.filter(i => !i.alt)
|
||||
.map(i => ({ src: i.src.split("/").pop(), width: i.width }))
|
||||
)
|
||||
EVALEOF
|
||||
|
||||
# Base64 encoding (avoids all shell escaping issues)
|
||||
agent-browser eval -b "$(echo -n 'document.title' | base64)"
|
||||
```
|
||||
|
||||
## Ref Lifecycle
|
||||
|
||||
Refs (`@e1`, `@e2`, etc.) are invalidated when the page changes. Always re-snapshot after clicking links/buttons that navigate, form submissions, or dynamic content loading.
|
||||
|
||||
## Annotated Screenshots (Vision Mode)
|
||||
|
||||
```bash
|
||||
agent-browser screenshot --annotate
|
||||
# Output includes the image path and a legend:
|
||||
# [1] @e1 button "Submit"
|
||||
# [2] @e2 link "Home"
|
||||
agent-browser click @e2 # Click using ref from annotated screenshot
|
||||
```
|
||||
|
||||
## Parallel Sessions
|
||||
|
||||
```bash
|
||||
agent-browser --session site1 open https://site-a.com
|
||||
agent-browser --session site2 open https://site-b.com
|
||||
agent-browser session list
|
||||
```
|
||||
|
||||
## Connect to Existing Chrome
|
||||
|
||||
```bash
|
||||
agent-browser --auto-connect snapshot # Auto-discover running Chrome
|
||||
agent-browser --cdp 9222 snapshot # Explicit CDP port
|
||||
```
|
||||
|
||||
## iOS Simulator (Mobile Safari)
|
||||
|
||||
```bash
|
||||
agent-browser device list
|
||||
agent-browser -p ios --device "iPhone 16 Pro" open https://example.com
|
||||
agent-browser -p ios snapshot -i
|
||||
agent-browser -p ios tap @e1
|
||||
agent-browser -p ios swipe up
|
||||
agent-browser -p ios screenshot mobile.png
|
||||
agent-browser -p ios close
|
||||
```
|
||||
|
||||
## Observability Dashboard
|
||||
|
||||
```bash
|
||||
agent-browser dashboard install
|
||||
agent-browser dashboard start # Background server on port 4848
|
||||
agent-browser dashboard stop
|
||||
```
|
||||
|
||||
## Cloud Providers
|
||||
|
||||
Use `-p <provider>` to run against cloud browsers: `agentcore`, `browserbase`, `browserless`, `browseruse`, `kernel`.
|
||||
|
||||
## Browser Engine Selection
|
||||
|
||||
```bash
|
||||
agent-browser --engine lightpanda open example.com # 10x faster, 10x less memory
|
||||
```
|
||||
|
||||
## Electron (LobeHub Desktop)
|
||||
|
||||
### Setup / Teardown
|
||||
|
||||
Use the `electron-dev.sh` script to manage the Electron dev environment. It handles process lifecycle, waits for SPA readiness, and reliably kills all child processes (main + helpers + vite).
|
||||
|
||||
```bash
|
||||
SCRIPT=".agents/skills/local-testing/scripts/electron-dev.sh"
|
||||
|
||||
# Start Electron dev with CDP (idempotent — skips if already running)
|
||||
$SCRIPT start
|
||||
|
||||
# Check if Electron is running and CDP is reachable
|
||||
$SCRIPT status
|
||||
|
||||
# Kill all Electron-related processes (main + helper + vite)
|
||||
$SCRIPT stop
|
||||
|
||||
# Force fresh restart
|
||||
$SCRIPT restart
|
||||
```
|
||||
|
||||
After `start` succeeds, connect with: `agent-browser --cdp 9222 snapshot -i`
|
||||
|
||||
**Always run `$SCRIPT stop` when done testing** — `pkill -f "Electron"` alone won't catch all helper processes.
|
||||
|
||||
#### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
| ----------------- | ----------------------- | ---------------------------------------- |
|
||||
| `CDP_PORT` | `9222` | Chrome DevTools Protocol port |
|
||||
| `ELECTRON_LOG` | `/tmp/electron-dev.log` | Electron process log |
|
||||
| `ELECTRON_WAIT_S` | `60` | Max seconds to wait for Electron process |
|
||||
| `RENDERER_WAIT_S` | `60` | Max seconds to wait for SPA to load |
|
||||
|
||||
### LobeHub-Specific Patterns
|
||||
|
||||
#### Access Zustand Store State
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
var chat = window.__LOBE_STORES.chat();
|
||||
var ops = Object.values(chat.operations);
|
||||
return JSON.stringify({
|
||||
ops: ops.map(function(o) { return { type: o.type, status: o.status }; }),
|
||||
activeAgent: chat.activeAgentId,
|
||||
activeTopic: chat.activeTopicId,
|
||||
});
|
||||
})()
|
||||
EVALEOF
|
||||
```
|
||||
|
||||
#### Find and Use the Chat Input
|
||||
|
||||
```bash
|
||||
# The chat input is contenteditable — must use -C flag
|
||||
agent-browser --cdp 9222 snapshot -i -C 2>&1 | grep "editable"
|
||||
|
||||
agent-browser --cdp 9222 click @e48
|
||||
agent-browser --cdp 9222 type @e48 "Hello world"
|
||||
agent-browser --cdp 9222 press Enter
|
||||
```
|
||||
|
||||
#### Wait for Agent to Complete
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
var chat = window.__LOBE_STORES.chat();
|
||||
var ops = Object.values(chat.operations);
|
||||
var running = ops.filter(function(o) { return o.status === 'running'; });
|
||||
return running.length === 0 ? 'done' : 'running: ' + running.length;
|
||||
})()
|
||||
EVALEOF
|
||||
```
|
||||
|
||||
#### Install Error Interceptor
|
||||
|
||||
```bash
|
||||
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
|
||||
(function() {
|
||||
window.__CAPTURED_ERRORS = [];
|
||||
var orig = console.error;
|
||||
console.error = function() {
|
||||
var msg = Array.from(arguments).map(function(a) {
|
||||
if (a instanceof Error) return a.message;
|
||||
return typeof a === 'object' ? JSON.stringify(a) : String(a);
|
||||
}).join(' ');
|
||||
window.__CAPTURED_ERRORS.push(msg);
|
||||
orig.apply(console, arguments);
|
||||
};
|
||||
return 'installed';
|
||||
})()
|
||||
EVALEOF
|
||||
|
||||
# Later, check captured errors:
|
||||
agent-browser --cdp 9222 eval "JSON.stringify(window.__CAPTURED_ERRORS)"
|
||||
```
|
||||
|
||||
## Chrome / Web Apps
|
||||
|
||||
```bash
|
||||
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
|
||||
--remote-debugging-port=9222 \
|
||||
--user-data-dir=/tmp/chrome-test-profile \
|
||||
"<URL>" &
|
||||
sleep 5
|
||||
agent-browser --cdp 9222 snapshot -i
|
||||
|
||||
# Or auto-discover running Chrome with remote debugging
|
||||
agent-browser --auto-connect snapshot -i
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
# Part 2: osascript (Native macOS App Bot Testing)
|
||||
|
||||
Use AppleScript via `osascript` to control native macOS desktop apps for bot testing. Works with any app that supports macOS Accessibility, no CDP or Chromium needed.
|
||||
|
||||
The pattern is the same for every platform:
|
||||
|
||||
1. **Activate** the app (`tell application "X" to activate`)
|
||||
2. **Navigate** to a channel/chat (Quick Switcher `Cmd+K` or Search `Cmd+F`)
|
||||
3. **Send** a message (clipboard paste `Cmd+V` + Enter)
|
||||
4. **Wait** for the bot response
|
||||
5. **Screenshot** for verification (`screencapture` + `Read` tool)
|
||||
|
||||
## Per-Platform References
|
||||
|
||||
Pick the file for your target platform — each contains activation, navigation, send-message, and verification snippets specific to that app:
|
||||
|
||||
Each channel has its own folder under `bot/<channel>/` containing an `index.md`
|
||||
(activation, navigation, send-message, and verification snippets specific to
|
||||
that app) and its test script:
|
||||
|
||||
| Platform | Reference | Quick switcher |
|
||||
| ------------- | ------------------------------------------------ | -------------- |
|
||||
| Discord | [bot/discord/index.md](./bot/discord/index.md) | `Cmd+K` |
|
||||
| Slack | [bot/slack/index.md](./bot/slack/index.md) | `Cmd+K` |
|
||||
| Telegram | [bot/telegram/index.md](./bot/telegram/index.md) | `Cmd+F` |
|
||||
| WeChat / 微信 | [bot/wechat/index.md](./bot/wechat/index.md) | `Cmd+F` |
|
||||
| Lark / 飞书 | [bot/lark/index.md](./bot/lark/index.md) | `Cmd+K` |
|
||||
| QQ | [bot/qq/index.md](./bot/qq/index.md) | `Cmd+F` |
|
||||
|
||||
For **shared osascript patterns** (activate, type, paste, screenshot, read accessibility, common workflow template, gotchas), see [bot/osascript-common.md](./bot/osascript-common.md). Read this first if you're new to osascript automation.
|
||||
|
||||
## Bridge-based channels (no native app)
|
||||
|
||||
Some channels have no native app to drive with osascript — they connect through
|
||||
a local bridge inside the Desktop app. These are tested with agent-browser
|
||||
(IPC + UI) plus the bridge's own HTTP/REST endpoints, not osascript:
|
||||
|
||||
| Channel | Reference | What it drives |
|
||||
| -------- | ------------------------------------------------ | -------------------------------------------------------- |
|
||||
| iMessage | [bot/imessage/index.md](./bot/imessage/index.md) | `imessageBridge.*` IPC + local bridge + BlueBubbles REST |
|
||||
|
||||
For iMessage there is a one-shot regression script — see `test-imessage-bridge.sh` below.
|
||||
|
||||
---
|
||||
|
||||
# Scripts
|
||||
|
||||
**App / recording scripts** in `.agents/skills/local-testing/scripts/`:
|
||||
|
||||
| Script | Usage |
|
||||
| ------------------------- | --------------------------------------------------- |
|
||||
| `electron-dev.sh` | Manage Electron dev env (start/stop/status/restart) |
|
||||
| `record-electron-demo.sh` | Record Electron app demo with ffmpeg |
|
||||
| `record-app-screen.sh` | Record app screen (video + screenshots, start/stop) |
|
||||
|
||||
**Bot scripts** live under `.agents/skills/local-testing/bot/`, one folder per
|
||||
channel (alongside that channel's `index.md`). The shared
|
||||
`capture-app-window.sh` sits at the `bot/` root:
|
||||
|
||||
| Script | Usage |
|
||||
| ---------------------------------- | ------------------------------------------------------------------- |
|
||||
| `capture-app-window.sh` | Capture screenshot of a specific app window (used by bot tests) |
|
||||
| `discord/test-discord-bot.sh` | Send message to Discord bot via osascript |
|
||||
| `slack/test-slack-bot.sh` | Send message to Slack bot via osascript |
|
||||
| `telegram/test-telegram-bot.sh` | Send message to Telegram bot via osascript |
|
||||
| `wechat/test-wechat-bot.sh` | Send message to WeChat bot via osascript |
|
||||
| `lark/test-lark-bot.sh` | Send message to Lark / 飞书 bot via osascript |
|
||||
| `qq/test-qq-bot.sh` | Send message to QQ bot via osascript |
|
||||
| `imessage/test-imessage-bridge.sh` | Regression-test the iMessage BlueBubbles bridge (IPC + HTTP) |
|
||||
| `imessage/send-imessage-test.sh` | Send one real iMessage (desktop → BB → iMessage) and verify it sent |
|
||||
|
||||
### Window Screenshot Utility
|
||||
|
||||
`capture-app-window.sh` captures a screenshot of a specific app window using `screencapture -l <windowID>`. It uses Swift + CGWindowList to find the window by process name, so screenshots work correctly even when the window is on an external monitor or behind other windows.
|
||||
|
||||
```bash
|
||||
# Standalone usage
|
||||
./.agents/skills/local-testing/bot/capture-app-window.sh "Discord" /tmp/discord.png
|
||||
./.agents/skills/local-testing/bot/capture-app-window.sh "Slack" /tmp/slack.png
|
||||
./.agents/skills/local-testing/bot/capture-app-window.sh "WeChat" /tmp/wechat.png
|
||||
```
|
||||
|
||||
All bot test scripts use this utility automatically for their screenshots.
|
||||
|
||||
### Bot Test Scripts
|
||||
|
||||
All bot test scripts share the same interface:
|
||||
|
||||
```bash
|
||||
./scripts/test-<platform>-bot.sh <channel_or_contact> <message> [wait_seconds] [screenshot_path]
|
||||
```
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
# Discord — test a bot in #bot-testing channel
|
||||
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "!ping"
|
||||
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
|
||||
|
||||
# Slack — test a bot in #bot-testing channel
|
||||
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "@mybot hello"
|
||||
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
|
||||
|
||||
# Telegram — test a bot by username
|
||||
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "MyTestBot" "/start"
|
||||
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "GPTBot" "Hello" 60
|
||||
|
||||
# WeChat — test a bot or send to a contact
|
||||
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "文件传输助手" "test message" 5
|
||||
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
|
||||
|
||||
# Lark/飞书 — test a bot in a group chat
|
||||
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "@MyBot hello"
|
||||
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "Help me with this" 30
|
||||
|
||||
# QQ — test a bot in a group or direct chat
|
||||
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "bot-testing" "Hello bot" 15
|
||||
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "MyBot" "/help" 10
|
||||
```
|
||||
|
||||
Each script: activates the app, navigates to the channel/contact, pastes the message via clipboard, sends, waits, and takes a screenshot. Use the `Read` tool on the screenshot for visual verification.
|
||||
|
||||
### iMessage bridge regression script
|
||||
|
||||
`test-imessage-bridge.sh` does **not** follow the osascript bot interface — it
|
||||
drives the Desktop bridge's IPC + HTTP layers and asserts the result, then
|
||||
self-cleans. Needs BlueBubbles running and Electron up with CDP.
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/bot/imessage/test-imessage-bridge.sh '<bluebubbles_password>' [bb_url] [cdp_port]
|
||||
# defaults: bb_url=http://127.0.0.1:1234 cdp_port=9222 — exit 0 = all green
|
||||
```
|
||||
|
||||
It guards the connect/configure flow (testConfig happy + reject paths, first-time
|
||||
`upsertConfig` save, bridge running + webhook registered, local-server secret
|
||||
enforcement). See [bot/imessage/index.md](./bot/imessage/index.md)
|
||||
for the full manual UI flow and known bugs.
|
||||
|
||||
---
|
||||
|
||||
# Screen Recording
|
||||
|
||||
Record automated demos using `record-app-screen.sh` (start/stop lifecycle, CDP screenshots + ffmpeg assembly). See [references/record-app-screen.md](references/record-app-screen.md) for full documentation.
|
||||
|
||||
```bash
|
||||
./.agents/skills/local-testing/scripts/electron-dev.sh start
|
||||
./.agents/skills/local-testing/scripts/record-app-screen.sh start my-demo
|
||||
# ... run automation ...
|
||||
./.agents/skills/local-testing/scripts/record-app-screen.sh stop
|
||||
```
|
||||
|
||||
Outputs to `.records/` directory (gitignored): `<name>.mp4` (video) + `<name>/` (screenshots every 3s).
|
||||
|
||||
---
|
||||
|
||||
# Gotchas
|
||||
|
||||
### agent-browser
|
||||
|
||||
- **Daemon can get stuck** — if commands hang, `agent-browser close --all` or `pkill -f agent-browser` to reset
|
||||
- **HMR invalidates everything** — after code changes, refs break. Re-snapshot or restart
|
||||
- **`snapshot -i` doesn't find contenteditable** — use `snapshot -i -C` for rich text editors
|
||||
- **`fill` doesn't work on contenteditable** — use `type` for chat inputs
|
||||
- **Screenshots go to `~/.agent-browser/tmp/screenshots/`** — read them with the `Read` tool
|
||||
- **Dialogs block all commands** — if commands time out, check `agent-browser dialog status`
|
||||
- **Default timeout is 25s** — override with `AGENT_BROWSER_DEFAULT_TIMEOUT` (ms) or use explicit waits
|
||||
- **Shell quoting corrupts eval** — use `eval --stdin <<'EVALEOF'` for complex JS
|
||||
|
||||
### Electron-specific
|
||||
|
||||
- **Always use `electron-dev.sh stop` to clean up** — `pkill -f "Electron"` only kills the main process; helper processes (GPU, renderer, network) survive. The script finds and kills all of them via PID matching against the project's electron binary path.
|
||||
- **`npx electron-vite dev` must run from `apps/desktop/`** — running from project root fails silently. The `electron-dev.sh` script handles this automatically.
|
||||
- **Don't resize the Electron window after load** — resizing triggers full SPA reload
|
||||
- **Store is at `window.__LOBE_STORES`** not `window.__ZUSTAND_STORES__`
|
||||
|
||||
### osascript
|
||||
|
||||
See [bot/osascript-common.md](./bot/osascript-common.md#gotchas) for the full osascript gotchas list (accessibility permissions, `keystroke` non-ASCII issues, locale-specific app names, rate limiting, etc.).
|
||||
@@ -1,110 +0,0 @@
|
||||
# Log `agent-browser` into a local LobeHub dev server
|
||||
|
||||
`agent-browser --headed` on macOS often creates the Chromium window off-screen — the user can't see or interact with it, so manual login inside the agent-browser session fails. Instead of sharing the user's real Chrome profile, copy the **better-auth session cookie** out of a request in DevTools and inject it into the agent-browser session as a Playwright-style state file.
|
||||
|
||||
## When to use
|
||||
|
||||
- You need `agent-browser` to reach an authenticated page on `http://localhost:<port>` (e.g. `localhost:3011`).
|
||||
- The user already has a logged-in tab of the same dev server in their own Chrome.
|
||||
- Spawning a headed Chromium to let the user log in manually is unreliable (window off-screen, no interaction).
|
||||
|
||||
Do **not** use this on production URLs — only local dev. Treat the cookie as a secret: don't paste it into shared logs, PRs, or commit it anywhere.
|
||||
|
||||
## Step 1 — Ask the user to copy the cookie from a Network request, NOT `document.cookie`
|
||||
|
||||
`document.cookie` will not return HttpOnly cookies, which is exactly where better-auth puts its session. Instruct the user:
|
||||
|
||||
1. Open the logged-in tab (`http://localhost:<port>/…`) in their own Chrome.
|
||||
2. `Cmd+Option+I` → **Network** tab.
|
||||
3. Refresh, click any same-origin request (e.g. the top-level document request).
|
||||
4. In the right pane under **Request Headers**, right-click the `Cookie:` line → **Copy value** (or copy the entire header).
|
||||
5. Paste the string into chat.
|
||||
|
||||
You only need the better-auth pieces. Everything else (Clerk, `LOBE_LOCALE`, HMR hash, theme vars) is noise and can stay. The minimum viable set is:
|
||||
|
||||
```
|
||||
better-auth.session_token=<value>; better-auth.state=<value>
|
||||
```
|
||||
|
||||
## Step 2 — Build a Playwright-style state file
|
||||
|
||||
`agent-browser state load` expects Playwright's `storageState` format: a JSON with a `cookies` array and an `origins` array.
|
||||
|
||||
```bash
|
||||
cat > /tmp/mkstate.py << 'PY'
|
||||
import json, sys, time
|
||||
|
||||
# Read the Cookie header from stdin (allows optional "Cookie: " prefix).
|
||||
raw = sys.stdin.read().strip()
|
||||
if raw.lower().startswith("cookie:"):
|
||||
raw = raw.split(":", 1)[1].strip()
|
||||
|
||||
# Keep only better-auth cookies. Extend this set if the app genuinely needs more.
|
||||
WANTED = {"better-auth.session_token", "better-auth.state"}
|
||||
|
||||
cookies = []
|
||||
exp = int(time.time()) + 30 * 24 * 3600 # 30 days
|
||||
for pair in raw.split("; "):
|
||||
if "=" not in pair:
|
||||
continue
|
||||
name, _, value = pair.partition("=")
|
||||
if name not in WANTED:
|
||||
continue
|
||||
cookies.append({
|
||||
"name": name,
|
||||
"value": value,
|
||||
"domain": "localhost",
|
||||
"path": "/",
|
||||
"expires": exp,
|
||||
"httpOnly": False,
|
||||
"secure": False,
|
||||
"sameSite": "Lax",
|
||||
})
|
||||
|
||||
if not cookies:
|
||||
sys.stderr.write("no better-auth cookies found in input\n")
|
||||
sys.exit(1)
|
||||
|
||||
print(json.dumps({"cookies": cookies, "origins": []}, indent=2))
|
||||
PY
|
||||
|
||||
# Feed the copied Cookie header in via env var or heredoc.
|
||||
printf '%s' "$COOKIE_HEADER" | python3 /tmp/mkstate.py > /tmp/state.json
|
||||
```
|
||||
|
||||
**Note on `httpOnly`**: the real cookie in the user's browser is HttpOnly, but `storageState` doesn't enforce the flag on load — it just attaches the value. Storing with `httpOnly: false` is fine for local dev and sidesteps a CDP-context quirk where HttpOnly cookies sometimes fail to attach.
|
||||
|
||||
## Step 3 — Load state and navigate
|
||||
|
||||
```bash
|
||||
SESSION="my-test" # any stable session name
|
||||
|
||||
agent-browser --session "$SESSION" state load /tmp/state.json
|
||||
agent-browser --session "$SESSION" open "http://localhost:3011/"
|
||||
agent-browser --session "$SESSION" get url
|
||||
# Expect NOT /signin?callbackUrl=… — if you still see signin, cookie didn't apply.
|
||||
```
|
||||
|
||||
## Step 4 — Verify
|
||||
|
||||
```bash
|
||||
agent-browser --session "$SESSION" snapshot -i | head -20
|
||||
# Look for the user's avatar/name in the sidebar, or absence of the signin form.
|
||||
```
|
||||
|
||||
## Common failure modes
|
||||
|
||||
| Symptom | Cause | Fix |
|
||||
| ----------------------------------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------- |
|
||||
| Still redirects to `/signin` after `state load` | User pasted from `document.cookie` → missed HttpOnly session | Re-pull from Network request Headers, not console |
|
||||
| `state load` reports 0 cookies | Separator wrong, or user pasted URL-decoded value | Keep the raw `Cookie:` header as-is; split on `"; "` |
|
||||
| Login works briefly then expires | `better-auth.session_token` rotated (user logged out / signed in again) | Re-copy and re-load |
|
||||
| Domain mismatch | Use `domain: "localhost"` literally, no leading dot for local dev | — |
|
||||
|
||||
## Scope
|
||||
|
||||
Only covers authenticating an **agent-browser** session into a **local** LobeHub dev server. It does not:
|
||||
|
||||
- Work for production — production cookies are `Secure; HttpOnly; Domain=.lobehub.com` and must be delivered over HTTPS.
|
||||
- Replace real OAuth flows — tests that must exercise the login UI need a real Chromium with `--remote-debugging-port` or a bot account.
|
||||
- Flow cookies back to the user's Chrome — injection is one-way (into agent-browser only).
|
||||
@@ -0,0 +1,69 @@
|
||||
---
|
||||
name: model-bank-metadata
|
||||
description: 'Backfill and maintain model-bank metadata (knowledgeCutoff, family, generation). Use when adding models, fixing cutoff/family data, running a metadata sweep across aiModels providers, or researching official knowledge cutoffs.'
|
||||
user-invocable: false
|
||||
---
|
||||
|
||||
# Model-Bank Metadata (knowledgeCutoff / family / generation)
|
||||
|
||||
How to populate and maintain the three structured metadata fields on `packages/model-bank/src/aiModels/*.ts` model cards, at single-model scale (new model PR) or repo-wide scale (sweep across \~80 provider files / \~1900 entries).
|
||||
|
||||
## Field semantics
|
||||
|
||||
| Field | Format | Meaning |
|
||||
| ----------------- | ----------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| `knowledgeCutoff` | `'YYYY-MM'` (or `'YYYY'` if only the year is published) | World-knowledge cutoff. When a vendor distinguishes a **"reliable knowledge cutoff"** from the broader training-data cutoff (Anthropic does), always use the **reliable** one. |
|
||||
| `family` | lowercase slug (`claude`, `gpt`, `o-series`, `qwen`, `deepseek`, `llama`, `glm`, …) | Model lineage, finer than `organization`. Lets the UI group models and match the same model across aggregator providers. |
|
||||
| `generation` | family slug + version (`claude-4.6`, `gpt-5.2`, `qwen3.5`, `llama-3.1`) | Generation within the family. Only set when confidently derivable from the model line's naming. Rolling aliases (`qwen-max`, `deepseek-chat`, `gemini-flash-latest`) get `family` only. |
|
||||
|
||||
All three are optional. **The cardinal rule: only fill what an authoritative source states or naming rules derive — never guess.** An empty field is correct for vendors that publish nothing.
|
||||
|
||||
No DB migration is ever needed for these: builtin models are merged from model-bank at read time (`repositories/aiInfra/index.ts` spreads the whole card), so new card fields flow to the client automatically.
|
||||
|
||||
## Sourcing rules for knowledgeCutoff
|
||||
|
||||
Accept only:
|
||||
|
||||
- Vendor official docs (platform.openai.com / developers.openai.com, docs.x.ai, ai.google.dev, docs.anthropic.com / platform.claude.com)
|
||||
- Official Hugging Face org model cards (huggingface.co/meta-llama/..., etc.)
|
||||
- Official tech reports / system cards / launch blog posts
|
||||
|
||||
Reject:
|
||||
|
||||
- **Third-party aggregator sites** (aiknowledgecutoff.com and similar) — proven to copy one model's value across a whole family. A Cohere sweep once claimed `2024-06` for four distinct base models; none of the cited Cohere pages said that, and the only cutoff Cohere actually publishes is Feb 2023 for the 08-2024 Command R/R+ refresh.
|
||||
- **AWS Bedrock model cards as sole source** — proven to conflate launch date with knowledge cutoff (DeepSeek R1's card lists both as "Jan 2025"). If Bedrock is the only place a value appears, leave the field empty.
|
||||
- Inference from `releasedAt` — a release date is not a cutoff.
|
||||
|
||||
Variant inheritance: dated snapshots (`-2024-08-06`), speed/price tiers of the same checkpoint, quantizations (`-fp8`, `-awq`), context-length variants (`-32k`), ollama `:NNb` tags, and cloud-prefixed ids (`anthropic.`/`us.`/`global.` Bedrock ids) share their base model's cutoff. **Distills do not inherit** from teacher or base — use the distill's own published value or leave empty. **Sizes within one generation can genuinely differ**: Llama 3 8B is Mar 2023 while 70B is Dec 2023 (per Meta's own card) — don't "fix" that to one family-wide value.
|
||||
|
||||
Vendors that publish no cutoffs (leave empty, don't chase): Qwen, DeepSeek, GLM/Zhipu, ERNIE, Doubao, Hunyuan, SenseNova, Spark, MiniMax, StepFun, Yi (mostly), Moonshot.
|
||||
|
||||
Known per-vendor footguns:
|
||||
|
||||
- **Anthropic**: Opus 4.6 reliable cutoff is `2025-05`, Sonnet 4.6 is `2025-08` — easy to swap. Claude 3.7 is `2024-10` (system card: trained through Nov 2024, knowledge cutoff end of Oct 2024). Cite system cards / the models overview, not the Help Center article (a living page that drops retired models — citation rot).
|
||||
- **xAI**: docs.x.ai has one blanket sentence covering grok-3/grok-4; mini variants are not named there. Grok 4.20/4.3 have no official cutoff anywhere.
|
||||
- **OpenAI**: per-model docs pages (developers.openai.com/api/docs/models/<id>) state cutoffs explicitly, including snapshot differences (gpt-4-1106-preview `2023-04` vs gpt-4-0125-preview `2023-12`).
|
||||
|
||||
## family/generation derivation
|
||||
|
||||
Rule-based, no research needed: `scripts/derive-family.ts` holds the per-family regex rules. Traps already encoded there — keep them when extending:
|
||||
|
||||
- Date suffixes are not versions: `claude-sonnet-4-20250514` is generation `claude-4`, not `claude-4.2`.
|
||||
- Size suffixes are not versions: `llama-3-8b` → `llama-3` (not `llama-3.8`); `gemma-7b-it` is **gemma-1** (not gemma-7).
|
||||
- Vendor spelling variants: `qwen2p5` = qwen2.5, `llama-v3p1` = llama-3.1, ollama `:NNb` tags, Bedrock `us.`/`global.`/`anthropic.` prefixes.
|
||||
- `claude-X.0` normalizes to `claude-X`.
|
||||
- Fable/Mythos-class ids (`claude-fable-5`) don't match the opus/sonnet/haiku regex — they are the Mythos class — `family: 'claude-mythos'`, `generation: 'mythos-5'` (set manually; the launch page calls Fable 5 "the generally available Mythos-class model").
|
||||
|
||||
## Repo-wide sweep workflow
|
||||
|
||||
1. **Extract ids**: `bun .agents/skills/model-bank-metadata/scripts/extract-model-ids.ts` → unique normalized chat-model ids (normalization = last path segment, lowercased). Non-chat types (image/video/embedding/tts) have no knowledge cutoff — skip them.
|
||||
2. **Research (multi-agent)**: chunk ids by family (≤50 per chunk) and fan out one research agent per chunk (Workflow tool), each returning `{id, cutoff, source}` with the sourcing rules above baked into the prompt, **plus** one adversarial verify agent per chunk that re-fetches cited sources and refutes unsupported claims. The verify pass is load-bearing: it caught the Cohere aggregator copy-paste and the AWS launch-date conflation.
|
||||
3. **Policy filter**: before applying, drop entries whose only source is a rejected category (check the returned `sources` map — e.g. drop everything sourced to aws.amazon.com).
|
||||
4. **Apply**: `bun scripts/apply-cutoffs.ts <map.json>` and `bun scripts/apply-family.ts <map.json>` (run from repo root). Both are idempotent codemods keyed on normalized id — aggregator providers get the same values automatically; entries that already have the field are skipped. They rely on the uniform prettier formatting of the data files (entries start ` {` / end ` },`, fields at 4-space indent).
|
||||
5. **Verify**: `cd packages/model-bank && bunx vitest run src/aiModels/__tests__/index.test.ts && bunx tsc --noEmit`.
|
||||
|
||||
## Maintenance rules
|
||||
|
||||
- **New model PRs** should fill all three fields inline, citing the official source in the PR body (see the Anthropic entries in `anthropic.ts` for reference values).
|
||||
- **After resolving merge conflicts** in model-bank data files, sanity-check that metadata didn't vanish: `git grep -c knowledgeCutoff -- 'packages/model-bank/src/aiModels/*.ts'` before vs after. A three-way stack of model PRs once silently dropped all 10 Anthropic cutoffs during conflict resolution.
|
||||
- Dirty ids exist in aggregator data (a sambanova id once carried a trailing tab). The codemods match ids verbatim — if a map key won't apply, check for invisible characters before assuming the model is missing.
|
||||
@@ -0,0 +1,73 @@
|
||||
/**
|
||||
* One-off codemod: apply a canonical { normalizedModelId: 'YYYY-MM' } map onto
|
||||
* packages/model-bank/src/aiModels/*.ts, inserting `knowledgeCutoff` after the
|
||||
* `id:` line of every chat-model entry that matches and doesn't already have one.
|
||||
*
|
||||
* Relies on the uniform prettier formatting of these files:
|
||||
* - each model entry starts with ` {` and ends with ` },` (2-space indent)
|
||||
* - fields are at 4-space indent: ` id: '...'`, ` type: 'chat'`
|
||||
*
|
||||
* Usage: bun /tmp/apply-cutoffs.ts /tmp/cutoff-map.json
|
||||
*/
|
||||
import { readdirSync, readFileSync, writeFileSync } from 'node:fs';
|
||||
import { join } from 'node:path';
|
||||
|
||||
const mapPath = process.argv[2];
|
||||
if (!mapPath) throw new Error('usage: bun apply-cutoffs.ts <map.json>');
|
||||
const map: Record<string, string> = JSON.parse(readFileSync(mapPath, 'utf8'));
|
||||
|
||||
const dir = 'packages/model-bank/src/aiModels';
|
||||
const normalize = (id: string) => id.split('/').pop()!.toLowerCase();
|
||||
|
||||
let touchedFiles = 0;
|
||||
let inserted = 0;
|
||||
const matchedIds = new Set<string>();
|
||||
|
||||
for (const file of readdirSync(dir).filter((f) => f.endsWith('.ts'))) {
|
||||
const path = join(dir, file);
|
||||
const lines = readFileSync(path, 'utf8').split('\n');
|
||||
const out: string[] = [];
|
||||
let changed = false;
|
||||
|
||||
let i = 0;
|
||||
while (i < lines.length) {
|
||||
if (lines[i] !== ' {') {
|
||||
out.push(lines[i]);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
// collect one model entry block
|
||||
const start = i;
|
||||
let end = i;
|
||||
while (end < lines.length && lines[end] !== ' },') end++;
|
||||
const block = lines.slice(start, end + 1);
|
||||
|
||||
const idLineIdx = block.findIndex((l) => /^ {4}id: '/.test(l));
|
||||
const isChat = block.some((l) => /^ {4}type: 'chat',?$/.test(l));
|
||||
const hasCutoff = block.some((l) => /^ {4}knowledgeCutoff:/.test(l));
|
||||
|
||||
if (idLineIdx >= 0 && isChat && !hasCutoff) {
|
||||
const rawId = block[idLineIdx].match(/^ {4}id: '(.+)',$/)?.[1];
|
||||
const norm = rawId ? normalize(rawId) : undefined;
|
||||
const cutoff = norm ? map[norm] : undefined;
|
||||
if (cutoff && /^\d{4}(?:-\d{2})?$/.test(cutoff)) {
|
||||
block.splice(idLineIdx + 1, 0, ` knowledgeCutoff: '${cutoff}',`);
|
||||
inserted++;
|
||||
changed = true;
|
||||
matchedIds.add(norm!);
|
||||
}
|
||||
}
|
||||
out.push(...block);
|
||||
i = end + 1;
|
||||
}
|
||||
|
||||
if (changed) {
|
||||
writeFileSync(path, out.join('\n'));
|
||||
touchedFiles++;
|
||||
}
|
||||
}
|
||||
|
||||
console.log(`inserted ${inserted} knowledgeCutoff fields across ${touchedFiles} files`);
|
||||
console.log(`map ids used: ${matchedIds.size}/${Object.keys(map).length}`);
|
||||
const unused = Object.keys(map).filter((k) => !matchedIds.has(k));
|
||||
if (unused.length) console.log('unused map keys (first 20):', unused.slice(0, 20));
|
||||
@@ -0,0 +1,49 @@
|
||||
import { readdirSync, readFileSync, writeFileSync } from 'node:fs';
|
||||
import { join } from 'node:path';
|
||||
|
||||
const map: Record<string, { family: string; generation?: string }> = JSON.parse(
|
||||
readFileSync('/tmp/family-map.json', 'utf8'),
|
||||
);
|
||||
const dir = 'packages/model-bank/src/aiModels';
|
||||
const normalize = (id: string) => id.split('/').pop()!.toLowerCase();
|
||||
|
||||
let inserted = 0;
|
||||
let touchedFiles = 0;
|
||||
for (const file of readdirSync(dir).filter((f) => f.endsWith('.ts'))) {
|
||||
const path = join(dir, file);
|
||||
const lines = readFileSync(path, 'utf8').split('\n');
|
||||
const out: string[] = [];
|
||||
let changed = false;
|
||||
let i = 0;
|
||||
while (i < lines.length) {
|
||||
if (lines[i] !== ' {') {
|
||||
out.push(lines[i]);
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
let end = i;
|
||||
while (end < lines.length && lines[end] !== ' },') end++;
|
||||
const block = lines.slice(i, end + 1);
|
||||
const idLineIdx = block.findIndex((l) => /^ {4}id: '/.test(l));
|
||||
const isChat = block.some((l) => /^ {4}type: 'chat',?$/.test(l));
|
||||
const hasFamily = block.some((l) => /^ {4}family:/.test(l));
|
||||
if (idLineIdx >= 0 && isChat && !hasFamily) {
|
||||
const rawId = block[idLineIdx].match(/^ {4}id: '(.+)',$/)?.[1];
|
||||
const r = rawId ? map[normalize(rawId)] : undefined;
|
||||
if (r) {
|
||||
const add = [` family: '${r.family}',`];
|
||||
if (r.generation) add.push(` generation: '${r.generation}',`);
|
||||
block.splice(idLineIdx, 0, ...add);
|
||||
inserted++;
|
||||
changed = true;
|
||||
}
|
||||
}
|
||||
out.push(...block);
|
||||
i = end + 1;
|
||||
}
|
||||
if (changed) {
|
||||
writeFileSync(path, out.join('\n'));
|
||||
touchedFiles++;
|
||||
}
|
||||
}
|
||||
console.log(`annotated ${inserted} model entries across ${touchedFiles} files`);
|
||||
@@ -0,0 +1,237 @@
|
||||
/* eslint-disable regexp/no-unused-capturing-group */
|
||||
/**
|
||||
* Rule-based derivation of { family, generation } from normalized model ids.
|
||||
* Principle: only fill what is confidently derivable; otherwise omit.
|
||||
*
|
||||
* Usage: bun /tmp/derive-family.ts # print distinct pairs for review
|
||||
* bun /tmp/derive-family.ts --emit # write /tmp/family-map.json
|
||||
*/
|
||||
import { readFileSync, writeFileSync } from 'node:fs';
|
||||
|
||||
const ids: string[] = JSON.parse(readFileSync('/tmp/model-ids.json', 'utf8'));
|
||||
|
||||
type R = { family: string; generation?: string };
|
||||
|
||||
const derive = (id: string): R | undefined => {
|
||||
// strip cloud/bedrock prefixes for matching
|
||||
const m = id.replace(/^(us\.|global\.|eu\.|apac\.)?(anthropic\.|meta\.|cohere\.|azure-)/, '');
|
||||
|
||||
// ---- anthropic ----
|
||||
if (m.startsWith('claude')) {
|
||||
// family = product-line tier (claude-opus/sonnet/haiku/instant); bare claude-2.x has no tier
|
||||
const tier = m.match(/(opus|sonnet|haiku|instant)/)?.[1];
|
||||
const family = tier ? `claude-${tier}` : 'claude';
|
||||
let g = m.match(/^claude-(?:opus|sonnet|haiku)-(\d)[.-](\d)(?!\d)/); // claude-opus-4-8 / claude-haiku-4.5
|
||||
if (g) return { family, generation: `claude-${g[1]}.${g[2]}` };
|
||||
g = m.match(/^claude-(?:opus|sonnet|haiku)-(\d)(?!\d)/); // claude-opus-4
|
||||
if (g) return { family, generation: `claude-${g[1]}` };
|
||||
g = m.match(/^claude-(\d)[.-](\d)(?!\d)/); // claude-3-5-haiku / claude-3.7-sonnet / claude-2.1
|
||||
if (g) return { family, generation: g[2] === '0' ? `claude-${g[1]}` : `claude-${g[1]}.${g[2]}` };
|
||||
g = m.match(/^claude-(\d)(?!\d)/); // claude-3-haiku
|
||||
if (g) return { family, generation: `claude-${g[1]}` };
|
||||
if (m.startsWith('claude-instant')) return { family: 'claude-instant' };
|
||||
if (/^claude-v?2/.test(m)) return { family: 'claude', generation: 'claude-2' };
|
||||
return { family };
|
||||
}
|
||||
|
||||
// ---- openai ----
|
||||
if (/^(gpt-oss|gpt_oss)/.test(m) || m.startsWith('gpt-oss:'))
|
||||
return { family: 'gpt-oss', generation: 'gpt-oss' };
|
||||
if (/^(chatgpt-4o|gpt-4o)/.test(m)) return { family: 'gpt', generation: 'gpt-4o' };
|
||||
if (/^gpt-(3\.5|35)/.test(m)) return { family: 'gpt', generation: 'gpt-3.5' };
|
||||
if (m.startsWith('gpt-audio')) return { family: 'gpt', generation: 'gpt-audio' };
|
||||
{
|
||||
const g = m.match(/^gpt-(\d)\.(\d)/); // gpt-4.1 / gpt-5.2
|
||||
if (g) return { family: 'gpt', generation: `gpt-${g[1]}.${g[2]}` };
|
||||
const g2 = m.match(/^gpt-(\d)(?!\d)/); // gpt-4 / gpt-5
|
||||
if (g2) return { family: 'gpt', generation: `gpt-${g2[1]}` };
|
||||
}
|
||||
{
|
||||
const g = m.match(/^o([134])(-|$)/); // o1 / o3 / o4
|
||||
if (g) return { family: 'o-series', generation: `o${g[1]}` };
|
||||
}
|
||||
if (/^(codex|computer-use-preview)/.test(m)) return { family: 'gpt' };
|
||||
|
||||
// ---- google ----
|
||||
{
|
||||
const g = m.match(/^gemini-(\d+(?:\.\d+)?)/);
|
||||
if (g) return { family: 'gemini', generation: `gemini-${g[1]}` };
|
||||
if (/^gemini-(pro|flash)/.test(m)) return { family: 'gemini' }; // rolling aliases
|
||||
if (m.startsWith('gemma')) {
|
||||
if (/^gemma-?\db/.test(m)) return { family: 'gemma', generation: 'gemma-1' };
|
||||
const v = m.match(/^gemma-?(\d)(?!b)/);
|
||||
return { family: 'gemma', generation: v ? `gemma-${v[1]}` : undefined };
|
||||
}
|
||||
if (/^(codegemma|learnlm|palm)/.test(m)) return { family: m.match(/^[a-z]+/)![0] };
|
||||
}
|
||||
|
||||
// ---- qwen ----
|
||||
if (m.startsWith('qwq')) return { family: 'qwen', generation: 'qwq' };
|
||||
if (m.startsWith('qvq')) return { family: 'qwen', generation: 'qvq' };
|
||||
if (m.startsWith('codeqwen')) return { family: 'qwen' };
|
||||
if (m.startsWith('qwen')) {
|
||||
const g =
|
||||
m.match(/^qwen-?([123](?:\.\d+)?)(?![0-9b])/) || // qwen3.5-plus / qwen-3-14b / qwen2-7b / qwen1.5
|
||||
m.match(/^qwen([23](?:\.\d+)?):/) || // qwen2.5:72b
|
||||
m.match(/^qwen([23])p(\d)/); // qwen2p5 -> handled below
|
||||
if (/^qwen(\d)p(\d)/.test(m)) {
|
||||
const p = m.match(/^qwen(\d)p(\d)/)!;
|
||||
return { family: 'qwen', generation: `qwen${p[1]}.${p[2]}` };
|
||||
}
|
||||
if (g) return { family: 'qwen', generation: `qwen${g[1]}` };
|
||||
return { family: 'qwen' }; // qwen-max/plus/turbo/vl rolling aliases
|
||||
}
|
||||
|
||||
// ---- deepseek ----
|
||||
if (/^(deepseek|azure-deepseek|pro-deepseek)/.test(m) || m.startsWith('deepseek_')) {
|
||||
const s = m.replace(/^pro-/, '').replaceAll('_', '-');
|
||||
if (s.startsWith('deepseek-r1-distill'))
|
||||
return { family: 'deepseek', generation: 'deepseek-r1-distill' };
|
||||
if (s.startsWith('deepseek-r1')) return { family: 'deepseek', generation: 'deepseek-r1' };
|
||||
const g = s.match(/^deepseek-(?:chat-)?v(\d(?:\.\d)?)/);
|
||||
if (g) return { family: 'deepseek', generation: `deepseek-v${g[1]}` };
|
||||
if (/^deepseek-(coder-v2|coder)/.test(s))
|
||||
return { family: 'deepseek', generation: 'deepseek-coder' };
|
||||
return { family: 'deepseek' }; // deepseek-chat / reasoner rolling aliases
|
||||
}
|
||||
|
||||
// ---- meta llama ----
|
||||
if (m.startsWith('codellama')) return { family: 'llama', generation: 'codellama' };
|
||||
if (/^(meta-)?llama|^l3(\d)?-|^llava/.test(m)) {
|
||||
if (m.startsWith('llava')) return { family: 'llava' };
|
||||
const s = m.replace(/^meta-/, '');
|
||||
const g =
|
||||
s.match(/^llama-?([234])(?:[.-](\d))?(?![0-9b])/) || // llama-3.1 / llama3.3 / llama-4
|
||||
s.match(/^llama-?v([234])p?(\d)?/) || // llama-v3p1
|
||||
s.match(/^llama([234])[.:-](\d)?/);
|
||||
if (g) {
|
||||
const gen = g[2] ? `llama-${g[1]}.${g[2]}` : `llama-${g[1]}`;
|
||||
return { family: 'llama', generation: gen };
|
||||
}
|
||||
if (m.startsWith('l3-')) return { family: 'llama', generation: 'llama-3' };
|
||||
if (m.startsWith('l31-')) return { family: 'llama', generation: 'llama-3.1' };
|
||||
return { family: 'llama' };
|
||||
}
|
||||
|
||||
// ---- zhipu ----
|
||||
if (/^(zai-)?glm/.test(m)) {
|
||||
const s = m.replace(/^zai-/, '');
|
||||
if (s.startsWith('glm-z1')) return { family: 'glm', generation: 'glm-z1' };
|
||||
if (s.startsWith('glm-zero')) return { family: 'glm', generation: 'glm-zero' };
|
||||
const g = s.match(/^glm-(\d(?:\.\d)?)/);
|
||||
if (g) return { family: 'glm', generation: `glm-${g[1]}` };
|
||||
return { family: 'glm' };
|
||||
}
|
||||
if (/^(charglm|codegeex|emohaa)/.test(m)) return { family: m.match(/^[a-z]+/)![0] };
|
||||
|
||||
// ---- mistral ----
|
||||
if (
|
||||
/^(open-)?(mistral|mixtral|ministral|codestral|devstral|magistral|pixtral|mathstral|labs-devstral|labs-leanstral|open-codestral)/.test(
|
||||
m,
|
||||
)
|
||||
) {
|
||||
const fam = m.replace(/^(open-|labs-)/, '').match(/^[a-z]+/)![0];
|
||||
return { family: fam };
|
||||
}
|
||||
|
||||
// ---- xai ----
|
||||
if (m.startsWith('grok')) {
|
||||
const g = m.match(/^grok-(\d(?:\.\d+)?)/);
|
||||
return { family: 'grok', generation: g ? `grok-${g[1]}` : undefined };
|
||||
}
|
||||
|
||||
// ---- moonshot ----
|
||||
if (m.startsWith('kimi')) {
|
||||
const g = m.match(/^kimi-k(\d(?:\.\d)?)/);
|
||||
return { family: 'kimi', generation: g ? `kimi-k${g[1]}` : undefined };
|
||||
}
|
||||
if (m.startsWith('moonshot-kimi-k2')) return { family: 'kimi', generation: 'kimi-k2' };
|
||||
if (m.startsWith('moonshot-v1')) return { family: 'kimi', generation: 'moonshot-v1' };
|
||||
|
||||
// ---- minimax ----
|
||||
if (m.startsWith('minimax')) {
|
||||
if (m.startsWith('minimax-text')) return { family: 'minimax', generation: 'minimax-text-01' };
|
||||
const g = m.match(/^minimax-m(\d(?:\.\d)?)/);
|
||||
return { family: 'minimax', generation: g ? `minimax-m${g[1]}` : undefined };
|
||||
}
|
||||
if (m.startsWith('abab')) return { family: 'minimax', generation: 'abab' };
|
||||
|
||||
// ---- baidu ----
|
||||
if (m.startsWith('ernie')) {
|
||||
if (m.startsWith('ernie-x1')) return { family: 'ernie', generation: 'ernie-x1' };
|
||||
const g = m.match(/^ernie-(\d\.\d)/);
|
||||
return { family: 'ernie', generation: g ? `ernie-${g[1]}` : undefined };
|
||||
}
|
||||
if (m.startsWith('qianfan')) return { family: 'qianfan' };
|
||||
|
||||
// ---- bytedance ----
|
||||
if (m.startsWith('doubao')) {
|
||||
const g = m.match(/^doubao-seed-(\d[.-]\d|\d)/) || m.match(/^doubao-(\d\.\d)/);
|
||||
return { family: 'doubao', generation: g ? `doubao-${g[1].replace('-', '.')}` : undefined };
|
||||
}
|
||||
if (/^(seed-oss|skylark)/.test(m)) return { family: m.startsWith('seed') ? 'doubao' : 'skylark' };
|
||||
|
||||
// ---- tencent ----
|
||||
if (m.startsWith('hunyuan')) {
|
||||
const g = m.match(/^hunyuan-(\d\.\d)/);
|
||||
return { family: 'hunyuan', generation: g ? `hunyuan-${g[1]}` : undefined };
|
||||
}
|
||||
if (m.startsWith('hy3')) return { family: 'hunyuan', generation: 'hunyuan-3' };
|
||||
|
||||
// ---- others (family only / simple version) ----
|
||||
if (m.startsWith('yi-')) return { family: 'yi' };
|
||||
if (/^(command|c4ai-command)/.test(m)) return { family: 'command' };
|
||||
if (/^(aya|c4ai-aya)/.test(m)) return { family: 'aya' };
|
||||
if (/^phi-?(\d)?/.test(m) && m.startsWith('phi')) {
|
||||
const g = m.match(/^phi-?(\d(?:\.\d)?)/);
|
||||
return { family: 'phi', generation: g ? `phi-${g[1]}` : undefined };
|
||||
}
|
||||
if (m.startsWith('wizardlm')) return { family: 'wizardlm' };
|
||||
if (m.startsWith('step-')) {
|
||||
const g = m.match(/^step-(?:r1|(\d(?:\.\d)?))/);
|
||||
return { family: 'step', generation: g?.[1] ? `step-${g[1]}` : undefined };
|
||||
}
|
||||
if (/^(internlm|intern-)/.test(m)) return { family: 'intern' };
|
||||
if (m.startsWith('internvl')) return { family: 'internvl' };
|
||||
if (m.startsWith('baichuan')) {
|
||||
const g = m.match(/^baichuan-?(m?\d)/);
|
||||
return { family: 'baichuan', generation: g ? `baichuan-${g[1]}` : undefined };
|
||||
}
|
||||
if (/^(sensechat|sensenova)/.test(m)) return { family: 'sensenova' };
|
||||
if (/^(spark|generalv|4\.0ultra)/.test(m)) return { family: 'spark' };
|
||||
if (/^(360gpt|360zhinao)/.test(m)) return { family: '360zhinao' };
|
||||
if (/^(jamba|ai21-jamba)/.test(m)) return { family: 'jamba' };
|
||||
if (m.startsWith('sonar')) return { family: 'sonar' };
|
||||
if (/^(nova-lite|nova-micro|nova-pro)/.test(m)) return { family: 'nova' };
|
||||
if (/^(ling|ring)-/.test(m)) return { family: m.match(/^[a-z]+/)![0] };
|
||||
if (m.startsWith('longcat')) return { family: 'longcat' };
|
||||
if (m.startsWith('mimo')) return { family: 'mimo' };
|
||||
if (m.startsWith('taichu')) return { family: 'taichu' };
|
||||
if (/^(hermes|nous-hermes)/.test(m)) return { family: 'hermes' };
|
||||
if (m.startsWith('solar')) return { family: 'solar' };
|
||||
if (m.startsWith('kat-coder')) return { family: 'kat-coder' };
|
||||
if (m.startsWith('dbrx')) return { family: 'dbrx' };
|
||||
if (m.startsWith('morph')) return { family: 'morph' };
|
||||
|
||||
return undefined;
|
||||
};
|
||||
|
||||
const map: Record<string, R> = {};
|
||||
const pairs = new Map<string, number>();
|
||||
let derived = 0;
|
||||
for (const id of ids) {
|
||||
const r = derive(id);
|
||||
if (!r) continue;
|
||||
derived++;
|
||||
map[id] = r;
|
||||
const key = `${r.family} :: ${r.generation ?? '—'}`;
|
||||
pairs.set(key, (pairs.get(key) || 0) + 1);
|
||||
}
|
||||
|
||||
console.log(`derived ${derived}/${ids.length}`);
|
||||
for (const [k, n] of [...pairs.entries()].sort()) console.log(String(n).padStart(4), k);
|
||||
|
||||
if (process.argv.includes('--emit')) {
|
||||
writeFileSync('/tmp/family-map.json', JSON.stringify(map, null, 1));
|
||||
console.log('\nwritten /tmp/family-map.json');
|
||||
}
|
||||
@@ -0,0 +1,23 @@
|
||||
/**
|
||||
* Extract unique normalized chat-model ids from packages/model-bank/src/aiModels/*.ts.
|
||||
* Normalization: last path segment, lowercased (matches the apply codemods).
|
||||
*
|
||||
* Usage (repo root): bun .agents/skills/model-bank-metadata/scripts/extract-model-ids.ts [out.json]
|
||||
* Default output: /tmp/model-ids.json
|
||||
*/
|
||||
import { readdirSync, writeFileSync } from 'node:fs';
|
||||
import { join, resolve } from 'node:path';
|
||||
|
||||
const dir = resolve('packages/model-bank/src/aiModels');
|
||||
const out = process.argv[2] || '/tmp/model-ids.json';
|
||||
|
||||
const ids = new Set<string>();
|
||||
for (const f of readdirSync(dir).filter((f) => f.endsWith('.ts'))) {
|
||||
const mod = await import(join(dir, f));
|
||||
for (const m of mod.default || []) {
|
||||
if (!m?.id || m.type !== 'chat') continue;
|
||||
ids.add(m.id.split('/').pop()!.toLowerCase());
|
||||
}
|
||||
}
|
||||
writeFileSync(out, JSON.stringify([...ids].sort(), null, 1));
|
||||
console.log(`${ids.size} unique normalized chat ids -> ${out}`);
|
||||
@@ -50,7 +50,7 @@ Common false positives (do NOT merge):
|
||||
- `db-migrations` vs `drizzle` — distinct workflows (migration files vs schema authoring).
|
||||
- `microcopy` vs `i18n` — content vs mechanics.
|
||||
- `agent-runtime-hooks` vs `agent-tracing` vs `agent-signal` — different surfaces of the agent system.
|
||||
- `testing` vs `local-testing` vs `cli-backend-testing` — different test types.
|
||||
- `testing` vs `agent-testing` — different test types.
|
||||
|
||||
### 4 — Description format consistency
|
||||
|
||||
|
||||
@@ -5,6 +5,18 @@ inputs:
|
||||
node-version:
|
||||
description: Node.js version
|
||||
required: true
|
||||
cloud-repository:
|
||||
description: Cloud repository to overlay for commercial desktop builds
|
||||
required: false
|
||||
default: lobehub/lobehub-cloud
|
||||
cloud-ref:
|
||||
description: Optional Cloud repository ref
|
||||
required: false
|
||||
default: ''
|
||||
cloud-token:
|
||||
description: GitHub token with permission to read the Cloud repository
|
||||
required: false
|
||||
default: ''
|
||||
|
||||
runs:
|
||||
using: composite
|
||||
@@ -14,9 +26,77 @@ runs:
|
||||
with:
|
||||
node-version: ${{ inputs.node-version }}
|
||||
|
||||
- name: Overlay Cloud repository for desktop build
|
||||
if: inputs.cloud-token != ''
|
||||
shell: bash
|
||||
env:
|
||||
CLOUD_CHECKOUT: ${{ runner.temp }}/lobehub-cloud
|
||||
CLOUD_REF: ${{ inputs.cloud-ref }}
|
||||
CLOUD_REPOSITORY: ${{ inputs.cloud-repository }}
|
||||
CLOUD_ROOT: ${{ github.workspace }}/..
|
||||
CLOUD_TOKEN: ${{ inputs.cloud-token }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
|
||||
cloud_root="$(cd "$GITHUB_WORKSPACE/.." && pwd)"
|
||||
cloud_checkout="$RUNNER_TEMP/lobehub-cloud"
|
||||
|
||||
rm -rf "$cloud_checkout"
|
||||
|
||||
clone_args=(--depth 1)
|
||||
if [ -n "$CLOUD_REF" ]; then
|
||||
clone_args+=(--branch "$CLOUD_REF")
|
||||
fi
|
||||
|
||||
git clone "${clone_args[@]}" "https://x-access-token:${CLOUD_TOKEN}@github.com/${CLOUD_REPOSITORY}.git" "$cloud_checkout"
|
||||
|
||||
node <<'NODE'
|
||||
const fs = require('node:fs');
|
||||
const path = require('node:path');
|
||||
|
||||
const source = process.env.CLOUD_CHECKOUT;
|
||||
const target = process.env.CLOUD_ROOT;
|
||||
const skip = new Set(['.git', 'lobehub', 'node_modules']);
|
||||
|
||||
const copy = (from, to) => {
|
||||
const stat = fs.lstatSync(from);
|
||||
if (stat.isSymbolicLink()) {
|
||||
const link = fs.readlinkSync(from);
|
||||
fs.rmSync(to, { force: true, recursive: true });
|
||||
fs.symlinkSync(link, to);
|
||||
return;
|
||||
}
|
||||
|
||||
if (stat.isDirectory()) {
|
||||
fs.mkdirSync(to, { recursive: true });
|
||||
for (const entry of fs.readdirSync(from)) {
|
||||
if (skip.has(entry)) continue;
|
||||
copy(path.join(from, entry), path.join(to, entry));
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
fs.mkdirSync(path.dirname(to), { recursive: true });
|
||||
fs.copyFileSync(from, to);
|
||||
};
|
||||
|
||||
for (const entry of fs.readdirSync(source)) {
|
||||
if (skip.has(entry)) continue;
|
||||
copy(path.join(source, entry), path.join(target, entry));
|
||||
}
|
||||
NODE
|
||||
|
||||
echo "CLOUD_DESKTOP=1" >> "$GITHUB_ENV"
|
||||
echo "✅ Cloud repository overlaid at $cloud_root"
|
||||
|
||||
- name: Install dependencies
|
||||
shell: bash
|
||||
run: pnpm install --node-linker=hoisted
|
||||
run: |
|
||||
set -euo pipefail
|
||||
if [ "${CLOUD_DESKTOP:-}" = "1" ]; then
|
||||
cd ..
|
||||
fi
|
||||
pnpm install --node-linker=hoisted
|
||||
|
||||
# 移除国内 electron 镜像配置,GitHub Actions 使用官方源更快
|
||||
- name: Remove China electron mirror from .npmrc
|
||||
@@ -31,4 +111,11 @@ runs:
|
||||
|
||||
- name: Install deps on Desktop
|
||||
shell: bash
|
||||
run: npm run install-isolated --prefix=./apps/desktop
|
||||
run: |
|
||||
set -euo pipefail
|
||||
if [ "${CLOUD_DESKTOP:-}" = "1" ]; then
|
||||
cd ..
|
||||
npm run install-isolated --prefix=./lobehub/apps/desktop
|
||||
else
|
||||
npm run install-isolated --prefix=./apps/desktop
|
||||
fi
|
||||
|
||||
@@ -30,7 +30,7 @@ jobs:
|
||||
This issue is closed, If you have any questions, you can comment and reply.
|
||||
- name: Checkout repository
|
||||
if: github.event_name == 'pull_request_target' && github.event.pull_request.merged == true
|
||||
uses: actions/checkout@v4
|
||||
uses: actions/checkout@v6
|
||||
|
||||
- name: Check if PR author is maintainer
|
||||
if: github.event.pull_request.merged == true
|
||||
|
||||
@@ -104,6 +104,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
@@ -172,6 +173,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
@@ -216,6 +218,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
|
||||
@@ -54,7 +54,7 @@ jobs:
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v6
|
||||
with:
|
||||
node-version: 24.11.1
|
||||
node-version: 24.16.0
|
||||
package-manager-cache: false
|
||||
|
||||
# 主要逻辑:确定构建版本号
|
||||
@@ -92,6 +92,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: 24.11.1
|
||||
|
||||
# 设置 package.json 的版本号
|
||||
|
||||
@@ -87,6 +87,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
|
||||
@@ -223,6 +223,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
@@ -409,7 +410,7 @@ jobs:
|
||||
- uses: actions/checkout@v6
|
||||
|
||||
- name: Delete old canary GitHub releases
|
||||
uses: actions/github-script@v7
|
||||
uses: actions/github-script@v8
|
||||
with:
|
||||
script: |
|
||||
const { data: releases } = await github.rest.repos.listReleases({
|
||||
|
||||
@@ -180,6 +180,7 @@ jobs:
|
||||
- name: Setup build environment
|
||||
uses: ./.github/actions/desktop-build-setup
|
||||
with:
|
||||
cloud-token: ${{ secrets.LOBEHUB_CLOUD_TOKEN }}
|
||||
node-version: ${{ env.NODE_VERSION }}
|
||||
|
||||
- name: Set package version
|
||||
|
||||
@@ -28,7 +28,7 @@ jobs:
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v6
|
||||
with:
|
||||
node-version: 24.11.1
|
||||
node-version: 24.16.0
|
||||
|
||||
- name: Setup pnpm
|
||||
uses: pnpm/action-setup@v4
|
||||
@@ -51,7 +51,7 @@ jobs:
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v6
|
||||
with:
|
||||
node-version: 24.11.1
|
||||
node-version: 24.16.0
|
||||
registry-url: https://registry.npmjs.org
|
||||
|
||||
- name: Setup pnpm
|
||||
|
||||
@@ -19,12 +19,6 @@ jobs:
|
||||
steps:
|
||||
- uses: actions/checkout@v6
|
||||
|
||||
- name: Clean issue notice
|
||||
uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
|
||||
with:
|
||||
actions: 'close-issues'
|
||||
labels: '🚨 Sync Fail'
|
||||
|
||||
- name: Sync upstream changes
|
||||
id: sync
|
||||
uses: aormsby/Fork-Sync-With-Upstream-action@v3.4
|
||||
@@ -33,22 +27,4 @@ jobs:
|
||||
upstream_sync_branch: main
|
||||
target_sync_branch: main
|
||||
target_repo_token: ${{ secrets.GITHUB_TOKEN }} # automatically generated, no need to set
|
||||
test_mode: false
|
||||
|
||||
- name: Sync check
|
||||
if: failure()
|
||||
uses: actions-cool/issues-helper@e361abf610221f09495ad510cb1e69328d839e1c # v3.7.6
|
||||
with:
|
||||
actions: 'create-issue'
|
||||
title: '🚨 同步失败 | Sync Fail'
|
||||
labels: '🚨 Sync Fail'
|
||||
body: |
|
||||
Due to a change in the workflow file of the [LobeChat][lobechat] upstream repository, GitHub has automatically suspended the scheduled automatic update. You need to manually sync your fork. Please refer to the detailed [Tutorial][tutorial-en-US] for instructions.
|
||||
|
||||
由于 [LobeChat][lobechat] 上游仓库的 workflow 文件变更,导致 GitHub 自动暂停了本次自动更新,你需要手动 Sync Fork 一次,请查看 [详细教程][tutorial-zh-CN]
|
||||
|
||||

|
||||
|
||||
[lobechat]: https://github.com/lobehub/lobe-chat
|
||||
[tutorial-zh-CN]: https://lobehub.com/zh/docs/self-hosting/advanced/upstream-sync
|
||||
[tutorial-en-US]: https://lobehub.com/docs/self-hosting/advanced/upstream-sync
|
||||
test_mode: false
|
||||
@@ -32,7 +32,7 @@ jobs:
|
||||
runs-on: ubuntu-latest
|
||||
name: Test Packages
|
||||
env:
|
||||
PACKAGES: '@lobechat/file-loaders @lobechat/prompts @lobechat/model-runtime @lobechat/web-crawler @lobechat/electron-server-ipc @lobechat/utils @lobechat/python-interpreter @lobechat/context-engine @lobechat/agent-runtime @lobechat/conversation-flow @lobechat/ssrf-safe-fetch @lobechat/memory-user-memory @lobechat/types @lobechat/trpc @lobechat/app-config @lobechat/locales @lobechat/env @lobechat/builtin-tool-lobe-agent model-bank @lobechat/agent-gateway-client @lobechat/agent-manager-runtime @lobechat/device-gateway-client @lobechat/device-identity @lobechat/eval-dataset-parser @lobechat/eval-rubric @lobechat/fetch-sse @lobechat/heterogeneous-agents'
|
||||
PACKAGES: '@lobechat/file-loaders @lobechat/prompts @lobechat/model-runtime @lobechat/web-crawler @lobechat/electron-server-ipc @lobechat/utils @lobechat/context-engine @lobechat/agent-runtime @lobechat/conversation-flow @lobechat/ssrf-safe-fetch @lobechat/memory-user-memory @lobechat/types @lobechat/trpc @lobechat/app-config @lobechat/locales @lobechat/env @lobechat/builtin-tool-lobe-agent model-bank @lobechat/agent-gateway-client @lobechat/agent-manager-runtime @lobechat/device-gateway-client @lobechat/device-identity @lobechat/eval-dataset-parser @lobechat/eval-rubric @lobechat/fetch-sse @lobechat/heterogeneous-agents'
|
||||
|
||||
steps:
|
||||
- name: Checkout
|
||||
@@ -90,11 +90,23 @@ jobs:
|
||||
for package in $PACKAGES; do
|
||||
dir="${package#@lobechat/}"
|
||||
if [ -f "./packages/$dir/coverage/lcov.info" ]; then
|
||||
echo "Uploading coverage for $dir..."
|
||||
flag="packages/$dir"
|
||||
|
||||
case "$dir" in
|
||||
builtin-tool-*)
|
||||
flag="builtin-tools"
|
||||
;;
|
||||
locales|env|device-gateway-client)
|
||||
echo "Skipping Codecov upload for $dir."
|
||||
continue
|
||||
;;
|
||||
esac
|
||||
|
||||
echo "Uploading coverage for $dir as $flag..."
|
||||
./codecov upload-coverage \
|
||||
$COMMON_ARGS \
|
||||
--file ./packages/$dir/coverage/lcov.info \
|
||||
--flag packages/$dir \
|
||||
--flag "$flag" \
|
||||
--disable-search
|
||||
fi
|
||||
done
|
||||
@@ -105,8 +117,8 @@ jobs:
|
||||
if: needs.check-duplicate-run.outputs.should_skip != 'true'
|
||||
strategy:
|
||||
matrix:
|
||||
shard: [1, 2, 3]
|
||||
name: Test App (shard ${{ matrix.shard }}/3)
|
||||
shard: [1, 2]
|
||||
name: Test App (shard ${{ matrix.shard }}/2)
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout
|
||||
@@ -126,7 +138,7 @@ jobs:
|
||||
run: pnpm install
|
||||
|
||||
- name: Run tests
|
||||
run: bunx vitest --coverage --silent='passed-only' --reporter=default --reporter=blob --shard=${{ matrix.shard }}/3
|
||||
run: bunx vitest --coverage --silent='passed-only' --reporter=default --reporter=blob --shard=${{ matrix.shard }}/2 --exclude '**/apps/server/**'
|
||||
|
||||
- name: Upload blob report
|
||||
if: ${{ !cancelled() }}
|
||||
@@ -219,6 +231,40 @@ jobs:
|
||||
files: ./apps/desktop/coverage/lcov.info
|
||||
flags: desktop
|
||||
|
||||
test-server:
|
||||
needs: check-duplicate-run
|
||||
if: needs.check-duplicate-run.outputs.should_skip != 'true'
|
||||
name: Test Server
|
||||
|
||||
runs-on: ubuntu-latest
|
||||
|
||||
steps:
|
||||
- name: Checkout
|
||||
env:
|
||||
REF_SHA: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.sha || github.sha }}
|
||||
REPOSITORY: ${{ github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name || github.repository }}
|
||||
run: |
|
||||
git init .
|
||||
git remote add origin "https://github.com/${REPOSITORY}.git"
|
||||
git fetch --no-tags --depth=1 origin "${REF_SHA}"
|
||||
git checkout --force FETCH_HEAD
|
||||
|
||||
- name: Setup environment
|
||||
uses: ./.github/actions/setup-env
|
||||
|
||||
- name: Install deps
|
||||
run: pnpm install
|
||||
|
||||
- name: Test Server Coverage
|
||||
run: bunx vitest --coverage --silent='passed-only' --reporter=default --coverage.reportsDirectory=./apps/server/coverage --dir apps/server
|
||||
|
||||
- name: Upload Server coverage to Codecov
|
||||
uses: codecov/codecov-action@v5
|
||||
with:
|
||||
token: ${{ secrets.CODECOV_TOKEN }}
|
||||
files: ./apps/server/coverage/lcov.info
|
||||
flags: server
|
||||
|
||||
test-databsae:
|
||||
needs: check-duplicate-run
|
||||
if: needs.check-duplicate-run.outputs.should_skip != 'true'
|
||||
|
||||
+2
-3
@@ -59,6 +59,7 @@ bun.lockb
|
||||
# Build outputs
|
||||
dist/
|
||||
public/_spa/
|
||||
public/_spa-auth/
|
||||
public/spa/
|
||||
es/
|
||||
lib/
|
||||
@@ -92,10 +93,8 @@ public/swe-worker*
|
||||
|
||||
# Generated files
|
||||
src/app/spa/[variants]/[[...path]]/spaHtmlTemplates.ts
|
||||
src/app/spa-auth/authHtmlTemplate.ts
|
||||
public/*.js
|
||||
public/sitemap.xml
|
||||
public/sitemap-index.xml
|
||||
sitemap*.xml
|
||||
robots.txt
|
||||
|
||||
# Git hooks
|
||||
|
||||
@@ -29,13 +29,14 @@
|
||||
},
|
||||
"devDependencies": {
|
||||
"@lobechat/agent-gateway-client": "workspace:*",
|
||||
"@lobechat/device-control": "workspace:*",
|
||||
"@lobechat/device-gateway-client": "workspace:*",
|
||||
"@lobechat/device-identity": "workspace:*",
|
||||
"@lobechat/heterogeneous-agents": "workspace:*",
|
||||
"@lobechat/local-file-shell": "workspace:*",
|
||||
"@lobechat/tool-runtime": "workspace:*",
|
||||
"@trpc/client": "^11.8.1",
|
||||
"@types/node": "^22.13.5",
|
||||
"@types/node": "^24.13.2",
|
||||
"@types/ws": "^8.18.1",
|
||||
"commander": "^13.1.0",
|
||||
"dayjs": "^1.11.19",
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
packages:
|
||||
- '../../packages/agent-gateway-client'
|
||||
- '../../packages/device-control'
|
||||
- '../../packages/device-gateway-client'
|
||||
- '../../packages/device-identity'
|
||||
- '../../packages/heterogeneous-agents'
|
||||
|
||||
@@ -2,9 +2,16 @@ import fs from 'node:fs';
|
||||
import os from 'node:os';
|
||||
import path from 'node:path';
|
||||
|
||||
import {
|
||||
defaultGetLocalFilePreview,
|
||||
defaultGetProjectFileIndex,
|
||||
type DeviceControlDeps,
|
||||
executeDeviceRpc,
|
||||
} from '@lobechat/device-control';
|
||||
import type {
|
||||
AgentRunRequestMessage,
|
||||
DeviceSystemInfo,
|
||||
RpcRequestMessage,
|
||||
SystemInfoRequestMessage,
|
||||
ToolCallRequestMessage,
|
||||
} from '@lobechat/device-gateway-client';
|
||||
@@ -262,19 +269,23 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
|
||||
|
||||
// Handle tool call requests
|
||||
client.on('tool_call_request', async (request: ToolCallRequestMessage) => {
|
||||
const { requestId, timeout, toolCall } = request;
|
||||
const { operationId, requestId, timeout, toolCall } = request;
|
||||
if (isDaemonChild) {
|
||||
appendLog(`[TOOL] ${toolCall.apiName} (${requestId})`);
|
||||
appendLog(
|
||||
`[TOOL] ${toolCall.apiName}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
|
||||
);
|
||||
} else {
|
||||
log.toolCall(toolCall.apiName, requestId, toolCall.arguments);
|
||||
log.toolCall(toolCall.apiName, requestId, toolCall.arguments, operationId);
|
||||
}
|
||||
|
||||
const result = await executeToolCall(toolCall.apiName, toolCall.arguments, timeout);
|
||||
|
||||
if (isDaemonChild) {
|
||||
appendLog(`[RESULT] ${result.success ? 'OK' : 'FAIL'} (${requestId})`);
|
||||
appendLog(
|
||||
`[RESULT] ${result.success ? 'OK' : 'FAIL'}${operationId ? ` op=${operationId}` : ''} (${requestId})`,
|
||||
);
|
||||
} else {
|
||||
log.toolResult(requestId, result.success, result.content);
|
||||
log.toolResult(requestId, result.success, result.content, operationId);
|
||||
}
|
||||
|
||||
client.sendToolCallResponse({
|
||||
@@ -288,6 +299,31 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
|
||||
});
|
||||
});
|
||||
|
||||
// Handle generic server-internal device RPCs (git / workspace / file ops).
|
||||
// Shares the `@lobechat/device-control` dispatcher with the desktop app so the
|
||||
// CLI exposes the same remote-device control surface. File preview / index use
|
||||
// the package's portable defaults (no preview-protocol approval on the CLI).
|
||||
const deviceControlDeps: DeviceControlDeps = {
|
||||
getLocalFilePreview: defaultGetLocalFilePreview,
|
||||
getProjectFileIndex: defaultGetProjectFileIndex,
|
||||
};
|
||||
|
||||
client.on('rpc_request', async (request: RpcRequestMessage) => {
|
||||
const { method, params, requestId } = request;
|
||||
if (isDaemonChild) appendLog(`[RPC] ${method} (${requestId})`);
|
||||
else info(`Received rpc_request: method=${method} (${requestId})`);
|
||||
|
||||
try {
|
||||
const data = await executeDeviceRpc(method, params, deviceControlDeps);
|
||||
client.sendRpcResponse({ requestId, result: { data, success: true } });
|
||||
} catch (err) {
|
||||
const message = err instanceof Error ? err.message : String(err);
|
||||
if (isDaemonChild) appendLog(`[RPC ERROR] ${method}: ${message} (${requestId})`);
|
||||
else error(`rpc_request method=${method} failed: ${message}`);
|
||||
client.sendRpcResponse({ requestId, result: { error: message, success: false } });
|
||||
}
|
||||
});
|
||||
|
||||
// Handle gateway-dispatched agent runs (heterogeneous agents, e.g. Claude
|
||||
// Code). Mirrors the desktop app: spawn `lh hetero exec`, which owns the full
|
||||
// execution + server-ingest pipeline. Ack with the spawn outcome — `accepted`
|
||||
@@ -302,6 +338,7 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
|
||||
{
|
||||
agentType: request.agentType,
|
||||
cwd: request.cwd,
|
||||
imageList: request.imageList,
|
||||
jwt: request.jwt,
|
||||
operationId: request.operationId,
|
||||
prompt: request.prompt,
|
||||
|
||||
@@ -650,7 +650,7 @@ describe('hetero exec command', () => {
|
||||
});
|
||||
|
||||
it('resets the per-message text accumulator at message boundaries (no cross-message duplication)', async () => {
|
||||
// LOBE-10157 Bug 3: the `replace` snapshot accumulator must not span
|
||||
// The `replace` snapshot accumulator must not span
|
||||
// message boundaries. Two assistant messages separated by a
|
||||
// stream_end/stream_start boundary must each snapshot only their OWN
|
||||
// text — otherwise the second message re-emits the first's text verbatim.
|
||||
|
||||
@@ -261,7 +261,7 @@ class SerialServerIngester {
|
||||
// adapter's `openMainMessage`) must reset it — otherwise it spans the
|
||||
// whole run and every later message's snapshot re-emits all prior
|
||||
// messages' text verbatim, which the server then persists into the new
|
||||
// DB message (LOBE-10157 Bug 3: cross-message text duplication). Reset
|
||||
// DB message: cross-message text duplication. Reset
|
||||
// AFTER flushing the just-ended message's pending snapshot above.
|
||||
if (event.type === 'stream_start' || event.type === 'stream_end') {
|
||||
this.accumulatedText = '';
|
||||
|
||||
@@ -122,4 +122,24 @@ describe('spawnHeteroAgentRun', () => {
|
||||
]),
|
||||
);
|
||||
});
|
||||
|
||||
it('appends image blocks to stdin when imageList is provided', async () => {
|
||||
const child = makeFakeChild();
|
||||
spawnMock.mockReturnValue(child);
|
||||
|
||||
const ackPromise = spawnHeteroAgentRun({
|
||||
...baseParams,
|
||||
imageList: [{ id: 'file-1', url: 'https://signed/a.png' }],
|
||||
prompt: 'look at this',
|
||||
});
|
||||
child.emit('spawn');
|
||||
await ackPromise;
|
||||
|
||||
expect(child.stdin.write).toHaveBeenCalledWith(
|
||||
JSON.stringify([
|
||||
{ text: 'look at this', type: 'text' },
|
||||
{ source: { id: 'file-1', type: 'url', url: 'https://signed/a.png' }, type: 'image' },
|
||||
]),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -1,8 +1,15 @@
|
||||
import { spawn } from 'node:child_process';
|
||||
|
||||
import {
|
||||
buildHeteroExecStdinPayload,
|
||||
type HeteroExecImageRef,
|
||||
} from '@lobechat/heterogeneous-agents/protocol';
|
||||
|
||||
export interface SpawnHeteroAgentRunParams {
|
||||
agentType: string;
|
||||
cwd?: string;
|
||||
/** Image attachments (signed URLs) appended as image content blocks. */
|
||||
imageList?: HeteroExecImageRef[];
|
||||
jwt: string;
|
||||
operationId: string;
|
||||
prompt: string;
|
||||
@@ -46,6 +53,7 @@ export function spawnHeteroAgentRun(
|
||||
const {
|
||||
agentType,
|
||||
cwd,
|
||||
imageList,
|
||||
jwt,
|
||||
operationId,
|
||||
prompt,
|
||||
@@ -77,15 +85,11 @@ export function spawnHeteroAgentRun(
|
||||
...(resumeSessionId ? ['--resume', resumeSessionId] : []),
|
||||
];
|
||||
|
||||
// With systemContext, send a content-block array so the agent sees the
|
||||
// context block first, then the user's actual prompt — mirrors the desktop
|
||||
// path. `lh hetero exec` coerces both shapes via coerceJsonPrompt.
|
||||
const stdinPayload = systemContext
|
||||
? JSON.stringify([
|
||||
{ text: systemContext, type: 'text' },
|
||||
{ text: prompt, type: 'text' },
|
||||
])
|
||||
: JSON.stringify(prompt);
|
||||
// systemContext / image attachments turn the payload into a content-block
|
||||
// array: context block first, then the user's prompt, then images — mirrors
|
||||
// the desktop path. `lh hetero exec` coerces both shapes via
|
||||
// coerceJsonPrompt.
|
||||
const stdinPayload = buildHeteroExecStdinPayload({ imageList, prompt, systemContext });
|
||||
|
||||
return new Promise<AgentRunAckResult>((resolve) => {
|
||||
let settled = false;
|
||||
|
||||
@@ -1,4 +1,3 @@
|
||||
/* eslint-disable no-console */
|
||||
import pc from 'picocolors';
|
||||
|
||||
let verbose = false;
|
||||
@@ -41,18 +40,20 @@ export const log = {
|
||||
console.log(`${timestamp()} ${pc.bold('[STATUS]')} ${color(status)}`);
|
||||
},
|
||||
|
||||
toolCall: (apiName: string, requestId: string, args?: string) => {
|
||||
toolCall: (apiName: string, requestId: string, args?: string, operationId?: string) => {
|
||||
console.log(
|
||||
`${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)} ${pc.dim(`(${requestId})`)}`,
|
||||
`${timestamp()} ${pc.magenta('[TOOL]')} ${pc.bold(apiName)}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
|
||||
);
|
||||
if (args && verbose) {
|
||||
console.log(` ${pc.dim(args)}`);
|
||||
}
|
||||
},
|
||||
|
||||
toolResult: (requestId: string, success: boolean, content?: string) => {
|
||||
toolResult: (requestId: string, success: boolean, content?: string, operationId?: string) => {
|
||||
const icon = success ? pc.green('OK') : pc.red('FAIL');
|
||||
console.log(`${timestamp()} ${pc.magenta('[RESULT]')} ${icon} ${pc.dim(`(${requestId})`)}`);
|
||||
console.log(
|
||||
`${timestamp()} ${pc.magenta('[RESULT]')} ${icon}${operationId ? ` ${pc.dim(`op=${operationId}`)}` : ''} ${pc.dim(`(${requestId})`)}`,
|
||||
);
|
||||
if (content && verbose) {
|
||||
const preview = content.length > 200 ? content.slice(0, 200) + '...' : content;
|
||||
console.log(` ${pc.dim(preview)}`);
|
||||
|
||||
@@ -6,6 +6,7 @@ import dotenv from 'dotenv';
|
||||
import { defineConfig } from 'electron-vite';
|
||||
import type { PluginOption, ViteDevServer } from 'vite';
|
||||
import { loadEnv } from 'vite';
|
||||
import tsconfigPaths from 'vite-tsconfig-paths';
|
||||
|
||||
import {
|
||||
sharedOptimizeDeps,
|
||||
@@ -88,10 +89,112 @@ function electronDesktopHtmlPlugin(): PluginOption {
|
||||
};
|
||||
}
|
||||
|
||||
const CLOUD_DESKTOP_BUSINESS_FEATURES_FLAG = '__LOBECLOUD_DESKTOP_BUSINESS_FEATURES__';
|
||||
const BUSINESS_CONST_MODULE_ID = '@lobechat/business-const';
|
||||
const CLOUD_BUSINESS_CONST_MODULE_ID = '@cloud/business-const';
|
||||
const DYNAMIC_BUSINESS_CONST_QUERY = '?lobe-cloud-desktop-business-const';
|
||||
|
||||
const createBusinessFeaturesBootstrapScript = () =>
|
||||
`globalThis[${JSON.stringify(CLOUD_DESKTOP_BUSINESS_FEATURES_FLAG)}] = true;`;
|
||||
|
||||
const replaceBusinessFlagExport = (code: string, name: string, initializer: string) => {
|
||||
const pattern = new RegExp(`export\\s+(?:const|let|var)\\s+${name}\\s*=\\s*[\\s\\S]*?;`);
|
||||
|
||||
return {
|
||||
code: code.replace(pattern, `export let ${name} = ${initializer};`),
|
||||
replaced: pattern.test(code),
|
||||
};
|
||||
};
|
||||
|
||||
const injectDynamicBusinessFeatureFlag = (code: string) => {
|
||||
const businessFlag = replaceBusinessFlagExport(
|
||||
code,
|
||||
'ENABLE_BUSINESS_FEATURES',
|
||||
`Boolean(globalThis['${CLOUD_DESKTOP_BUSINESS_FEATURES_FLAG}'])`,
|
||||
);
|
||||
const topicLinkFlag = replaceBusinessFlagExport(
|
||||
businessFlag.code,
|
||||
'ENABLE_TOPIC_LINK_SHARE',
|
||||
'ENABLE_BUSINESS_FEATURES',
|
||||
);
|
||||
|
||||
if (!businessFlag.replaced) {
|
||||
throw new Error('Cannot find ENABLE_BUSINESS_FEATURES export in @cloud/business-const');
|
||||
}
|
||||
|
||||
const topicLinkAssignment = topicLinkFlag.replaced
|
||||
? '\n ENABLE_TOPIC_LINK_SHARE = enabled;'
|
||||
: '';
|
||||
|
||||
return `${topicLinkFlag.code}
|
||||
|
||||
const __lobeCloudDesktopBusinessFeaturesFlagKey = '${CLOUD_DESKTOP_BUSINESS_FEATURES_FLAG}';
|
||||
const __lobeCloudDesktopApplyBusinessFeaturesFlag = (value) => {
|
||||
const enabled = Boolean(value);
|
||||
ENABLE_BUSINESS_FEATURES = enabled;${topicLinkAssignment}
|
||||
return enabled;
|
||||
};
|
||||
|
||||
const __lobeCloudDesktopExistingDescriptor = Object.getOwnPropertyDescriptor(
|
||||
globalThis,
|
||||
__lobeCloudDesktopBusinessFeaturesFlagKey,
|
||||
);
|
||||
const __lobeCloudDesktopInitialValue = __lobeCloudDesktopExistingDescriptor?.get
|
||||
? __lobeCloudDesktopExistingDescriptor.get.call(globalThis)
|
||||
: globalThis[__lobeCloudDesktopBusinessFeaturesFlagKey];
|
||||
|
||||
Object.defineProperty(globalThis, __lobeCloudDesktopBusinessFeaturesFlagKey, {
|
||||
configurable: true,
|
||||
get() {
|
||||
return ENABLE_BUSINESS_FEATURES;
|
||||
},
|
||||
set(value) {
|
||||
__lobeCloudDesktopApplyBusinessFeaturesFlag(value);
|
||||
},
|
||||
});
|
||||
|
||||
__lobeCloudDesktopApplyBusinessFeaturesFlag(__lobeCloudDesktopInitialValue);
|
||||
`;
|
||||
};
|
||||
|
||||
function cloudDesktopBusinessConstPlugin(): PluginOption {
|
||||
return {
|
||||
enforce: 'pre',
|
||||
async resolveId(id, importer) {
|
||||
if (id !== BUSINESS_CONST_MODULE_ID) return;
|
||||
|
||||
const resolved = await this.resolve(CLOUD_BUSINESS_CONST_MODULE_ID, importer, {
|
||||
skipSelf: true,
|
||||
});
|
||||
if (!resolved) throw new Error(`Cannot resolve ${CLOUD_BUSINESS_CONST_MODULE_ID}`);
|
||||
|
||||
return `${resolved.id}${DYNAMIC_BUSINESS_CONST_QUERY}`;
|
||||
},
|
||||
load(id) {
|
||||
if (!id.endsWith(DYNAMIC_BUSINESS_CONST_QUERY)) return;
|
||||
|
||||
const sourcePath = id.slice(0, -DYNAMIC_BUSINESS_CONST_QUERY.length);
|
||||
return injectDynamicBusinessFeatureFlag(readFileSync(sourcePath, 'utf8'));
|
||||
},
|
||||
name: 'lobe-cloud-desktop-business-const',
|
||||
transformIndexHtml() {
|
||||
return [
|
||||
{
|
||||
children: createBusinessFeaturesBootstrapScript(),
|
||||
injectTo: 'head-prepend',
|
||||
tag: 'script',
|
||||
},
|
||||
];
|
||||
},
|
||||
};
|
||||
}
|
||||
|
||||
dotenv.config();
|
||||
|
||||
const isDev = process.env.NODE_ENV === 'development';
|
||||
const ROOT_DIR = path.resolve(__dirname, '../..');
|
||||
const CLOUD_ROOT_DIR = path.resolve(__dirname, '../../..');
|
||||
const isCloudDesktopBuild = process.env.CLOUD_DESKTOP === '1';
|
||||
const mode = process.env.NODE_ENV === 'production' ? 'production' : 'development';
|
||||
|
||||
Object.assign(process.env, loadEnv(mode, ROOT_DIR, ''));
|
||||
@@ -105,8 +208,17 @@ const mainProcessRuntimeExternals = [
|
||||
...externalRuntimeModules,
|
||||
'node-mac-permissions',
|
||||
];
|
||||
const externalNavigationHosts =
|
||||
process.env.DESKTOP_EXTERNAL_NAVIGATION_HOSTS ?? (isCloudDesktopBuild ? 'stripe.com' : '');
|
||||
|
||||
console.info(`[electron-vite.config.ts] Detected UPDATE_CHANNEL: ${updateChannel}`);
|
||||
console.info(`[electron-vite.config.ts] Cloud desktop build: ${isCloudDesktopBuild}`);
|
||||
|
||||
const cloudTsconfigPathsPlugin = () =>
|
||||
({
|
||||
...tsconfigPaths({ projects: [path.resolve(CLOUD_ROOT_DIR, 'tsconfig.json')] }),
|
||||
name: 'lobe-cloud-desktop-tsconfig-paths',
|
||||
}) satisfies PluginOption;
|
||||
|
||||
export default defineConfig({
|
||||
main: {
|
||||
@@ -169,6 +281,7 @@ export default defineConfig({
|
||||
sourcemap: isDev ? 'inline' : false,
|
||||
},
|
||||
define: {
|
||||
'process.env.DESKTOP_EXTERNAL_NAVIGATION_HOSTS': JSON.stringify(externalNavigationHosts),
|
||||
'process.env.UPDATE_CHANNEL': JSON.stringify(process.env.UPDATE_CHANNEL),
|
||||
'process.env.UPDATE_SERVER_URL': JSON.stringify(process.env.UPDATE_SERVER_URL),
|
||||
},
|
||||
@@ -214,6 +327,8 @@ export default defineConfig({
|
||||
},
|
||||
optimizeDeps: sharedOptimizeDeps,
|
||||
plugins: [
|
||||
isCloudDesktopBuild && cloudTsconfigPathsPlugin(),
|
||||
isCloudDesktopBuild && cloudDesktopBusinessConstPlugin(),
|
||||
forceAbsoluteBasePlugin(),
|
||||
electronDesktopHtmlPlugin(),
|
||||
vanillaExtractPlugin(),
|
||||
@@ -221,7 +336,7 @@ export default defineConfig({
|
||||
],
|
||||
resolve: {
|
||||
dedupe: ['react', 'react-dom'],
|
||||
tsconfigPaths: true,
|
||||
tsconfigPaths: !isCloudDesktopBuild,
|
||||
},
|
||||
// In dev the BrowserWindow loads `app://renderer/` and the Electron main process
|
||||
// proxies non-backend requests to this Vite dev server via `net.fetch`. The HMR
|
||||
|
||||
@@ -56,6 +56,7 @@
|
||||
"@electron-toolkit/utils": "^4.0.0",
|
||||
"@lobechat/chat-adapter-imessage": "workspace:*",
|
||||
"@lobechat/desktop-bridge": "workspace:*",
|
||||
"@lobechat/device-control": "workspace:*",
|
||||
"@lobechat/device-gateway-client": "workspace:*",
|
||||
"@lobechat/device-identity": "workspace:*",
|
||||
"@lobechat/electron-client-ipc": "workspace:*",
|
||||
@@ -111,7 +112,7 @@
|
||||
"undici": "^7.16.0",
|
||||
"uuid": "^14.0.0",
|
||||
"vite": "8.0.14",
|
||||
"vitest": "3.2.4",
|
||||
"vitest": "3.2.6",
|
||||
"zod": "^3.25.76"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
@@ -128,7 +129,7 @@
|
||||
"node-gyp": "^12.4.0",
|
||||
"react": "19.2.4",
|
||||
"react-dom": "19.2.4",
|
||||
"vitest": "3.2.4"
|
||||
"vitest": "3.2.6"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -8,6 +8,7 @@ packages:
|
||||
- '../../packages/electron-client-ipc'
|
||||
- '../../packages/file-loaders'
|
||||
- '../../packages/desktop-bridge'
|
||||
- '../../packages/device-control'
|
||||
- '../../packages/device-gateway-client'
|
||||
- '../../packages/device-identity'
|
||||
- '../../packages/local-file-shell'
|
||||
|
||||
@@ -7,6 +7,7 @@ import { getDesktopEnv } from '@/env';
|
||||
export const isDev = electronIs.dev();
|
||||
|
||||
export const OFFICIAL_CLOUD_SERVER = getDesktopEnv().OFFICIAL_CLOUD_SERVER;
|
||||
export const DESKTOP_EXTERNAL_NAVIGATION_HOSTS = getDesktopEnv().DESKTOP_EXTERNAL_NAVIGATION_HOSTS;
|
||||
|
||||
export const isMac = electronIs.macOS();
|
||||
export const isWindows = electronIs.windows();
|
||||
|
||||
@@ -91,6 +91,13 @@ export default class BrowserWindowsCtr extends ControllerModule {
|
||||
});
|
||||
}
|
||||
|
||||
@IpcMethod()
|
||||
isWindowFullScreen() {
|
||||
return this.withSenderIdentifier((identifier) => {
|
||||
return this.app.browserManager.isWindowFullScreen(identifier);
|
||||
});
|
||||
}
|
||||
|
||||
@IpcMethod()
|
||||
setWindowAlwaysOnTop(flag: boolean) {
|
||||
this.withSenderIdentifier((identifier) => {
|
||||
|
||||
@@ -3,6 +3,7 @@ import fs from 'node:fs';
|
||||
import os from 'node:os';
|
||||
import path from 'node:path';
|
||||
|
||||
import { type DeviceControlDeps, executeDeviceRpc as runDeviceRpc } from '@lobechat/device-control';
|
||||
import type {
|
||||
AgentRunRequestMessage,
|
||||
GatewayMcpStdioParams,
|
||||
@@ -13,10 +14,8 @@ import type {
|
||||
GetCommandOutputParams,
|
||||
GlobFilesParams,
|
||||
GrepContentParams,
|
||||
InitWorkspaceParams,
|
||||
KillCommandParams,
|
||||
ListLocalFileParams,
|
||||
ListProjectSkillsParams,
|
||||
LocalReadFileParams,
|
||||
LocalReadFilesParams,
|
||||
LocalSearchFilesParams,
|
||||
@@ -29,15 +28,16 @@ import { type ILocalSystemService, LocalSystemExecutionRuntime } from '@lobechat
|
||||
|
||||
import GatewayConnectionService from '@/services/gatewayConnectionSrv';
|
||||
import ImessageBridgeService from '@/services/imessageBridgeSrv';
|
||||
import { createLogger } from '@/utils/logger';
|
||||
|
||||
import GitCtr from './GitCtr';
|
||||
import HeterogeneousAgentCtr from './HeterogeneousAgentCtr';
|
||||
import { ControllerModule, IpcMethod } from './index';
|
||||
import LocalFileCtr from './LocalFileCtr';
|
||||
import McpCtr from './McpCtr';
|
||||
import RemoteServerConfigCtr from './RemoteServerConfigCtr';
|
||||
import ShellCommandCtr from './ShellCommandCtr';
|
||||
import WorkspaceCtr from './WorkspaceCtr';
|
||||
|
||||
const logger = createLogger('controllers:GatewayConnectionCtr');
|
||||
|
||||
/**
|
||||
* Inject the lh-notify protocol into the first turn of a new hetero-agent session.
|
||||
@@ -166,14 +166,6 @@ export default class GatewayConnectionCtr extends ControllerModule {
|
||||
return this.app.getController(LocalFileCtr);
|
||||
}
|
||||
|
||||
private get workspaceCtr() {
|
||||
return this.app.getController(WorkspaceCtr);
|
||||
}
|
||||
|
||||
private get gitCtr() {
|
||||
return this.app.getController(GitCtr);
|
||||
}
|
||||
|
||||
private get shellCommandCtr() {
|
||||
return this.app.getController(ShellCommandCtr);
|
||||
}
|
||||
@@ -300,6 +292,7 @@ export default class GatewayConnectionCtr extends ControllerModule {
|
||||
this.heterogeneousAgentCtr.spawnLhHeteroExec({
|
||||
agentType: request.agentType,
|
||||
cwd: request.cwd,
|
||||
imageList: request.imageList,
|
||||
jwt: request.jwt,
|
||||
operationId: request.operationId,
|
||||
prompt: request.prompt,
|
||||
@@ -351,87 +344,33 @@ export default class GatewayConnectionCtr extends ControllerModule {
|
||||
return this.localSystemRuntime;
|
||||
}
|
||||
|
||||
/**
|
||||
* Platform-specific handlers the shared `@lobechat/device-control` dispatcher
|
||||
* delegates to. Git + workspace-scan methods run inside device-control over
|
||||
* `@lobechat/local-file-shell`; only file preview / index (and preview
|
||||
* approval) are desktop-specific and routed back to the controllers here.
|
||||
*/
|
||||
private get deviceControlDeps(): DeviceControlDeps {
|
||||
return {
|
||||
approveProjectRoot: async (root) => {
|
||||
try {
|
||||
await this.app.localFileProtocolManager.approveIndexedProjectRoot(root);
|
||||
} catch (error) {
|
||||
logger.error(`Failed to approve project preview root ${root}:`, error);
|
||||
}
|
||||
},
|
||||
getLocalFilePreview: (params) => this.localFileCtr.getLocalFilePreview(params),
|
||||
getProjectFileIndex: (params) => this.localFileCtr.getProjectFileIndex(params),
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Dispatch a generic server-internal device RPC (not an agent tool call) by
|
||||
* method name. Currently only `initWorkspace` (scan the bound project root for
|
||||
* skills + AGENTS.md); add new server-only device methods here.
|
||||
* method name. The dispatch logic lives in `@lobechat/device-control` so the
|
||||
* desktop main process and the CLI daemon share one device RPC surface.
|
||||
*/
|
||||
private async executeDeviceRpc(method: string, params: unknown): Promise<unknown> {
|
||||
switch (method) {
|
||||
case 'initWorkspace': {
|
||||
return this.workspaceCtr.initWorkspace(params as InitWorkspaceParams);
|
||||
}
|
||||
|
||||
case 'getGitBranch': {
|
||||
return this.gitCtr.getGitBranch((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'getLinkedPullRequest': {
|
||||
return this.gitCtr.getLinkedPullRequest(params as { branch: string; path: string });
|
||||
}
|
||||
|
||||
case 'getGitWorkingTreeStatus': {
|
||||
return this.gitCtr.getGitWorkingTreeStatus((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'getGitAheadBehind': {
|
||||
return this.gitCtr.getGitAheadBehind((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'listGitBranches': {
|
||||
return this.gitCtr.listGitBranches((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'checkoutGitBranch': {
|
||||
return this.gitCtr.checkoutGitBranch(
|
||||
params as { branch: string; create?: boolean; path: string },
|
||||
);
|
||||
}
|
||||
|
||||
case 'pullGitBranch': {
|
||||
return this.gitCtr.pullGitBranch(params as { path: string });
|
||||
}
|
||||
|
||||
case 'pushGitBranch': {
|
||||
return this.gitCtr.pushGitBranch(params as { path: string });
|
||||
}
|
||||
|
||||
case 'getGitWorkingTreePatches': {
|
||||
return this.gitCtr.getGitWorkingTreePatches((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'getGitWorkingTreeFiles': {
|
||||
return this.gitCtr.getGitWorkingTreeFiles((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'getProjectFileIndex': {
|
||||
return this.localFileCtr.getProjectFileIndex(params as { scope?: string });
|
||||
}
|
||||
|
||||
case 'listProjectSkills': {
|
||||
return this.workspaceCtr.listProjectSkills(params as ListProjectSkillsParams);
|
||||
}
|
||||
|
||||
case 'getGitBranchDiff': {
|
||||
return this.gitCtr.getGitBranchDiff(params as { baseRef?: string; path: string });
|
||||
}
|
||||
|
||||
case 'listGitRemoteBranches': {
|
||||
return this.gitCtr.listGitRemoteBranches((params as { path: string }).path);
|
||||
}
|
||||
|
||||
case 'revertGitFile': {
|
||||
return this.gitCtr.revertGitFile(params as { filePath: string; path: string });
|
||||
}
|
||||
|
||||
case 'statPath': {
|
||||
return this.workspaceCtr.statPath(params as { path: string });
|
||||
}
|
||||
|
||||
default: {
|
||||
throw new Error(`Unknown device RPC method: ${method}`);
|
||||
}
|
||||
}
|
||||
return runDeviceRpc(method, params, this.deviceControlDeps);
|
||||
}
|
||||
|
||||
private async executeToolCall(
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -18,13 +18,20 @@ import {
|
||||
} from '@lobechat/electron-client-ipc';
|
||||
import type { AskUserBridge } from '@lobechat/heterogeneous-agents/askUser';
|
||||
import { AskUserMcpServer } from '@lobechat/heterogeneous-agents/askUser';
|
||||
import type { AgentContentBlock } from '@lobechat/heterogeneous-agents/spawn';
|
||||
import type {
|
||||
AgentContentBlock,
|
||||
HeteroExecImageRef,
|
||||
} from '@lobechat/heterogeneous-agents/protocol';
|
||||
import { buildHeteroExecStdinPayload } from '@lobechat/heterogeneous-agents/protocol';
|
||||
import type { AgentStreamEvent, UsageData } from '@lobechat/heterogeneous-agents/spawn';
|
||||
import {
|
||||
AgentStreamPipeline,
|
||||
buildAgentInput,
|
||||
materializeImageToPath,
|
||||
normalizeImage,
|
||||
readCodexSessionModel,
|
||||
resolveCliSpawnPlan,
|
||||
resolveCodexInitialModel,
|
||||
} from '@lobechat/heterogeneous-agents/spawn';
|
||||
import { app as electronApp, BrowserWindow } from 'electron';
|
||||
|
||||
@@ -176,9 +183,33 @@ interface AgentSession {
|
||||
command: string;
|
||||
cwd?: string;
|
||||
env?: Record<string, string>;
|
||||
model?: string;
|
||||
modelSource?: string;
|
||||
modelVerificationLastAttemptAt?: number;
|
||||
modelVerificationLastAttemptSessionId?: string;
|
||||
process?: ChildProcess;
|
||||
/**
|
||||
* Absolute CLI path resolved by spawn preflight detection. Used for spawn()
|
||||
* when the configured command is bare: detection can find the CLI through
|
||||
* the login-shell PATH or a well-known install location (e.g. the Codex.app
|
||||
* bundled CLI) that plain spawn() with the inherited env can't resolve.
|
||||
*/
|
||||
resolvedCommandPath?: string;
|
||||
/**
|
||||
* PATH the preflight detector used to resolve `resolvedCommandPath`, set only
|
||||
* when it fell back to the login-shell PATH. Merged into the child PATH at
|
||||
* spawn so a `#!/usr/bin/env node` shim still finds its interpreter — the
|
||||
* shim resolving in preflight doesn't guarantee `node` is on the leaner
|
||||
* inherited PATH (Finder-launched Electron).
|
||||
*/
|
||||
resolvedCommandSearchPath?: string;
|
||||
resumeSessionId?: string;
|
||||
sessionId: string;
|
||||
verifiedModel?: string;
|
||||
verifiedModelContextWindow?: number;
|
||||
verifiedModelProvider?: string;
|
||||
verifiedModelSessionId?: string;
|
||||
verifiedModelSourceFile?: string;
|
||||
}
|
||||
|
||||
type SessionErrorPayload = HeterogeneousAgentSessionError | string;
|
||||
@@ -454,11 +485,20 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
session.agentType === 'claude-code' ? 'claude-code' : 'codex',
|
||||
command,
|
||||
);
|
||||
const cliMissingError = this.buildCliMissingError(session);
|
||||
|
||||
if (!status || status.available || !cliMissingError) return;
|
||||
if (!status || status.available) {
|
||||
// Spawn through the detector-resolved absolute path when the configured
|
||||
// command is bare — detection may have located the CLI somewhere plain
|
||||
// spawn() can't (login-shell PATH, Codex.app bundled CLI, …).
|
||||
const useResolvedPath = Boolean(status?.path) && !command.includes(path.sep);
|
||||
session.resolvedCommandPath = useResolvedPath ? status!.path : undefined;
|
||||
// Carry the login-shell PATH the detector resolved through, so a
|
||||
// `#!/usr/bin/env node` shim spawned by absolute path still finds `node`.
|
||||
session.resolvedCommandSearchPath = useResolvedPath ? status!.resolvedPathEnv : undefined;
|
||||
return;
|
||||
}
|
||||
|
||||
return cliMissingError;
|
||||
return this.buildCliMissingError(session);
|
||||
}
|
||||
|
||||
private get shouldTraceCliOutput(): boolean {
|
||||
@@ -581,12 +621,19 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
createdAt: createdAt.toISOString(),
|
||||
cwd,
|
||||
envKeys: session.env ? Object.keys(session.env).sort() : [],
|
||||
model: session.model,
|
||||
modelSource: session.modelSource,
|
||||
resumeSessionId: session.resumeSessionId,
|
||||
sessionId: session.sessionId,
|
||||
stdinBytes: stdinPayload === undefined ? 0 : Buffer.byteLength(stdinPayload),
|
||||
stdinFile: stdinPayload === undefined ? undefined : 'stdin.txt',
|
||||
stderrFile: 'stderr.log',
|
||||
stdoutFile: 'stdout.jsonl',
|
||||
verifiedModel: session.verifiedModel,
|
||||
verifiedModelContextWindow: session.verifiedModelContextWindow,
|
||||
verifiedModelProvider: session.verifiedModelProvider,
|
||||
verifiedModelSessionId: session.verifiedModelSessionId,
|
||||
verifiedModelSourceFile: session.verifiedModelSourceFile,
|
||||
},
|
||||
null,
|
||||
2,
|
||||
@@ -888,6 +935,8 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
let spawnPlan;
|
||||
let traceSession;
|
||||
let cwd: string;
|
||||
let initialCumulativeUsage: UsageData | undefined;
|
||||
let spawnEnv: NodeJS.ProcessEnv;
|
||||
try {
|
||||
const driver = getHeterogeneousAgentDriver(session.agentType);
|
||||
spawnPlan = await driver.buildSpawnPlan({
|
||||
@@ -906,6 +955,34 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
// Fall back to the user's Desktop so the process never inherits
|
||||
// the Electron parent's cwd (which is `/` when launched from Finder).
|
||||
cwd = session.cwd || electronApp.getPath('desktop');
|
||||
|
||||
// Forward the user's proxy settings to the CLI. The main-process undici
|
||||
// dispatcher doesn't reach child processes — they need env vars.
|
||||
const proxyEnv = buildProxyEnv(this.app.storeManager.get('networkProxy'));
|
||||
const inheritedEnv = buildInheritedSpawnEnv();
|
||||
// When preflight resolved the CLI via the login-shell PATH, spawn with
|
||||
// that PATH (a superset of the inherited one) so a `#!/usr/bin/env node`
|
||||
// shim finds its interpreter. `session.env` still wins if it sets PATH.
|
||||
if (session.resolvedCommandSearchPath) inheritedEnv.PATH = session.resolvedCommandSearchPath;
|
||||
spawnEnv = { ...inheritedEnv, ...proxyEnv, ...session.env };
|
||||
|
||||
if (session.agentType === 'codex') {
|
||||
const initialModel = await resolveCodexInitialModel({
|
||||
args: spawnPlan.args,
|
||||
env: spawnEnv,
|
||||
});
|
||||
if (initialModel?.model) {
|
||||
session.model = initialModel.model;
|
||||
session.modelSource = initialModel.source;
|
||||
}
|
||||
|
||||
if (session.agentSessionId) {
|
||||
initialCumulativeUsage = (
|
||||
await readCodexSessionModel(session.agentSessionId, { env: spawnEnv })
|
||||
)?.cumulativeUsage;
|
||||
}
|
||||
}
|
||||
|
||||
traceSession = await this.createCliTraceSession({
|
||||
cliArgs: spawnPlan.args,
|
||||
cwd,
|
||||
@@ -925,7 +1002,10 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
}
|
||||
const useStdin = spawnPlan.stdinPayload !== undefined;
|
||||
const cliArgs = spawnPlan.args;
|
||||
const resolvedCliSpawnPlan = await resolveCliSpawnPlan(session.command, cliArgs);
|
||||
const resolvedCliSpawnPlan = await resolveCliSpawnPlan(
|
||||
session.resolvedCommandPath ?? session.command,
|
||||
cliArgs,
|
||||
);
|
||||
|
||||
logger.info(
|
||||
'Spawning agent:',
|
||||
@@ -940,29 +1020,28 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
// the claude binary can leave bash/grep/etc. tool children running and
|
||||
// the CLI hung waiting on them. Windows has different semantics — use
|
||||
// taskkill /T /F there; no detached flag needed.
|
||||
// Forward the user's proxy settings to the CLI. The main-process undici
|
||||
// dispatcher doesn't reach child processes — they need env vars.
|
||||
const proxyEnv = buildProxyEnv(this.app.storeManager.get('networkProxy'));
|
||||
|
||||
const spawnOptions = {
|
||||
cwd,
|
||||
detached: process.platform !== 'win32',
|
||||
// Strip host Anthropic creds from the inherited env so a developer's
|
||||
// shell `ANTHROPIC_API_KEY` can't hijack the CLI's own auth. `session.env`
|
||||
// is spread last, so an agent that explicitly configures a key still wins.
|
||||
env: { ...buildInheritedSpawnEnv(), ...proxyEnv, ...session.env },
|
||||
env: spawnEnv,
|
||||
stdio: [useStdin ? 'pipe' : 'ignore', 'pipe', 'pipe'] as ['pipe' | 'ignore', 'pipe', 'pipe'],
|
||||
};
|
||||
|
||||
return new Promise<void>((resolve, reject) => {
|
||||
const proc = spawn(resolvedCliSpawnPlan.command, resolvedCliSpawnPlan.args, spawnOptions);
|
||||
this.handleSpawnedAgentProcess({
|
||||
cwd,
|
||||
intervention,
|
||||
params,
|
||||
proc,
|
||||
reject,
|
||||
resolve,
|
||||
session,
|
||||
initialCumulativeUsage,
|
||||
spawnEnv,
|
||||
traceSession,
|
||||
useStdin,
|
||||
spawnPlan,
|
||||
@@ -970,23 +1049,88 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
});
|
||||
}
|
||||
|
||||
private async verifyCodexSessionModel({
|
||||
env,
|
||||
pipeline,
|
||||
session,
|
||||
traceSession,
|
||||
}: {
|
||||
env: NodeJS.ProcessEnv;
|
||||
pipeline: AgentStreamPipeline;
|
||||
session: AgentSession;
|
||||
traceSession: CliTraceSession | undefined;
|
||||
}): Promise<AgentStreamEvent[]> {
|
||||
if (
|
||||
session.agentType !== 'codex' ||
|
||||
!pipeline.sessionId ||
|
||||
session.verifiedModelSessionId === pipeline.sessionId
|
||||
) {
|
||||
return [];
|
||||
}
|
||||
|
||||
const now = Date.now();
|
||||
if (
|
||||
session.modelVerificationLastAttemptSessionId === pipeline.sessionId &&
|
||||
session.modelVerificationLastAttemptAt &&
|
||||
now - session.modelVerificationLastAttemptAt < 1000
|
||||
) {
|
||||
return [];
|
||||
}
|
||||
session.modelVerificationLastAttemptSessionId = pipeline.sessionId;
|
||||
session.modelVerificationLastAttemptAt = now;
|
||||
|
||||
const sessionModel = await readCodexSessionModel(pipeline.sessionId, { env });
|
||||
if (!sessionModel?.model) return [];
|
||||
|
||||
const previousModel = session.model;
|
||||
session.verifiedModel = sessionModel.model;
|
||||
session.verifiedModelContextWindow = sessionModel.contextWindow;
|
||||
session.verifiedModelProvider = sessionModel.provider;
|
||||
session.verifiedModelSessionId = pipeline.sessionId;
|
||||
session.verifiedModelSourceFile = sessionModel.sourceFile;
|
||||
|
||||
void this.writeCliTraceJson(traceSession, 'model.json', {
|
||||
initialModel: previousModel,
|
||||
initialModelSource: session.modelSource,
|
||||
sessionId: pipeline.sessionId,
|
||||
verifiedAt: new Date().toISOString(),
|
||||
verifiedContextWindow: sessionModel.contextWindow,
|
||||
verifiedLine: sessionModel.line,
|
||||
verifiedModel: sessionModel.model,
|
||||
verifiedModelProvider: sessionModel.provider,
|
||||
verifiedSourceFile: sessionModel.sourceFile,
|
||||
});
|
||||
|
||||
if (previousModel === sessionModel.model) return [];
|
||||
|
||||
session.model = sessionModel.model;
|
||||
session.modelSource = 'codex-session';
|
||||
return pipeline.configureSession({ model: sessionModel.model });
|
||||
}
|
||||
|
||||
private handleSpawnedAgentProcess({
|
||||
cwd,
|
||||
initialCumulativeUsage,
|
||||
intervention,
|
||||
params,
|
||||
proc,
|
||||
reject,
|
||||
resolve,
|
||||
session,
|
||||
spawnEnv,
|
||||
spawnPlan,
|
||||
traceSession,
|
||||
useStdin,
|
||||
}: {
|
||||
cwd: string;
|
||||
intervention?: Awaited<ReturnType<HeterogeneousAgentCtr['setupInterventionForOp']>>;
|
||||
params: SendPromptParams;
|
||||
proc: ChildProcess;
|
||||
reject: (reason?: unknown) => void;
|
||||
resolve: () => void;
|
||||
session: AgentSession;
|
||||
initialCumulativeUsage?: UsageData | undefined;
|
||||
spawnEnv: NodeJS.ProcessEnv;
|
||||
spawnPlan: HeterogeneousAgentBuildPlan;
|
||||
traceSession: CliTraceSession | undefined;
|
||||
useStdin: boolean;
|
||||
@@ -1021,10 +1165,13 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
// toStreamEvent all run inside the shared pipeline, so renderer + future
|
||||
// server `heteroIngest` see the same `AgentStreamEvent` wire shape with
|
||||
// no per-consumer adapter. The pipeline auto-wires the Codex
|
||||
// file-change line-stat tracker when `agentType === 'codex'`, so this
|
||||
// file-change diff/stat tracker when `agentType === 'codex'`, so this
|
||||
// controller stays agent-agnostic.
|
||||
const pipeline = new AgentStreamPipeline({
|
||||
agentType: session.agentType,
|
||||
cwd,
|
||||
initialCumulativeUsage,
|
||||
initialModel: session.model,
|
||||
operationId: params.operationId,
|
||||
});
|
||||
let stdoutBroadcastQueue: Promise<void> = Promise.resolve();
|
||||
@@ -1039,6 +1186,14 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
if (pipeline.sessionId && pipeline.sessionId !== session.agentSessionId) {
|
||||
session.agentSessionId = pipeline.sessionId;
|
||||
}
|
||||
events.push(
|
||||
...(await this.verifyCodexSessionModel({
|
||||
env: spawnEnv,
|
||||
pipeline,
|
||||
session,
|
||||
traceSession,
|
||||
})),
|
||||
);
|
||||
for (const event of events) {
|
||||
this.broadcast('heteroAgentEvent', {
|
||||
event,
|
||||
@@ -1317,6 +1472,8 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
spawnLhHeteroExec(params: {
|
||||
agentType: string;
|
||||
cwd?: string;
|
||||
/** Image attachments (signed URLs) appended as image content blocks. */
|
||||
imageList?: HeteroExecImageRef[];
|
||||
jwt: string;
|
||||
operationId: string;
|
||||
prompt: string;
|
||||
@@ -1328,6 +1485,7 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
const {
|
||||
agentType,
|
||||
cwd,
|
||||
imageList,
|
||||
jwt,
|
||||
operationId,
|
||||
prompt,
|
||||
@@ -1380,16 +1538,11 @@ export default class HeterogeneousAgentCtr extends ControllerModule {
|
||||
stdio: ['pipe', 'inherit', 'inherit'],
|
||||
});
|
||||
|
||||
// When systemContext is provided, send a content-block array so CC sees the
|
||||
// context block first, then the user's actual message — mirrors
|
||||
// spawnHeteroSandbox. lh handles JSON arrays via coerceJsonPrompt, so no lh
|
||||
// changes are required.
|
||||
const stdinPayload = systemContext
|
||||
? JSON.stringify([
|
||||
{ text: systemContext, type: 'text' },
|
||||
{ text: prompt, type: 'text' },
|
||||
])
|
||||
: JSON.stringify(prompt);
|
||||
// systemContext / image attachments turn the payload into a content-block
|
||||
// array so CC sees the context block first, then the user's message, then
|
||||
// the images — mirrors spawnHeteroSandbox. lh handles both shapes via
|
||||
// coerceJsonPrompt, so no lh changes are required.
|
||||
const stdinPayload = buildHeteroExecStdinPayload({ imageList, prompt, systemContext });
|
||||
child.stdin.write(stdinPayload);
|
||||
child.stdin.end();
|
||||
|
||||
|
||||
@@ -12,6 +12,7 @@ import {
|
||||
type GrepContentParams,
|
||||
type GrepContentResult,
|
||||
type ListLocalFileParams,
|
||||
type LocalFilePreviewResult,
|
||||
type LocalFilePreviewUrlParams,
|
||||
type LocalFilePreviewUrlResult,
|
||||
type LocalMoveFilesResultItem,
|
||||
@@ -65,6 +66,19 @@ const logger = createLogger('controllers:LocalFileCtr');
|
||||
|
||||
const SAFE_PATH_PREFIXES = ['/tmp', '/var/tmp'] as const;
|
||||
|
||||
const TEXT_PREVIEW_MIME_TYPES = new Set([
|
||||
'application/graphql',
|
||||
'application/javascript',
|
||||
'application/json',
|
||||
'application/markdown',
|
||||
'application/toml',
|
||||
'application/xml',
|
||||
'application/yaml',
|
||||
'text/markdown',
|
||||
'text/mdx',
|
||||
'text/x-markdown',
|
||||
]);
|
||||
|
||||
const normalizeAbsolutePath = (inputPath: string): string =>
|
||||
path.normalize(path.isAbsolute(inputPath) ? inputPath : `/${inputPath}`);
|
||||
|
||||
@@ -91,6 +105,48 @@ const resolveNearestExistingRealPath = async (targetPath: string): Promise<strin
|
||||
|
||||
const toPosixRelativePath = (filePath: string) => filePath.split(path.sep).join('/');
|
||||
|
||||
const normalizeContentType = (contentType: string): string =>
|
||||
contentType.split(';')[0].trim().toLowerCase();
|
||||
|
||||
const isTextPreviewMimeType = (mimeType: string): boolean =>
|
||||
mimeType.startsWith('text/') || TEXT_PREVIEW_MIME_TYPES.has(mimeType);
|
||||
|
||||
const serializePreviewFile = ({
|
||||
buffer,
|
||||
contentType,
|
||||
}: {
|
||||
buffer: Buffer;
|
||||
contentType: string;
|
||||
}): NonNullable<LocalFilePreviewResult['preview']> => {
|
||||
const normalizedContentType = normalizeContentType(contentType);
|
||||
|
||||
if (normalizedContentType.startsWith('image/')) {
|
||||
return {
|
||||
base64: buffer.toString('base64'),
|
||||
contentType: normalizedContentType,
|
||||
type: 'image',
|
||||
};
|
||||
}
|
||||
|
||||
if (isTextPreviewMimeType(normalizedContentType)) {
|
||||
return {
|
||||
content: buffer.toString('utf8'),
|
||||
contentType: normalizedContentType,
|
||||
type: 'text',
|
||||
};
|
||||
}
|
||||
|
||||
if (normalizedContentType === 'application/pdf') {
|
||||
return { contentType: normalizedContentType, type: 'pdf' };
|
||||
}
|
||||
|
||||
if (normalizedContentType.startsWith('video/')) {
|
||||
return { contentType: normalizedContentType, type: 'video' };
|
||||
}
|
||||
|
||||
return { contentType: normalizedContentType, type: 'binary' };
|
||||
};
|
||||
|
||||
const createProjectFileEntry = (
|
||||
root: string,
|
||||
absolutePath: string,
|
||||
@@ -381,11 +437,13 @@ export default class LocalFileCtr extends ControllerModule {
|
||||
|
||||
@IpcMethod()
|
||||
async getLocalFilePreviewUrl({
|
||||
accept,
|
||||
path: filePath,
|
||||
workingDirectory,
|
||||
}: LocalFilePreviewUrlParams): Promise<LocalFilePreviewUrlResult> {
|
||||
try {
|
||||
const url = await this.app.localFileProtocolManager.createPreviewUrl({
|
||||
accept,
|
||||
filePath,
|
||||
workspaceRoot: workingDirectory,
|
||||
});
|
||||
@@ -401,6 +459,33 @@ export default class LocalFileCtr extends ControllerModule {
|
||||
}
|
||||
}
|
||||
|
||||
@IpcMethod()
|
||||
async getLocalFilePreview({
|
||||
accept,
|
||||
path: filePath,
|
||||
workingDirectory,
|
||||
}: LocalFilePreviewUrlParams): Promise<LocalFilePreviewResult> {
|
||||
try {
|
||||
const preview = await this.app.localFileProtocolManager.readPreviewFile({
|
||||
accept,
|
||||
filePath,
|
||||
workspaceRoot: workingDirectory,
|
||||
});
|
||||
|
||||
if (!preview) {
|
||||
return { error: 'File is outside the approved workspace', success: false };
|
||||
}
|
||||
|
||||
return {
|
||||
preview: serializePreviewFile(preview),
|
||||
success: true,
|
||||
};
|
||||
} catch (error) {
|
||||
logger.error('Failed to read local file preview:', error);
|
||||
return { error: (error as Error).message, success: false };
|
||||
}
|
||||
}
|
||||
|
||||
@IpcMethod()
|
||||
async handlePrepareSkillDirectory({
|
||||
forceRefresh,
|
||||
|
||||
@@ -1,244 +1,53 @@
|
||||
import { readdir, readFile, stat } from 'node:fs/promises';
|
||||
import path from 'node:path';
|
||||
|
||||
import {
|
||||
initWorkspace as runInitWorkspace,
|
||||
listProjectSkills as runListProjectSkills,
|
||||
statPath as runStatPath,
|
||||
type WorkspaceScanDeps,
|
||||
} from '@lobechat/device-control';
|
||||
import {
|
||||
type InitWorkspaceParams,
|
||||
type InitWorkspaceResult,
|
||||
type ListProjectSkillsParams,
|
||||
type ListProjectSkillsResult,
|
||||
type ProjectSkillItem,
|
||||
} from '@lobechat/electron-client-ipc';
|
||||
|
||||
import { detectRepoType } from '@/utils/git';
|
||||
import { createLogger } from '@/utils/logger';
|
||||
|
||||
import { ControllerModule, IpcMethod } from './index';
|
||||
|
||||
const logger = createLogger('controllers:WorkspaceCtr');
|
||||
|
||||
const SKILL_FRONTMATTER_RE = /^---\r?\n([\s\S]*?)\r?\n---/;
|
||||
|
||||
// Cap recursion to guard against pathological directory trees.
|
||||
const MAX_SKILL_FILE_COUNT = 1000;
|
||||
|
||||
const toPosixRelativePath = (filePath: string) => filePath.split(path.sep).join('/');
|
||||
|
||||
const listSkillFilesRecursive = async (dir: string): Promise<string[]> => {
|
||||
const results: string[] = [];
|
||||
const stack: string[] = [dir];
|
||||
|
||||
while (stack.length > 0 && results.length < MAX_SKILL_FILE_COUNT) {
|
||||
const current = stack.pop()!;
|
||||
let entries;
|
||||
try {
|
||||
entries = await readdir(current, { withFileTypes: true });
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
for (const entry of entries) {
|
||||
if (entry.name.startsWith('.')) continue;
|
||||
const full = path.join(current, entry.name);
|
||||
if (entry.isDirectory()) {
|
||||
stack.push(full);
|
||||
} else if (entry.isFile()) {
|
||||
results.push(toPosixRelativePath(path.relative(dir, full)));
|
||||
if (results.length >= MAX_SKILL_FILE_COUNT) break;
|
||||
}
|
||||
}
|
||||
}
|
||||
return results.sort();
|
||||
};
|
||||
|
||||
// Parse a minimal YAML frontmatter block for SKILL.md files.
|
||||
// Only handles `key: value` lines; multi-line block scalars fall back to the first line.
|
||||
const parseSkillFrontmatter = (raw: string): Record<string, string> => {
|
||||
const match = raw.match(SKILL_FRONTMATTER_RE);
|
||||
if (!match) return {};
|
||||
|
||||
const fields: Record<string, string> = {};
|
||||
for (const line of match[1].split(/\r?\n/)) {
|
||||
const colonIdx = line.indexOf(':');
|
||||
if (colonIdx === -1) continue;
|
||||
const key = line.slice(0, colonIdx).trim();
|
||||
if (!key || key.startsWith('#')) continue;
|
||||
let value = line.slice(colonIdx + 1).trim();
|
||||
if (value.startsWith('|') || value.startsWith('>')) continue;
|
||||
if (
|
||||
(value.startsWith('"') && value.endsWith('"')) ||
|
||||
(value.startsWith("'") && value.endsWith("'"))
|
||||
) {
|
||||
value = value.slice(1, -1);
|
||||
}
|
||||
fields[key] = value;
|
||||
}
|
||||
return fields;
|
||||
};
|
||||
|
||||
/**
|
||||
* WorkspaceCtr
|
||||
*
|
||||
* Owns "project workspace" scanning: discovering agent skills (`.agents/skills`
|
||||
* / `.claude/skills`) and project-root instructions (`AGENTS.md` / `CLAUDE.md`)
|
||||
* under a bound project directory. Split out of LocalFileCtr so the
|
||||
* workspace/agent-config concern is distinct from generic local file ops.
|
||||
* Thin IPC layer over `@lobechat/device-control`'s workspace-scan helpers
|
||||
* (skills discovery under `.agents/skills` / `.claude/skills` + project-root
|
||||
* instructions). The scan logic is shared with the device-control RPC dispatch
|
||||
* so the local desktop IPC path, the remote device RPC, and the CLI all run
|
||||
* identical scans; the desktop-only preview-protocol approval is injected here.
|
||||
*/
|
||||
export default class WorkspaceCtr extends ControllerModule {
|
||||
static override readonly groupName = 'workspace';
|
||||
|
||||
/**
|
||||
* Scan one skill source directory (e.g. `.agents/skills`) under `root` and
|
||||
* return parsed frontmatter for each `SKILL.md`. Returns `[]` when the source
|
||||
* directory is absent or unreadable. Unsorted — callers sort/merge.
|
||||
*/
|
||||
private async scanSkillsInSource(
|
||||
root: string,
|
||||
source: ProjectSkillItem['source'],
|
||||
): Promise<ProjectSkillItem[]> {
|
||||
const dir = path.join(root, source);
|
||||
let entries;
|
||||
try {
|
||||
entries = await readdir(dir, { withFileTypes: true });
|
||||
} catch {
|
||||
// Directory does not exist or is not readable.
|
||||
return [];
|
||||
}
|
||||
|
||||
const skills = await Promise.all(
|
||||
entries
|
||||
.filter((entry) => entry.isDirectory() || entry.isSymbolicLink())
|
||||
.map(async (entry) => {
|
||||
const skillDir = path.join(dir, entry.name);
|
||||
const skillFile = path.join(skillDir, 'SKILL.md');
|
||||
try {
|
||||
const raw = await readFile(skillFile, 'utf8');
|
||||
const fields = parseSkillFrontmatter(raw);
|
||||
const files = await listSkillFilesRecursive(skillDir);
|
||||
return {
|
||||
description: fields.description || undefined,
|
||||
fileCount: files.length,
|
||||
files,
|
||||
name: fields.name || entry.name,
|
||||
path: skillFile,
|
||||
skillDir,
|
||||
source,
|
||||
};
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}),
|
||||
);
|
||||
|
||||
return skills.filter((skill): skill is ProjectSkillItem => skill !== null);
|
||||
private get scanDeps(): WorkspaceScanDeps {
|
||||
return { approveProjectRoot: (root) => this.approveProjectRootForPreview(root) };
|
||||
}
|
||||
|
||||
/**
|
||||
* Scan agent skill directories under the project root and return parsed
|
||||
* frontmatter for each SKILL.md. Used by the hetero agent's working sidebar
|
||||
* to surface skills available in the current project. Returns the first
|
||||
* source directory that yields any skills (`.agents/skills` wins).
|
||||
*/
|
||||
@IpcMethod()
|
||||
async listProjectSkills(params: ListProjectSkillsParams): Promise<ListProjectSkillsResult> {
|
||||
const root = params.scope;
|
||||
const sources = ['.agents/skills', '.claude/skills'] as const;
|
||||
|
||||
for (const source of sources) {
|
||||
const skills = (await this.scanSkillsInSource(root, source)).sort((a, b) =>
|
||||
a.name.localeCompare(b.name),
|
||||
);
|
||||
|
||||
if (skills.length > 0) {
|
||||
await this.approveProjectRootForPreview(root);
|
||||
return { root, skills, source };
|
||||
}
|
||||
}
|
||||
|
||||
return { root, skills: [], source: null };
|
||||
return runListProjectSkills(params, this.scanDeps);
|
||||
}
|
||||
|
||||
/**
|
||||
* One-call "workspace init" scan of a bound project directory: merge the
|
||||
* project skills from BOTH `.agents/skills` and `.claude/skills` (deduped by
|
||||
* name, `.agents/skills` winning) and read the project-root agent
|
||||
* instructions file (`AGENTS.md`, else `CLAUDE.md`). Driven server-side at run
|
||||
* start via the generic device RPC (not an LLM-visible tool) and cached onto
|
||||
* `devices.workingDirs[].workspace`.
|
||||
*
|
||||
* Approves the root for the `lobe-file://` preview protocol (same as
|
||||
* `listProjectSkills`) so the user can later click through to the scanned
|
||||
* skills / instructions in the UI.
|
||||
*/
|
||||
@IpcMethod()
|
||||
async initWorkspace(params: InitWorkspaceParams): Promise<InitWorkspaceResult> {
|
||||
const root = params.scope;
|
||||
const sources = ['.agents/skills', '.claude/skills'] as const;
|
||||
|
||||
const seen = new Set<string>();
|
||||
const skills: ProjectSkillItem[] = [];
|
||||
for (const source of sources) {
|
||||
for (const skill of await this.scanSkillsInSource(root, source)) {
|
||||
if (seen.has(skill.name)) continue;
|
||||
seen.add(skill.name);
|
||||
skills.push(skill);
|
||||
}
|
||||
}
|
||||
skills.sort((a, b) => a.name.localeCompare(b.name));
|
||||
|
||||
const instructions = await this.readWorkspaceInstructions(root);
|
||||
|
||||
// Approve regardless of what was found — the run is now bound to this root,
|
||||
// so any later click-through to it should resolve through the preview
|
||||
// protocol even if the project carries neither skills nor instructions.
|
||||
await this.approveProjectRootForPreview(root);
|
||||
|
||||
return { instructions, root, skills };
|
||||
return runInitWorkspace(params, this.scanDeps);
|
||||
}
|
||||
|
||||
/**
|
||||
* Check whether a path exists on this device and is a directory, plus its git
|
||||
* repo type (`git` / `github` / none). Used to validate a manually-entered
|
||||
* working directory from a web / remote client (which can't browse this
|
||||
* device's filesystem) before binding it, and to render the right dir icon.
|
||||
*/
|
||||
@IpcMethod()
|
||||
async statPath(params: {
|
||||
path: string;
|
||||
}): Promise<{ exists: boolean; isDirectory: boolean; repoType?: 'git' | 'github' }> {
|
||||
try {
|
||||
const stats = await stat(params.path);
|
||||
if (!stats.isDirectory()) return { exists: true, isDirectory: false };
|
||||
const repoType = await detectRepoType(params.path);
|
||||
return { exists: true, isDirectory: true, repoType };
|
||||
} catch {
|
||||
return { exists: false, isDirectory: false };
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Read the project-root agent instructions files. Collects every present
|
||||
* candidate (`AGENTS.md`, then `CLAUDE.md`) rather than first-match, since both
|
||||
* can coexist. Each body is capped so a pathologically large file can't bloat
|
||||
* the cached `workingDirs` payload or the injected system role.
|
||||
*/
|
||||
private async readWorkspaceInstructions(
|
||||
root: string,
|
||||
): Promise<InitWorkspaceResult['instructions']> {
|
||||
const MAX_INSTRUCTIONS_BYTES = 64 * 1024;
|
||||
const candidates = ['AGENTS.md', 'CLAUDE.md'] as const;
|
||||
|
||||
const instructions: InitWorkspaceResult['instructions'] = [];
|
||||
for (const source of candidates) {
|
||||
try {
|
||||
const raw = await readFile(path.join(root, source), 'utf8');
|
||||
const content =
|
||||
raw.length > MAX_INSTRUCTIONS_BYTES ? raw.slice(0, MAX_INSTRUCTIONS_BYTES) : raw;
|
||||
instructions.push({ content, source });
|
||||
} catch {
|
||||
// File absent or unreadable; skip it.
|
||||
}
|
||||
}
|
||||
|
||||
return instructions;
|
||||
return runStatPath(params);
|
||||
}
|
||||
|
||||
private async approveProjectRootForPreview(root: string) {
|
||||
|
||||
@@ -29,6 +29,7 @@ const mockCloseWindow = vi.fn();
|
||||
const mockMinimizeWindow = vi.fn();
|
||||
const mockMaximizeWindow = vi.fn();
|
||||
const mockIsWindowMaximized = vi.fn();
|
||||
const mockIsWindowFullScreen = vi.fn();
|
||||
const mockRetrieveByIdentifier = vi.fn();
|
||||
const mockStartSession = vi.fn();
|
||||
const testSenderIdentifierString: string = 'test-window-event-id';
|
||||
@@ -58,6 +59,7 @@ const mockApp = {
|
||||
minimizeWindow: mockMinimizeWindow,
|
||||
maximizeWindow: mockMaximizeWindow,
|
||||
isWindowMaximized: mockIsWindowMaximized,
|
||||
isWindowFullScreen: mockIsWindowFullScreen,
|
||||
retrieveByIdentifier: mockRetrieveByIdentifier.mockImplementation(
|
||||
(identifier: AppBrowsersIdentifiers | string) => {
|
||||
if (identifier === 'some-other-window') {
|
||||
@@ -166,6 +168,20 @@ describe('BrowserWindowsCtr', () => {
|
||||
});
|
||||
});
|
||||
|
||||
describe('isWindowFullScreen', () => {
|
||||
it('should return fullscreen state for the sender window', () => {
|
||||
mockIsWindowFullScreen.mockReturnValueOnce(true);
|
||||
|
||||
const sender = {} as any;
|
||||
const context = { sender, event: { sender } as any } as IpcContext;
|
||||
const result = runWithIpcContext(context, () => browserWindowsCtr.isWindowFullScreen());
|
||||
|
||||
expect(mockGetIdentifierByWebContents).toHaveBeenCalledWith(context.sender);
|
||||
expect(mockIsWindowFullScreen).toHaveBeenCalledWith(testSenderIdentifierString);
|
||||
expect(result).toBe(true);
|
||||
});
|
||||
});
|
||||
|
||||
describe('interceptRoute', () => {
|
||||
const baseParams = { source: 'link-click' as const };
|
||||
|
||||
|
||||
@@ -480,6 +480,87 @@ describe('HeterogeneousAgentCtr', () => {
|
||||
expect(spawnCalls).toHaveLength(0);
|
||||
});
|
||||
|
||||
it('spawns through the detector-resolved absolute path when the bare command is off PATH', async () => {
|
||||
// Codex desktop app case: `codex` is not on PATH, but the preflight
|
||||
// detector finds the CLI bundled inside Codex.app. Spawning the bare
|
||||
// command would ENOENT — spawn must use the resolved absolute path.
|
||||
const resolvedPath = '/Applications/Codex.app/Contents/Resources/codex';
|
||||
const detect = vi.fn().mockResolvedValue({ available: true, path: resolvedPath });
|
||||
const { proc } = createFakeProc();
|
||||
nextFakeProc = proc;
|
||||
|
||||
const ctr = new HeterogeneousAgentCtr({
|
||||
appStoragePath,
|
||||
storeManager: { get: vi.fn() },
|
||||
toolDetectorManager: { detect },
|
||||
} as any);
|
||||
const { sessionId } = await ctr.startSession({
|
||||
agentType: 'codex',
|
||||
command: 'codex',
|
||||
});
|
||||
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
|
||||
|
||||
expect(spawnCalls[0].command).toBe(resolvedPath);
|
||||
});
|
||||
|
||||
it('carries the detector login-shell PATH into the spawn env for `env node` shims', async () => {
|
||||
// `codex` resolved via the login-shell PATH (mise/nvm). Spawning the
|
||||
// absolute shim under the leaner inherited PATH would fail at its
|
||||
// `#!/usr/bin/env node` shebang — the resolved PATH must reach the child.
|
||||
const resolvedPath = '/Users/h/.local/share/mise/shims/codex';
|
||||
const searchPath = '/Users/h/.local/share/mise/shims:/usr/bin:/bin';
|
||||
const detect = vi
|
||||
.fn()
|
||||
.mockResolvedValue({ available: true, path: resolvedPath, resolvedPathEnv: searchPath });
|
||||
const { proc } = createFakeProc();
|
||||
nextFakeProc = proc;
|
||||
|
||||
const ctr = new HeterogeneousAgentCtr({
|
||||
appStoragePath,
|
||||
storeManager: { get: vi.fn() },
|
||||
toolDetectorManager: { detect },
|
||||
} as any);
|
||||
const { sessionId } = await ctr.startSession({ agentType: 'codex', command: 'codex' });
|
||||
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
|
||||
|
||||
expect(spawnCalls[0].command).toBe(resolvedPath);
|
||||
expect(spawnCalls[0].options.env.PATH).toBe(searchPath);
|
||||
});
|
||||
|
||||
it('keeps an explicit path-like command for spawn instead of the detector result', async () => {
|
||||
// detectHeterogeneousCliCommand validates the custom path via --version.
|
||||
execFileMock.mockImplementation(
|
||||
(
|
||||
_file: string,
|
||||
_args: string[],
|
||||
optionsOrCallback: unknown,
|
||||
callback?: (error: Error | null, result: { stderr: string; stdout: string }) => void,
|
||||
) => {
|
||||
const resolvedCallback =
|
||||
typeof optionsOrCallback === 'function' ? optionsOrCallback : callback;
|
||||
(resolvedCallback as any)?.(null, { stderr: '', stdout: 'codex-cli 0.99.0' });
|
||||
},
|
||||
);
|
||||
|
||||
const detect = vi.fn();
|
||||
const { proc } = createFakeProc();
|
||||
nextFakeProc = proc;
|
||||
|
||||
const ctr = new HeterogeneousAgentCtr({
|
||||
appStoragePath,
|
||||
storeManager: { get: vi.fn() },
|
||||
toolDetectorManager: { detect },
|
||||
} as any);
|
||||
const { sessionId } = await ctr.startSession({
|
||||
agentType: 'codex',
|
||||
command: '/custom/bin/codex',
|
||||
});
|
||||
await ctr.sendPrompt({ operationId: 'op-test', prompt: 'hello', sessionId });
|
||||
|
||||
expect(detect).not.toHaveBeenCalled();
|
||||
expect(spawnCalls[0].command).toBe('/custom/bin/codex');
|
||||
});
|
||||
|
||||
it('passes prompt via stdin to codex exec instead of argv', async () => {
|
||||
const prompt = '--run a shell-like prompt safely';
|
||||
const { cliArgs, command, writes } = await runSendPrompt(prompt);
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
import path from 'node:path';
|
||||
|
||||
import { zipSync } from 'fflate';
|
||||
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
@@ -88,6 +90,7 @@ const mockLocalFileProtocolManager = {
|
||||
approveIndexedProjectRoot: vi.fn(),
|
||||
approveProjectRootFromScope: vi.fn(),
|
||||
createPreviewUrl: vi.fn(),
|
||||
readPreviewFile: vi.fn(),
|
||||
};
|
||||
|
||||
// Mock makeSureDirExist
|
||||
@@ -146,7 +149,6 @@ describe('LocalFileCtr', () => {
|
||||
|
||||
it('should expand a leading ~ to the user home directory', async () => {
|
||||
const os = await import('node:os');
|
||||
const path = await import('node:path');
|
||||
vi.mocked(mockShell.openPath).mockResolvedValue('');
|
||||
|
||||
const result = await localFileCtr.handleOpenLocalFile({ path: '~/git/work/file.txt' });
|
||||
@@ -171,7 +173,6 @@ describe('LocalFileCtr', () => {
|
||||
|
||||
it('should expand a leading ~ when opening a directory', async () => {
|
||||
const os = await import('node:os');
|
||||
const path = await import('node:path');
|
||||
vi.mocked(mockShell.openPath).mockResolvedValue('');
|
||||
|
||||
const result = await localFileCtr.handleOpenLocalFolder({
|
||||
@@ -224,6 +225,7 @@ describe('LocalFileCtr', () => {
|
||||
});
|
||||
|
||||
expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
|
||||
accept: undefined,
|
||||
filePath: '/workspace/app.ts',
|
||||
workspaceRoot: '/workspace',
|
||||
});
|
||||
@@ -246,6 +248,99 @@ describe('LocalFileCtr', () => {
|
||||
success: false,
|
||||
});
|
||||
});
|
||||
|
||||
it('should forward image-only preview URL constraints', async () => {
|
||||
mockLocalFileProtocolManager.createPreviewUrl.mockResolvedValue(
|
||||
'localfile://file/workspace/image.png?token=abc',
|
||||
);
|
||||
|
||||
const result = await localFileCtr.getLocalFilePreviewUrl({
|
||||
accept: 'image',
|
||||
path: '/workspace/image.png',
|
||||
workingDirectory: '/workspace',
|
||||
});
|
||||
|
||||
expect(mockLocalFileProtocolManager.createPreviewUrl).toHaveBeenCalledWith({
|
||||
accept: 'image',
|
||||
filePath: '/workspace/image.png',
|
||||
workspaceRoot: '/workspace',
|
||||
});
|
||||
expect(result).toEqual({
|
||||
success: true,
|
||||
url: 'localfile://file/workspace/image.png?token=abc',
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('getLocalFilePreview', () => {
|
||||
it('should return text preview content for an approved workspace file', async () => {
|
||||
mockLocalFileProtocolManager.readPreviewFile.mockResolvedValue({
|
||||
buffer: Buffer.from('const value = 1;'),
|
||||
contentType: 'text/plain; charset=utf-8',
|
||||
realPath: '/workspace/app.ts',
|
||||
});
|
||||
|
||||
const result = await localFileCtr.getLocalFilePreview({
|
||||
path: '/workspace/app.ts',
|
||||
workingDirectory: '/workspace',
|
||||
});
|
||||
|
||||
expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
|
||||
accept: undefined,
|
||||
filePath: '/workspace/app.ts',
|
||||
workspaceRoot: '/workspace',
|
||||
});
|
||||
expect(result).toEqual({
|
||||
preview: {
|
||||
content: 'const value = 1;',
|
||||
contentType: 'text/plain',
|
||||
type: 'text',
|
||||
},
|
||||
success: true,
|
||||
});
|
||||
});
|
||||
|
||||
it('should reject preview payload creation outside an approved workspace', async () => {
|
||||
mockLocalFileProtocolManager.readPreviewFile.mockResolvedValue(null);
|
||||
|
||||
const result = await localFileCtr.getLocalFilePreview({
|
||||
path: '/Users/alice/.ssh/id_rsa',
|
||||
workingDirectory: '/workspace',
|
||||
});
|
||||
|
||||
expect(result).toEqual({
|
||||
error: 'File is outside the approved workspace',
|
||||
success: false,
|
||||
});
|
||||
});
|
||||
|
||||
it('should forward image-only preview read constraints', async () => {
|
||||
mockLocalFileProtocolManager.readPreviewFile.mockResolvedValue({
|
||||
buffer: Buffer.from('image-bytes'),
|
||||
contentType: 'image/png',
|
||||
realPath: '/workspace/image.png',
|
||||
});
|
||||
|
||||
const result = await localFileCtr.getLocalFilePreview({
|
||||
accept: 'image',
|
||||
path: '/workspace/image.png',
|
||||
workingDirectory: '/workspace',
|
||||
});
|
||||
|
||||
expect(mockLocalFileProtocolManager.readPreviewFile).toHaveBeenCalledWith({
|
||||
accept: 'image',
|
||||
filePath: '/workspace/image.png',
|
||||
workspaceRoot: '/workspace',
|
||||
});
|
||||
expect(result).toEqual({
|
||||
preview: {
|
||||
base64: Buffer.from('image-bytes').toString('base64'),
|
||||
contentType: 'image/png',
|
||||
type: 'image',
|
||||
},
|
||||
success: true,
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('handleWriteFile', () => {
|
||||
|
||||
@@ -7,7 +7,7 @@ import type { BrowserWindowConstructorOptions } from 'electron';
|
||||
import { app, BrowserWindow, ipcMain, screen, session as electronSession, shell } from 'electron';
|
||||
|
||||
import { preloadDir, resourcesDir } from '@/const/dir';
|
||||
import { isMac } from '@/const/env';
|
||||
import { DESKTOP_EXTERNAL_NAVIGATION_HOSTS, isMac } from '@/const/env';
|
||||
import RemoteServerConfigCtr from '@/controllers/RemoteServerConfigCtr';
|
||||
import { backendProxyProtocolManager } from '@/core/infrastructure/BackendProxyProtocolManager';
|
||||
import { appendVercelCookie, setResponseHeader } from '@/utils/http-headers';
|
||||
@@ -19,6 +19,31 @@ import { WindowThemeManager } from './WindowThemeManager';
|
||||
|
||||
const logger = createLogger('core:Browser');
|
||||
|
||||
const getExternalNavigationHosts = () =>
|
||||
DESKTOP_EXTERNAL_NAVIGATION_HOSTS.split(',')
|
||||
.map((host) => host.trim().toLowerCase())
|
||||
.filter(Boolean);
|
||||
|
||||
const shouldOpenTopLevelNavigationExternally = (rawUrl: string) => {
|
||||
const externalNavigationHosts = getExternalNavigationHosts();
|
||||
if (externalNavigationHosts.length === 0) return false;
|
||||
|
||||
let url: URL;
|
||||
try {
|
||||
url = new URL(rawUrl);
|
||||
} catch {
|
||||
return false;
|
||||
}
|
||||
|
||||
if (url.protocol !== 'http:' && url.protocol !== 'https:') return false;
|
||||
|
||||
const hostname = url.hostname.toLowerCase();
|
||||
|
||||
return externalNavigationHosts.some(
|
||||
(externalHost) => hostname === externalHost || hostname.endsWith(`.${externalHost}`),
|
||||
);
|
||||
};
|
||||
|
||||
// ==================== Types ====================
|
||||
|
||||
export interface BrowserWindowOpts extends BrowserWindowConstructorOptions {
|
||||
@@ -194,10 +219,27 @@ export default class Browser {
|
||||
this.setupReadyToShowListener(browserWindow);
|
||||
this.setupCloseListener(browserWindow);
|
||||
this.setupFocusListener(browserWindow);
|
||||
this.setupFullscreenListener(browserWindow);
|
||||
this.setupTopLevelNavigationListener(browserWindow);
|
||||
this.setupWillPreventUnloadListener(browserWindow);
|
||||
this.setupContextMenu(browserWindow);
|
||||
}
|
||||
|
||||
private setupTopLevelNavigationListener(browserWindow: BrowserWindow): void {
|
||||
logger.debug(`[${this.identifier}] Setting up top-level navigation listener.`);
|
||||
|
||||
browserWindow.webContents.on('will-navigate', (event, url) => {
|
||||
if (!shouldOpenTopLevelNavigationExternally(url)) return;
|
||||
|
||||
logger.info(`[${this.identifier}] Opening top-level navigation externally: ${url}`);
|
||||
event.preventDefault();
|
||||
|
||||
shell.openExternal(url).catch((error) => {
|
||||
logger.error(`[${this.identifier}] Failed to open external navigation URL: ${url}`, error);
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Setup window open handler to intercept external links
|
||||
* Prevents opening new windows in renderer and uses system browser instead
|
||||
@@ -268,6 +310,18 @@ export default class Browser {
|
||||
});
|
||||
}
|
||||
|
||||
private setupFullscreenListener(browserWindow: BrowserWindow): void {
|
||||
logger.debug(`[${this.identifier}] Setting up fullscreen event listeners.`);
|
||||
|
||||
browserWindow.on('enter-full-screen', () => {
|
||||
this.broadcast('windowFullscreenChanged', { isFullScreen: true });
|
||||
});
|
||||
|
||||
browserWindow.on('leave-full-screen', () => {
|
||||
this.broadcast('windowFullscreenChanged', { isFullScreen: false });
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* Setup context menu with platform-specific features
|
||||
* Delegates to MenuManager for consistent platform behavior
|
||||
|
||||
@@ -368,6 +368,11 @@ export class BrowserManager {
|
||||
return browser?.browserWindow.isMaximized() ?? false;
|
||||
}
|
||||
|
||||
isWindowFullScreen(identifier: string) {
|
||||
const browser = this.browsers.get(identifier);
|
||||
return browser?.browserWindow.isFullScreen() ?? false;
|
||||
}
|
||||
|
||||
setWindowSize(identifier: string, size: { height?: number; width?: number }) {
|
||||
const browser = this.browsers.get(identifier);
|
||||
browser?.setWindowSize(size);
|
||||
|
||||
@@ -9,6 +9,7 @@ const {
|
||||
mockBrowserWindow,
|
||||
mockNativeTheme,
|
||||
mockIpcMain,
|
||||
mockShell,
|
||||
mockScreen,
|
||||
MockBrowserWindow,
|
||||
mockEnv,
|
||||
@@ -64,6 +65,7 @@ const {
|
||||
MockBrowserWindow: vi.fn().mockImplementation(() => mockBrowserWindow),
|
||||
mockBrowserWindow,
|
||||
mockEnv: {
|
||||
externalNavigationHosts: '',
|
||||
isDev: false,
|
||||
isLinux: false,
|
||||
isMac: false,
|
||||
@@ -91,6 +93,9 @@ const {
|
||||
workArea: { height: 1080, width: 1920, x: 0, y: 0 },
|
||||
}),
|
||||
},
|
||||
mockShell: {
|
||||
openExternal: vi.fn().mockResolvedValue(undefined),
|
||||
},
|
||||
};
|
||||
});
|
||||
|
||||
@@ -101,6 +106,7 @@ vi.mock('electron', () => ({
|
||||
ipcMain: mockIpcMain,
|
||||
nativeTheme: mockNativeTheme,
|
||||
screen: mockScreen,
|
||||
shell: mockShell,
|
||||
}));
|
||||
|
||||
// Mock logger
|
||||
@@ -121,6 +127,9 @@ vi.mock('@/const/dir', () => ({
|
||||
}));
|
||||
|
||||
vi.mock('@/const/env', () => ({
|
||||
get DESKTOP_EXTERNAL_NAVIGATION_HOSTS() {
|
||||
return mockEnv.externalNavigationHosts;
|
||||
},
|
||||
get isDev() {
|
||||
return mockEnv.isDev;
|
||||
},
|
||||
@@ -182,6 +191,7 @@ describe('Browser', () => {
|
||||
mockEnv.isMac = false;
|
||||
mockEnv.isMacTahoe = false;
|
||||
mockEnv.isWindows = true;
|
||||
mockEnv.externalNavigationHosts = '';
|
||||
|
||||
// Create mock App
|
||||
mockStoreManagerGet = vi.fn().mockReturnValue(undefined);
|
||||
@@ -531,6 +541,30 @@ describe('Browser', () => {
|
||||
});
|
||||
});
|
||||
|
||||
describe('fullscreen events', () => {
|
||||
it('should broadcast fullscreen state changes', () => {
|
||||
const enterHandler = mockBrowserWindow.on.mock.calls.find(
|
||||
(call) => call[0] === 'enter-full-screen',
|
||||
)?.[1];
|
||||
const leaveHandler = mockBrowserWindow.on.mock.calls.find(
|
||||
(call) => call[0] === 'leave-full-screen',
|
||||
)?.[1];
|
||||
|
||||
expect(enterHandler).toBeDefined();
|
||||
expect(leaveHandler).toBeDefined();
|
||||
|
||||
enterHandler();
|
||||
expect(mockBrowserWindow.webContents.send).toHaveBeenCalledWith('windowFullscreenChanged', {
|
||||
isFullScreen: true,
|
||||
});
|
||||
|
||||
leaveHandler();
|
||||
expect(mockBrowserWindow.webContents.send).toHaveBeenCalledWith('windowFullscreenChanged', {
|
||||
isFullScreen: false,
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('close', () => {
|
||||
it('should close window', () => {
|
||||
browser.close();
|
||||
@@ -730,4 +764,38 @@ describe('Browser', () => {
|
||||
expect(mockEvent.preventDefault).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
|
||||
describe('top-level navigation handling', () => {
|
||||
let willNavigateHandler: (event: any, url: string) => void;
|
||||
|
||||
beforeEach(() => {
|
||||
willNavigateHandler = mockBrowserWindow.webContents.on.mock.calls.find(
|
||||
(call) => call[0] === 'will-navigate',
|
||||
)?.[1];
|
||||
});
|
||||
|
||||
it('should open configured external navigation hosts in system browser', () => {
|
||||
mockEnv.externalNavigationHosts = 'stripe.com';
|
||||
const mockEvent = { preventDefault: vi.fn() };
|
||||
|
||||
expect(willNavigateHandler).toBeDefined();
|
||||
willNavigateHandler(mockEvent, 'https://checkout.stripe.com/c/pay/session_id');
|
||||
|
||||
expect(mockEvent.preventDefault).toHaveBeenCalled();
|
||||
expect(mockShell.openExternal).toHaveBeenCalledWith(
|
||||
'https://checkout.stripe.com/c/pay/session_id',
|
||||
);
|
||||
});
|
||||
|
||||
it('should allow internal result routes in the app window', () => {
|
||||
mockEnv.externalNavigationHosts = 'stripe.com';
|
||||
const mockEvent = { preventDefault: vi.fn() };
|
||||
|
||||
expect(willNavigateHandler).toBeDefined();
|
||||
willNavigateHandler(mockEvent, 'http://localhost:3000/payment/upgrade-success');
|
||||
|
||||
expect(mockEvent.preventDefault).not.toHaveBeenCalled();
|
||||
expect(mockShell.openExternal).not.toHaveBeenCalled();
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@@ -48,6 +48,27 @@ interface PreviewTokenRecord {
|
||||
realPath: string;
|
||||
}
|
||||
|
||||
export interface PreviewFileReadResult {
|
||||
buffer: Buffer;
|
||||
contentType: string;
|
||||
realPath: string;
|
||||
}
|
||||
|
||||
type PreviewFileAccept = 'image';
|
||||
|
||||
const normalizeContentType = (contentType: string): string =>
|
||||
contentType.split(';')[0].trim().toLowerCase();
|
||||
|
||||
const isAcceptedPreviewContentType = (
|
||||
contentType: string,
|
||||
accept?: PreviewFileAccept,
|
||||
): boolean => {
|
||||
if (!accept) return true;
|
||||
|
||||
const normalizedContentType = normalizeContentType(contentType);
|
||||
return accept === 'image' && normalizedContentType.startsWith('image/');
|
||||
};
|
||||
|
||||
/**
|
||||
* Custom `localfile://` protocol for project file previews.
|
||||
*
|
||||
@@ -207,43 +228,65 @@ export class LocalFileProtocolManager {
|
||||
}
|
||||
|
||||
async createPreviewUrl({
|
||||
accept,
|
||||
filePath,
|
||||
workspaceRoot,
|
||||
}: {
|
||||
accept?: PreviewFileAccept;
|
||||
filePath: string;
|
||||
workspaceRoot: string;
|
||||
}): Promise<string | null> {
|
||||
const normalizedFilePath = normalizeAbsolutePath(filePath);
|
||||
const normalizedWorkspaceRoot = normalizeAbsolutePath(workspaceRoot);
|
||||
if (!normalizedFilePath || !normalizedWorkspaceRoot) return null;
|
||||
if (!normalizedFilePath) return null;
|
||||
|
||||
const [realFilePath, realWorkspaceRoot] = await Promise.all([
|
||||
realpath(normalizedFilePath),
|
||||
realpath(normalizedWorkspaceRoot),
|
||||
]);
|
||||
const normalizedRealFilePath = normalizeAbsolutePath(realFilePath);
|
||||
const normalizedRealWorkspaceRoot = normalizeAbsolutePath(realWorkspaceRoot);
|
||||
|
||||
if (!normalizedRealFilePath || !normalizedRealWorkspaceRoot) return null;
|
||||
if (
|
||||
!this.approvedWorkspaceRoots.has(normalizedRealWorkspaceRoot) &&
|
||||
!this.indexedProjectRoots.has(normalizedRealWorkspaceRoot)
|
||||
) {
|
||||
return null;
|
||||
}
|
||||
if (!isPathWithinRoot(normalizedRealFilePath, normalizedRealWorkspaceRoot)) return null;
|
||||
const realFilePath = accept
|
||||
? (
|
||||
await this.readPreviewFile({
|
||||
accept,
|
||||
filePath,
|
||||
workspaceRoot,
|
||||
})
|
||||
)?.realPath
|
||||
: await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
|
||||
if (!realFilePath) return null;
|
||||
|
||||
this.cleanupExpiredTokens();
|
||||
|
||||
const token = randomUUID();
|
||||
this.previewTokens.set(token, {
|
||||
expiresAt: Date.now() + PREVIEW_TOKEN_TTL_MS,
|
||||
realPath: normalizedRealFilePath,
|
||||
realPath: realFilePath,
|
||||
});
|
||||
|
||||
return buildLocalFileUrl(normalizedFilePath, token);
|
||||
}
|
||||
|
||||
async readPreviewFile({
|
||||
accept,
|
||||
filePath,
|
||||
workspaceRoot,
|
||||
}: {
|
||||
accept?: PreviewFileAccept;
|
||||
filePath: string;
|
||||
workspaceRoot: string;
|
||||
}): Promise<PreviewFileReadResult | null> {
|
||||
const realFilePath = await this.resolveApprovedPreviewPath({ filePath, workspaceRoot });
|
||||
if (!realFilePath) return null;
|
||||
|
||||
const fileStat = await stat(realFilePath);
|
||||
if (!fileStat.isFile()) return null;
|
||||
|
||||
const buffer = await readFile(realFilePath);
|
||||
const contentType = resolveLocalFileMimeType(realFilePath, buffer);
|
||||
if (!isAcceptedPreviewContentType(contentType, accept)) return null;
|
||||
|
||||
return {
|
||||
buffer,
|
||||
contentType,
|
||||
realPath: realFilePath,
|
||||
};
|
||||
}
|
||||
|
||||
/**
|
||||
* Decode the URL pathname back into an absolute filesystem path.
|
||||
*
|
||||
@@ -283,6 +326,36 @@ export class LocalFileProtocolManager {
|
||||
return normalized;
|
||||
}
|
||||
|
||||
private async resolveApprovedPreviewPath({
|
||||
filePath,
|
||||
workspaceRoot,
|
||||
}: {
|
||||
filePath: string;
|
||||
workspaceRoot: string;
|
||||
}): Promise<string | null> {
|
||||
const normalizedFilePath = normalizeAbsolutePath(filePath);
|
||||
const normalizedWorkspaceRoot = normalizeAbsolutePath(workspaceRoot);
|
||||
if (!normalizedFilePath || !normalizedWorkspaceRoot) return null;
|
||||
|
||||
const [realFilePath, realWorkspaceRoot] = await Promise.all([
|
||||
realpath(normalizedFilePath),
|
||||
realpath(normalizedWorkspaceRoot),
|
||||
]);
|
||||
const normalizedRealFilePath = normalizeAbsolutePath(realFilePath);
|
||||
const normalizedRealWorkspaceRoot = normalizeAbsolutePath(realWorkspaceRoot);
|
||||
|
||||
if (!normalizedRealFilePath || !normalizedRealWorkspaceRoot) return null;
|
||||
if (
|
||||
!this.approvedWorkspaceRoots.has(normalizedRealWorkspaceRoot) &&
|
||||
!this.indexedProjectRoots.has(normalizedRealWorkspaceRoot)
|
||||
) {
|
||||
return null;
|
||||
}
|
||||
if (!isPathWithinRoot(normalizedRealFilePath, normalizedRealWorkspaceRoot)) return null;
|
||||
|
||||
return normalizedRealFilePath;
|
||||
}
|
||||
|
||||
private cleanupExpiredTokens() {
|
||||
const now = Date.now();
|
||||
for (const [token, record] of this.previewTokens) {
|
||||
|
||||
@@ -15,6 +15,15 @@ export interface ToolStatus {
|
||||
error?: string;
|
||||
lastChecked?: Date;
|
||||
path?: string;
|
||||
/**
|
||||
* PATH value used to resolve/validate the command, surfaced only when it
|
||||
* differs from the detector process's `process.env.PATH` (e.g. resolution
|
||||
* fell back to the login-shell PATH). A caller that spawns the resolved
|
||||
* `path` must carry this into the child's PATH, or a `#!/usr/bin/env node`
|
||||
* shim that resolved here still fails with `env: node: No such file or
|
||||
* directory` under the leaner inherited env.
|
||||
*/
|
||||
resolvedPathEnv?: string;
|
||||
version?: string;
|
||||
}
|
||||
|
||||
|
||||
@@ -119,6 +119,21 @@ describe('LocalFileProtocolManager', () => {
|
||||
expect(response.headers.get('Content-Type')).toBe('text/plain; charset=utf-8');
|
||||
});
|
||||
|
||||
it('does not mint image-only preview URLs for text files', async () => {
|
||||
const manager = new LocalFileProtocolManager();
|
||||
await manager.approveWorkspaceRoot('/Users/alice/project');
|
||||
mockReadFile.mockResolvedValue(Buffer.from('const value = 1;'));
|
||||
|
||||
const url = await manager.createPreviewUrl({
|
||||
accept: 'image',
|
||||
filePath: '/Users/alice/project/App.tsx',
|
||||
workspaceRoot: '/Users/alice/project',
|
||||
});
|
||||
|
||||
expect(url).toBeNull();
|
||||
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
|
||||
});
|
||||
|
||||
it('decodes percent-encoded characters in the path', async () => {
|
||||
const manager = new LocalFileProtocolManager();
|
||||
manager.registerHandler();
|
||||
@@ -278,6 +293,52 @@ describe('LocalFileProtocolManager', () => {
|
||||
expect(url).toContain('token=');
|
||||
});
|
||||
|
||||
it('reads preview payloads only from approved project roots', async () => {
|
||||
const manager = new LocalFileProtocolManager();
|
||||
await manager.approveIndexedProjectRoot('/Users/alice/project');
|
||||
mockReadFile.mockResolvedValue(Buffer.from('const value = 1;'));
|
||||
|
||||
const result = await manager.readPreviewFile({
|
||||
filePath: '/Users/alice/project/App.tsx',
|
||||
workspaceRoot: '/Users/alice/project',
|
||||
});
|
||||
|
||||
expect(result).toEqual({
|
||||
buffer: Buffer.from('const value = 1;'),
|
||||
contentType: 'text/plain; charset=utf-8',
|
||||
realPath: '/Users/alice/project/App.tsx',
|
||||
});
|
||||
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/App.tsx');
|
||||
});
|
||||
|
||||
it('does not return text payloads for image-only preview reads', async () => {
|
||||
const manager = new LocalFileProtocolManager();
|
||||
await manager.approveIndexedProjectRoot('/Users/alice/project');
|
||||
mockReadFile.mockResolvedValue(Buffer.from('SECRET=value'));
|
||||
|
||||
const result = await manager.readPreviewFile({
|
||||
accept: 'image',
|
||||
filePath: '/Users/alice/project/.env',
|
||||
workspaceRoot: '/Users/alice/project',
|
||||
});
|
||||
|
||||
expect(result).toBeNull();
|
||||
expect(mockReadFile).toHaveBeenCalledWith('/Users/alice/project/.env');
|
||||
});
|
||||
|
||||
it('does not read preview payloads outside the approved workspace root', async () => {
|
||||
const manager = new LocalFileProtocolManager();
|
||||
await manager.approveIndexedProjectRoot('/Users/alice/project');
|
||||
|
||||
const result = await manager.readPreviewFile({
|
||||
filePath: '/Users/alice/.ssh/id_rsa',
|
||||
workspaceRoot: '/Users/alice/project',
|
||||
});
|
||||
|
||||
expect(result).toBeNull();
|
||||
expect(mockReadFile).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('defers registration until app ready when not yet ready', async () => {
|
||||
mockApp.isReady.mockReturnValue(false);
|
||||
let resolveReady: () => void = () => undefined;
|
||||
|
||||
@@ -50,6 +50,13 @@ const envNumber = (defaultValue: number) =>
|
||||
}, z.number().optional())
|
||||
.default(defaultValue);
|
||||
|
||||
const getRuntimeEnv = () => ({
|
||||
...process.env,
|
||||
DESKTOP_EXTERNAL_NAVIGATION_HOSTS: process.env.DESKTOP_EXTERNAL_NAVIGATION_HOSTS,
|
||||
UPDATE_CHANNEL: process.env.UPDATE_CHANNEL,
|
||||
UPDATE_SERVER_URL: process.env.UPDATE_SERVER_URL,
|
||||
});
|
||||
|
||||
/**
|
||||
* Desktop (Electron main process) runtime env access.
|
||||
*
|
||||
@@ -63,13 +70,15 @@ export const getDesktopEnv = memoize(() =>
|
||||
clientPrefix: 'PUBLIC_',
|
||||
emptyStringAsUndefined: true,
|
||||
isServer: true,
|
||||
runtimeEnv: process.env,
|
||||
runtimeEnv: getRuntimeEnv(),
|
||||
server: {
|
||||
DEBUG_VERBOSE: envBoolean(false),
|
||||
|
||||
// escape hatch: allow testing static renderer in dev via env
|
||||
DESKTOP_RENDERER_STATIC: envBoolean(false),
|
||||
|
||||
DESKTOP_EXTERNAL_NAVIGATION_HOSTS: z.string().optional().default(''),
|
||||
|
||||
// device gateway url override (dev: point at a local `wrangler dev` instance,
|
||||
// e.g. http://localhost:8787). Falls back to the stored value, then the
|
||||
// production gateway.
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import * as childProcess from 'node:child_process';
|
||||
import * as os from 'node:os';
|
||||
import path from 'node:path';
|
||||
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
@@ -180,6 +181,76 @@ describe('cliAgentDetectors', () => {
|
||||
expect(status.path).toBe('/usr/local/bin/claude');
|
||||
expect(execMock).not.toHaveBeenCalled();
|
||||
expect(execFileMock).toHaveBeenCalledTimes(2);
|
||||
// Resolved on the inherited PATH — nothing extra to carry into spawn.
|
||||
expect(status.resolvedPathEnv).toBeUndefined();
|
||||
});
|
||||
|
||||
it('falls back to the Codex.app bundled CLI when `codex` is not on any PATH', async () => {
|
||||
const originalPath = process.env.PATH;
|
||||
const originalShell = process.env.SHELL;
|
||||
// Deterministic env: no SHELL → no login-shell lookup, merged PATH
|
||||
// equals process.env.PATH → no second `which` attempt.
|
||||
process.env.PATH = '/usr/bin:/bin';
|
||||
delete process.env.SHELL;
|
||||
|
||||
try {
|
||||
callExecFileError(new Error('not found')); // which codex
|
||||
callExecFile('codex-cli 0.138.0'); // bundled CLI --version
|
||||
|
||||
const { codexDetector } = await import('../cliAgentDetectors');
|
||||
const status = await codexDetector.detect();
|
||||
|
||||
expect(status.available).toBe(true);
|
||||
expect(status.path).toBe('/Applications/Codex.app/Contents/Resources/codex');
|
||||
expect(status.version).toBe('codex-cli 0.138.0');
|
||||
|
||||
expect(execFileMock).toHaveBeenCalledTimes(2);
|
||||
expect(execFileMock.mock.calls[0]![0]).toBe('which');
|
||||
expect(execFileMock.mock.calls[1]![0]).toBe(
|
||||
'/Applications/Codex.app/Contents/Resources/codex',
|
||||
);
|
||||
} finally {
|
||||
process.env.PATH = originalPath;
|
||||
if (originalShell === undefined) delete process.env.SHELL;
|
||||
else process.env.SHELL = originalShell;
|
||||
}
|
||||
});
|
||||
|
||||
it('stays unavailable when neither PATH nor the well-known locations have codex', async () => {
|
||||
const originalPath = process.env.PATH;
|
||||
const originalShell = process.env.SHELL;
|
||||
process.env.PATH = '/usr/bin:/bin';
|
||||
delete process.env.SHELL;
|
||||
|
||||
try {
|
||||
callExecFileError(new Error('not found')); // which codex
|
||||
callExecFileError(new Error('ENOENT')); // /Applications candidate
|
||||
callExecFileError(new Error('ENOENT')); // ~/Applications candidate
|
||||
|
||||
const { codexDetector } = await import('../cliAgentDetectors');
|
||||
const status = await codexDetector.detect();
|
||||
|
||||
expect(status.available).toBe(false);
|
||||
expect(execFileMock).toHaveBeenCalledTimes(3);
|
||||
expect(execFileMock.mock.calls[2]![0]).toBe(
|
||||
path.join(os.homedir(), 'Applications', 'Codex.app', 'Contents', 'Resources', 'codex'),
|
||||
);
|
||||
} finally {
|
||||
process.env.PATH = originalPath;
|
||||
if (originalShell === undefined) delete process.env.SHELL;
|
||||
else process.env.SHELL = originalShell;
|
||||
}
|
||||
});
|
||||
|
||||
it('does not probe well-known locations for an explicit path-like command', async () => {
|
||||
callExecFileError(new Error('ENOENT')); // /custom/bin/codex --version
|
||||
|
||||
const { detectHeterogeneousCliCommand } = await import('../cliAgentDetectors');
|
||||
const status = await detectHeterogeneousCliCommand('codex', '/custom/bin/codex');
|
||||
|
||||
expect(status.available).toBe(false);
|
||||
// Only the explicit path's --version attempt — no fallback probing.
|
||||
expect(execFileMock).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it('falls back to the login shell PATH for tools installed by shell setup', async () => {
|
||||
@@ -200,6 +271,12 @@ describe('cliAgentDetectors', () => {
|
||||
expect(status.available).toBe(true);
|
||||
expect(status.path).toBe('/Users/Hanam/.local/share/mise/shims/gemini');
|
||||
expect(status.version).toBe('gemini 0.2.0');
|
||||
// The login-shell PATH that resolved the shim must be surfaced so the
|
||||
// spawn site can carry it into the child env (mise/nvm `node` lives
|
||||
// there, not on the leaner inherited PATH).
|
||||
expect(status.resolvedPathEnv).toBe(
|
||||
'/opt/homebrew/bin:/Users/Hanam/.local/share/mise/shims:/usr/bin:/bin',
|
||||
);
|
||||
|
||||
expect(execFileMock).toHaveBeenCalledTimes(4);
|
||||
expect(execFileMock.mock.calls[0]![0]).toBe('which');
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
import { exec, execFile } from 'node:child_process';
|
||||
import { platform } from 'node:os';
|
||||
import { homedir, platform } from 'node:os';
|
||||
import path from 'node:path';
|
||||
import { promisify } from 'node:util';
|
||||
|
||||
@@ -190,6 +190,11 @@ const detectValidatedCommand = async (
|
||||
return {
|
||||
available: true,
|
||||
path: resolvedPath,
|
||||
// `env` is set only when resolution fell back to the login-shell PATH.
|
||||
// Surface that PATH so the spawn site can carry it into the child env —
|
||||
// otherwise a `#!/usr/bin/env node` shim resolved here can't find `node`
|
||||
// under the leaner inherited PATH (Finder-launched Electron).
|
||||
resolvedPathEnv: env?.PATH,
|
||||
version: output.split(/\r?\n/)[0],
|
||||
};
|
||||
} catch {
|
||||
@@ -209,6 +214,27 @@ const HETEROGENEOUS_CLI_AGENT_OPTIONS = {
|
||||
Pick<ValidatedDetectorOptions, 'validateKeywords'>
|
||||
>;
|
||||
|
||||
// Well-known absolute install locations probed when a bare command isn't on
|
||||
// PATH. The Codex desktop app bundles a fully functional CLI inside Codex.app
|
||||
// (sharing ~/.codex auth/config) but never symlinks it into PATH, so
|
||||
// `which codex` misses an otherwise working install.
|
||||
const getWellKnownCommandPaths = (agentType: HeterogeneousCliAgentType): string[] => {
|
||||
if (platform() !== 'darwin') return [];
|
||||
|
||||
switch (agentType) {
|
||||
case 'codex': {
|
||||
const bundledCli = path.join('Codex.app', 'Contents', 'Resources', 'codex');
|
||||
return [
|
||||
path.join('/Applications', bundledCli),
|
||||
path.join(homedir(), 'Applications', bundledCli),
|
||||
];
|
||||
}
|
||||
default: {
|
||||
return [];
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
export const detectHeterogeneousCliCommand = async (
|
||||
agentType: HeterogeneousCliAgentType,
|
||||
command: string,
|
||||
@@ -216,7 +242,20 @@ export const detectHeterogeneousCliCommand = async (
|
||||
const validator = HETEROGENEOUS_CLI_AGENT_OPTIONS[agentType];
|
||||
if (!validator) return { available: false };
|
||||
|
||||
return detectValidatedCommand(command, validator);
|
||||
const status = await detectValidatedCommand(command, validator);
|
||||
if (status.available) return status;
|
||||
|
||||
// A bare command missing from PATH may still live at a well-known install
|
||||
// location (e.g. the Codex desktop app's bundled CLI). Don't second-guess
|
||||
// an explicit user-configured path.
|
||||
if (!command.trim().includes(path.sep)) {
|
||||
for (const candidate of getWellKnownCommandPaths(agentType)) {
|
||||
const fallbackStatus = await detectValidatedCommand(candidate, validator);
|
||||
if (fallbackStatus.available) return fallbackStatus;
|
||||
}
|
||||
}
|
||||
|
||||
return status;
|
||||
};
|
||||
|
||||
/**
|
||||
@@ -261,14 +300,17 @@ export const claudeCodeDetector: IToolDetector = createValidatedDetector({
|
||||
/**
|
||||
* OpenAI Codex CLI
|
||||
* @see https://github.com/openai/codex
|
||||
*
|
||||
* Goes through `detectHeterogeneousCliCommand` so the Codex.app bundled-CLI
|
||||
* fallback applies here too, keeping the manager path and the custom-command
|
||||
* path in sync.
|
||||
*/
|
||||
export const codexDetector: IToolDetector = createValidatedDetector({
|
||||
candidates: ['codex'],
|
||||
export const codexDetector: IToolDetector = {
|
||||
description: 'Codex - OpenAI agentic coding CLI',
|
||||
detect: () => detectHeterogeneousCliCommand('codex', 'codex'),
|
||||
name: 'codex',
|
||||
priority: 2,
|
||||
validateKeywords: ['codex'],
|
||||
});
|
||||
};
|
||||
|
||||
/**
|
||||
* Google Gemini CLI
|
||||
|
||||
@@ -17,24 +17,23 @@ const log = debug('lobe-server:agent-runtime:coordinator');
|
||||
* decision) starts, but that resume runs under a **new** operationId with
|
||||
* its own event stream. For the paused operationId no further events will
|
||||
* arrive, so clients should stop waiting the same way they do on done.
|
||||
*
|
||||
* `waiting_for_async_tool` is different: deferred tools such as server
|
||||
* sub-agents resume the SAME operationId after the out-of-band result is
|
||||
* backfilled. Ending the stream at park time makes the client mark the turn
|
||||
* as stopped while the server is still waiting for sub-agents.
|
||||
*/
|
||||
const STREAM_END_STATUSES = new Set<AgentState['status']>([
|
||||
'done',
|
||||
'error',
|
||||
'interrupted',
|
||||
'waiting_for_human',
|
||||
'waiting_for_async_tool',
|
||||
]);
|
||||
|
||||
const hasEnteredStreamEndState = (
|
||||
previousStatus?: AgentState['status'],
|
||||
nextStatus?: AgentState['status'],
|
||||
): nextStatus is
|
||||
| 'done'
|
||||
| 'error'
|
||||
| 'interrupted'
|
||||
| 'waiting_for_human'
|
||||
| 'waiting_for_async_tool' => {
|
||||
): nextStatus is 'done' | 'error' | 'interrupted' | 'waiting_for_human' => {
|
||||
const wasStreamEnd = previousStatus ? STREAM_END_STATUSES.has(previousStatus) : false;
|
||||
return Boolean(nextStatus && STREAM_END_STATUSES.has(nextStatus) && !wasStreamEnd);
|
||||
};
|
||||
|
||||
@@ -61,6 +61,7 @@ import { chainCompressContext } from '@lobechat/prompts';
|
||||
import {
|
||||
type ChatToolPayload,
|
||||
type ExecSubAgentParams,
|
||||
type ExecVirtualSubAgentParams,
|
||||
type MessageToolCall,
|
||||
type UIChatMessage,
|
||||
} from '@lobechat/types';
|
||||
@@ -73,6 +74,7 @@ import { TopicModel } from '@/database/models/topic';
|
||||
import { UserModel } from '@/database/models/user';
|
||||
import { type LobeChatDatabase } from '@/database/type';
|
||||
import { fileEnv } from '@/envs/file';
|
||||
import { type ExecutionPlan, isDeviceCapablePlan } from '@/helpers/executionTarget';
|
||||
import { serverMessagesEngine } from '@/server/modules/Mecha/ContextEngineering';
|
||||
import { type EvalContext } from '@/server/modules/Mecha/ContextEngineering/types';
|
||||
import { initModelRuntimeFromDB } from '@/server/modules/ModelRuntime';
|
||||
@@ -202,6 +204,51 @@ const isEmptyModelCompletion = (params: {
|
||||
return true;
|
||||
};
|
||||
|
||||
type ReasoningReplayNode = {
|
||||
children?: ReasoningReplayNode[];
|
||||
members?: ReasoningReplayNode[];
|
||||
reasoning?: unknown;
|
||||
};
|
||||
|
||||
const stripAssistantReasoningForReplay = (messages: UIChatMessage[]): UIChatMessage[] => {
|
||||
const stripMessage = <T extends ReasoningReplayNode>(message: T): T => {
|
||||
let changed = false;
|
||||
|
||||
const children = message.children?.map((child) => {
|
||||
const strippedChild = stripMessage(child);
|
||||
if (strippedChild !== child) changed = true;
|
||||
return strippedChild;
|
||||
});
|
||||
|
||||
const members = message.members?.map((member) => {
|
||||
const strippedMember = stripMessage(member);
|
||||
if (strippedMember !== member) changed = true;
|
||||
return strippedMember;
|
||||
});
|
||||
|
||||
if ('reasoning' in message) changed = true;
|
||||
if (!changed) return message;
|
||||
|
||||
const { reasoning: _reasoning, ...messageWithoutReasoning } = message;
|
||||
|
||||
return {
|
||||
...messageWithoutReasoning,
|
||||
...(children ? { children } : {}),
|
||||
...(members ? { members } : {}),
|
||||
} as T;
|
||||
};
|
||||
|
||||
let changed = false;
|
||||
|
||||
const strippedMessages = messages.map((message) => {
|
||||
const strippedMessage = stripMessage(message);
|
||||
if (strippedMessage !== message) changed = true;
|
||||
return strippedMessage;
|
||||
});
|
||||
|
||||
return changed ? strippedMessages : messages;
|
||||
};
|
||||
|
||||
const GEN_AI_FUNCTION_TOOL_TYPE: ToolType = 'function';
|
||||
|
||||
type ToolFailureKind = 'replan' | 'retry' | 'stop';
|
||||
@@ -277,7 +324,7 @@ const buildPostProcessUrl = (
|
||||
};
|
||||
|
||||
/**
|
||||
* Build the per-tool-call server sub-agent runner injected into the tool
|
||||
* Build the per-tool-call server virtual sub-agent runner injected into the tool
|
||||
* execution context. Closes over the current tool payload + parent message so
|
||||
* the `callSubAgent` server tool can fork a child op without re-deriving the
|
||||
* message anchor (which it cannot do correctly from its own context).
|
||||
@@ -285,17 +332,18 @@ const buildPostProcessUrl = (
|
||||
* The runner creates the pending placeholder tool message that anchors the
|
||||
* isolation thread (so the UI shows a loading state and the completion bridge
|
||||
* has a message to backfill), then kicks off the child op asynchronously and
|
||||
* returns immediately. Returns `undefined` when sub-agent execution is not
|
||||
* available (no `execSubAgent` callback, or missing agent/topic context).
|
||||
* returns immediately. Returns `undefined` when virtual sub-agent execution is
|
||||
* not available (no `execVirtualSubAgent` callback, or missing agent/topic
|
||||
* context).
|
||||
*/
|
||||
const buildServerSubAgentRunner = (
|
||||
const buildServerVirtualSubAgentRunner = (
|
||||
ctx: RuntimeExecutorContext,
|
||||
state: AgentState,
|
||||
chatToolPayload: ChatToolPayload,
|
||||
parentMessageId: string,
|
||||
): ServerSubAgentRunner | undefined => {
|
||||
const execSubAgent = ctx.execSubAgent;
|
||||
if (!execSubAgent) return undefined;
|
||||
const execVirtualSubAgent = ctx.execVirtualSubAgent;
|
||||
if (!execVirtualSubAgent) return undefined;
|
||||
|
||||
const agentId = state.metadata?.agentId;
|
||||
const topicId = ctx.topicId ?? state.metadata?.topicId;
|
||||
@@ -318,16 +366,15 @@ const buildServerSubAgentRunner = (
|
||||
topicId,
|
||||
});
|
||||
|
||||
// 2. Fork the child op anchored to the placeholder. `resumeParentOnComplete`
|
||||
// tells execSubAgent to register the completion bridge that
|
||||
// backfills this tool message and resumes the parent op.
|
||||
const result = (await execSubAgent({
|
||||
// 2. Fork the virtual child op anchored to the placeholder. The virtual
|
||||
// entry marks the child as `isSubAgent` and registers the completion
|
||||
// bridge that backfills this tool message and resumes the parent op.
|
||||
const result = (await execVirtualSubAgent({
|
||||
agentId: targetAgentId ?? agentId,
|
||||
groupId: state.metadata?.groupId ?? undefined,
|
||||
instruction,
|
||||
parentMessageId: placeholder.id,
|
||||
parentOperationId: ctx.operationId,
|
||||
resumeParentOnComplete: true,
|
||||
timeout,
|
||||
title: description,
|
||||
topicId,
|
||||
@@ -341,7 +388,7 @@ const buildServerSubAgentRunner = (
|
||||
await ctx.messageModel.deleteMessage(placeholder.id);
|
||||
} catch (error) {
|
||||
log(
|
||||
'buildServerSubAgentRunner: failed to clean up placeholder %s: %O',
|
||||
'buildServerVirtualSubAgentRunner: failed to clean up placeholder %s: %O',
|
||||
placeholder.id,
|
||||
error,
|
||||
);
|
||||
@@ -476,11 +523,17 @@ export interface RuntimeExecutorContext {
|
||||
discordContext?: any;
|
||||
evalContext?: EvalContext;
|
||||
/**
|
||||
* Callback to spawn a sub-agent task server-side.
|
||||
* Callback to run a legacy agent invocation server-side.
|
||||
* Injected by AiAgentService so exec_sub_agent / exec_sub_agents executors
|
||||
* can dispatch callAgent-triggered tasks without a circular import.
|
||||
* can dispatch callAgent-triggered runs without a circular import.
|
||||
*/
|
||||
execSubAgent?: (params: ExecSubAgentParams) => Promise<unknown>;
|
||||
/**
|
||||
* Callback to fork a `lobe-agent.callSubAgent` virtual child run. Unlike
|
||||
* execSubAgent, this path installs the async completion bridge and marks the
|
||||
* child operation as a sub-agent.
|
||||
*/
|
||||
execVirtualSubAgent?: (params: ExecVirtualSubAgentParams) => Promise<unknown>;
|
||||
hookDispatcher?: HookDispatcher;
|
||||
loadAgentState?: (operationId: string) => Promise<AgentState | null>;
|
||||
messageModel: MessageModel;
|
||||
@@ -532,17 +585,23 @@ export const createRuntimeExecutors = (
|
||||
const provider = llmPayload.provider || state.modelRuntimeConfig?.provider;
|
||||
// Resolve tools via ToolResolver (unified tool injection).
|
||||
//
|
||||
// Belt-and-suspenders: even if `aiAgent.execAgent` ever forgets to clear
|
||||
// `state.metadata.activeDeviceId` for a non-trusted sender, swallowing
|
||||
// it here keeps `buildStepToolDelta` from re-injecting `local-system` —
|
||||
// the engine's enabledToolIds exclusion alone is not enough, since the
|
||||
// delta builder treats activeDeviceId as an independent activation
|
||||
// signal and only dedupes against already-enabled tools.
|
||||
// Single-track device gate: `buildStepToolDelta` treats activeDeviceId as
|
||||
// an independent activation signal (it only dedupes against already-
|
||||
// enabled tools), so any id that reaches it WILL inject local-system. The
|
||||
// execution plan is the only authority on whether this session may touch
|
||||
// a device — swallow the id for non-device-capable plans (`none`,
|
||||
// `sandbox`) and for denied senders, even if `state.metadata.activeDeviceId`
|
||||
// was populated by a bug or a mid-run side effect. Plans absent on old /
|
||||
// resumed operations fall back to the policy-only gate.
|
||||
const devicePolicy = state.metadata?.deviceAccessPolicy as
|
||||
| { canUseDevice: boolean; reason: DeviceAccessReason }
|
||||
| undefined;
|
||||
const executionPlan = state.metadata?.executionPlan as ExecutionPlan | undefined;
|
||||
const planAllowsDevice = !executionPlan || isDeviceCapablePlan(executionPlan);
|
||||
const activeDeviceId =
|
||||
devicePolicy?.canUseDevice === false ? undefined : state.metadata?.activeDeviceId;
|
||||
devicePolicy?.canUseDevice === false || !planAllowsDevice
|
||||
? undefined
|
||||
: state.metadata?.activeDeviceId;
|
||||
const operationToolSet: OperationToolSet = state.operationToolSet ?? {
|
||||
enabledToolIds: [],
|
||||
executorMap: state.toolExecutorMap ?? {},
|
||||
@@ -660,7 +719,7 @@ export const createRuntimeExecutors = (
|
||||
|
||||
try {
|
||||
type ContentPart = { text: string; type: 'text' } | { image: string; type: 'image' };
|
||||
let shouldPersistAssistantReasoning = false;
|
||||
let shouldReplayAssistantReasoning = false;
|
||||
let preserveThinkingForPayload: boolean | undefined;
|
||||
|
||||
// Process messages through serverMessagesEngine to inject system role, knowledge, etc.
|
||||
@@ -699,19 +758,21 @@ export const createRuntimeExecutors = (
|
||||
modelSupportsPreserveThinkingFromCard ||
|
||||
(!modelCard && providerSupportsPreserveThinkingFallback);
|
||||
|
||||
shouldPersistAssistantReasoning =
|
||||
preserveThinkingRequested && modelSupportsPreserveThinking;
|
||||
shouldReplayAssistantReasoning = preserveThinkingRequested && modelSupportsPreserveThinking;
|
||||
preserveThinkingForPayload =
|
||||
modelSupportsPreserveThinking && typeof preserveThinkingConfigured === 'boolean'
|
||||
? preserveThinkingConfigured
|
||||
: undefined;
|
||||
const messagesForContext = shouldReplayAssistantReasoning
|
||||
? (llmPayload.messages as UIChatMessage[])
|
||||
: stripAssistantReasoningForReplay(llmPayload.messages as UIChatMessage[]);
|
||||
|
||||
// Extract <refer_topic> tags from messages and fetch summaries.
|
||||
// Skip if messages already contain injected topic_reference_context
|
||||
// (e.g., from client-side contextEngineering preprocessing) to avoid double injection.
|
||||
let topicReferences;
|
||||
const alreadyHasTopicRefs = (
|
||||
llmPayload.messages as Array<{ content: string | unknown }>
|
||||
messagesForContext as Array<{ content: string | unknown }>
|
||||
).some(
|
||||
(m) => typeof m.content === 'string' && m.content.includes('topic_reference_context'),
|
||||
);
|
||||
@@ -720,7 +781,7 @@ export const createRuntimeExecutors = (
|
||||
const topicModel = new TopicModel(ctx.serverDB, ctx.userId, ctx.workspaceId);
|
||||
const messageModel = new MessageModelClass(ctx.serverDB, ctx.userId, ctx.workspaceId);
|
||||
topicReferences = await resolveTopicReferences(
|
||||
llmPayload.messages as Array<{ content: string | unknown }>,
|
||||
messagesForContext as Array<{ content: string | unknown }>,
|
||||
async (topicId) => topicModel.findById(topicId),
|
||||
async (topicId) => {
|
||||
const topic = await topicModel.findById(topicId);
|
||||
@@ -762,7 +823,7 @@ export const createRuntimeExecutors = (
|
||||
agentConfig?.slug === 'web-onboarding' ||
|
||||
resolved.enabledToolIds.includes('lobe-web-onboarding');
|
||||
const alreadyHasOnboardingContext = (
|
||||
llmPayload.messages as Array<{ content: string | unknown }>
|
||||
messagesForContext as Array<{ content: string | unknown }>
|
||||
).some((message) => {
|
||||
if (typeof message.content !== 'string') return false;
|
||||
|
||||
@@ -1043,7 +1104,7 @@ export const createRuntimeExecutors = (
|
||||
name: kb.name ?? '',
|
||||
})),
|
||||
},
|
||||
messages: llmPayload.messages as UIChatMessage[],
|
||||
messages: messagesForContext,
|
||||
model,
|
||||
provider,
|
||||
systemRole: agentConfig.systemRole ?? undefined,
|
||||
@@ -1071,14 +1132,14 @@ export const createRuntimeExecutors = (
|
||||
CONTEXT_ENGINEERING_SPAN_NAME,
|
||||
{
|
||||
attributes: buildContextEngineeringAttributes({
|
||||
hasImages: (llmPayload.messages as Array<{ content?: unknown }>).some(
|
||||
hasImages: (messagesForContext as Array<{ content?: unknown }>).some(
|
||||
(m) =>
|
||||
Array.isArray(m.content) &&
|
||||
(m.content as Array<{ type?: string }>).some((p) => p?.type === 'image_url'),
|
||||
),
|
||||
historyCompressed:
|
||||
Array.isArray(llmPayload.messages) &&
|
||||
llmPayload.messages.some((m: { role?: string }) => m?.role === 'compressedGroup'),
|
||||
Array.isArray(messagesForContext) &&
|
||||
messagesForContext.some((m: { role?: string }) => m?.role === 'compressedGroup'),
|
||||
knowledgeCount:
|
||||
(contextEngineInput.knowledge?.knowledgeBases?.length ?? 0) +
|
||||
(contextEngineInput.knowledge?.fileContents?.length ?? 0),
|
||||
@@ -1086,7 +1147,7 @@ export const createRuntimeExecutors = (
|
||||
(contextEngineInput.knowledge?.knowledgeBases?.length ?? 0) > 0 ||
|
||||
(contextEngineInput.knowledge?.fileContents?.length ?? 0) > 0,
|
||||
memoryInjected: Boolean(contextEngineInput.userMemory?.memories),
|
||||
messageCount: llmPayload.messages.length,
|
||||
messageCount: messagesForContext.length,
|
||||
operationId,
|
||||
stepIndex,
|
||||
systemRoleLength: contextEngineInput.systemRole?.length,
|
||||
@@ -1639,9 +1700,10 @@ export const createRuntimeExecutors = (
|
||||
};
|
||||
}
|
||||
|
||||
const persistedReasoning = shouldPersistAssistantReasoning
|
||||
? finalReasoning
|
||||
: undefined;
|
||||
// preserveThinking only gates whether reasoning is replayed into the
|
||||
// next LLM payload (state.messages); the DB copy powers UI display
|
||||
// after refresh and must always be saved.
|
||||
const replayedReasoning = shouldReplayAssistantReasoning ? finalReasoning : undefined;
|
||||
|
||||
try {
|
||||
// Build metadata object
|
||||
@@ -1675,7 +1737,7 @@ export const createRuntimeExecutors = (
|
||||
content: finalContent,
|
||||
imageList: imageList.length > 0 ? imageList : undefined,
|
||||
metadata: Object.keys(metadata).length > 0 ? metadata : undefined,
|
||||
reasoning: persistedReasoning,
|
||||
reasoning: finalReasoning,
|
||||
search: grounding,
|
||||
tools: persistedTools,
|
||||
});
|
||||
@@ -1708,7 +1770,7 @@ export const createRuntimeExecutors = (
|
||||
newState.messages.push({
|
||||
content,
|
||||
id: assistantMessageItem.id,
|
||||
reasoning: persistedReasoning,
|
||||
reasoning: replayedReasoning,
|
||||
role: 'assistant',
|
||||
tool_calls: stateToolCalls,
|
||||
});
|
||||
@@ -2421,7 +2483,7 @@ export const createRuntimeExecutors = (
|
||||
scope: state.metadata?.scope,
|
||||
serverDB: ctx.serverDB,
|
||||
skipResultTruncation: true,
|
||||
subAgent: buildServerSubAgentRunner(
|
||||
subAgent: buildServerVirtualSubAgentRunner(
|
||||
ctx,
|
||||
state,
|
||||
chatToolPayload,
|
||||
@@ -2663,14 +2725,15 @@ export const createRuntimeExecutors = (
|
||||
|
||||
log('[%s:%d] Tool execution completed', operationId, stepIndex);
|
||||
|
||||
// When the tool result carries an execSubAgent / execSubAgents state the
|
||||
// GeneralChatAgent needs `stop: true` in the payload to detect it and
|
||||
// emit the matching exec_sub_agent / exec_sub_agents instruction. Without
|
||||
// this flag the agent falls through to the normal LLM-call path and the
|
||||
// sub-agent is never spawned.
|
||||
const execTaskStateType = executionResult.state?.type as string | undefined;
|
||||
const isExecTaskState =
|
||||
execTaskStateType === 'execSubAgent' || execTaskStateType === 'execSubAgents';
|
||||
// When a legacy callAgent task result carries execSubAgent / execSubAgents
|
||||
// state, the GeneralChatAgent needs `stop: true` in the payload to detect
|
||||
// it and emit the matching exec_sub_agent / exec_sub_agents instruction.
|
||||
// Without this flag the agent falls through to the normal LLM-call path
|
||||
// and the background agent run is never spawned.
|
||||
const legacyAgentInvocationStateType = executionResult.state?.type as string | undefined;
|
||||
const isLegacyAgentInvocationState =
|
||||
legacyAgentInvocationStateType === 'execSubAgent' ||
|
||||
legacyAgentInvocationStateType === 'execSubAgents';
|
||||
|
||||
executeToolSpan.setAttributes(
|
||||
buildExecuteToolResultAttributes({ attempts: execution.attempts, success: isSuccess }),
|
||||
@@ -2686,7 +2749,7 @@ export const createRuntimeExecutors = (
|
||||
isSuccess,
|
||||
// Pass tool message ID as parentMessageId for the next LLM call
|
||||
parentMessageId: toolMessageId,
|
||||
...(isExecTaskState && { stop: true }),
|
||||
...(isLegacyAgentInvocationState && { stop: true }),
|
||||
toolCall: chatToolPayload,
|
||||
toolCallId: chatToolPayload.id,
|
||||
},
|
||||
@@ -2993,7 +3056,7 @@ export const createRuntimeExecutors = (
|
||||
scope: state.metadata?.scope,
|
||||
serverDB: ctx.serverDB,
|
||||
skipResultTruncation: true,
|
||||
subAgent: buildServerSubAgentRunner(
|
||||
subAgent: buildServerVirtualSubAgentRunner(
|
||||
ctx,
|
||||
state,
|
||||
chatToolPayload,
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user