Compare commits

..

2 Commits

Author SHA1 Message Date
Arvin Xu 5fcf2daa65 🐛 fix: require model context for "does not exist" matches
Previously, any message containing "does not exist" was classified as ModelNotFound, which could misclassify API key, deployment, or unrelated errors. Now require the word "model" to appear before "does not exist" within the same sentence. Periods inside version numbers (e.g. "doubao-seed-2.0-pro") are still allowed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-24 01:10:58 +08:00
Arvin Xu d4efbe1ed1 feat: add isModelNotFoundError pattern matching for model-runtime error classification
Add message-based pattern matching for ModelNotFound errors, similar to existing patterns for context window, quota, and account errors. Previously ModelNotFound was only detected via error code 'model_not_found', missing cases where providers return different codes but include recognizable messages like "does not exist" (Volcengine/doubao).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-24 01:10:58 +08:00
2127 changed files with 34563 additions and 233210 deletions
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: agent-runtime-hooks
description: 'Agent runtime lifecycle hooks. Use for before/after tool or step hooks, tool mocks, human intervention, sub-agent calls, context compression, evals, tracing, callAgent, or lifecycle events.'
description: "Agent runtime lifecycle hooks for observing and intercepting agent execution. Use when adding hooks to agent operations, mocking tool calls, logging step events, handling human intervention, sub-agent calls, context compression, or building eval/tracing integrations. Triggers on 'hooks', 'beforeToolCall', 'afterToolCall', 'beforeStep', 'afterStep', 'onComplete', 'onError', 'tool mock', 'agent lifecycle', 'human intervention', 'callAgent', 'compact'."
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: agent-signal
description: 'Build or extend LobeHub Agent Signal pipelines. Use for signal sources, signal/action types, policies, middleware, workflow handoff, dedupe, scope behavior, or observability.'
description: Build or extend LobeHub Agent Signal pipelines for background or quiet agent work driven by event sources, semantic signals, and action handlers. Use when adding a new Agent Signal source, signal or action type, policy, middleware handler, workflow handoff, dedupe or scope behavior, or observability around `src/server/services/agentSignal/**`, `packages/agent-signal`, or `packages/observability-otel/src/modules/agent-signal`.
---
# Agent Signal
+6 -7
View File
@@ -1,6 +1,6 @@
---
name: agent-tracing
description: 'Agent tracing CLI for execution snapshots. Use for agent-tracing, traces, snapshots, LLM call inspection, context engine data, agent step analysis, or execution debugging.'
description: "Agent tracing CLI for inspecting agent execution snapshots. Use when user mentions 'agent-tracing', 'trace', 'snapshot', wants to debug agent execution, inspect LLM calls, view context engine data, or analyze agent steps. Triggers on agent debugging, trace inspection, or execution analysis tasks."
user-invocable: false
---
@@ -14,7 +14,7 @@ In `NODE_ENV=development`, `AgentRuntimeService.executeStep()` automatically rec
**Data flow**: executeStep loop -> build `StepPresentationData` -> write partial snapshot to disk -> on completion, finalize to `.agent-tracing/{timestamp}_{traceId}.json`
**Context engine capture**: In `RuntimeExecutors.ts`, the `call_llm` executor calls `ctx.tracingContextEngine(input, output)` after `serverMessagesEngine()` processes messages. `AgentRuntimeService.executeStep` buffers the call per step and forwards it to `OperationTraceRecorder.appendStep` as the typed `contextEngine` field. CE flows through this side channel rather than the `events` array so its heavy payload (agentDocuments, systemRole, …) never enters the Redis state pipeline (LOBE-9110).
**Context engine capture**: In `RuntimeExecutors.ts`, the `call_llm` executor emits a `context_engine_result` event after `serverMessagesEngine()` processes messages. This event carries the full `contextEngineInput` (DB messages, systemRole, model, knowledge, tools, userMemory, etc.) and the processed `output` messages (the final LLM payload).
## Package Location
@@ -199,10 +199,9 @@ interface StepSnapshot {
messages?: any[]; // DB messages before step
context?: { phase: string; payload?: unknown; stepContext?: unknown };
events?: Array<{ type: string; [key: string]: unknown }>;
contextEngine?: {
input?: unknown; // contextEngineInput minus messages + toolsConfig (reconstructible from baseline)
output?: unknown; // processed messages array (final LLM payload)
};
// context_engine_result event contains:
// input: full contextEngineInput (messages, systemRole, model, knowledge, tools, userMemory, ...)
// output: processed messages array (final LLM payload)
}
```
@@ -217,5 +216,5 @@ When using `--messages`, the output shows three sections (if context engine data
## Integration Points
- **Recording**: `src/server/services/agentRuntime/AgentRuntimeService.ts` — in the `executeStep()` method, after building `stepPresentationData`, writes partial snapshot in dev mode
- **Context engine capture**: `src/server/modules/AgentRuntime/RuntimeExecutors.ts` — in `call_llm` executor, after `serverMessagesEngine()` returns, calls `ctx.tracingContextEngine(input, output)`. `AgentRuntimeService.executeStep` buffers it per step and passes it to `traceRecorder.appendStep` as the typed `contextEngine` field (kept off the `events` array to stay out of Redis state).
- **Context engine event**: `src/server/modules/AgentRuntime/RuntimeExecutors.ts` — in `call_llm` executor, after `serverMessagesEngine()` returns, emits `context_engine_result` event
- **Store**: `FileSnapshotStore` reads/writes to `.agent-tracing/` relative to `process.cwd()`
+2 -2
View File
@@ -1,6 +1,6 @@
---
name: builtin-tool
description: 'Build LobeHub builtin tool packages. Use when adding agent-callable tools, manifests, executors, runtimes, inspectors, renders, placeholders, streaming, interventions, portals, or tool registries.'
description: Build a new builtin tool package under `packages/builtin-tool-<name>/`. Use when adding a new agent-callable toolset, designing its API surface (manifest / ApiName / Params / State), implementing the Executor + ExecutionRuntime, building the Inspector / Render / Placeholder / Streaming / Intervention / Portal UI, or wiring a tool into the central registries (`packages/builtin-tools/src/{index,identifiers,inspectors,renders,placeholders,streamings,interventions,portals}.ts` and `src/store/tool/slices/builtin/executors/index.ts`). Triggers on "new builtin tool", "add a tool", "tool inspector", "tool render", "tool placeholder", "tool streaming", "tool intervention", "BuiltinToolManifest", "BaseExecutor", "ExecutionRuntime".
---
# Builtin Tool Authoring Guide
@@ -23,7 +23,7 @@ A builtin tool is a package the agent runtime can call. It ships **five faces**:
| ------------------------------------------------------------------------------------ | --------------------------------------------- |
| Where do files live? What does each face do? Wiring? | [architecture.md](references/architecture.md) |
| How do I name the tool, design APIs, write the manifest, executor, ExecutionRuntime? | [tool-design.md](references/tool-design.md) |
| How do I build Inspector / Render / Placeholder / Streaming / Intervention / Portal? | [ui/](references/ui/README.md) |
| How do I build Inspector / Render / Placeholder / Streaming / Intervention / Portal? | [ui.md](references/ui.md) |
---
@@ -2,7 +2,7 @@
This doc covers everything that **isn't UI**: the tool's identifier, API surface, manifest, types, system prompt, ExecutionRuntime, and the executor that wires it into the frontend.
For UI surfaces (Inspector / Render / Placeholder / Streaming / Intervention / Portal), see [ui/](ui/README.md).
For UI surfaces (Inspector / Render / Placeholder / Streaming / Intervention / Portal), see [ui.md](ui.md).
For where files live and how registries work, see [architecture.md](architecture.md).
---
@@ -156,7 +156,7 @@ export const TaskManifest: BuiltinToolManifest = {
executors: ['client', 'server'],
/* Default human intervention policy for all APIs that don't specify one.
Pair with an Intervention component (see ui/intervention.md). */
Pair with an Intervention component (see ui.md). */
humanIntervention: 'never' | 'always' | { /* extended config */ },
}
```
@@ -0,0 +1,744 @@
# Tool UI Surfaces
A builtin tool can ship up to **six client-side surfaces**, each with a different role in the chat UI. Only `Inspector` is required; the other five are added on demand and registered in their own central files.
| Surface | Required? | When the chat shows it | Registered in |
| ------------ | --------- | --------------------------------------------------------------------- | --------------------------------------------- |
| Inspector | ✅ Always | Header strip of every tool call (one-line chip) | `packages/builtin-tools/src/inspectors.ts` |
| Render | Optional | Rich result card below the header, after the call returns | `packages/builtin-tools/src/renders.ts` |
| Placeholder | Optional | Skeleton between "args streaming complete" and "result arrives" | `packages/builtin-tools/src/placeholders.ts` |
| Streaming | Optional | Live output during execution (e.g. command stdout) | `packages/builtin-tools/src/streamings.ts` |
| Intervention | Optional | Approval / edit-before-run dialog (when `humanIntervention` triggers) | `packages/builtin-tools/src/interventions.ts` |
| Portal | Optional | Full-screen detail view (right-side or modal) | `packages/builtin-tools/src/portals.ts` |
The two reference tools to read end-to-end:
- **`builtin-tool-web-browsing/src/client/`** — Inspector + Render + Placeholder + Portal (no Intervention/Streaming).
- **`builtin-tool-local-system/src/client/`** — all six surfaces, including `components/` for shared building blocks.
---
## Tool Render 设计原则(中文草案)
这些原则用于判断一个 builtin tool 的 Inspector / Render / Placeholder / Streaming / Intervention / Portal 应该做什么,以及做到什么程度。
1. **先保证折叠态可读。** 每个 API 都必须有 Inspector;用户不展开也应该能看懂 “正在做什么 / 对什么做 / 当前结果是什么”。Inspector 不应该只展示函数名和原始参数。
2. **Inspector 是一句话,不是详情页。** 优先表达动作、关键对象、数量、状态,例如 “分析图片 3 张”“搜索 12 个结果”“读取 config.json”。长文本、列表和结构化结果放到 Render 或 Portal。
3. **Inspector 要覆盖执行生命周期。** `args` 还在 streaming、工具执行中、执行完成、执行失败时都应该有稳定展示;必要时同时读取 `args``partialArgs``pluginState`,避免出现空白、跳变或只显示半截参数。
4. **文案要随状态切换时态。** 同一个动作在 loading 与 completed 两个阶段必须用不同的措辞:执行中用现在进行时(“正在创建任务 / Creating task / 正在搜索”),执行完成后切到完成态(“已创建任务 / Task created / 已找到 N 条”)。Inspector chip 会一直留在聊天记录里 —— 如果一直挂着 “正在 xxx”,几小时后回看历史时会读起来像还在跑。约定的 i18n 形式是 `<api>.loading` / `<api>.completed` 一对键(见 `lobe-agent.apiName.callSubAgent.{loading,completed}``lobe-claude-code.task.{create,list,update,get}.{loading,completed}`),渲染时按 `isArgumentsStreaming || isLoading` 决定取哪一个。只读 / 查询类(“查看任务” 这种本来就是名词性的)可以共用一个键。
5. **只有结构化结果才需要 Render。** 如果工具结果只是自然语言总结,通常不需要 Render;如果结果包含列表、媒体、文件、表格、代码、diff、地图、时间线、权限请求等结构,就应该提供 Render。
6. **Render 要帮助用户检查结果,而不是复述参数。** Render 的主体应该围绕工具产物组织:可预览、可比较、可筛选、可定位。参数只作为上下文辅助出现,不要把 Render 做成一块更大的 args dump。
7. **参数和结果要一起参与渲染。** 好的 Tool UI 通常同时用 `args` 解释意图,用 `pluginState` 展示真实执行结果;但 `pluginState` 只放结果域数据,不要反向塞入可以从 `args` 推导出的内容。
8. **慢操作要有 Placeholder。** 如果工具通常需要等待网络、文件系统、模型或外部进程,Placeholder 应该先占住最终 Render 的版式,让用户知道即将看到什么,而不是只显示一个泛化 loading。
9. **Streaming 只用于连续产物。** 搜索列表、日志、长文本、文件分析、分阶段计划适合 Streaming;一次性小结果不需要强行做 Streaming。Streaming UI 要能渐进追加,并且完成后自然过渡到最终 Render。
10. **有风险的动作必须 Intervention。** 写文件、删除、发送、安装、执行命令、外部可见操作、权限敏感操作,都应该在执行前给出可理解的确认界面;确认文案要说明影响范围,而不是只问 “是否继续”。
11. **错误、空态和截断都是正式状态。** Render 不能在失败、无结果、超长结果时退化成空白。错误要说明发生在哪一步;空态要告诉用户没有产物;超长内容要明确 “展示前 N 项 / 还有 N 项”。
12. **信息密度要克制。** 默认展示最有判断价值的部分:标题、来源、状态、摘要、少量关键字段。大对象、长列表、原文、调试数据放进可展开区域或 Portal,避免把聊天流撑成后台管理页。
13. **视觉上融入聊天流。** Tool UI 应该使用 `@lobehub/ui` / base-ui、`Flexbox``createStaticStyles``cssVar.*`,遵循现有间距、圆角、颜色、字号;不要为单个工具发明一套独立视觉语言。
14. **Devtools fixture 是验收入口。** 新增或修改 Tool UI 时,应在 `/devtools` 里准备覆盖典型态、loading/streaming、空态、错误态、长内容态的 fixture;一个 API 如果在真实聊天里会出现,就不应该在 devtools 中缺席。
15. **先做用户会看的 UI,再做调试 UI。** Raw JSON、trace、schema、内部 id 可以存在,但应默认收起或放到调试区;主界面先回答用户最关心的问题:工具做了什么,结果值不值得信任,下一步能做什么。
---
## 0. Shared Style Rules
These apply across every surface.
### 0.1 Use `'use client'` at the top of every component file
Tool surfaces are leaves in the chat tree and must not block server rendering.
### 0.2 Prefer `createStaticStyles + cssVar.*`
Zero-runtime CSS-in-JS — the styles compile once and read CSS variables at runtime.
```tsx
import { createStaticStyles, cssVar } from 'antd-style';
const styles = createStaticStyles(({ css, cssVar }) => ({
chip: css`
padding-block: 2px;
padding-inline: 8px;
border-radius: 999px;
color: ${cssVar.colorText};
background: ${cssVar.colorFillTertiary};
`,
}));
```
Fall back to `createStyles + token` only when you need runtime token computation (rare). Inline `style={{ color: cssVar.colorTextSecondary }}` is fine for one-off dynamic values.
### 0.3 Use `@lobehub/ui`, not raw `antd`
`Block`, `Text`, `Flexbox`, `Highlighter`, `Alert`, `Tooltip`, `Skeleton` all come from `@lobehub/ui`. Modals come from `@lobehub/ui/base-ui` (`createModal`, `useModalContext`, `confirmModal`) — see the **modal** skill.
Memory note: `@lobehub/ui`'s `<Text type='secondary'>` is a lighter shade than `colorTextSecondary`. If you need that exact token color, write `<Text style={{ color: cssVar.colorTextSecondary }}>`.
### 0.4 Always `memo` and set `displayName`
```tsx
export const SearchInspector = memo<BuiltinInspectorProps<SearchQuery, UniformSearchResponse>>(
({ args /* … */ }) => {
/* … */
},
);
SearchInspector.displayName = 'SearchInspector';
export default SearchInspector;
```
### 0.5 Always type with `BuiltinXProps<Args, State>` generics
Don't widen to `any`. The Args generic is the JSON Schema params, the State generic is the executor's `state` field. The two should match `<Name>Params` and `<Name>State` from `types.ts`.
### 0.6 Pull strings from `t('plugin')`
```tsx
const { t } = useTranslation('plugin');
t('builtins.<identifier>.apiName.<api>');
```
Every Inspector should default to `t('builtins.<identifier>.apiName.<api>')` so it shows something while args stream in.
### 0.7 Read store state from `@/store/chat`, not props
Tool surfaces sometimes need cross-cutting state (loading, streaming buffer). Read it inside the component via Zustand selectors, not from props — props only carry args/state/messageId.
---
## 1. Inspector — Header Chip (required)
**Lifecycle:** Inspector renders for **every phase** of a tool call: while args are streaming in, while the executor is running, and after results come back. It's the only surface that's always visible.
**Goal:** keep it to a single line. Show what's happening with as much context as is currently available.
### Props (`BuiltinInspectorProps<Args, State>`)
```ts
interface BuiltinInspectorProps<Arguments = any, State = any> {
apiName: string;
args: Arguments; // final args (only after the assistant stops streaming)
identifier: string;
isArgumentsStreaming?: boolean; // args still arriving
isLoading?: boolean; // args complete, executor running
partialArgs?: Arguments; // partial JSON during streaming
pluginState?: State; // executor's `state` after success
result?: { content: string | null; error?: any };
}
```
### State machine
| Phase | What's available | What to show |
| ----------------------------------- | ---------------------------------------------------------- | ---------------------------------------------------------- |
| Args streaming, no useful field yet | `isArgumentsStreaming === true`, `partialArgs.X` undefined | Just the API title with `shinyTextStyles.shinyText` |
| Args streaming, key field arrived | `partialArgs.X` populated | Title + key field chip, still pulse-animated |
| Args complete, executor running | `args` populated, `isLoading === true` | Same as above, still pulse-animated |
| Result arrived | `pluginState` populated, `isLoading === false` | Title + chips + result summary (count, identifier, status) |
### Canonical example — Search
`packages/builtin-tool-web-browsing/src/client/Inspector/Search/index.tsx`:
```tsx
'use client';
import type { BuiltinInspectorProps, SearchQuery, UniformSearchResponse } from '@lobechat/types';
import { Text } from '@lobehub/ui';
import { cssVar, cx } from 'antd-style';
import { memo } from 'react';
import { useTranslation } from 'react-i18next';
import { highlightTextStyles, inspectorTextStyles, shinyTextStyles } from '@/styles';
export const SearchInspector = memo<BuiltinInspectorProps<SearchQuery, UniformSearchResponse>>(
({ args, partialArgs, isArgumentsStreaming, isLoading, pluginState }) => {
const { t } = useTranslation('plugin');
const query = args?.query || partialArgs?.query || '';
const resultCount = pluginState?.results?.length ?? 0;
const hasResults = resultCount > 0;
if (isArgumentsStreaming && !query) {
return (
<div className={cx(inspectorTextStyles.root, shinyTextStyles.shinyText)}>
<span>{t('builtins.lobe-web-browsing.apiName.search')}</span>
</div>
);
}
return (
<div
className={cx(
inspectorTextStyles.root,
(isArgumentsStreaming || isLoading) && shinyTextStyles.shinyText,
)}
>
<span>{t('builtins.lobe-web-browsing.apiName.search')}:&nbsp;</span>
{query && <span className={highlightTextStyles.primary}>{query}</span>}
{!isLoading &&
!isArgumentsStreaming &&
pluginState?.results &&
(hasResults ? (
<span style={{ marginInlineStart: 4 }}>({resultCount})</span>
) : (
<Text as="span" color={cssVar.colorTextDescription} fontSize={12}>
({t('builtins.lobe-web-browsing.inspector.noResults')})
</Text>
))}
</div>
);
},
);
SearchInspector.displayName = 'SearchInspector';
export default SearchInspector;
```
### Inspector rules
- Wrap the whole row with `inspectorTextStyles.root` (provides correct flex / line-height baseline).
- Pulse with `shinyTextStyles.shinyText` whenever `isArgumentsStreaming || isLoading`.
- Show the i18n title first so the row is non-empty during the earliest streaming phase.
- Read both `args?.X` and `partialArgs?.X` together — `args` is final, `partialArgs` is in-stream.
- Use chips/tags for distinct facets (identifier, name, parent, status, count). Each chip should clip with `text-overflow: ellipsis` and have a `max-width` so long values don't blow out the chat bubble.
- Append `pluginState`-derived suffixes only **after** loading finishes — count or "(no results)" should not appear while still searching.
- **Switch copy by phase.** If the verb implies an ongoing action ("Creating", "Searching", "Listing"), define `<api>.loading` and `<api>.completed` keys and select via `isArgumentsStreaming || isLoading ? loadingKey : completedKey`. Inspector chips persist in chat history — leaving "Creating task" frozen on a finished call reads as if the tool is still running. Read-only labels that are already noun-form ("View task") can keep a single key. See `CallSubAgentInspector` for the canonical two-key pattern.
### Inspector registry — `client/Inspector/index.ts`
```ts
import type { BuiltinInspector } from '@lobechat/types';
import { TaskApiName } from '../../types';
import { CreateTaskInspector } from './CreateTask';
import { ListTasksInspector } from './ListTasks';
/* … */
export const TaskInspectors: Record<string, BuiltinInspector> = {
[TaskApiName.createTask]: CreateTaskInspector as BuiltinInspector,
[TaskApiName.listTasks]: ListTasksInspector as BuiltinInspector,
/* one entry per ApiName */
};
export { CreateTaskInspector } from './CreateTask';
export { ListTasksInspector } from './ListTasks';
/* re-export each */
```
---
## 2. Render — Rich Result Card (optional)
**Lifecycle:** rendered **once the result arrives** (after Placeholder/Streaming hand off). Sits below the Inspector header.
**Skip if** the API is read-only or the result is just text — the framework already shows the executor's `content` string. Add a Render only when there's a structured artifact worth seeing: a card, a chart, a diff, a list of files.
### Props (`BuiltinRenderProps<Args, State, Content>`)
```ts
interface BuiltinRenderProps<Arguments = any, State = any, Content = any> {
apiName?: string;
args: Arguments; // final params from the LLM
content: Content; // executor's content string (or parsed)
identifier?: string;
messageId: string; // for store lookups
pluginError?: any; // from BuiltinToolResult.error
pluginState?: State; // executor's state
toolCallId?: string;
}
```
### Two patterns
**Pattern A — Single-file Render** (web-browsing CrawlSinglePage):
```tsx
// client/Render/CrawlSinglePage.tsx
import type { BuiltinRenderProps, CrawlPluginState, CrawlSinglePageQuery } from '@lobechat/types';
import { memo } from 'react';
import PageContent from './PageContent';
const CrawlSinglePage = memo<BuiltinRenderProps<CrawlSinglePageQuery, CrawlPluginState>>(
({ messageId, pluginState, args }) => (
<PageContent messageId={messageId} results={pluginState?.results} urls={[args?.url]} />
),
);
export default CrawlSinglePage;
```
**Pattern B — Folder with subcomponents** (web-browsing Search):
```
client/Render/Search/
├── index.tsx # composes the subcomponents, handles error states
├── ConfigForm.tsx # appears when pluginError.type === 'PluginSettingsInvalid'
├── SearchQuery.tsx # editable query header
└── SearchResult.tsx # result list
```
Use Pattern B when the Render has internal state (editing mode, expanded items), error variants, or is large enough to benefit from splitting.
### Error handling in Render
Renders are the canonical place to surface `pluginError` because the chat doesn't auto-render typed errors:
```tsx
if (pluginError) {
if (pluginError?.type === 'PluginSettingsInvalid') {
return <ConfigForm id={messageId} provider={pluginError.body?.provider} />;
}
return (
<Alert
title={pluginError?.message}
type="error"
extra={<Highlighter language="json">{JSON.stringify(pluginError.body, null, 2)}</Highlighter>}
/>
);
}
```
### Render rules
- **Return `null`** if there's nothing useful to draw yet (avoids empty cards during stream).
- Use `pluginState` for server-truth (ids, counts, server-assigned status) and `args` for what the LLM asked. **Combine — neither alone is enough.**
- For lists, summarize with a header line and show top N items with a "+N more" tail rather than rendering everything.
- For modals from a Render, use `@lobehub/ui/base-ui` (`createModal`, `useModalContext`, `confirmModal`) — see the **modal** skill.
### Render registry — `client/Render/index.ts`
```ts
import type { BuiltinRender } from '@lobechat/types';
import { TaskApiName } from '../../types';
import CreateTaskRender from './CreateTask';
import RunTasksRender from './RunTasks';
export const TaskRenders: Record<string, BuiltinRender> = {
[TaskApiName.createTask]: CreateTaskRender as BuiltinRender,
[TaskApiName.runTasks]: RunTasksRender as BuiltinRender,
/* only the APIs with rich result UI — others fall back to text content */
};
export { default as CreateTaskRender } from './CreateTask';
export { default as RunTasksRender } from './RunTasks';
```
### Render display control (rare)
If the Render should hide for certain results (e.g. ClaudeCode's TodoWrite hides when the agent is mid-stream), add a `RenderDisplayControl` to `packages/builtin-tools/src/displayControls.ts`. See `ClaudeCodeRenderDisplayControls` for the pattern.
---
## 3. Placeholder — Skeleton Between Args and Result (optional)
**Lifecycle:** rendered when the args have finished streaming but the executor hasn't returned yet. Disappears when `pluginState` arrives. Bridges the moment of perceived lag.
**Add for** APIs with noticeable execution time: web search, network crawl, file list, large grep. **Skip for** instant ops (status flips, calculator).
### Props (`BuiltinPlaceholderProps<Args>`)
```ts
interface BuiltinPlaceholderProps<T extends Record<string, any> = any> {
apiName: string;
args?: T;
identifier: string;
}
```
No `pluginState` — Placeholder lives entirely in the "executing" gap.
### Canonical example — Search Placeholder
`packages/builtin-tool-web-browsing/src/client/Placeholder/Search.tsx`:
```tsx
import type { BuiltinPlaceholderProps, SearchQuery } from '@lobechat/types';
import { Flexbox, Icon, Skeleton } from '@lobehub/ui';
import { createStaticStyles, cx } from 'antd-style';
import { SearchIcon } from 'lucide-react';
import { memo } from 'react';
import { useIsMobile } from '@/hooks/useIsMobile';
import { shinyTextStyles } from '@/styles';
const styles = createStaticStyles(({ css, cssVar }) => ({
query: cx(
css`
padding: 4px 8px;
border-radius: 8px;
font-size: 12px;
color: ${cssVar.colorTextSecondary};
&:hover {
background: ${cssVar.colorFillTertiary};
}
`,
shinyTextStyles.shinyText,
),
}));
export const Search = memo<BuiltinPlaceholderProps<SearchQuery>>(({ args }) => {
const { query } = args || {};
const isMobile = useIsMobile();
return (
<Flexbox gap={8}>
<Flexbox horizontal={!isMobile} gap={isMobile ? 8 : 40}>
<Flexbox horizontal align="center" className={styles.query} gap={8}>
<Icon icon={SearchIcon} />
{query ? query : <Skeleton.Block active style={{ height: 20, width: 40 }} />}
</Flexbox>
<Skeleton.Block active style={{ height: 20, width: 40 }} />
</Flexbox>
<Flexbox horizontal gap={12}>
{[1, 2, 3, 4, 5].map((id) => (
<Skeleton.Button active key={id} style={{ borderRadius: 8, height: 80, width: 160 }} />
))}
</Flexbox>
</Flexbox>
);
});
```
### Placeholder rules
- **Mirror the eventual Render's layout.** When the result arrives the Placeholder unmounts and the Render mounts; if they share dimensions, the chat doesn't jump.
- Use `Skeleton.Block` / `Skeleton.Button` from `@lobehub/ui` for placeholder shapes.
- Embed any args you have (e.g. the query text) — context helps the user know what's loading.
- Pulse with `shinyTextStyles.shinyText` if the Placeholder includes literal text.
### Placeholder registry — `client/Placeholder/index.ts`
```ts
import { WebBrowsingApiName } from '../../types';
import CrawlMultiPages from './CrawlMultiPages';
import CrawlSinglePage from './CrawlSinglePage';
import { Search } from './Search';
export const WebBrowsingPlaceholders = {
[WebBrowsingApiName.crawlMultiPages]: CrawlMultiPages,
[WebBrowsingApiName.crawlSinglePage]: CrawlSinglePage,
[WebBrowsingApiName.search]: Search,
};
export { CrawlMultiPages, CrawlSinglePage, Search };
```
---
## 4. Streaming — Live Output During Execution (optional)
**Lifecycle:** rendered **while the executor is still running** for APIs that emit incremental output. The component is responsible for fetching the in-flight stream from the chat store and rendering it.
**Add for** long-running ops with continuous output: shell command execution (stdout/stderr), file write progress, code interpreter cells.
### Props (`BuiltinStreamingProps<Args>`)
```ts
interface BuiltinStreamingProps<Arguments = any> {
apiName: string;
args: Arguments;
identifier: string;
messageId: string; // use to fetch the streaming buffer from store
toolCallId: string;
}
```
Note there's **no `state` or `result` prop** — the Streaming component is for the in-flight phase. It pulls the live buffer from the store itself (typically via `chatToolSelectors.streamingContent(messageId)` or similar).
### Canonical example — RunCommandStreaming
`packages/builtin-tool-local-system/src/client/Streaming/RunCommand/index.tsx`:
```tsx
'use client';
import type { BuiltinStreamingProps } from '@lobechat/types';
import { Highlighter } from '@lobehub/ui';
import { memo } from 'react';
interface RunCommandParams {
command?: string;
description?: string;
timeout?: number;
}
export const RunCommandStreaming = memo<BuiltinStreamingProps<RunCommandParams>>(({ args }) => {
const { command } = args || {};
if (!command) return null;
return (
<Highlighter
animated
wrap
language="sh"
showLanguage={false}
style={{ padding: '4px 8px' }}
variant="outlined"
>
{command}
</Highlighter>
);
});
RunCommandStreaming.displayName = 'RunCommandStreaming';
```
For real-time output beyond just the command (stderr/stdout streaming), pull from the chat store:
```tsx
const buffer = useChatStore((state) =>
chatToolSelectors.streamingBuffer(messageId, toolCallId)(state),
);
```
### Streaming rules
- Render `null` until you have something to display (avoids flash).
- For terminal-style output, use `Highlighter` with `animated` to show typing-like effect.
- The Streaming component must **unmount cleanly** when execution ends — typically the framework swaps it out for the Render automatically.
### Streaming registry — `client/Streaming/index.ts`
```ts
import { LocalSystemApiName } from '../..';
import { RunCommandStreaming } from './RunCommand';
import { WriteFileStreaming } from './WriteFile';
export const LocalSystemStreamings = {
[LocalSystemApiName.runCommand]: RunCommandStreaming,
[LocalSystemApiName.writeLocalFile]: WriteFileStreaming,
};
```
---
## 5. Intervention — Approval / Edit-Before-Run (optional)
**Lifecycle:** rendered **before the executor runs** for APIs whose manifest sets `humanIntervention`. The user sees a preview of the args, can edit them, then approves or skips/cancels.
**Add for** destructive or sensitive ops: shell commands, file writes, file moves, payments, message broadcasts.
### Props (`BuiltinInterventionProps<Args>`)
```ts
interface BuiltinInterventionProps<Arguments = any> {
apiName?: string;
args: Arguments;
identifier?: string;
interactionMode?: 'approval' | 'custom';
messageId: string;
/** Called when the user edits the args; the approve action awaits this. */
onArgsChange?: (args: Arguments) => void | Promise<void>;
/** Called on approve / skip / cancel. */
onInteractionAction?: (
action:
| { type: 'submit'; payload: Record<string, unknown> }
| { type: 'skip'; payload?: Record<string, unknown>; reason?: string }
| { type: 'cancel'; payload?: Record<string, unknown> },
) => Promise<void>;
/** Register a callback to flush pending saves before approval. Returns cleanup. */
registerBeforeApprove?: (id: string, callback: () => void | Promise<void>) => () => void;
}
```
### Canonical example — RunCommand Intervention
`packages/builtin-tool-local-system/src/client/Intervention/RunCommand/index.tsx`:
```tsx
import type { RunCommandParams } from '@lobechat/electron-client-ipc';
import type { BuiltinInterventionProps } from '@lobechat/types';
import { Flexbox, Highlighter, Text } from '@lobehub/ui';
import { memo } from 'react';
const RunCommand = memo<BuiltinInterventionProps<RunCommandParams>>(({ args }) => {
const { description, command, timeout } = args;
return (
<Flexbox gap={8}>
<Flexbox horizontal justify="space-between">
{description && <Text>{description}</Text>}
{timeout && (
<Text style={{ fontSize: 12 }} type="secondary">
timeout: {formatTimeout(timeout)}
</Text>
)}
</Flexbox>
{command && (
<Highlighter wrap language="sh" showLanguage={false} variant="outlined">
{command}
</Highlighter>
)}
</Flexbox>
);
});
export default RunCommand;
```
### Intervention rules
- **Show a preview, not a form by default.** Editing UI is opt-in via `onArgsChange` and is usually inline (click to edit a code block, etc.).
- For args with debounced edit state (text fields), use `registerBeforeApprove(id, flushFn)` so the approve action waits for the debounce to flush. Always return the cleanup function.
- Call `onInteractionAction({ type: 'submit', payload })` when the user approves; `'skip'` if they skip with a reason; `'cancel'` if they cancel the whole turn.
- Add a corresponding `interventionAudit.ts` in the package root if the tool needs scope/path validation before approval (see `local-system/src/interventionAudit.ts`).
### Intervention registry — `client/Intervention/index.ts`
```ts
import { LocalSystemApiName } from '../..';
import EditLocalFile from './EditLocalFile';
import RunCommand from './RunCommand';
import WriteFile from './WriteFile';
/* … */
export const LocalSystemInterventions = {
[LocalSystemApiName.editLocalFile]: EditLocalFile,
[LocalSystemApiName.runCommand]: RunCommand,
[LocalSystemApiName.writeLocalFile]: WriteFile,
/* one entry per API that needs approval */
};
```
---
## 6. Portal — Full-Screen Detail View (optional)
**Lifecycle:** rendered when the user opens the tool message in a side panel or full-screen modal. One Portal per **tool**, not per API — the Portal switches on `apiName` internally.
**Add for** tools whose results deserve a deep-dive view: search results with editable filters, page content with reader mode, code interpreter sessions.
### Props (`BuiltinPortalProps<Args, State>`)
```ts
interface BuiltinPortalProps<Arguments = Record<string, any>, State = any> {
apiName?: string;
arguments: Arguments;
identifier: string;
messageId: string;
state: State;
}
```
### Canonical example — Web-Browsing Portal
`packages/builtin-tool-web-browsing/src/client/Portal/index.tsx`:
```tsx
import type { BuiltinPortalProps, CrawlPluginState, SearchQuery } from '@lobechat/types';
import { memo } from 'react';
import { WebBrowsingApiName } from '../../types';
import PageContent from './PageContent';
import PageContents from './PageContents';
import Search from './Search';
const Portal = memo<BuiltinPortalProps>(({ arguments: args, messageId, state, apiName }) => {
switch (apiName) {
case WebBrowsingApiName.search:
return <Search messageId={messageId} query={args as SearchQuery} response={state} />;
case WebBrowsingApiName.crawlSinglePage: {
const result = (state as CrawlPluginState).results.find((r) => r.originalUrl === args.url);
return <PageContent messageId={messageId} result={result} />;
}
case WebBrowsingApiName.crawlMultiPages:
return (
<PageContents
messageId={messageId}
results={(state as CrawlPluginState).results}
urls={args.urls}
/>
);
}
return null;
});
export default Portal;
```
### Portal rules
- One Portal per tool — the file is the routing layer, subcomponents implement each API's view.
- Portals can read the chat store directly to detect "still streaming" and render a Skeleton internally (see `Search/index.tsx:20-46`).
- Layout assumes more space than the Render — use `Flexbox` with `height={'100%'}` and structure for a side panel viewport.
### Portal registry — `packages/builtin-tools/src/portals.ts`
```ts
import { WebBrowsingManifest, WebBrowsingPortal } from '@lobechat/builtin-tool-web-browsing/client';
import { type BuiltinPortal } from '@lobechat/types';
export const BuiltinToolsPortals: Record<string, BuiltinPortal> = {
[WebBrowsingManifest.identifier]: WebBrowsingPortal as BuiltinPortal,
};
```
---
## 7. `client/components/` — Shared Subcomponents
Cross-cutting building blocks used by multiple surfaces live here, not duplicated in each surface folder.
Examples from `web-browsing/src/client/components/`:
- `CategoryAvatar.tsx` — search category icon
- `EngineAvatar.tsx` — search engine logo (used in Inspector chip + Render list + Portal header)
- `SearchBar.tsx` — editable query bar (used in Render and Portal)
Examples from `local-system/src/client/components/`:
- `FileItem.tsx` — single file row (used in ListFiles Render, SearchFiles Render, MoveLocalFiles Render)
- `FilePathDisplay.tsx` — path with truncation (used everywhere)
### Rules
- Live under `client/components/`, exported via `client/components/index.ts`.
- Re-export from `client/index.ts` only if other packages need them; otherwise keep internal.
- Keep them dumb — props in, JSX out, no store reads. The store reads belong in the surface that composes them.
---
## 8. `client/index.ts` — Package Public API
Re-exports everything the registries need plus useful types/manifest:
```ts
// Inspector — required
export { TaskInspectors } from './Inspector';
// Render — only if any API has one
export { TaskRenders, CreateTaskRender, RunTasksRender } from './Render';
// Placeholder / Streaming / Intervention — only if used
export { LocalSystemListFilesPlaceholder, LocalSystemSearchFilesPlaceholder } from './Placeholder';
export { LocalSystemStreamings } from './Streaming';
export { LocalSystemInterventions } from './Intervention';
// Portal — single export per tool
export { default as WebBrowsingPortal } from './Portal';
// Reusable components if other packages need them
export { CategoryAvatar, EngineAvatar, SearchBar } from './components';
// Re-export manifest, identifier, types for convenience
export { TaskManifest, TaskIdentifier } from '../manifest';
export * from '../types';
```
---
## 9. Diagnostic Quick-Lookup
| Symptom | Surface to check | | |
| ----------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | --- | ------------------------- |
| No header at all on the tool call | Inspector missing from `client/Inspector/index.ts` registry | | |
| Header shows the API name but no chips | Inspector missing \`args?.X | | partialArgs?.X\` fallback |
| Header doesn't pulse during loading | Missing `shinyTextStyles.shinyText` on `isArgumentsStreaming \|\| isLoading` | | |
| Empty result card under header | Render returned `<div />` instead of `null` when no data | | |
| Layout jump when result arrives | Placeholder dimensions don't match Render dimensions | | |
| Approval dialog never appears | Manifest missing `humanIntervention`, or Intervention not in registry | | |
| Approval click doesn't wait for inline edit | Missing `registerBeforeApprove(id, flushFn)` | | |
| Portal opens but blank | Switch in `Portal/index.tsx` doesn't cover the apiName | | |
| Strings show as `builtins.lobe-foo.apiName.bar` | Missing i18n key in `src/locales/default/plugin.ts` (or not seeded in dev locale files) | | |
| Wrong color shade on `<Text type="secondary">` | `type='secondary'` is lighter than `colorTextSecondary` — pass via `style={{ color: cssVar.colorTextSecondary }}` | | |
@@ -1,36 +0,0 @@
# Tool UI Surfaces
A builtin tool can ship up to **six client-side surfaces**, each with a different role in the chat UI. Only `Inspector` is required; the other five are added on demand and registered in their own central files.
| Surface | Required? | When the chat shows it | Registered in |
| ------------ | --------- | --------------------------------------------------------------------- | --------------------------------------------- |
| Inspector | ✅ Always | Header strip of every tool call (one-line chip) | `packages/builtin-tools/src/inspectors.ts` |
| Render | Optional | Rich result card below the header, after the call returns | `packages/builtin-tools/src/renders.ts` |
| Placeholder | Optional | Skeleton between "args streaming complete" and "result arrives" | `packages/builtin-tools/src/placeholders.ts` |
| Streaming | Optional | Live output during execution (e.g. command stdout) | `packages/builtin-tools/src/streamings.ts` |
| Intervention | Optional | Approval / edit-before-run dialog (when `humanIntervention` triggers) | `packages/builtin-tools/src/interventions.ts` |
| Portal | Optional | Full-screen detail view (right-side or modal) | `packages/builtin-tools/src/portals.ts` |
The two reference tools to read end-to-end:
- **`builtin-tool-web-browsing/src/client/`** — Inspector + Render + Placeholder + Portal (no Intervention/Streaming).
- **`builtin-tool-local-system/src/client/`** — all six surfaces, including `components/` for shared building blocks.
---
## Files in this folder
Read **principles** and **shared-rules** first — they apply to every surface. Then jump to the surface you're building.
| File | What it covers |
| ---------------------------------- | ----------------------------------------------------------------------- |
| [principles.md](principles.md) | Design principles — when each surface exists and how far to take it |
| [shared-rules.md](shared-rules.md) | Cross-surface rules: component skeleton, styling, single-layer surfaces |
| [inspector.md](inspector.md) | Inspector — header chip (required) |
| [render.md](render.md) | Render — rich result card |
| [placeholder.md](placeholder.md) | Placeholder — skeleton between args and result |
| [streaming.md](streaming.md) | Streaming — live output during execution |
| [intervention.md](intervention.md) | Intervention — approval / edit-before-run |
| [portal.md](portal.md) | Portal — full-screen detail view |
| [composition.md](composition.md) | Shared subcomponents (`client/components/`) + package public API |
| [diagnostics.md](diagnostics.md) | Symptom → surface quick-lookup |
@@ -1,51 +0,0 @@
# Composition — Shared Components & Package API
## `client/components/` — Shared Subcomponents
Cross-cutting building blocks used by multiple surfaces live here, not duplicated in each surface folder.
Examples from `web-browsing/src/client/components/`:
- `CategoryAvatar.tsx` — search category icon
- `EngineAvatar.tsx` — search engine logo (used in Inspector chip + Render list + Portal header)
- `SearchBar.tsx` — editable query bar (used in Render and Portal)
Examples from `local-system/src/client/components/`:
- `FileItem.tsx` — single file row (used in ListFiles Render, SearchFiles Render, MoveLocalFiles Render)
- `FilePathDisplay.tsx` — path with truncation (used everywhere)
### Rules
- Live under `client/components/`, exported via `client/components/index.ts`.
- Re-export from `client/index.ts` only if other packages need them; otherwise keep internal.
- Keep them dumb — props in, JSX out, no store reads. The store reads belong in the surface that composes them.
---
## `client/index.ts` — Package Public API
Re-exports everything the registries need plus useful types/manifest:
```ts
// Inspector — required
export { TaskInspectors } from './Inspector';
// Render — only if any API has one
export { TaskRenders, CreateTaskRender, RunTasksRender } from './Render';
// Placeholder / Streaming / Intervention — only if used
export { LocalSystemListFilesPlaceholder, LocalSystemSearchFilesPlaceholder } from './Placeholder';
export { LocalSystemStreamings } from './Streaming';
export { LocalSystemInterventions } from './Intervention';
// Portal — single export per tool
export { default as WebBrowsingPortal } from './Portal';
// Reusable components if other packages need them
export { CategoryAvatar, EngineAvatar, SearchBar } from './components';
// Re-export manifest, identifier, types for convenience
export { TaskManifest, TaskIdentifier } from '../manifest';
export * from '../types';
```
@@ -1,15 +0,0 @@
# Diagnostic Quick-Lookup
| Symptom | Surface to check |
| ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------- |
| No header at all on the tool call | Inspector missing from `client/Inspector/index.ts` registry |
| Header shows the API name but no chips | Inspector missing `args?.X \|\| partialArgs?.X` fallback |
| Header doesn't pulse during loading | Missing `shinyTextStyles.shinyText` on `isArgumentsStreaming \|\| isLoading` |
| Empty result card under header | Render returned `<div />` instead of `null` when no data |
| Render looks "complex" / card-in-card | Filled container (`colorFillQuaternary`) wrapping more filled boxes — flatten to single-layer, see [shared-rules.md](shared-rules.md) |
| Layout jump when result arrives | Placeholder dimensions don't match Render dimensions |
| Approval dialog never appears | Manifest missing `humanIntervention`, or Intervention not in registry |
| Approval click doesn't wait for inline edit | Missing `registerBeforeApprove(id, flushFn)` |
| Portal opens but blank | Switch in `Portal/index.tsx` doesn't cover the apiName |
| Strings show as `builtins.lobe-foo.apiName.bar` | Missing i18n key in `src/locales/default/plugin.ts` (or not seeded in dev locale files) |
| Wrong color shade on `<Text type="secondary">` | `type='secondary'` is lighter than `colorTextSecondary` — pass via `style={{ color: cssVar.colorTextSecondary }}` |
@@ -1,118 +0,0 @@
# Inspector — Header Chip (required)
**Lifecycle:** Inspector renders for **every phase** of a tool call: while args are streaming in, while the executor is running, and after results come back. It's the only surface that's always visible.
**Goal:** keep it to a single line. Show what's happening with as much context as is currently available.
## Props (`BuiltinInspectorProps<Args, State>`)
```ts
interface BuiltinInspectorProps<Arguments = any, State = any> {
apiName: string;
args: Arguments; // final args (only after the assistant stops streaming)
identifier: string;
isArgumentsStreaming?: boolean; // args still arriving
isLoading?: boolean; // args complete, executor running
partialArgs?: Arguments; // partial JSON during streaming
pluginState?: State; // executor's `state` after success
result?: { content: string | null; error?: any };
}
```
## State machine
| Phase | What's available | What to show |
| ----------------------------------- | ---------------------------------------------------------- | ---------------------------------------------------------- |
| Args streaming, no useful field yet | `isArgumentsStreaming === true`, `partialArgs.X` undefined | Just the API title with `shinyTextStyles.shinyText` |
| Args streaming, key field arrived | `partialArgs.X` populated | Title + key field chip, still pulse-animated |
| Args complete, executor running | `args` populated, `isLoading === true` | Same as above, still pulse-animated |
| Result arrived | `pluginState` populated, `isLoading === false` | Title + chips + result summary (count, identifier, status) |
## Canonical example — Search
`packages/builtin-tool-web-browsing/src/client/Inspector/Search/index.tsx`:
```tsx
'use client';
import type { BuiltinInspectorProps, SearchQuery, UniformSearchResponse } from '@lobechat/types';
import { Text } from '@lobehub/ui';
import { cssVar, cx } from 'antd-style';
import { memo } from 'react';
import { useTranslation } from 'react-i18next';
import { highlightTextStyles, inspectorTextStyles, shinyTextStyles } from '@/styles';
export const SearchInspector = memo<BuiltinInspectorProps<SearchQuery, UniformSearchResponse>>(
({ args, partialArgs, isArgumentsStreaming, isLoading, pluginState }) => {
const { t } = useTranslation('plugin');
const query = args?.query || partialArgs?.query || '';
const resultCount = pluginState?.results?.length ?? 0;
const hasResults = resultCount > 0;
if (isArgumentsStreaming && !query) {
return (
<div className={cx(inspectorTextStyles.root, shinyTextStyles.shinyText)}>
<span>{t('builtins.lobe-web-browsing.apiName.search')}</span>
</div>
);
}
return (
<div
className={cx(
inspectorTextStyles.root,
(isArgumentsStreaming || isLoading) && shinyTextStyles.shinyText,
)}
>
<span>{t('builtins.lobe-web-browsing.apiName.search')}:&nbsp;</span>
{query && <span className={highlightTextStyles.primary}>{query}</span>}
{!isLoading &&
!isArgumentsStreaming &&
pluginState?.results &&
(hasResults ? (
<span style={{ marginInlineStart: 4 }}>({resultCount})</span>
) : (
<Text as="span" color={cssVar.colorTextDescription} fontSize={12}>
({t('builtins.lobe-web-browsing.inspector.noResults')})
</Text>
))}
</div>
);
},
);
SearchInspector.displayName = 'SearchInspector';
export default SearchInspector;
```
## Inspector rules
- Wrap the whole row with `inspectorTextStyles.root` (provides correct flex / line-height baseline).
- Pulse with `shinyTextStyles.shinyText` whenever `isArgumentsStreaming || isLoading`.
- Show the i18n title first so the row is non-empty during the earliest streaming phase.
- Read both `args?.X` and `partialArgs?.X` together — `args` is final, `partialArgs` is in-stream.
- Use chips/tags for distinct facets (identifier, name, parent, status, count). Each chip should clip with `text-overflow: ellipsis` and have a `max-width` so long values don't blow out the chat bubble.
- Append `pluginState`-derived suffixes only **after** loading finishes — count or "(no results)" should not appear while still searching.
- **Switch copy by phase.** If the verb implies an ongoing action ("Creating", "Searching", "Listing"), define `<api>.loading` and `<api>.completed` keys and select via `isArgumentsStreaming || isLoading ? loadingKey : completedKey`. Inspector chips persist in chat history — leaving "Creating task" frozen on a finished call reads as if the tool is still running. Read-only labels that are already noun-form ("View task") can keep a single key. See `CallSubAgentInspector` for the canonical two-key pattern.
## Inspector registry — `client/Inspector/index.ts`
```ts
import type { BuiltinInspector } from '@lobechat/types';
import { TaskApiName } from '../../types';
import { CreateTaskInspector } from './CreateTask';
import { ListTasksInspector } from './ListTasks';
/* … */
export const TaskInspectors: Record<string, BuiltinInspector> = {
[TaskApiName.createTask]: CreateTaskInspector as BuiltinInspector,
[TaskApiName.listTasks]: ListTasksInspector as BuiltinInspector,
/* one entry per ApiName */
};
export { CreateTaskInspector } from './CreateTask';
export { ListTasksInspector } from './ListTasks';
/* re-export each */
```
@@ -1,88 +0,0 @@
# Intervention — Approval / Edit-Before-Run (optional)
**Lifecycle:** rendered **before the executor runs** for APIs whose manifest sets `humanIntervention`. The user sees a preview of the args, can edit them, then approves or skips/cancels.
**Add for** destructive or sensitive ops: shell commands, file writes, file moves, payments, message broadcasts.
## Props (`BuiltinInterventionProps<Args>`)
```ts
interface BuiltinInterventionProps<Arguments = any> {
apiName?: string;
args: Arguments;
identifier?: string;
interactionMode?: 'approval' | 'custom';
messageId: string;
/** Called when the user edits the args; the approve action awaits this. */
onArgsChange?: (args: Arguments) => void | Promise<void>;
/** Called on approve / skip / cancel. */
onInteractionAction?: (
action:
| { type: 'submit'; payload: Record<string, unknown> }
| { type: 'skip'; payload?: Record<string, unknown>; reason?: string }
| { type: 'cancel'; payload?: Record<string, unknown> },
) => Promise<void>;
/** Register a callback to flush pending saves before approval. Returns cleanup. */
registerBeforeApprove?: (id: string, callback: () => void | Promise<void>) => () => void;
}
```
## Canonical example — RunCommand Intervention
`packages/builtin-tool-local-system/src/client/Intervention/RunCommand/index.tsx`:
```tsx
import type { RunCommandParams } from '@lobechat/electron-client-ipc';
import type { BuiltinInterventionProps } from '@lobechat/types';
import { Flexbox, Highlighter, Text } from '@lobehub/ui';
import { memo } from 'react';
const RunCommand = memo<BuiltinInterventionProps<RunCommandParams>>(({ args }) => {
const { description, command, timeout } = args;
return (
<Flexbox gap={8}>
<Flexbox horizontal justify="space-between">
{description && <Text>{description}</Text>}
{timeout && (
<Text style={{ fontSize: 12 }} type="secondary">
timeout: {formatTimeout(timeout)}
</Text>
)}
</Flexbox>
{command && (
<Highlighter wrap language="sh" showLanguage={false} variant="outlined">
{command}
</Highlighter>
)}
</Flexbox>
);
});
export default RunCommand;
```
## Intervention rules
- **Show a preview, not a form by default.** Editing UI is opt-in via `onArgsChange` and is usually inline (click to edit a code block, etc.).
- For args with debounced edit state (text fields), use `registerBeforeApprove(id, flushFn)` so the approve action waits for the debounce to flush. Always return the cleanup function.
- Call `onInteractionAction({ type: 'submit', payload })` when the user approves; `'skip'` if they skip with a reason; `'cancel'` if they cancel the whole turn.
- Add a corresponding `interventionAudit.ts` in the package root if the tool needs scope/path validation before approval (see `local-system/src/interventionAudit.ts`).
## Intervention registry — `client/Intervention/index.ts`
```ts
import { LocalSystemApiName } from '../..';
import EditLocalFile from './EditLocalFile';
import RunCommand from './RunCommand';
import WriteFile from './WriteFile';
/* … */
export const LocalSystemInterventions = {
[LocalSystemApiName.editLocalFile]: EditLocalFile,
[LocalSystemApiName.runCommand]: RunCommand,
[LocalSystemApiName.writeLocalFile]: WriteFile,
/* one entry per API that needs approval */
};
```
@@ -1,93 +0,0 @@
# Placeholder — Skeleton Between Args and Result (optional)
**Lifecycle:** rendered when the args have finished streaming but the executor hasn't returned yet. Disappears when `pluginState` arrives. Bridges the moment of perceived lag.
**Add for** APIs with noticeable execution time: web search, network crawl, file list, large grep. **Skip for** instant ops (status flips, calculator).
## Props (`BuiltinPlaceholderProps<Args>`)
```ts
interface BuiltinPlaceholderProps<T extends Record<string, any> = any> {
apiName: string;
args?: T;
identifier: string;
}
```
No `pluginState` — Placeholder lives entirely in the "executing" gap.
## Canonical example — Search Placeholder
`packages/builtin-tool-web-browsing/src/client/Placeholder/Search.tsx`:
```tsx
import type { BuiltinPlaceholderProps, SearchQuery } from '@lobechat/types';
import { Flexbox, Icon, Skeleton } from '@lobehub/ui';
import { createStaticStyles, cx } from 'antd-style';
import { SearchIcon } from 'lucide-react';
import { memo } from 'react';
import { useIsMobile } from '@/hooks/useIsMobile';
import { shinyTextStyles } from '@/styles';
const styles = createStaticStyles(({ css, cssVar }) => ({
query: cx(
css`
padding: 4px 8px;
border-radius: 8px;
font-size: 12px;
color: ${cssVar.colorTextSecondary};
&:hover {
background: ${cssVar.colorFillTertiary};
}
`,
shinyTextStyles.shinyText,
),
}));
export const Search = memo<BuiltinPlaceholderProps<SearchQuery>>(({ args }) => {
const { query } = args || {};
const isMobile = useIsMobile();
return (
<Flexbox gap={8}>
<Flexbox horizontal={!isMobile} gap={isMobile ? 8 : 40}>
<Flexbox horizontal align="center" className={styles.query} gap={8}>
<Icon icon={SearchIcon} />
{query ? query : <Skeleton.Block active style={{ height: 20, width: 40 }} />}
</Flexbox>
<Skeleton.Block active style={{ height: 20, width: 40 }} />
</Flexbox>
<Flexbox horizontal gap={12}>
{[1, 2, 3, 4, 5].map((id) => (
<Skeleton.Button active key={id} style={{ borderRadius: 8, height: 80, width: 160 }} />
))}
</Flexbox>
</Flexbox>
);
});
```
## Placeholder rules
- **Mirror the eventual Render's layout.** When the result arrives the Placeholder unmounts and the Render mounts; if they share dimensions, the chat doesn't jump.
- Use `Skeleton.Block` / `Skeleton.Button` from `@lobehub/ui` for placeholder shapes.
- Embed any args you have (e.g. the query text) — context helps the user know what's loading.
- Pulse with `shinyTextStyles.shinyText` if the Placeholder includes literal text.
## Placeholder registry — `client/Placeholder/index.ts`
```ts
import { WebBrowsingApiName } from '../../types';
import CrawlMultiPages from './CrawlMultiPages';
import CrawlSinglePage from './CrawlSinglePage';
import { Search } from './Search';
export const WebBrowsingPlaceholders = {
[WebBrowsingApiName.crawlMultiPages]: CrawlMultiPages,
[WebBrowsingApiName.crawlSinglePage]: CrawlSinglePage,
[WebBrowsingApiName.search]: Search,
};
export { CrawlMultiPages, CrawlSinglePage, Search };
```
@@ -1,71 +0,0 @@
# Portal — Full-Screen Detail View (optional)
**Lifecycle:** rendered when the user opens the tool message in a side panel or full-screen modal. One Portal per **tool**, not per API — the Portal switches on `apiName` internally.
**Add for** tools whose results deserve a deep-dive view: search results with editable filters, page content with reader mode, code interpreter sessions.
## Props (`BuiltinPortalProps<Args, State>`)
```ts
interface BuiltinPortalProps<Arguments = Record<string, any>, State = any> {
apiName?: string;
arguments: Arguments;
identifier: string;
messageId: string;
state: State;
}
```
## Canonical example — Web-Browsing Portal
`packages/builtin-tool-web-browsing/src/client/Portal/index.tsx`:
```tsx
import type { BuiltinPortalProps, CrawlPluginState, SearchQuery } from '@lobechat/types';
import { memo } from 'react';
import { WebBrowsingApiName } from '../../types';
import PageContent from './PageContent';
import PageContents from './PageContents';
import Search from './Search';
const Portal = memo<BuiltinPortalProps>(({ arguments: args, messageId, state, apiName }) => {
switch (apiName) {
case WebBrowsingApiName.search:
return <Search messageId={messageId} query={args as SearchQuery} response={state} />;
case WebBrowsingApiName.crawlSinglePage: {
const result = (state as CrawlPluginState).results.find((r) => r.originalUrl === args.url);
return <PageContent messageId={messageId} result={result} />;
}
case WebBrowsingApiName.crawlMultiPages:
return (
<PageContents
messageId={messageId}
results={(state as CrawlPluginState).results}
urls={args.urls}
/>
);
}
return null;
});
export default Portal;
```
## Portal rules
- One Portal per tool — the file is the routing layer, subcomponents implement each API's view.
- Portals can read the chat store directly to detect "still streaming" and render a Skeleton internally (see `Search/index.tsx:20-46`).
- Layout assumes more space than the Render — use `Flexbox` with `height={'100%'}` and structure for a side panel viewport.
## Portal registry — `packages/builtin-tools/src/portals.ts`
```ts
import { WebBrowsingManifest, WebBrowsingPortal } from '@lobechat/builtin-tool-web-browsing/client';
import { type BuiltinPortal } from '@lobechat/types';
export const BuiltinToolsPortals: Record<string, BuiltinPortal> = {
[WebBrowsingManifest.identifier]: WebBrowsingPortal as BuiltinPortal,
};
```
@@ -1,19 +0,0 @@
# Tool Render 设计原则(中文草案)
这些原则用于判断一个 builtin tool 的 Inspector / Render / Placeholder / Streaming / Intervention / Portal 应该做什么,以及做到什么程度。
1. **先保证折叠态可读。** 每个 API 都必须有 Inspector;用户不展开也应该能看懂 “正在做什么 / 对什么做 / 当前结果是什么”。Inspector 不应该只展示函数名和原始参数。
2. **Inspector 是一句话,不是详情页。** 优先表达动作、关键对象、数量、状态,例如 “分析图片 3 张”“搜索 12 个结果”“读取 config.json”。长文本、列表和结构化结果放到 Render 或 Portal。
3. **Inspector 要覆盖执行生命周期。** `args` 还在 streaming、工具执行中、执行完成、执行失败时都应该有稳定展示;必要时同时读取 `args``partialArgs``pluginState`,避免出现空白、跳变或只显示半截参数。
4. **文案要随状态切换时态。** 同一个动作在 loading 与 completed 两个阶段必须用不同的措辞:执行中用现在进行时(“正在创建任务 / Creating task / 正在搜索”),执行完成后切到完成态(“已创建任务 / Task created / 已找到 N 条”)。Inspector chip 会一直留在聊天记录里 —— 如果一直挂着 “正在 xxx”,几小时后回看历史时会读起来像还在跑。约定的 i18n 形式是 `<api>.loading` / `<api>.completed` 一对键(见 `lobe-agent.apiName.callSubAgent.{loading,completed}``lobe-claude-code.task.{create,list,update,get}.{loading,completed}`),渲染时按 `isArgumentsStreaming || isLoading` 决定取哪一个。只读 / 查询类(“查看任务” 这种本来就是名词性的)可以共用一个键。
5. **只有结构化结果才需要 Render。** 如果工具结果只是自然语言总结,通常不需要 Render;如果结果包含列表、媒体、文件、表格、代码、diff、地图、时间线、权限请求等结构,就应该提供 Render。
6. **Render 要帮助用户检查结果,而不是复述参数。** Render 的主体应该围绕工具产物组织:可预览、可比较、可筛选、可定位。参数只作为上下文辅助出现,不要把 Render 做成一块更大的 args dump。
7. **参数和结果要一起参与渲染。** 好的 Tool UI 通常同时用 `args` 解释意图,用 `pluginState` 展示真实执行结果;但 `pluginState` 只放结果域数据,不要反向塞入可以从 `args` 推导出的内容。
8. **慢操作要有 Placeholder。** 如果工具通常需要等待网络、文件系统、模型或外部进程,Placeholder 应该先占住最终 Render 的版式,让用户知道即将看到什么,而不是只显示一个泛化 loading。
9. **Streaming 只用于连续产物。** 搜索列表、日志、长文本、文件分析、分阶段计划适合 Streaming;一次性小结果不需要强行做 Streaming。Streaming UI 要能渐进追加,并且完成后自然过渡到最终 Render。
10. **有风险的动作必须 Intervention。** 写文件、删除、发送、安装、执行命令、外部可见操作、权限敏感操作,都应该在执行前给出可理解的确认界面;确认文案要说明影响范围,而不是只问 “是否继续”。
11. **错误、空态和截断都是正式状态。** Render 不能在失败、无结果、超长结果时退化成空白。错误要说明发生在哪一步;空态要告诉用户没有产物;超长内容要明确 “展示前 N 项 / 还有 N 项”。
12. **信息密度要克制。** 默认展示最有判断价值的部分:标题、来源、状态、摘要、少量关键字段。大对象、长列表、原文、调试数据放进可展开区域或 Portal,避免把聊天流撑成后台管理页。
13. **视觉上融入聊天流。** Tool UI 应该使用 `@lobehub/ui` / base-ui、`Flexbox``createStaticStyles``cssVar.*`,遵循现有间距、圆角、颜色、字号;不要为单个工具发明一套独立视觉语言。具体的样式约定见 [shared-rules.md](shared-rules.md)。
14. **Devtools fixture 是验收入口。** 新增或修改 Tool UI 时,应在 `/devtools` 里准备覆盖典型态、loading/streaming、空态、错误态、长内容态的 fixture;一个 API 如果在真实聊天里会出现,就不应该在 devtools 中缺席。
15. **先做用户会看的 UI,再做调试 UI。** Raw JSON、trace、schema、内部 id 可以存在,但应默认收起或放到调试区;主界面先回答用户最关心的问题:工具做了什么,结果值不值得信任,下一步能做什么。
@@ -1,101 +0,0 @@
# Render — Rich Result Card (optional)
**Lifecycle:** rendered **once the result arrives** (after Placeholder/Streaming hand off). Sits below the Inspector header.
**Skip if** the API is read-only or the result is just text — the framework already shows the executor's `content` string. Add a Render only when there's a structured artifact worth seeing: a card, a chart, a diff, a list of files.
## Props (`BuiltinRenderProps<Args, State, Content>`)
```ts
interface BuiltinRenderProps<Arguments = any, State = any, Content = any> {
apiName?: string;
args: Arguments; // final params from the LLM
content: Content; // executor's content string (or parsed)
identifier?: string;
messageId: string; // for store lookups
pluginError?: any; // from BuiltinToolResult.error
pluginState?: State; // executor's state
toolCallId?: string;
}
```
## Two patterns
**Pattern A — Single-file Render** (web-browsing CrawlSinglePage):
```tsx
// client/Render/CrawlSinglePage.tsx
import type { BuiltinRenderProps, CrawlPluginState, CrawlSinglePageQuery } from '@lobechat/types';
import { memo } from 'react';
import PageContent from './PageContent';
const CrawlSinglePage = memo<BuiltinRenderProps<CrawlSinglePageQuery, CrawlPluginState>>(
({ messageId, pluginState, args }) => (
<PageContent messageId={messageId} results={pluginState?.results} urls={[args?.url]} />
),
);
export default CrawlSinglePage;
```
**Pattern B — Folder with subcomponents** (web-browsing Search):
```
client/Render/Search/
├── index.tsx # composes the subcomponents, handles error states
├── ConfigForm.tsx # appears when pluginError.type === 'PluginSettingsInvalid'
├── SearchQuery.tsx # editable query header
└── SearchResult.tsx # result list
```
Use Pattern B when the Render has internal state (editing mode, expanded items), error variants, or is large enough to benefit from splitting.
## Error handling in Render
Renders are the canonical place to surface `pluginError` because the chat doesn't auto-render typed errors:
```tsx
if (pluginError) {
if (pluginError?.type === 'PluginSettingsInvalid') {
return <ConfigForm id={messageId} provider={pluginError.body?.provider} />;
}
return (
<Alert
title={pluginError?.message}
type="error"
extra={<Highlighter language="json">{JSON.stringify(pluginError.body, null, 2)}</Highlighter>}
/>
);
}
```
## Render rules
- **Return `null`** if there's nothing useful to draw yet (avoids empty cards during stream).
- Use `pluginState` for server-truth (ids, counts, server-assigned status) and `args` for what the LLM asked. **Combine — neither alone is enough.**
- For lists, summarize with a header line and show top N items with a "+N more" tail rather than rendering everything.
- **Keep the Render single-layer** — the tool card is already your surface, so don't open with your own filled container and then nest more filled boxes inside it. See [shared-rules.md](shared-rules.md) → "Stay single-layer".
- For modals from a Render, use `@lobehub/ui/base-ui` (`createModal`, `useModalContext`, `confirmModal`) — see the **modal** skill.
## Render registry — `client/Render/index.ts`
```ts
import type { BuiltinRender } from '@lobechat/types';
import { TaskApiName } from '../../types';
import CreateTaskRender from './CreateTask';
import RunTasksRender from './RunTasks';
export const TaskRenders: Record<string, BuiltinRender> = {
[TaskApiName.createTask]: CreateTaskRender as BuiltinRender,
[TaskApiName.runTasks]: RunTasksRender as BuiltinRender,
/* only the APIs with rich result UI — others fall back to text content */
};
export { default as CreateTaskRender } from './CreateTask';
export { default as RunTasksRender } from './RunTasks';
```
## Render display control (rare)
If the Render should hide for certain results (e.g. ClaudeCode's TodoWrite hides when the agent is mid-stream), add a `RenderDisplayControl` to `packages/builtin-tools/src/displayControls.ts`. See `ClaudeCodeRenderDisplayControls` for the pattern.
@@ -1,89 +0,0 @@
# Shared Style Rules
These apply across every surface.
## The component skeleton
Every surface file is the same shape, so internalize it once instead of re-deriving it per rule. The skeleton below bakes in five mechanical conventions — copy it and fill the body:
```tsx
'use client'; // (a) leaves of the chat tree must not block server rendering
import type { BuiltinInspectorProps, SearchQuery, UniformSearchResponse } from '@lobechat/types';
import { memo } from 'react';
import { useTranslation } from 'react-i18next';
// (b) type with BuiltinXProps<Args, State> — never widen to `any`.
// Args = the JSON Schema params, State = the executor's `state` field;
// they should match <Name>Params / <Name>State from types.ts.
export const SearchInspector = memo<BuiltinInspectorProps<SearchQuery, UniformSearchResponse>>(
({ args, pluginState }) => {
const { t } = useTranslation('plugin'); // (c) all strings from the `plugin` namespace
// (d) cross-cutting state (loading, streaming buffer) comes from the store,
// not props — props only carry args/state/messageId.
// const buffer = useChatStore((s) => chatToolSelectors.streamingBuffer(messageId)(s));
return <span>{t('builtins.<identifier>.apiName.search')}</span>;
},
);
SearchInspector.displayName = 'SearchInspector'; // (e) always memo + displayName
export default SearchInspector;
```
- **(c)** Default an Inspector to `t('builtins.<identifier>.apiName.<api>')` so the row is non-empty while args stream in.
- **(d)** Read the store via Zustand selectors inside the component; see [streaming.md](streaming.md) for the buffer selector.
## Styling: `createStaticStyles + cssVar.*`, `@lobehub/ui` over `antd`
Zero-runtime CSS-in-JS — styles compile once and read CSS variables at runtime:
```tsx
import { createStaticStyles, cssVar } from 'antd-style';
const styles = createStaticStyles(({ css, cssVar }) => ({
chip: css`
padding-block: 2px;
padding-inline: 8px;
border-radius: 999px;
color: ${cssVar.colorText};
background: ${cssVar.colorFillTertiary};
`,
}));
```
- Fall back to `createStyles + token` only when you need runtime token computation (rare). Inline `style={{ color: cssVar.colorTextSecondary }}` is fine for one-off dynamic values.
- Components come from `@lobehub/ui` (`Block`, `Text`, `Flexbox`, `Highlighter`, `Alert`, `Tooltip`, `Skeleton`), not raw `antd`. Modals come from `@lobehub/ui/base-ui` (`createModal`, `useModalContext`, `confirmModal`) — see the **modal** skill.
- Note: `<Text type='secondary'>` is a lighter shade than `colorTextSecondary`. For that exact token color, write `<Text style={{ color: cssVar.colorTextSecondary }}>`.
## Stay single-layer — don't nest filled cards
The framework already wraps every Render / Intervention in a tool card, so that card **is** your surface. A Render that opens with its own `background: ${cssVar.colorFillQuaternary}` container is already one card deep; put another filled box inside it (`colorBgContainer` / `colorFillTertiary`) and you get the card-in-card look that reads as "complex" — two or three stacked fills for what is really a flat list of fields.
- **The outermost wrapper carries no fill.** Use a flat container with only `padding-block: 4px` for breathing room; let the tool card provide the card. (See `Agent/index.tsx`'s `container`.)
- **At most one filled box, and only to delineate real content** — a Markdown preview, a diff, a code/result block. Labels, keyvalue fields, question/answer text, chips: render flat on the surface, separated by spacing or a hairline divider (`height: 1px; background: ${cssVar.colorFillSecondary}`), not by wrapping each in its own box.
- **A box on a flat surface needs a visible fill.** Once the outer fill is gone, an inner `colorBgContainer` box can vanish against the tool card (same color). Use `colorFillTertiary` for the one content box so it still reads as delineated.
- Don't wrap a single value in a box just to give it padding — that's the redundant-nesting smell (a `detailCard` around a `value` box around one string).
```tsx
// ❌ card-in-card: filled container wrapping a filled preview box
container: css`
padding: 12px;
background: ${cssVar.colorFillQuaternary};
`,
previewBox: css`
background: ${cssVar.colorBgContainer};
`,
// ✅ single-layer: flat container, one visible content box
container: css`
padding-block: 4px;
`,
previewBox: css`
background: ${cssVar.colorFillTertiary};
`,
```
For the common "icon + file/title header, then one content box" shape, reuse `ToolResultCard` from `@lobechat/shared-tool-ui/components` instead of rebuilding it — it's already single-layer (flat wrapper, one `colorFillTertiary` content box) and is what CC `Read` / `Grep` / `Glob` / `Write` / `WebSearch` / `WebFetch` render through.
The exception is a deliberate **panel** pattern — an `<Block variant="outlined">` with a header bar + list rows (CC `TodoWrite` / `Task`). There the single outlined block is the panel and the header fill is a header bar, not a nested card. One structured panel is fine; stacked decorative fills are not.
@@ -1,83 +0,0 @@
# Streaming — Live Output During Execution (optional)
**Lifecycle:** rendered **while the executor is still running** for APIs that emit incremental output. The component is responsible for fetching the in-flight stream from the chat store and rendering it.
**Add for** long-running ops with continuous output: shell command execution (stdout/stderr), file write progress, code interpreter cells.
## Props (`BuiltinStreamingProps<Args>`)
```ts
interface BuiltinStreamingProps<Arguments = any> {
apiName: string;
args: Arguments;
identifier: string;
messageId: string; // use to fetch the streaming buffer from store
toolCallId: string;
}
```
Note there's **no `state` or `result` prop** — the Streaming component is for the in-flight phase. It pulls the live buffer from the store itself (typically via `chatToolSelectors.streamingContent(messageId)` or similar).
## Canonical example — RunCommandStreaming
`packages/builtin-tool-local-system/src/client/Streaming/RunCommand/index.tsx`:
```tsx
'use client';
import type { BuiltinStreamingProps } from '@lobechat/types';
import { Highlighter } from '@lobehub/ui';
import { memo } from 'react';
interface RunCommandParams {
command?: string;
description?: string;
timeout?: number;
}
export const RunCommandStreaming = memo<BuiltinStreamingProps<RunCommandParams>>(({ args }) => {
const { command } = args || {};
if (!command) return null;
return (
<Highlighter
animated
wrap
language="sh"
showLanguage={false}
style={{ padding: '4px 8px' }}
variant="outlined"
>
{command}
</Highlighter>
);
});
RunCommandStreaming.displayName = 'RunCommandStreaming';
```
For real-time output beyond just the command (stderr/stdout streaming), pull from the chat store:
```tsx
const buffer = useChatStore((state) =>
chatToolSelectors.streamingBuffer(messageId, toolCallId)(state),
);
```
## Streaming rules
- Render `null` until you have something to display (avoids flash).
- For terminal-style output, use `Highlighter` with `animated` to show typing-like effect.
- The Streaming component must **unmount cleanly** when execution ends — typically the framework swaps it out for the Render automatically.
## Streaming registry — `client/Streaming/index.ts`
```ts
import { LocalSystemApiName } from '../..';
import { RunCommandStreaming } from './RunCommand';
import { WriteFileStreaming } from './WriteFile';
export const LocalSystemStreamings = {
[LocalSystemApiName.runCommand]: RunCommandStreaming,
[LocalSystemApiName.writeLocalFile]: WriteFileStreaming,
};
```
+8 -1
View File
@@ -1,6 +1,13 @@
---
name: chat-sdk
description: 'Build multi-platform chat bots with the chat SDK. Use for Slack, Teams, Google Chat, Discord, GitHub, Linear bots, webhooks, mentions, slash commands, cards, modals, or streaming responses.'
description: >
Build multi-platform chat bots with Chat SDK (`chat` npm package). Use when developers want to
(1) Build a Slack, Teams, Google Chat, Discord, GitHub, or Linear bot,
(2) Use the Chat SDK to handle mentions, messages, reactions, slash commands, cards, modals, or streaming,
(3) Set up webhook handlers for chat platforms,
(4) Send interactive cards or stream AI responses to chat platforms.
Triggers on "chat sdk", "chat bot", "slack bot", "teams bot", "discord bot", "@chat-adapter",
building bots that work across multiple chat platforms.
user-invocable: false
---
+58 -11
View File
@@ -29,9 +29,10 @@ Standard workflow for verifying backend changes using the LobeHub CLI (`lh`) aga
## Quick Reference
All CLI dev commands run from `lobehub/apps/cli/`. Subsequent examples use `$CLI`:
All CLI dev commands run from `lobehub/apps/cli/`:
```bash
# Shorthand for all commands below
CLI="LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts"
```
@@ -39,14 +40,17 @@ CLI="LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts"
### Step 1: Ensure Dev Server is Running
Check if the dev server is already running:
```bash
curl -s -o /dev/null -w '%{http_code}' http://localhost:3011/ 2> /dev/null
```
- **If reachable**: skip to Step 2.
- **If unreachable**: start from cloud repo root:
- **If reachable** (returns any HTTP status): server is running. Skip to Step 2.
- **If unreachable**: start the server:
```bash
# From cloud repo root
pnpm run dev:next
```
@@ -61,33 +65,37 @@ pnpm run dev:next
### Step 2: Check CLI Authentication
Check if dev credentials already exist:
```bash
cat lobehub/apps/cli/.lobehub-dev/settings.json 2> /dev/null
```
- **If file exists and contains `"serverUrl": "http://localhost:3011"`**: skip to Step 3.
- **If missing or wrong server**: ask the user to run:
- **If file exists and contains `"serverUrl": "http://localhost:3011"`**: already authenticated. Skip to Step 3.
- **If file missing or points to wrong server**: login is needed. Ask the user to run:
```bash
! cd lobehub/apps/cli && LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts login --server http://localhost:3011
```
> Login requires interactive browser authorization (OIDC Device Code Flow), so the user must run it themselves via `!` prefix. Credentials persist in `lobehub/apps/cli/.lobehub-dev/`.
> Login requires interactive browser authorization (OIDC Device Code Flow), so the user must run it themselves via `!` prefix. After login, credentials are saved to `lobehub/apps/cli/.lobehub-dev/` and persist across sessions.
### Step 3: Test with CLI Commands
CLI runs from source, so CLI-side code changes take effect immediately without rebuilding.
CLI runs from source (`bun src/index.ts`), so CLI-side code changes take effect immediately without rebuilding.
```bash
cd lobehub/apps/cli
$CLI <command>
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts <command>
```
### Step 4: Clean Up Test Data
Delete any test data created during verification:
```bash
$CLI task delete < id > -y
$CLI agent delete < id > -y
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts task delete < id > -y
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts agent delete < id > -y
```
## Common Testing Patterns
@@ -95,30 +103,51 @@ $CLI agent delete < id > -y
### Task System
```bash
# List tasks
$CLI task list
# Create test data with nesting
$CLI task create -n "Root Task" -i "Test instruction"
$CLI task create -n "Child Task" -i "Sub instruction" --parent T-1
# View task detail (tests getTaskDetail service)
$CLI task view T-1
# View task tree
$CLI task tree T-1
# Test lifecycle
$CLI task edit T-1 --status running
$CLI task comment T-1 -m "Test comment"
# Clean up
$CLI task delete T-1 -y
```
### Agent System
```bash
# List agents
$CLI agent list
# View agent detail
$CLI agent view <agent-id>
# Run agent (tests agent execution pipeline)
$CLI agent run <agent-id> -m "Test prompt"
```
### Document & Knowledge Base
```bash
# List documents
$CLI doc list
# Create and view
$CLI doc create -t "Test Doc" -c "Content here"
$CLI doc view <doc-id>
# Knowledge base
$CLI kb list
$CLI kb tree <kb-id>
```
@@ -126,13 +155,18 @@ $CLI kb tree <kb-id>
### Model & Provider
```bash
# List models and providers
$CLI model list
$CLI provider list
# Test provider connectivity
$CLI provider test <provider-id>
```
## Dev-Test Cycle
The standard cycle for backend development:
```
1. Make code changes (service/model/router/type)
|
@@ -143,7 +177,7 @@ $CLI provider test <provider-id>
lsof -ti:3011 | xargs kill && pnpm run dev:next
|
4. CLI verification (end-to-end)
$CLI <command>
LOBEHUB_CLI_HOME=.lobehub-dev bun src/index.ts <command>
|
5. Clean up test data
```
@@ -159,6 +193,10 @@ $CLI provider test <provider-id>
| `lobehub/apps/cli/` (CLI code) | No |
| `src/` (cloud overrides) | Yes |
### When Server Restart is NOT Needed
CLI runs from source via `bun src/index.ts`, so any changes to `lobehub/apps/cli/src/` take effect immediately on next command invocation.
## Troubleshooting
| Issue | Solution |
@@ -169,3 +207,12 @@ $CLI provider test <provider-id>
| CLI shows old data/behavior | Server needs restart to pick up code changes |
| `EADDRINUSE` on port 3011 | Server already running; kill with `lsof -ti:3011 \| xargs kill` |
| Login opens wrong server | Must use `--server http://localhost:3011` flag (env var doesn't work) |
## Credential Isolation
| Mode | Credential Dir | Server |
| ---------- | -------------------------------- | ----------------- |
| Dev | `lobehub/apps/cli/.lobehub-dev/` | `localhost:3011` |
| Production | `~/.lobehub/` | `app.lobehub.com` |
The two environments are completely isolated. Dev mode credentials are gitignored.
@@ -1,6 +1,6 @@
---
name: data-fetching-architecture
description: 'LobeHub data-fetching pipeline guide. Use for service layer, Zustand store, SWR, lambdaClient, useClientDataSWR, useFetchXxx hooks, or migrating useEffect fetches.'
name: data-fetching
description: Data fetching architecture guide using Service layer + Zustand Store + SWR. Use when implementing data fetching, creating services, working with store hooks, or migrating from useEffect. Triggers on data loading, API calls, service creation, or store data fetching tasks.
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: db-migrations
description: 'Use for Drizzle migrations: schema/table/column changes, migration generation or regeneration, sequence conflicts after rebase, idempotent SQL review, or migration renames.'
description: 'Use when generating or regenerating Drizzle migration files, changing database schema tables or columns, resolving migration sequence conflicts after rebase, reviewing migration SQL for idempotent patterns, or renaming migration files.'
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: debug-package
description: 'LobeHub debug package and log namespace guide. Use when adding debug() logging, choosing lobe-* namespaces, troubleshooting DEBUG output, localStorage.debug, or log format specifiers.'
description: "Guide for the `debug` npm package and LobeHub log namespaces (lobe-server:*, lobe-desktop:*, lobe-client:*, lobe-*-router:*). Use whenever adding a `debug(...)` logger, picking a namespace for new server/desktop/client/router code, troubleshooting why DEBUG=lobe-* logs don't show up, or when the user asks to 'add logging', 'add a logger', 'instrument this', 'trace this call', 'why isn't my log printing', or mentions `debug(`, `DEBUG=`, `localStorage.debug`, or log format specifiers like %O / %o / %s / %d in a LobeHub codebase."
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: docs-changelog
description: 'Write website changelog pages under docs/changelog/*.mdx. Use for EN/ZH product update posts, changelog posts, update-log copy, or docs changelog edits; not GitHub Release notes.'
description: 'Writing guide for website changelog pages under docs/changelog/*.mdx. Use when creating or editing product update posts in EN/ZH. Not for GitHub Release notes.'
---
# Docs Changelog Writing Guide
+4 -92
View File
@@ -1,6 +1,6 @@
---
name: drizzle
description: 'LobeHub Drizzle ORM schema and query style. Use for pgTable schemas, indexes, joins, inferred types, db.select/db.query, schema fields, foreign keys, junction tables, or postgres query patterns.'
description: "Drizzle ORM schema authoring and query style for LobeHub (postgres, strict mode). Use when editing anything under `src/database/schemas/`, defining `pgTable` columns/indexes/junction tables, spreading `...timestamps`, generating `createInsertSchema`/`$inferSelect`/`$inferInsert` types, writing `db.select().from(...).leftJoin(...)` queries, or deciding when to split a relational `with:` into two queries. Triggers on `pgTable`, `db.select`, `db.query`, `eq()`/`and()`/`inArray()`, `uniqueIndex`, `primaryKey`, `references({ onDelete })`, 'add a column', 'new table', 'foreign key', 'junction table', 'schema field'. For migration files specifically, see the `db-migrations` skill."
user-invocable: false
---
@@ -9,13 +9,13 @@ user-invocable: false
## Configuration
- Config: `drizzle.config.ts`
- Schemas: `packages/database/src/schemas/`
- Migrations: `packages/database/migrations/`
- Schemas: `src/database/schemas/`
- Migrations: `src/database/migrations/`
- Dialect: `postgresql` with `strict: true`
## Helper Functions
Location: `packages/database/src/schemas/_helpers.ts`
Location: `src/database/schemas/_helpers.ts`
- `timestamptz(name)`: Timestamp with timezone
- `createdAt()`, `updatedAt()`, `accessedAt()`: Standard timestamp columns
@@ -174,94 +174,6 @@ const rows = await this.db
.groupBy(agentEvalDatasets.id);
```
### Raw SQL and Advanced Queries
Prefer Drizzle builders whenever the query can be expressed clearly with `select`,
`insert().select()`, `update().from()`, joins, CTEs, `groupBy`, and typed selected
columns. This keeps table and column references tied to schema definitions, so
schema changes are more likely to surface as TypeScript errors.
Expression-level `sql<T>` is fine inside a Drizzle builder for PostgreSQL features
that do not have a dedicated helper, such as JSON path extraction, casts, aggregate
expressions, `CASE`, `NOW()`, or advisory locks. Row locks are query clauses, not
expressions; use the select builder's `.for('update')` instead of raw
`FOR UPDATE` SQL fragments.
When refactoring raw SQL:
- Preserve the original query shape for latency-sensitive paths. If raw SQL is one
database roundtrip, do not replace it with multiple depth-based queries just to
remove `execute`.
- Use `$with(...)` plus `insert().select()` / `update().from()` for multi-step
single-roundtrip writes when Drizzle can express the data flow.
- Avoid generic `execute<MyRow>(sql...)` as the main safety mechanism. It types the
returned rows, but it does not keep selected columns in sync with schema changes.
- If the only clean implementation is a PostgreSQL feature that Drizzle cannot
express well, keep the raw SQL and tighten it instead: use schema references in
interpolations, explicit user scope, a narrow row interface, and regression tests.
Recursive CTEs are a special case: current Drizzle usage in this repo does not have
a clean `WITH RECURSIVE` builder pattern. Keep recursive CTE raw SQL when replacing
it would add extra database roundtrips or materially worsen performance.
Example: convert an aggregate query when Drizzle can preserve one roundtrip:
```typescript
// ✅ Good: builder owns table and column references; sql<T> stays expression-level.
const rows = await trx
.select({
model: messages.model,
provider: messages.provider,
totalCost: sql<string | null>`sum((${messages.metadata}->'usage'->>'cost')::numeric)`.as(
'totalCost',
),
})
.from(messages)
.where(
and(
eq(messages.topicId, topicId),
eq(messages.userId, userId),
eq(messages.role, 'assistant'),
sql`${messages.metadata} ? 'usage'`,
),
)
.groupBy(messages.provider, messages.model);
```
Example: use the select lock builder for row locks:
```typescript
const [user] = await trx.select().from(users).where(eq(users.id, userId)).for('update');
```
Example: keep a recursive CTE raw when replacing it would add depth-based DB
roundtrips:
```typescript
interface TaskTreeRow {
id: string;
parent_task_id: string | null;
}
// execute<T> is acceptable here only because Drizzle has no clean WITH RECURSIVE
// builder; a builder rewrite would add depth-based roundtrips. Keep schema refs in
// the interpolations and scope every leg to the user.
const { rows } = await db.execute<TaskTreeRow>(sql`
WITH RECURSIVE task_tree AS (
SELECT ${tasks.id}, ${tasks.parentTaskId}
FROM ${tasks}
WHERE ${tasks.id} = ${rootTaskId}
AND ${tasks.createdByUserId} = ${userId}
UNION ALL
SELECT ${tasks.id}, ${tasks.parentTaskId}
FROM ${tasks}
JOIN task_tree ON ${tasks.parentTaskId} = task_tree.id
WHERE ${tasks.createdByUserId} = ${userId}
)
SELECT * FROM task_tree
`);
```
### One-to-Many (Separate Queries)
When you need a parent record with its children, use two queries instead of relational `with:`:
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: heterogeneous-agent
description: 'Implement or debug LobeHub heterogeneous agents. Use for Claude Code/Codex adapters, external CLI agents, event mapping, IPC, persistence, tool-call chains, sessions, traces, or adapter bugs.'
description: Guide for implementing and debugging LobeHub heterogeneous agent integrations such as Claude Code, Codex, and future external CLI agents. Use when working on adapter event mapping, Electron IPC transport, renderer persistence, tool-call chaining, subagent threads, resume/session handling, or regressions like mixed multi-tool messages, broken step boundaries, stuck tool loading, and orphan tool messages. Triggers on 'heterogeneous agent', 'hetero agent', '异构 agent', 'claude code adapter', 'codex adapter', 'external agent CLI', '孤立 tool 消息', 'raw Codex trace', or adapter/executor bugs.
---
# Heterogeneous Agent Development
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: hotkey
description: 'Add or edit LobeHub keyboard shortcuts. Use for HotkeyEnum, HOTKEYS_REGISTRATION, combineKeys, useHotkeyById, tooltip hotkeys, shortcut scope, conflicts, or Cmd/Ctrl key combos.'
description: "Adding or editing keyboard shortcuts in LobeHub. Use when registering a new hotkey, changing a key combo, scoping a shortcut to chat vs global, or wiring a hotkey hook + tooltip. Covers the 5-step flow: add to `HotkeyEnum` in `src/types/hotkey.ts`, register in `HOTKEYS_REGISTRATION` (`src/const/hotkeys.ts`) with `combineKeys([Key.Mod, …])`, add i18n in `src/locales/default/hotkey.ts`, expose via `useHotkeyById` in `src/hooks/useHotkeys/`, and render `<Tooltip hotkey={…}>`. Triggers on `HotkeyEnum`, `HOTKEYS_REGISTRATION`, `useHotkeyById`, `combineKeys`, `Key.Mod`/`Key.Shift`, 'add a hotkey', 'add a shortcut', '加快捷键', '快捷键', 'Cmd+K', 'keyboard shortcut', 'hotkey scope', 'hotkey conflict'."
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: i18n
description: 'LobeHub i18n with react-i18next. Use for user-facing strings, locale keys, namespaces, useTranslation, t(), interpolation, zh-CN/en-US previews, hardcoded UI copy, or pnpm i18n.'
description: "LobeHub internationalization with react-i18next. Use when adding any user-facing string in `.tsx`/`.ts` files, creating or renaming a key under `src/locales/default/{namespace}.ts`, deciding the `{feature}.{context}.{action}` flat-key pattern, wiring a new namespace into `src/locales/default/index.ts`, or translating zh-CN/en-US JSON for dev preview. Triggers on `useTranslation`, `t('foo.bar')`, `i18next.t`, `{{variable}}` interpolation, hardcoded UI strings (zh or en) that should be extracted, 'add i18n', '加 i18n key', '翻译', 'locale key', 'namespace', 'pnpm i18n'."
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: linear
description: 'Linear issue management. Use for LOBE-xxx issues, Linear links, PRs referencing Linear, retrieving issues, updating status, completion comments, or sub-issue trees.'
description: "Linear issue management. Use when the user mentions LOBE-xxx issue IDs (e.g. LOBE-4540), says 'linear' / 'linear issue' / 'link linear', or when creating PRs that reference Linear issues. Covers retrieving issues, updating status, adding completion comments, and creating sub-issue trees."
user-invocable: false
---
+33 -74
View File
@@ -397,60 +397,35 @@ The pattern is the same for every platform:
Pick the file for your target platform — each contains activation, navigation, send-message, and verification snippets specific to that app:
Each channel has its own folder under `bot/<channel>/` containing an `index.md`
(activation, navigation, send-message, and verification snippets specific to
that app) and its test script:
| Platform | Reference | Quick switcher |
| ------------- | -------------------------------------------------- | -------------- |
| Discord | [references/discord.md](./references/discord.md) | `Cmd+K` |
| Slack | [references/slack.md](./references/slack.md) | `Cmd+K` |
| Telegram | [references/telegram.md](./references/telegram.md) | `Cmd+F` |
| WeChat / 微信 | [references/wechat.md](./references/wechat.md) | `Cmd+F` |
| Lark / 飞书 | [references/lark.md](./references/lark.md) | `Cmd+K` |
| QQ | [references/qq.md](./references/qq.md) | `Cmd+F` |
| Platform | Reference | Quick switcher |
| ------------- | ------------------------------------------------ | -------------- |
| Discord | [bot/discord/index.md](./bot/discord/index.md) | `Cmd+K` |
| Slack | [bot/slack/index.md](./bot/slack/index.md) | `Cmd+K` |
| Telegram | [bot/telegram/index.md](./bot/telegram/index.md) | `Cmd+F` |
| WeChat / 微信 | [bot/wechat/index.md](./bot/wechat/index.md) | `Cmd+F` |
| Lark / 飞书 | [bot/lark/index.md](./bot/lark/index.md) | `Cmd+K` |
| QQ | [bot/qq/index.md](./bot/qq/index.md) | `Cmd+F` |
For **shared osascript patterns** (activate, type, paste, screenshot, read accessibility, common workflow template, gotchas), see [bot/osascript-common.md](./bot/osascript-common.md). Read this first if you're new to osascript automation.
## Bridge-based channels (no native app)
Some channels have no native app to drive with osascript — they connect through
a local bridge inside the Desktop app. These are tested with agent-browser
(IPC + UI) plus the bridge's own HTTP/REST endpoints, not osascript:
| Channel | Reference | What it drives |
| -------- | ------------------------------------------------ | -------------------------------------------------------- |
| iMessage | [bot/imessage/index.md](./bot/imessage/index.md) | `imessageBridge.*` IPC + local bridge + BlueBubbles REST |
For iMessage there is a one-shot regression script — see `test-imessage-bridge.sh` below.
For **shared osascript patterns** (activate, type, paste, screenshot, read accessibility, common workflow template, gotchas), see [references/osascript-common.md](./references/osascript-common.md). Read this first if you're new to osascript automation.
---
# Scripts
**App / recording scripts** in `.agents/skills/local-testing/scripts/`:
Ready-to-use scripts in `.agents/skills/local-testing/scripts/`:
| Script | Usage |
| ------------------------- | --------------------------------------------------- |
| `electron-dev.sh` | Manage Electron dev env (start/stop/status/restart) |
| `capture-app-window.sh` | Capture screenshot of a specific app window |
| `record-electron-demo.sh` | Record Electron app demo with ffmpeg |
| `record-app-screen.sh` | Record app screen (video + screenshots, start/stop) |
**Bot scripts** live under `.agents/skills/local-testing/bot/`, one folder per
channel (alongside that channel's `index.md`). The shared
`capture-app-window.sh` sits at the `bot/` root:
| Script | Usage |
| ---------------------------------- | ------------------------------------------------------------------- |
| `capture-app-window.sh` | Capture screenshot of a specific app window (used by bot tests) |
| `discord/test-discord-bot.sh` | Send message to Discord bot via osascript |
| `slack/test-slack-bot.sh` | Send message to Slack bot via osascript |
| `telegram/test-telegram-bot.sh` | Send message to Telegram bot via osascript |
| `wechat/test-wechat-bot.sh` | Send message to WeChat bot via osascript |
| `lark/test-lark-bot.sh` | Send message to Lark / 飞书 bot via osascript |
| `qq/test-qq-bot.sh` | Send message to QQ bot via osascript |
| `imessage/test-imessage-bridge.sh` | Regression-test the iMessage BlueBubbles bridge (IPC + HTTP) |
| `imessage/send-imessage-test.sh` | Send one real iMessage (desktop → BB → iMessage) and verify it sent |
| `test-discord-bot.sh` | Send message to Discord bot via osascript |
| `test-slack-bot.sh` | Send message to Slack bot via osascript |
| `test-telegram-bot.sh` | Send message to Telegram bot via osascript |
| `test-wechat-bot.sh` | Send message to WeChat bot via osascript |
| `test-lark-bot.sh` | Send message to Lark / 飞书 bot via osascript |
| `test-qq-bot.sh` | Send message to QQ bot via osascript |
### Window Screenshot Utility
@@ -458,9 +433,9 @@ channel (alongside that channel's `index.md`). The shared
```bash
# Standalone usage
./.agents/skills/local-testing/bot/capture-app-window.sh "Discord" /tmp/discord.png
./.agents/skills/local-testing/bot/capture-app-window.sh "Slack" /tmp/slack.png
./.agents/skills/local-testing/bot/capture-app-window.sh "WeChat" /tmp/wechat.png
./.agents/skills/local-testing/scripts/capture-app-window.sh "Discord" /tmp/discord.png
./.agents/skills/local-testing/scripts/capture-app-window.sh "Slack" /tmp/slack.png
./.agents/skills/local-testing/scripts/capture-app-window.sh "WeChat" /tmp/wechat.png
```
All bot test scripts use this utility automatically for their screenshots.
@@ -477,48 +452,32 @@ Examples:
```bash
# Discord — test a bot in #bot-testing channel
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "!ping"
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
./.agents/skills/local-testing/scripts/test-discord-bot.sh "bot-testing" "!ping"
./.agents/skills/local-testing/scripts/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
# Slack — test a bot in #bot-testing channel
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "@mybot hello"
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
./.agents/skills/local-testing/scripts/test-slack-bot.sh "bot-testing" "@mybot hello"
./.agents/skills/local-testing/scripts/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
# Telegram — test a bot by username
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "MyTestBot" "/start"
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "GPTBot" "Hello" 60
./.agents/skills/local-testing/scripts/test-telegram-bot.sh "MyTestBot" "/start"
./.agents/skills/local-testing/scripts/test-telegram-bot.sh "GPTBot" "Hello" 60
# WeChat — test a bot or send to a contact
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "文件传输助手" "test message" 5
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
./.agents/skills/local-testing/scripts/test-wechat-bot.sh "文件传输助手" "test message" 5
./.agents/skills/local-testing/scripts/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
# Lark/飞书 — test a bot in a group chat
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "@MyBot hello"
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "Help me with this" 30
./.agents/skills/local-testing/scripts/test-lark-bot.sh "bot-testing" "@MyBot hello"
./.agents/skills/local-testing/scripts/test-lark-bot.sh "bot-testing" "Help me with this" 30
# QQ — test a bot in a group or direct chat
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "bot-testing" "Hello bot" 15
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "MyBot" "/help" 10
./.agents/skills/local-testing/scripts/test-qq-bot.sh "bot-testing" "Hello bot" 15
./.agents/skills/local-testing/scripts/test-qq-bot.sh "MyBot" "/help" 10
```
Each script: activates the app, navigates to the channel/contact, pastes the message via clipboard, sends, waits, and takes a screenshot. Use the `Read` tool on the screenshot for visual verification.
### iMessage bridge regression script
`test-imessage-bridge.sh` does **not** follow the osascript bot interface — it
drives the Desktop bridge's IPC + HTTP layers and asserts the result, then
self-cleans. Needs BlueBubbles running and Electron up with CDP.
```bash
./.agents/skills/local-testing/bot/imessage/test-imessage-bridge.sh '<bluebubbles_password>' [bb_url] [cdp_port]
# defaults: bb_url=http://127.0.0.1:1234 cdp_port=9222 — exit 0 = all green
```
It guards the connect/configure flow (testConfig happy + reject paths, first-time
`upsertConfig` save, bridge running + webhook registered, local-server secret
enforcement). See [bot/imessage/index.md](./bot/imessage/index.md)
for the full manual UI flow and known bugs.
---
# Screen Recording
@@ -558,4 +517,4 @@ Outputs to `.records/` directory (gitignored): `<name>.mp4` (video) + `<name>/`
### osascript
See [bot/osascript-common.md](./bot/osascript-common.md#gotchas) for the full osascript gotchas list (accessibility permissions, `keystroke` non-ASCII issues, locale-specific app names, rate limiting, etc.).
See [references/osascript-common.md](./references/osascript-common.md#gotchas) for the full osascript gotchas list (accessibility permissions, `keystroke` non-ASCII issues, locale-specific app names, rate limiting, etc.).
@@ -1,232 +0,0 @@
# iMessage Desktop bridge regression test
The iMessage channel is different from the other bot platforms: there is **no
native app to drive with osascript**. Instead the Desktop app runs a local
**BlueBubbles bridge** — a small HTTP server in the Electron main process that
registers a webhook on a local [BlueBubbles](https://bluebubbles.app/) server,
receives iMessage events, and forwards them to LobeHub Cloud.
So the test surface is three layers:
1. **Electron main IPC**`imessageBridge.*` handlers (`getStatus`,
`testConfig`, `upsertConfig`, `removeConfig`, `start`, `stop`)
2. **Local bridge HTTP server**`http://127.0.0.1:<port>/webhooks/bluebubbles/<appId>?secret=<secret>`
3. **BlueBubbles REST API**`http://127.0.0.1:1234/api/v1/*` (webhook + server/info)
## Prerequisites
- A running **BlueBubbles server** (macOS, default `http://127.0.0.1:1234`) with
a known password. Sanity check:
```bash
curl -sS -m4 -o /dev/null -w '%{http_code}\n' \
"http://127.0.0.1:1234/api/v1/server/info?password=<PW>" # expect 200
```
- **Electron dev running with CDP**: `./.agents/skills/local-testing/scripts/electron-dev.sh start`
- The **iMessage Desktop branch** checked out (the `imessageBridge` IPC group
and `@lobechat/chat-adapter-imessage` must be compiled into the main bundle).
Run `pnpm install --ignore-scripts` at the repo root **and** in `apps/desktop/`
after switching branches — the new workspace package must be linked or the
main build fails to resolve `@lobechat/chat-adapter-imessage`.
## Fast path: automated script
```bash
./.agents/skills/local-testing/bot/imessage/test-imessage-bridge.sh '<bluebubbles_password>' [bb_url] [cdp_port]
```
Asserts the whole flow and self-cleans (unique `applicationId` per run, removes
its bridge config + BlueBubbles webhook on exit). Exit 0 = all green. It covers:
- BlueBubbles reachable + password valid; Electron CDP reachable; IPC available
- `testConfig` happy path → success
- `testConfig` wrong password → rejected; unreachable URL → rejected
- `upsertConfig` **first-time save → success** (Bug #1 regression guard, below)
- `getStatus` → `running:true`, config persisted, password redacted (`blueBubblesPasswordSet`)
- BlueBubbles webhook actually registered for the appId
- Local bridge HTTP server: wrong secret → 401; valid secret → past auth
The password is passed as argv (visible in `ps`) — local dev only, don't use a
real secret on a shared machine.
## Layer 1 — IPC probes (no UI)
The renderer exposes the main-process handlers via `window.electronAPI.invoke`.
This is the quickest way to exercise the bridge without clicking:
```bash
# baseline
agent-browser --cdp 9222 eval \
"(async()=>JSON.stringify(await window.electronAPI.invoke('imessageBridge.getStatus',{})))()"
# test a connection (note: password as a JS string)
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
(async function () {
try {
var r = await window.electronAPI.invoke('imessageBridge.testConfig', {
applicationId: 'probe',
blueBubblesServerUrl: 'http://127.0.0.1:1234',
blueBubblesPassword: 'PASTE_PW',
enabled: true,
webhookSecret: 'probe-secret',
});
return JSON.stringify(r); // { success: true }
} catch (e) { return 'ERR: ' + (e.message || e); }
})()
EVALEOF
```
`upsertConfig` persists to the Electron store, starts the local HTTP server, and
registers the BlueBubbles webhook. `removeConfig` + `stop` reverse it.
## Layer 2 — full UI flow (agent-browser)
The bridge settings only render in Desktop (`isDesktop` guard) under the agent's
**Channel → iMessage** screen. The platform tile only appears as a real (non
"Coming Soon") entry once the server registers `imessage` **and** the frontend
drops it from `COMING_SOON_PLATFORMS` (`src/routes/(main)/agent/channel/const.ts`).
```bash
agent-browser --cdp 9222 open "http://localhost:5173/agent/<aid>/channel"
agent-browser --cdp 9222 wait --load networkidle && agent-browser --cdp 9222 wait 1500
# confirm the remote backend lists imessage (it must be registered + deployed)
agent-browser --cdp 9222 eval --stdin << 'EVALEOF'
(async function(){
var url='lobe-backend://lobe/trpc/lambda/agentBotProvider.listPlatforms?input='+encodeURIComponent('{"json":null,"meta":{"values":["undefined"],"v":1}}');
var d=await (await fetch(url,{credentials:'include'})).json();
var p=d.result?.data?.json||d;
return JSON.stringify(p.map(function(x){return x.id;}));
})()
EVALEOF
# click the iMessage tile, then fill the form by ref
agent-browser --cdp 9222 eval "(()=>{var b=[...document.querySelectorAll('aside button')].find(x=>/imessage/i.test(x.textContent));b&&b.click();})()"
agent-browser --cdp 9222 wait 1500
agent-browser --cdp 9222 snapshot -i | grep -iE "127.0.0.1:1234|Application ID|Webhook Secret|Test BlueBubbles|Save Bridge"
```
Field refs (from the snapshot): Application ID, Webhook Secret, BlueBubbles
Server URL (`placeholder="http://127.0.0.1:1234"`), and a **nested** textbox right
under the URL one is the BlueBubbles Password. Fill with `fill` (real input
events — `eval`-setting React inputs won't fire onChange), click **Test
BlueBubbles**, then **Save Bridge**. Read the antd toast immediately (it
auto-dismisses):
```bash
agent-browser --cdp 9222 eval \
"JSON.stringify([...new Set([...document.querySelectorAll('.ant-message-custom-content')].map(n=>n.textContent.trim()))])"
# Test → "BlueBubbles connection passed"
# Save → "iMessage Desktop bridge saved"
```
Verify the end state via BlueBubbles + IPC:
```bash
curl -sS "http://127.0.0.1:1234/api/v1/webhook?password=<PW>" # webhook for the appId present
agent-browser --cdp 9222 eval "(async()=>JSON.stringify(await window.electronAPI.invoke('imessageBridge.getStatus',{})))()"
# running:true, serverUrl: http://127.0.0.1:33270, configs[].blueBubblesPasswordSet:true
```
Cleanup: `removeConfig` + `stop` via IPC, then `DELETE /api/v1/webhook/<id>` on
BlueBubbles.
## Outbound send test (desktop → BlueBubbles → iMessage)
Verifies the leg the bridge uses to _reply_: `BlueBubblesApiClient.sendText`
→ `POST /api/v1/message/text`. Run the helper against your own number:
```bash
./.agents/skills/local-testing/bot/imessage/send-imessage-test.sh '<bb_password>' '+<E164>' # e.g. +15551234567
```
**Gotcha that bites everyone:** with `method=apple-script` and a _new_
conversation, the HTTP POST often **times out** even though the message is
sent. Never judge success by the HTTP response. Instead poll
`POST /api/v1/message/query` and read the matching `isFromMe:true` row's
`error` field:
- `error: 0` (or null) → sent OK
- non-zero `error` → real send failure
The script does exactly this: fires the send, ignores the timeout, then matches
its marker text in the message store and asserts `error == 0`.
Two more notes:
- Use a full E.164 handle (`iMessage;-;+<countrycode><number>`) or an Apple ID
email. Looking the chat up by guid afterwards may 404 if BB filed the message
under a differently-formatted guid — that's a lookup quirk, not a send failure.
- Sending to _your own_ number round-trips: BB records both the outgoing
(`fromMe:true`) and an incoming copy (`fromMe:false`).
## Inbound e2e test (iMessage → cloud agent → reply)
Full inbound chain: a message arrives → BlueBubbles fires its `new-message`
webhook → local bridge (`:33270`) → `forwardWebhook` POSTs to
`<remote>/api/agent/webhooks/imessage/<appId>?secret=…` → cloud agent → reply
flows back via Device Gateway → BB `sendText`.
Prerequisites:
- A cloud bot provider for the same `applicationId` exists and is **connected**
(Save Configuration + the device gateway connected — a _disconnected_ gateway
yields `DEVICE_NOT_FOUND` on connect and blocks the reply leg).
- The `imessage` Labs toggle is on (otherwise the channel is gated to "Coming
Soon"), and `webhookSecret` matches on both ends (auto-generated on save).
Two ways to drive it:
1. **Second device / Apple ID (recommended).** Have _another_ Apple ID message
the BB-hosted number (e.g. "please reply pong"). The bot replies; you see it
on the other device. **No loop risk** — the reply goes to the other party,
not back to itself.
2. **Send to your own number (quick, loop-aware).** `sendText` to the hosted
number; the loopback _incoming_ copy (`isFromMe:false`) triggers the bot.
Watch the reply land in `message/query` as a `fromMe:true` row.
**Loop guard — why a self-send doesn't spin forever:** the Chat SDK adapter
drops any `isFromMe` message before dispatch
(`packages/chat-adapter-imessage/src/adapter.ts`: `if (message.isFromMe) return`).
The bot's own reply (`isFromMe:true`) is never re-processed, so in the normal
case (someone else → bot → reply to them) there is no loop. The self-send case
is a **test-only edge**: the bot's reply also round-trips to your number, and
only the adapter's `isFromMe` check stops a second pass. Keep the prompt
conversational (so the bot doesn't keep finding something to answer), and
**turn the `imessage` lab off / remove the config when done** — never leave a
self-send bot running unattended.
Watch the chain live:
```bash
tail -f /tmp/electron-dev.log | grep -iE "imessage|bridge|forward|Message API"
# the agent reply shows up as a fromMe:true row with the bot's text:
curl -sS -X POST "http://127.0.0.1:1234/api/v1/message/query?password=<PW>" \
-H 'Content-Type: application/json' -d '{"limit":5,"sort":"DESC"}'
```
`startTyping` will log a Private-API error unless BlueBubbles has the Private
API helper set up (needs a jailbroken / SIP-disabled Mac) — it's logged and
ignored; text replies still work.
## Known bugs / gotchas
- **Bug #1 — first-time save (fixed; guarded by the script).** BlueBubbles'
`GET /api/v1/webhook?url=<unregistered>` returns **HTTP 500**
(`Cannot read properties of null (reading 'events')`). The bridge must list
**all** webhooks and match client-side, never pass the `?url=` filter. If you
see `upsertConfig` fail with "An unhandled error has occurred!" originating in
`listWebhooks`, this regressed.
- **Save leaves a half-state on webhook failure.** `upsertConfig` writes the
config + starts the HTTP server _before_ registering the webhook, so a webhook
failure still reports `running:true` with the config persisted but no
BlueBubbles webhook. Always assert the BlueBubbles webhook list, not just IPC
status.
- **Unknown appId / forward failure → 500.** Posting to the local bridge for an
unknown appId, or when no cloud bot is bound, returns 500 (BlueBubbles retries
on 5xx). Auth (wrong secret → 401) is enforced before that.
- **Backend deploy lag.** Desktop dev proxies tRPC through `lobe-backend://` to
the _remote_ server. iMessage only appears in `listPlatforms` once the server
registration is deployed there, regardless of local branch.
- **Restart to load main-process fixes.** Editing `imessageBridgeSrv.ts` /
`@lobechat/chat-adapter-imessage` needs `electron-dev.sh restart` — main isn't
hot-replaced. On restart, enabled configs auto-register their webhook again.
@@ -1,81 +0,0 @@
#!/usr/bin/env bash
#
# send-imessage-test.sh — Verify the outbound leg: desktop → BlueBubbles → iMessage
#
# Sends one real iMessage via the same REST call the Desktop bridge uses
# (`POST /api/v1/message/text`, which BlueBubblesApiClient.sendText wraps) and
# confirms it actually went out.
#
# KEY GOTCHA: with method=apple-script and a NEW conversation, the HTTP request
# often TIMES OUT even though the message is sent. Do NOT treat the timeout as a
# failure — instead poll `POST /api/v1/message/query` and check the message's
# `error` field (0 = sent OK). This script does that for you.
#
# This sends a REAL message, so it has side effects. Target your own number.
#
# Usage:
# ./send-imessage-test.sh <bb_password> <target_e164> [message] [bb_url]
#
# Example (send to your own phone, E.164 with country code):
# ./send-imessage-test.sh 'my-bb-pass' '+15551234567'
#
set -euo pipefail
BB_PASS="${1:?Usage: $0 <bb_password> <target_e164(+countrycode)> [message] [bb_url]}"
TARGET="${2:?Need a target handle in E.164, e.g. +15551234567 (or an Apple ID email)}"
MARKER="lobe-imsg-test-$(date +%s)"
MESSAGE="${3:-[${MARKER}] desktop bridge → BlueBubbles → iMessage outbound check}"
BB_URL="${4:-http://127.0.0.1:1234}"
CHAT_GUID="iMessage;-;${TARGET}"
echo "[send-test] target=${TARGET} marker=${MARKER}"
# 1) Fire the send. apple-script on a new chat may hang the HTTP response, so we
# cap it short and ignore a timeout — step 2 is the source of truth.
python3 - "$BB_PASS" "$BB_URL" "$CHAT_GUID" "$MESSAGE" <<'PY' || true
import json,sys,urllib.request,urllib.parse,uuid
pw,base,guid,msg=sys.argv[1:5]
url=base+"/api/v1/message/text?password="+urllib.parse.quote(pw)
body={"chatGuid":guid,"message":msg,"method":"apple-script","tempGuid":str(uuid.uuid4())}
req=urllib.request.Request(url,data=json.dumps(body).encode("utf-8"),
headers={"Content-Type":"application/json"},method="POST")
try:
r=urllib.request.urlopen(req,timeout=8)
print("[send-test] HTTP",r.status,"(immediate response)")
except urllib.error.HTTPError as e:
print("[send-test] HTTP",e.code,e.read().decode()[:200])
except Exception as e:
print("[send-test] HTTP request returned no body (likely apple-script delay):",type(e).__name__)
PY
# 2) Source of truth: find our marker in the message store and read its error.
echo "[send-test] verifying via message/query (the HTTP timeout above is expected)…"
sleep 3
python3 - "$BB_PASS" "$BB_URL" "$MARKER" <<'PY'
import json,sys,time,urllib.request,urllib.parse
pw,base,marker=sys.argv[1:4]
url=base+"/api/v1/message/query?password="+urllib.parse.quote(pw)
def query():
body={"limit":15,"offset":0,"with":["chats"],"sort":"DESC"}
req=urllib.request.Request(url,data=json.dumps(body).encode(),
headers={"Content-Type":"application/json"},method="POST")
return json.load(urllib.request.urlopen(req,timeout=12)).get("data") or []
hit=None
for _ in range(5):
for m in query():
if marker in (m.get("text") or "") and m.get("isFromMe"):
hit=m; break
if hit: break
time.sleep(2)
if not hit:
print("[send-test] ✗ outbound message not found in BB store — send likely failed")
sys.exit(1)
err=hit.get("error")
if err in (0,None):
print("[send-test] ✓ outbound message sent (fromMe=True, error=%s)"%err)
print("[send-test] → confirm it arrived in the Messages app on the target device")
else:
print("[send-test] ✗ BlueBubbles reported send error=%s"%err)
sys.exit(1)
PY
@@ -1,187 +0,0 @@
#!/usr/bin/env bash
#
# test-imessage-bridge.sh — Regression test for the iMessage Desktop bridge
#
# Drives the Electron main-process `imessageBridge.*` IPC handlers plus the
# local bridge HTTP server and the BlueBubbles server, asserting the full
# connect/configure flow. Use it to regression-test PR work on the iMessage
# channel (BlueBubbles bridge) without clicking through the UI every time.
#
# Prerequisites:
# 1. BlueBubbles server running and reachable (default http://127.0.0.1:1234)
# 2. Electron dev running with CDP — `electron-dev.sh start`
# 3. `agent-browser` on PATH, connected to the same CDP port
#
# Usage:
# ./test-imessage-bridge.sh <bluebubbles_password> [bb_url] [cdp_port]
#
# Example:
# ./test-imessage-bridge.sh 'my-bb-password'
# ./test-imessage-bridge.sh 'my-bb-password' http://127.0.0.1:1234 9222
#
# Notes:
# - The password is passed as an argv, so it is visible in `ps`. This is a
# local dev tool; do not run it on shared machines with a real secret.
# - It uses a unique applicationId per run (imsg-regression-$$) and cleans up
# its own bridge config + BlueBubbles webhook on exit, so it is safe to
# re-run and does not disturb real configs.
set -euo pipefail
BB_PASS="${1:?Usage: $0 <bluebubbles_password> [bb_url] [cdp_port]}"
BB_URL="${2:-http://127.0.0.1:1234}"
CDP_PORT="${3:-9222}"
APP_ID="imsg-regression-$$"
SECRET="regression-secret-$$"
PASS=0
FAIL=0
# ── Output helpers ───────────────────────────────────────────────────
ok() { echo "$1"; PASS=$((PASS + 1)); }
bad() { echo "$1$2"; FAIL=$((FAIL + 1)); }
note() { echo "[imsg-test] $1"; }
# ── BlueBubbles REST helpers ─────────────────────────────────────────
bb_get_webhooks() {
curl -sS -m 8 "${BB_URL}/api/v1/webhook?password=${BB_PASS}"
}
# Delete every webhook whose URL mentions our APP_ID (cleanup is idempotent).
bb_cleanup_webhooks() {
local ids
ids=$(bb_get_webhooks | python3 -c '
import json,sys
try: d=json.load(sys.stdin)
except Exception: sys.exit(0)
for w in (d.get("data") or []):
if "'"$APP_ID"'" in (w.get("url") or ""): print(w["id"])
' 2>/dev/null || true)
for id in $ids; do
curl -sS -m 8 -X DELETE "${BB_URL}/api/v1/webhook/${id}?password=${BB_PASS}" >/dev/null 2>&1 || true
done
}
# ── IPC helper (drives the Electron renderer's electronAPI bridge) ───
# Runs a JS snippet that returns a string token; prints the raw token.
# The BlueBubbles password is base64-injected (atob) so special chars in the
# secret never need shell/JS quoting.
ipc_eval() {
local js="$1"
agent-browser --cdp "$CDP_PORT" eval -b "$(printf '%s' "$js" | base64)" 2>/dev/null
}
PASS_B64=$(printf '%s' "$BB_PASS" | base64)
# Emit an inline JS object literal for the bridge config. $1 overrides the
# password expression (defaults to atob of the real password); pass a JS string
# literal like "'wrong'" to test the rejection path.
ipc_config_js() {
local pwexpr="${1:-atob('${PASS_B64}')}"
printf "{applicationId:'%s',blueBubblesServerUrl:'%s',blueBubblesPassword:%s,enabled:true,webhookSecret:'%s'}" \
"$APP_ID" "$BB_URL" "$pwexpr" "$SECRET"
}
# ── Preflight ────────────────────────────────────────────────────────
note "BlueBubbles: ${BB_URL} CDP: ${CDP_PORT} appId: ${APP_ID}"
code=$(curl -sS -m 6 -o /dev/null -w '%{http_code}' \
"${BB_URL}/api/v1/server/info?password=${BB_PASS}" || echo 000)
if [ "$code" = "200" ]; then ok "BlueBubbles reachable + password valid"; else
bad "BlueBubbles preflight" "HTTP $code (is BlueBubbles running on ${BB_URL}?)"
echo "Aborting — fix BlueBubbles first."; exit 1
fi
if ! curl -sf --max-time 3 "http://localhost:${CDP_PORT}/json/version" >/dev/null 2>&1; then
bad "Electron CDP preflight" "CDP ${CDP_PORT} unreachable — run electron-dev.sh start"
echo "Aborting."; exit 1
fi
ok "Electron CDP reachable"
# Bridge must expose the IPC group (built from this branch's code).
probe=$(ipc_eval "(async()=>{try{var s=await window.electronAPI.invoke('imessageBridge.getStatus',{});return 'OK:'+JSON.stringify(s);}catch(e){return 'ERR:'+(e.message||e);}})()")
case "$probe" in
*OK:*) ok "imessageBridge IPC available" ;;
*) bad "imessageBridge IPC" "got: $probe (is the iMessage Desktop branch checked out?)"; echo "Aborting."; exit 1 ;;
esac
# Start clean: remove any leftover config for this appId + BB webhooks.
ipc_eval "(async()=>{try{await window.electronAPI.invoke('imessageBridge.removeConfig',{applicationId:'${APP_ID}'});}catch(e){}return 'done';})()" >/dev/null
bb_cleanup_webhooks
# ── testConfig: happy path ───────────────────────────────────────────
r=$(ipc_eval "(async()=>{try{var c=$(ipc_config_js);var x=await window.electronAPI.invoke('imessageBridge.testConfig',c);return 'OK:'+JSON.stringify(x);}catch(e){return 'ERR:'+(e.message||e);}})()")
case "$r" in
*OK:*success*true*) ok "testConfig with valid password → success" ;;
*) bad "testConfig (valid)" "got: $r" ;;
esac
# ── testConfig: wrong password rejects ───────────────────────────────
r=$(ipc_eval "(async()=>{try{var c=$(ipc_config_js "'definitely-wrong-password'");var x=await window.electronAPI.invoke('imessageBridge.testConfig',c);return 'OK:'+JSON.stringify(x);}catch(e){return 'ERR:'+(e.message||e);}})()")
case "$r" in
*ERR:*) ok "testConfig with wrong password → rejected" ;;
*) bad "testConfig (wrong password)" "expected rejection, got: $r" ;;
esac
# ── testConfig: unreachable URL rejects ──────────────────────────────
r=$(ipc_eval "(async()=>{try{var x=await window.electronAPI.invoke('imessageBridge.testConfig',{applicationId:'${APP_ID}',blueBubblesServerUrl:'http://127.0.0.1:65530',blueBubblesPassword:atob('${PASS_B64}'),enabled:true,webhookSecret:'${SECRET}'});return 'OK:'+JSON.stringify(x);}catch(e){return 'ERR:'+(e.message||e);}})()")
case "$r" in
*ERR:*) ok "testConfig with unreachable URL → rejected" ;;
*) bad "testConfig (unreachable)" "expected rejection, got: $r" ;;
esac
# ── upsertConfig: FIRST-TIME registration (Bug #1 regression guard) ──
# BlueBubbles' GET /webhook?url=<unregistered> returns HTTP 500. The bridge
# must list ALL webhooks and match client-side, otherwise this first save
# fails. This assertion guards that fix.
r=$(ipc_eval "(async()=>{try{var c=$(ipc_config_js);var x=await window.electronAPI.invoke('imessageBridge.upsertConfig',c);return 'OK:'+JSON.stringify(x);}catch(e){return 'ERR:'+(e.message||e);}})()")
case "$r" in
*OK:*success*true*) ok "upsertConfig first-time save → success (Bug #1 guard)" ;;
*) bad "upsertConfig (first-time)" "got: $r" ;;
esac
# ── getStatus: bridge running + config persisted ─────────────────────
# Return a quote-free token so grep isn't tripped up by agent-browser's
# JSON-string escaping of the eval result.
r=$(ipc_eval "(async()=>{var s=await window.electronAPI.invoke('imessageBridge.getStatus',{});var c=(s.configs||[]).find(function(x){return x.applicationId==='${APP_ID}';});return 'RUN='+(s.running?'Y':'N')+' CFG='+(c?'Y':'N')+' PW='+((c&&c.blueBubblesPasswordSet)?'Y':'N');})()")
echo "$r" | grep -q 'RUN=Y' && ok "bridge running" || bad "bridge running" "got: $r"
echo "$r" | grep -q 'CFG=Y' && ok "config persisted" || bad "config persisted" "got: $r"
echo "$r" | grep -q 'PW=Y' && ok "password stored (redacted in status)" || bad "password stored" "got: $r"
# ── BlueBubbles webhook actually registered ──────────────────────────
if bb_get_webhooks | grep -q "${APP_ID}"; then
ok "BlueBubbles webhook registered for appId"
else
bad "BlueBubbles webhook" "no webhook URL containing ${APP_ID}"
fi
# ── Local bridge HTTP server: secret enforcement ─────────────────────
BRIDGE_URL=$(ipc_eval "(async()=>{var s=await window.electronAPI.invoke('imessageBridge.getStatus',{});return s.serverUrl||'';})()" | tr -d '"')
if [ -n "$BRIDGE_URL" ]; then
# wrong secret → 401
code=$(curl -sS -m 6 -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/json' \
"${BRIDGE_URL}/webhooks/bluebubbles/${APP_ID}?secret=WRONG" \
-d '{"type":"new-message","data":{"guid":"x"}}' || echo 000)
[ "$code" = "401" ] && ok "local bridge rejects wrong secret (401)" || bad "local bridge wrong secret" "expected 401, got $code"
# right secret → passes auth (reaches forward; without a bound cloud bot it
# returns 5xx — that's fine, we're only asserting auth + routing here)
code=$(curl -sS -m 6 -o /dev/null -w '%{http_code}' -X POST \
-H 'Content-Type: application/json' \
"${BRIDGE_URL}/webhooks/bluebubbles/${APP_ID}?secret=${SECRET}" \
-d '{"type":"new-message","data":{"guid":"x","text":"hi"}}' || echo 000)
[ "$code" != "401" ] && ok "local bridge accepts valid secret (HTTP $code, past auth)" || bad "local bridge valid secret" "got 401 with correct secret"
else
bad "local bridge URL" "getStatus returned no serverUrl"
fi
# ── Cleanup ──────────────────────────────────────────────────────────
ipc_eval "(async()=>{try{await window.electronAPI.invoke('imessageBridge.removeConfig',{applicationId:'${APP_ID}'});await window.electronAPI.invoke('imessageBridge.stop',{});}catch(e){}return 'cleaned';})()" >/dev/null
bb_cleanup_webhooks
note "cleaned up config + BlueBubbles webhook for ${APP_ID}"
# ── Summary ──────────────────────────────────────────────────────────
echo ""
echo "[imsg-test] PASS=${PASS} FAIL=${FAIL}"
[ "$FAIL" -eq 0 ] || exit 1
@@ -1,93 +0,0 @@
# LobeHub gateway streaming + tab-switch test harness
Captures store + DOM state at 200ms intervals so we can prove or disprove
claims like "切回 tab 后消息回到了很早以前". Built for gateway-mode chat but
works for any LobeHub streaming session.
## Files
`scripts/agent-gateway/`
| File | Role |
| --------------- | ---------------------------------------------------------------- |
| `probe.js` | Injects a 200ms sampler + `__PROBE_EVENT` marker + `__switchTab` |
| `probe-dump.js` | Stops the sampler and returns `{events, samples}` as JSON string |
| `tab-switch.js` | Runs N round-trip switches between two tabs, marks each step |
| `analyze.mjs` | Node post-processor: timeline + regression detection |
## Standard workflow
```bash
# 1. Start Electron with CDP
./.agents/skills/local-testing/scripts/electron-dev.sh start
# 2. Navigate to a chat, switch runtime to Cloud Sandbox (gateway mode)
# 3. Install the probe + helpers
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/local-testing/scripts/agent-gateway/probe.js
# 4. Send a tool-call message — manually or via type+press
agent-browser --cdp 9222 eval "window.__PROBE_EVENT('SENT')"
# 5. Run the multi-switch driver (auto-picks active tab as BACK and the
# rightmost inactive tab as AWAY — edit ROUND_TRIPS / DWELL_MS in the
# file if you want different timing)
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/local-testing/scripts/agent-gateway/tab-switch.js
# 6. Wait for streaming to finish, then dump
agent-browser --cdp 9222 eval --stdin \
< .agents/skills/local-testing/scripts/agent-gateway/probe-dump.js \
> /tmp/probe.json
# 7. Analyze
node .agents/skills/local-testing/scripts/agent-gateway/analyze.mjs /tmp/probe.json
```
The analyzer prints three sections: EVENTS, TIMELINE, REGRESSIONS. If
REGRESSIONS is non-empty it means content/reasoning/childN dropped on the
same topic — the symptom users describe.
## What the probe tracks (and why)
`chat.messagesMap` only stores the top-level `assistantGroup` shell. The
actual streamed content, reasoning, and tool calls live in
`assistantGroup.children: AssistantContentBlock[]`. Any probe that only
reads `m.content` / `m.reasoning` will see zeros throughout streaming and
miss everything that matters. probe.js walks both levels and sums:
- `cT` total content length
- `rT` total reasoning length
- `toolT` total tool-call count
- `childN` number of content blocks
Plus DOM-side signals (`domLen`, search/crawl indicator counts) so you can
tell store-side regressions apart from render-side regressions.
## Gotchas
- **Optimistic new-topic state.** Before the first chunk lands, messages
live under the `<scope>_new` key with `tmp_*` ids and no `topicId` field.
probe.js falls back to those when `activeTopicId` is null.
- **Reasoning resets to 0 are not bugs.** When the assistant finishes
thinking and starts tool-use or text, the streaming reasoning buffer
empties and the finalised reasoning gets sealed into a completed block.
Filter these out manually if needed.
- **DOM length jitters by a handful of chars** because counters like "(10)"
in tool-call labels change as results arrive. analyze.mjs only flags
`domLen` drops greater than 100 chars to ignore that noise.
- **Never identify tabs by innerText.** The active tab's text embeds a
` · <agent name>` suffix, so a search like `'LobeHub Growth'` matches the
active tab when the active agent happens to be LobeHub Growth — and you
end up clicking the tab you're already on. probe.js uses the stable
`data-contextmenu-trigger` attribute (a React `useId()` value that's set
per-tab and survives focus changes) plus `data-active="true"` to mark
the active one. Helpers exposed:
`__listTabs()` / `__clickTabByKey(key)` / `__clickTabByIndex(i)` /
`__activeTabKey()`.
- **`tab-switch.js` fires-and-forgets.** The IIFE kicks off an async loop
and returns immediately so the agent-browser CLI eval doesn't blow past
its default 25 s timeout. Wait on the `SWITCH_LOOP_DONE` event marker
before dumping. Re-running while a loop is in flight is refused — the
chaotic data from overlapping runs is not worth debugging.
@@ -2,7 +2,7 @@
**App name:** `Discord` | **Process name:** `Discord`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -92,6 +92,6 @@ echo "Screenshot saved to /tmp/discord-test-result.png"
## Script
```bash
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "!ping"
./.agents/skills/local-testing/bot/discord/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
./.agents/skills/local-testing/scripts/test-discord-bot.sh "bot-testing" "!ping"
./.agents/skills/local-testing/scripts/test-discord-bot.sh "bot-testing" "/ask Tell me a joke" 30
```
@@ -2,7 +2,7 @@
**App name:** `Lark` or `飞书` | **Process name:** `Lark` or `飞书`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -56,6 +56,6 @@ screencapture /tmp/lark-bot-response.png
## Script
```bash
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "@MyBot hello"
./.agents/skills/local-testing/bot/lark/test-lark-bot.sh "bot-testing" "Help me with this" 30
./.agents/skills/local-testing/scripts/test-lark-bot.sh "bot-testing" "@MyBot hello"
./.agents/skills/local-testing/scripts/test-lark-bot.sh "bot-testing" "Help me with this" 30
```
@@ -2,7 +2,7 @@
**App name:** `QQ` | **Process name:** `QQ`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -57,6 +57,6 @@ screencapture /tmp/qq-bot-response.png
## Script
```bash
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "bot-testing" "Hello bot" 15
./.agents/skills/local-testing/bot/qq/test-qq-bot.sh "MyBot" "/help" 10
./.agents/skills/local-testing/scripts/test-qq-bot.sh "bot-testing" "Hello bot" 15
./.agents/skills/local-testing/scripts/test-qq-bot.sh "MyBot" "/help" 10
```
@@ -2,7 +2,7 @@
**App name:** `Slack` | **Process name:** `Slack`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -68,6 +68,6 @@ screencapture /tmp/slack-bot-response.png
## Script
```bash
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "@mybot hello"
./.agents/skills/local-testing/bot/slack/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
./.agents/skills/local-testing/scripts/test-slack-bot.sh "bot-testing" "@mybot hello"
./.agents/skills/local-testing/scripts/test-slack-bot.sh "bot-testing" "/ask What is 2+2?" 20
```
@@ -2,7 +2,7 @@
**App name:** `Telegram` | **Process name:** `Telegram`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -75,6 +75,6 @@ curl -s "https://api.telegram.org/bot$TELEGRAM_BOT_TOKEN/getUpdates?limit=5" | j
## Script
```bash
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "MyTestBot" "/start"
./.agents/skills/local-testing/bot/telegram/test-telegram-bot.sh "GPTBot" "Hello" 60
./.agents/skills/local-testing/scripts/test-telegram-bot.sh "MyTestBot" "/start"
./.agents/skills/local-testing/scripts/test-telegram-bot.sh "GPTBot" "Hello" 60
```
@@ -2,7 +2,7 @@
**App name:** `微信` or `WeChat` | **Process name:** `WeChat`
See [osascript-common.md](../osascript-common.md) for shared patterns.
See [osascript-common.md](./osascript-common.md) for shared patterns.
## Activate & Navigate
@@ -76,6 +76,6 @@ screencapture /tmp/wechat-bot-response.png
## Script
```bash
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "文件传输助手" "test message" 5
./.agents/skills/local-testing/bot/wechat/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
./.agents/skills/local-testing/scripts/test-wechat-bot.sh "文件传输助手" "test message" 5
./.agents/skills/local-testing/scripts/test-wechat-bot.sh "MyBot" "Tell me a joke" 30
```
@@ -1,243 +0,0 @@
// Analyzer for probe-events dumps. Reads a JSON file produced by `run.ts dump`
// and prints a layered breakdown:
//
// 1. STREAM EVENTS — every non-chunk WS/SSE event in receipt order
// 2. CHUNKS SUMMARY — collapsed per-step chunk counts (otherwise floods)
// 3. ACTION CALLS — replaceMessages / refreshMessages / MARK:* with stack
// 4. CORRELATION — calls ↔ nearest stream event within ±300ms
// 5. PER-KEY ASSISTANT GROWTH — for each messagesMap key, when the leading
// assistant message's cLen / rLen actually moves (this is what reveals
// "chunks arrived but the message never grew" regressions)
// 6. ROLLBACKS — msgN / childN / role drops in the active-topic timeline
//
// Usage:
// bun run .agents/skills/local-testing/scripts/agent-gateway/analyze-events.ts <dump.json>
import { readFileSync } from 'node:fs';
import type {
ProbeActionCall,
ProbeDump,
ProbeMessageSummary,
ProbeStreamEvent,
ProbeTimelineSample,
} from './types';
const file = process.argv[2];
if (!file) {
console.error('usage: bun run analyze-events.ts <dump.json>');
process.exit(1);
}
const raw = readFileSync(file, 'utf8');
// agent-browser eval --stdin wraps return values in quotes when the value is
// a string — so the JSON file may be double-encoded depending on how it was
// captured. Handle both.
const parsedOnce = JSON.parse(raw) as ProbeDump | string;
const dump: ProbeDump = typeof parsedOnce === 'string' ? JSON.parse(parsedOnce) : parsedOnce;
const { streamEvents = [], actionCalls = [], timeline = [] } = dump;
const pad = (v: unknown, n: number) => String(v).padStart(n);
// ── META ───────────────────────────────────────────────────────────
console.log('=== META ===');
console.log(` events: ${streamEvents.length}`);
console.log(` calls: ${actionCalls.length}`);
console.log(` timeline: ${timeline.length}`);
// ── 1. STREAM EVENTS (non-chunk) ───────────────────────────────────
const nonChunkEvents = streamEvents.filter((e) => e.type !== 'stream_chunk');
const chunkEvents = streamEvents.filter((e) => e.type === 'stream_chunk');
console.log(
`\n=== STREAM EVENTS (${nonChunkEvents.length} non-chunk + ${chunkEvents.length} chunks elided) ===`,
);
for (const e of nonChunkEvents) {
const dataStr = e.dataKeys?.length ? ` [${e.dataKeys.join(',')}]` : '';
const data = e.data as Record<string, unknown> | undefined;
const uiHint = data?.uiMessagesPreview
? ` uiPreview=${JSON.stringify(data.uiMessagesPreview)}`
: data?.uiMessagesTotal
? ` uiTotal=${data.uiMessagesTotal}`
: '';
const phaseHint = data?.phase ? ` phase=${data.phase}` : '';
const extra = e.serverType ? ` serverType=${e.serverType}` : '';
console.log(
` t=${pad(e.t, 7)} [${(e.transport ?? '?').padEnd(3)}] step=${pad(e.stepIndex ?? '-', 2)} ` +
`type=${(e.type ?? '').padEnd(22)} op=${e.opIdTail ?? '-'}${phaseHint}${uiHint}${extra}${dataStr}`,
);
}
// ── 2. CHUNK SUMMARY ───────────────────────────────────────────────
console.log('\n=== CHUNKS SUMMARY (per step / chunkType) ===');
const chunkBuckets = new Map<string, { count: number; firstT: number; lastT: number }>();
for (const c of chunkEvents) {
const data = c.data as Record<string, unknown> | undefined;
const ct = (data?.chunkType as string | undefined) ?? '?';
const key = `step=${c.stepIndex ?? '-'} chunkType=${ct.padEnd(8)} op=${c.opIdTail}`;
const slot = chunkBuckets.get(key);
if (slot) {
slot.count += 1;
slot.lastT = c.t;
} else {
chunkBuckets.set(key, { count: 1, firstT: c.t, lastT: c.t });
}
}
for (const [k, v] of chunkBuckets) {
console.log(` ${k} count=${pad(v.count, 4)} t=${pad(v.firstT, 7)}..${pad(v.lastT, 7)}`);
}
// ── 3. ACTION CALLS ───────────────────────────────────────────────
console.log('\n=== ACTION CALLS (replace/refresh/MARK) ===');
for (const c of actionCalls) {
if (c.name?.startsWith('MARK:')) {
console.log(` t=${pad(c.t, 7)} ${c.name}`);
continue;
}
const snapshot = (c.args as any)?.snapshot as
| Array<{ id: string; role: string; cLen: number; rLen: number }>
| undefined;
const snapStr = snapshot?.length
? ' snapshot=' + snapshot.map((m) => `${m.id}:${m.role}/c${m.cLen}/r${m.rLen}`).join(' | ')
: '';
const summary =
c.name === 'replaceMessages'
? `count=${c.args?.count} action=${(c.args?.params as any)?.action ?? '-'}${snapStr}`
: c.name === 'refreshMessages'
? `ctx=${JSON.stringify(c.args?.context)}`
: c.error
? `error=${c.error}`
: '';
console.log(` t=${pad(c.t, 7)} ${c.name.padEnd(20)} ${summary}`);
if (c.stack) {
const frames = c.stack
.split(' ← ')
.filter((f) => !!f && !f.includes('Object.<anonymous>'))
.slice(0, 3);
for (const f of frames) console.log(`${f}`);
}
}
// ── 4. CORRELATION ────────────────────────────────────────────────
function nearestEventForCall(
call: ProbeActionCall,
windowMs = 300,
): { event: ProbeStreamEvent; delta: number } | null {
let best: ProbeStreamEvent | null = null;
let bestDelta = Infinity;
for (const e of streamEvents) {
const d = Math.abs(e.t - call.t);
if (d < bestDelta && d <= windowMs) {
bestDelta = d;
best = e;
}
}
return best ? { event: best, delta: bestDelta } : null;
}
console.log('\n=== CORRELATION (replace/refresh ↔ nearest event within ±300ms) ===');
for (const c of actionCalls) {
if (c.name !== 'refreshMessages' && c.name !== 'replaceMessages') continue;
const hit = nearestEventForCall(c);
if (hit) {
const phase = (hit.event.data as Record<string, unknown> | undefined)?.phase;
console.log(
` t=${pad(c.t, 7)} ${c.name.padEnd(16)} ← Δ${pad(hit.delta, 4)}ms ${hit.event.type}` +
(phase ? ` phase=${phase}` : ''),
);
} else {
console.log(` t=${pad(c.t, 7)} ${c.name.padEnd(16)} ← (no event nearby — external trigger)`);
}
}
// ── 5. PER-KEY ASSISTANT GROWTH ───────────────────────────────────
// For each messagesMap key, find the trailing assistant message and report
// the points in time where its cLen / rLen actually changed. If the timeline
// shows chunks arriving but the assistant cLen never moves, that's the
// signature of "dispatch queue blocked / messageId mismatch".
console.log('\n=== PER-KEY ASSISTANT GROWTH ===');
const keysEverSeen = new Set<string>();
for (const s of timeline) for (const k of Object.keys(s.byKey ?? {})) keysEverSeen.add(k);
for (const key of keysEverSeen) {
console.log(`\n key=${key}`);
let lastSig: string | null = null;
for (const s of timeline) {
const slot = s.byKey?.[key];
if (!slot) continue;
const last = slot.msgs.at(-1) as ProbeMessageSummary | undefined;
if (!last) continue;
const sig = `${last.id}|c${last.cLen}|r${last.rLen}|n${slot.n}`;
if (sig === lastSig) continue;
lastSig = sig;
console.log(
` t=${pad(s.t, 7)} msgN=${pad(slot.n, 3)} ` +
`lastAssistant=${last.id} cLen=${pad(last.cLen, 5)} rLen=${pad(last.rLen, 5)}` +
` runOps=${s.runOps}`,
);
}
}
// ── 6. ROLLBACKS (active-topic msgN / childN / role drops) ─────────
console.log('\n=== ROLLBACKS (active-topic msgN / childN / role drops) ===');
let prev: ProbeTimelineSample | null = null;
const rollbacks: Array<{ t: number; topic: string | null; drops: string[] }> = [];
const flatten = (s: ProbeTimelineSample) => {
if (!s.activeTopic) return [];
return Object.entries(s.byKey ?? {})
.filter(([k]) => k.includes(s.activeTopic!))
.flatMap(([, v]) => v.msgs);
};
for (const s of timeline) {
if (s.err) {
prev = null;
continue;
}
if (!prev || prev.activeTopic !== s.activeTopic) {
prev = s;
continue;
}
const prevMsgs = flatten(prev);
const curMsgs = flatten(s);
const drops: string[] = [];
if (curMsgs.length < prevMsgs.length) drops.push(`msgN ${prevMsgs.length}${curMsgs.length}`);
let prevChild = 0;
let curChild = 0;
for (const m of prevMsgs) prevChild += m.chN ?? 0;
for (const m of curMsgs) curChild += m.chN ?? 0;
if (curChild < prevChild) drops.push(`childN ${prevChild}${curChild}`);
const prevById = new Map(prevMsgs.map((m) => [m.id, m]));
for (const m of curMsgs) {
const pr = prevById.get(m.id);
if (!pr) continue;
if (m.cLen < pr.cLen) drops.push(`cLen[${m.id}] ${pr.cLen}${m.cLen}`);
if (m.rLen < pr.rLen) drops.push(`rLen[${m.id}] ${pr.rLen}${m.rLen}`);
}
if (drops.length) rollbacks.push({ t: s.t, topic: s.activeTopic, drops });
prev = s;
}
if (rollbacks.length === 0) {
console.log(' (none)');
} else {
for (const r of rollbacks) {
const nearEvent = streamEvents
.filter((e) => Math.abs(e.t - r.t) <= 300)
.map((e) => `${e.type}${(e.data as any)?.phase ? ':' + (e.data as any).phase : ''}`);
const nearCall = actionCalls
.filter((c) => Math.abs(c.t - r.t) <= 300 && !c.name?.startsWith('MARK:'))
.map((c) => c.name);
console.log(
` t=${pad(r.t, 7)} topic=${r.topic} ${r.drops.join(' | ')}` +
(nearEvent.length ? ` near-event:[${nearEvent.join(',')}]` : '') +
(nearCall.length ? ` near-call:[${nearCall.join(',')}]` : ''),
);
}
}
@@ -1,119 +0,0 @@
#!/usr/bin/env node
// Analyze a probe dump captured by probe.js + probe-dump.js.
//
// node analyze.mjs /tmp/probe.json
//
// Prints:
// 1. EVENTS — user-action markers with their relative timestamps
// 2. TIMELINE — periodic samples (~1 per second + event-adjacent samples)
// showing every interesting field; columns:
// t(ms) | runOps | msgN | childN | content | reasoning | tools | domLen | search | crawl | topic | event
// 3. REGRESSIONS — every place a tracked counter *dropped* on the same
// topic between adjacent samples. A "true" UI rollback shows up as a
// drop in content/reasoning/tools/childN/domLen without a topic change.
//
// Whitelisted transitions (not flagged):
// - topic change → all drops expected (focus moved away)
// - reasoning length 0 after content starts → reasoning gets sealed into a
// completed sub-block; the parent's running reasoning resets to ''.
// - msgN drop when topic transitions from `_new` placeholder to a real id.
import fs from 'node:fs';
const file = process.argv[2];
if (!file) {
console.error('usage: node analyze.mjs <probe.json>');
process.exit(1);
}
const raw = JSON.parse(fs.readFileSync(file, 'utf8'));
// probe-dump.js wraps the payload in JSON.stringify so agent-browser returns
// it as a single quoted string. Unwrap.
const data = typeof raw === 'string' ? JSON.parse(raw) : raw;
const { events, samples } = data;
const fmt = {
pad(v, n) {
return String(v).padStart(n);
},
};
console.log('=== EVENTS ===');
for (const e of events) console.log(` t=${fmt.pad(e.t, 7)} ${e.name}`);
console.log(
'\n=== TIMELINE (~1s cadence, plus event-adjacent samples) ===\n' +
' t(ms) runOps msgN childN content reasoning tools domLen search crawl topic event',
);
let lastSampledAt = -1e9;
const eventBuckets = events.map((e) => e.t);
for (let i = 0; i < samples.length; i++) {
const s = samples[i];
const nearEvent = eventBuckets.some((et) => Math.abs(et - s.t) < 110);
if (!nearEvent && s.t - lastSampledAt < 1000) continue;
lastSampledAt = s.t;
const ev = events.find((e) => Math.abs(e.t - s.t) < 110);
const evMarker = ev ? `${ev.name}` : '';
const topicSuffix = s.topicId ? s.topicId.slice(-6) : '(none)';
const search = s.ind?.search ?? 0;
const crawl = s.ind?.crawl ?? 0;
console.log(
` ${fmt.pad(s.t, 6)} ` +
`${fmt.pad(s.runOps, 6)} ` +
`${fmt.pad(s.msgN, 4)} ` +
`${fmt.pad(s.childN ?? 0, 5)} ` +
`${fmt.pad(s.cT ?? 0, 8)} ` +
`${fmt.pad(s.rT ?? 0, 9)} ` +
`${fmt.pad(s.toolT ?? 0, 5)} ` +
`${fmt.pad(s.domLen ?? 0, 7)} ` +
`${fmt.pad(search, 6)} ` +
`${fmt.pad(crawl, 5)} ` +
`${topicSuffix.padEnd(8)}${evMarker}`,
);
}
console.log('\n=== REGRESSIONS (same topic, value dropped) ===');
const regressions = [];
for (let i = 1; i < samples.length; i++) {
const prev = samples[i - 1];
const cur = samples[i];
if (!cur.topicId || prev.topicId !== cur.topicId) continue;
const drops = [];
if (cur.msgN < prev.msgN) drops.push(`msgN: ${prev.msgN}${cur.msgN}`);
if ((cur.childN ?? 0) < (prev.childN ?? 0)) drops.push(`childN: ${prev.childN}${cur.childN}`);
if ((cur.cT ?? 0) < (prev.cT ?? 0)) drops.push(`content: ${prev.cT}${cur.cT}`);
if ((cur.rT ?? 0) < (prev.rT ?? 0)) drops.push(`reasoning: ${prev.rT}${cur.rT}`);
if ((cur.toolT ?? 0) < (prev.toolT ?? 0)) drops.push(`tools: ${prev.toolT}${cur.toolT}`);
// domLen jitters by a few chars from counter labels — only flag big drops.
if ((cur.domLen ?? 0) < (prev.domLen ?? 0) - 100) {
drops.push(`domLen: ${prev.domLen}${cur.domLen}`);
}
if (drops.length === 0) continue;
const nearbyEv = events.filter((e) => Math.abs(e.t - cur.t) < 600).map((e) => e.name);
regressions.push({ t: cur.t, topic: cur.topicId.slice(-6), drops, nearbyEv });
}
if (regressions.length === 0) {
console.log(' (none)');
} else {
for (const r of regressions) {
const evStr = r.nearbyEv.length ? ` near:[${r.nearbyEv.join(',')}]` : '';
console.log(` t=${fmt.pad(r.t, 7)} topic=${r.topic} ${r.drops.join(' | ')}${evStr}`);
}
}
console.log(`\n=== SUMMARY ===`);
console.log(` samples: ${samples.length}`);
console.log(` events: ${events.length}`);
console.log(` regressions: ${regressions.length}`);
if (samples.length) {
const last = samples.at(-1);
console.log(
` final: msgN=${last.msgN} childN=${last.childN ?? 0} content=${last.cT ?? 0} ` +
`reasoning=${last.rT ?? 0} tools=${last.toolT ?? 0} runOps=${last.runOps}`,
);
}
@@ -1,17 +0,0 @@
// Stop the probe and serialize collected data.
//
// agent-browser --cdp 9222 eval --stdin < probe-dump.js > /tmp/probe.json
//
// The whole thing is wrapped in a JSON.stringify so agent-browser returns it
// as a single quoted string — the analyzer double-parses to handle that.
(function () {
if (window.__PROBE_TIMER) {
clearInterval(window.__PROBE_TIMER);
window.__PROBE_TIMER = null;
}
return JSON.stringify({
events: window.__PROBE_EVENTS || [],
samples: window.__PROBE_SAMPLES || [],
});
})();
@@ -1,37 +0,0 @@
// Stops the events-probe timeline timer and stashes the full capture as a
// JSON string on `window.__PROBE_LAST_DUMP_JSON`. `run.ts` wraps the bundle
// in an IIFE that returns that global, which `agent-browser eval` prints to
// stdout — the runner then persists it under `.agent-gateway/`.
import type { ProbeDump } from './types';
declare global {
interface Window {
__PROBE_LAST_DUMP_JSON?: string;
}
}
const w = window;
if (w.__PROBE_TIMELINE_TIMER) {
clearInterval(w.__PROBE_TIMELINE_TIMER);
w.__PROBE_TIMELINE_TIMER = null;
}
const mutations = w.__PROBE_MUTATIONS ?? [];
const dump: ProbeDump & { mutations: typeof mutations } = {
meta: {
t0: w.__PROBE_T0 ?? 0,
collectedAt: Date.now(),
sampleCount: (w.__PROBE_MSG_TIMELINE ?? []).length,
eventCount: (w.__PROBE_STREAM_EVENTS ?? []).length,
callCount: (w.__PROBE_ACTION_CALLS ?? []).length,
},
streamEvents: w.__PROBE_STREAM_EVENTS ?? [],
actionCalls: w.__PROBE_ACTION_CALLS ?? [],
timeline: w.__PROBE_MSG_TIMELINE ?? [],
mutations,
};
w.__PROBE_LAST_DUMP_JSON = JSON.stringify(dump);
@@ -1,637 +0,0 @@
// LobeHub gateway raw-event-stream probe.
//
// Gateway-mode chats subscribe via WebSocket — NOT via the `/api/agent/stream`
// SSE endpoint (that one belongs to the direct/client durable-agent runtime).
// `AgentStreamClient` (`packages/agent-gateway-client/src/client.ts`) opens
// `new WebSocket('wss://.../ws?operationId=...')`, then parses JSON frames in
// its `onmessage` handler and re-emits `agent_event.event` objects to the
// chat store.
//
// To capture the RAW gateway events before the store touches them, we wrap
// `window.WebSocket` so that for any socket whose URL contains `operationId=`
// we intercept the `onmessage` handler / `addEventListener('message')` and
// log every `agent_event` frame.
//
// We *also* keep the `window.fetch` hook for `/api/agent/stream` so this
// probe still works for direct-mode runs — but gateway-mode events come
// through the WebSocket path.
//
// Buffers (read via `dump`):
// __PROBE_STREAM_EVENTS — raw events parsed off the wire
// __PROBE_ACTION_CALLS — replaceMessages / refreshMessages calls (best-effort)
// __PROBE_MSG_TIMELINE — 200ms snapshots of every messagesMap key
import type {
ProbeActionCall,
ProbeMessageSummary,
ProbeStreamEvent,
ProbeTimelineSample,
} from './types';
// Bundled by esbuild as an IIFE. Top-level code runs once on injection.
const w = window;
// ── Buffers ─────────────────────────────────────────────────────────
declare global {
interface Window {
__PROBE_MUTATIONS?: Array<{
t: number;
key: string;
n: number;
last?: { id: string; role: string; cLen: number; rLen: number; updatedAt?: unknown };
prevLast?: { id: string; role: string; cLen: number; rLen: number };
delta?: string;
}>;
__PROBE_STORE_UNSUB?: () => void;
}
}
const events: ProbeStreamEvent[] = (w.__PROBE_STREAM_EVENTS ??= []);
const calls: ProbeActionCall[] = (w.__PROBE_ACTION_CALLS ??= []);
const timeline: ProbeTimelineSample[] = (w.__PROBE_MSG_TIMELINE ??= []);
const mutations = (w.__PROBE_MUTATIONS ??= []);
events.length = 0;
calls.length = 0;
timeline.length = 0;
mutations.length = 0;
const t0 = Date.now();
w.__PROBE_T0 = t0;
const now = (): number => Date.now() - t0;
// ── Helpers ─────────────────────────────────────────────────────────
function summarizeData(data: unknown): Record<string, unknown> | unknown {
if (!data || typeof data !== 'object') return data;
const src = data as Record<string, unknown>;
const out: Record<string, unknown> = {};
for (const k of Object.keys(src)) {
const v = src[k];
if (v == null) {
out[k] = v;
} else if (Array.isArray(v)) {
out[k] = `Array(${v.length})`;
if (k === 'uiMessages') {
out.uiMessagesPreview = v.slice(0, 5).map((m: any) => ({
id: (m.id ?? '').slice(-8),
role: m.role,
cLen: (m.content ?? '').length,
children: (m.children ?? []).length,
tools: (m.tools ?? []).length,
reasoning: (m.reasoning?.content ?? '').length,
}));
out.uiMessagesTotal = v.length;
}
} else if (typeof v === 'object') {
const obj = v as Record<string, unknown>;
out[k] =
'Object{' +
Object.keys(obj)
.slice(0, 6)
.map((kk) => kk + (typeof obj[kk] === 'string' ? `=${(obj[kk] as string).length}ch` : ''))
.join(',') +
'}';
} else if (typeof v === 'string') {
out[k] = v.length > 100 ? v.slice(0, 100) + `…(${v.length})` : v;
} else {
out[k] = v;
}
}
return out;
}
function summarizeMessages(msgs: any[]): ProbeMessageSummary[] {
return (msgs ?? []).slice(0, 80).map((m) => ({
id: (m.id ?? '').slice(-8),
role: m.role,
cLen: (m.content ?? '').length,
rLen: (m.reasoning?.content ?? '').length,
tools: (m.tools ?? []).length,
chN: (m.children ?? []).length,
}));
}
function shortStack(): string {
const raw = new Error('probe-stack').stack ?? '';
return raw
.split('\n')
.slice(3)
.filter((l) => !l.includes('probe-events') && !l.includes('node_modules'))
.map((l) => l.trim().replace(/^at\s+/, ''))
.slice(0, 6)
.join(' ← ');
}
function recordAgentEvent(args: {
transport: 'ws' | 'sse';
opId: string | null;
agentEvent: any;
eventId?: string | null;
rawLen?: number;
}): void {
const { transport, opId, agentEvent, eventId, rawLen } = args;
if (!agentEvent || typeof agentEvent !== 'object') return;
events.push({
t: now(),
transport,
opIdTail: (opId ?? '').slice(-10),
eventId: eventId ?? null,
type: agentEvent.type,
stepIndex: agentEvent.stepIndex,
dataKeys: agentEvent.data ? Object.keys(agentEvent.data) : [],
data: summarizeData(agentEvent.data) as Record<string, unknown>,
rawLen,
});
}
// ── 1. Patch window.WebSocket for gateway WS events ────────────────
if (!w.__PROBE_ORIG_WEBSOCKET) w.__PROBE_ORIG_WEBSOCKET = w.WebSocket;
const OrigWS = w.__PROBE_ORIG_WEBSOCKET;
function extractOpIdFromWsUrl(url: string | URL): string | null {
const m = String(url ?? '').match(/operationId=([^&]+)/);
return m ? decodeURIComponent(m[1]) : null;
}
function isGatewayWs(url: string | URL): boolean {
return String(url ?? '').includes('operationId=');
}
function handleWsFrame(rawData: unknown, opId: string | null): void {
const rawLen = typeof rawData === 'string' ? rawData.length : -1;
let parsed: any;
try {
parsed = typeof rawData === 'string' ? JSON.parse(rawData) : null;
} catch {
events.push({
t: now(),
transport: 'ws',
opIdTail: (opId ?? '').slice(-10),
type: '_PARSE_ERROR_',
raw: typeof rawData === 'string' && rawData.length < 400 ? rawData : '(non-string or large)',
});
return;
}
if (!parsed) return;
if (parsed.type === 'agent_event') {
recordAgentEvent({
transport: 'ws',
opId,
agentEvent: parsed.event,
eventId: parsed.id,
rawLen,
});
} else {
events.push({
t: now(),
transport: 'ws',
opIdTail: (opId ?? '').slice(-10),
type: '_SERVER_MSG_',
serverType: parsed.type,
rawLen,
});
}
}
// Wrap the constructor. Instance `constructor` will still reflect OrigWS
// (we share prototypes), so use the `_WS_OPEN_` sentinel events to confirm
// the patch is firing.
function PatchedWebSocket(this: WebSocket, url: string | URL, protocols?: string | string[]) {
const ws: WebSocket = protocols == null ? new OrigWS(url) : new OrigWS(url, protocols);
const opId = extractOpIdFromWsUrl(url);
if (!isGatewayWs(url)) return ws;
events.push({
t: now(),
transport: 'ws',
opIdTail: (opId ?? '').slice(-10),
type: '_WS_OPEN_',
url: String(url),
});
// One observer listener that always fires, regardless of how the consumer
// (AgentStreamClient uses `ws.onmessage = …`) subscribes.
ws.addEventListener('message', (e) => {
try {
handleWsFrame((e as MessageEvent).data, opId);
} catch {
/* swallow */
}
});
ws.addEventListener('close', () => {
events.push({
t: now(),
transport: 'ws',
opIdTail: (opId ?? '').slice(-10),
type: '_WS_CLOSE_',
});
});
return ws;
}
// Preserve prototype + static fields so `instanceof WebSocket` and
// `WebSocket.OPEN` constants still work.
(PatchedWebSocket as unknown as { prototype: WebSocket }).prototype = OrigWS.prototype;
for (const k of Object.keys(OrigWS) as Array<keyof typeof OrigWS>) {
try {
(PatchedWebSocket as any)[k] = (OrigWS as any)[k];
} catch {
/* readonly */
}
}
(['CONNECTING', 'OPEN', 'CLOSING', 'CLOSED'] as const).forEach((k) => {
(PatchedWebSocket as any)[k] = (OrigWS as any)[k];
});
w.WebSocket = PatchedWebSocket as unknown as typeof WebSocket;
// ── 2. Patch window.fetch for `/api/agent/stream` (direct-mode SSE) ─
if (!w.__PROBE_ORIG_FETCH) w.__PROBE_ORIG_FETCH = w.fetch.bind(w);
const origFetch = w.__PROBE_ORIG_FETCH;
function isAgentStreamUrl(input: RequestInfo | URL): boolean {
let url = '';
if (typeof input === 'string') url = input;
else if (input instanceof URL) url = input.toString();
else if (input && typeof (input as Request).url === 'string') url = (input as Request).url;
return url.includes('/api/agent/stream');
}
function extractOpIdFromHttpUrl(input: RequestInfo | URL): string | null {
const url = typeof input === 'string' ? input : (input as Request | URL).toString();
const m = url.match(/operationId=([^&]+)/);
return m ? decodeURIComponent(m[1]) : null;
}
function pushFromSSEFrame(rawFrame: string, opId: string | null): void {
const lines = rawFrame.split('\n');
let dataJson = '';
let evtName = 'message';
for (const line of lines) {
if (line.startsWith('event:')) evtName = line.slice(6).trim();
else if (line.startsWith('data:')) dataJson += line.slice(5).trim();
}
if (!dataJson) return;
let parsed: any;
try {
parsed = JSON.parse(dataJson);
} catch {
events.push({
t: now(),
transport: 'sse',
opIdTail: (opId ?? '').slice(-10),
type: '_PARSE_ERROR_',
sseEvent: evtName,
raw: dataJson.length > 400 ? dataJson.slice(0, 400) + '…' : dataJson,
});
return;
}
recordAgentEvent({
transport: 'sse',
opId,
agentEvent: parsed,
eventId: null,
rawLen: dataJson.length,
});
}
async function teeAndDrain(response: Response, opId: string | null): Promise<Response> {
if (!response.body) return response;
const [a, b] = response.body.tee();
void (async () => {
const reader = b.getReader();
const decoder = new TextDecoder();
let buf = '';
try {
while (true) {
const { value, done } = await reader.read();
if (done) break;
buf += decoder.decode(value, { stream: true });
let idx: number;
while ((idx = buf.indexOf('\n\n')) !== -1) {
const frame = buf.slice(0, idx);
buf = buf.slice(idx + 2);
if (frame.trim()) pushFromSSEFrame(frame, opId);
}
}
if (buf.trim()) pushFromSSEFrame(buf, opId);
} catch (e: any) {
events.push({
t: now(),
transport: 'sse',
opIdTail: (opId ?? '').slice(-10),
type: '_TEE_ERROR_',
message: String(e?.message ?? e),
});
}
})();
return new Response(a, {
headers: response.headers,
status: response.status,
statusText: response.statusText,
});
}
w.fetch = async function patchedFetch(input: RequestInfo | URL, init?: RequestInit) {
const response = await origFetch(input as any, init);
if (!isAgentStreamUrl(input)) return response;
const opId = extractOpIdFromHttpUrl(input);
const url =
typeof input === 'string'
? input.split('?')[0]
: (input as Request | URL).toString().split('?')[0];
events.push({
t: now(),
transport: 'sse',
opIdTail: (opId ?? '').slice(-10),
type: '_CONNECTED_',
url,
status: response.status,
});
return teeAndDrain(response, opId);
} as typeof fetch;
// ── 3. Wrap store actions (best-effort for "who called replace") ────
// Side-global stash for the original chat-store actions. Re-installs ALWAYS
// rewrap from the originals so updates to the probe body take effect
// without a page reload — using only a `__probeWrapped` flag on the chat
// state object would freeze the first-installed wrapper across re-installs.
declare global {
interface Window {
__PROBE_ORIG_REFRESH_MESSAGES?: any;
__PROBE_ORIG_REPLACE_MESSAGES?: any;
}
}
try {
const chat = w.__LOBE_STORES?.chat?.();
if (chat) {
// First-time install: cache the originals. Re-install: restore from
// the cached originals before wrapping again.
if (!w.__PROBE_ORIG_REFRESH_MESSAGES) w.__PROBE_ORIG_REFRESH_MESSAGES = chat.refreshMessages;
if (!w.__PROBE_ORIG_REPLACE_MESSAGES) w.__PROBE_ORIG_REPLACE_MESSAGES = chat.replaceMessages;
const origRefresh = w.__PROBE_ORIG_REFRESH_MESSAGES;
const origReplace = w.__PROBE_ORIG_REPLACE_MESSAGES;
chat.refreshMessages = origRefresh;
chat.replaceMessages = origReplace;
chat.refreshMessages = async function probeRefresh(this: unknown, ...args: any[]) {
calls.push({
t: now(),
name: 'refreshMessages',
args: { context: args[0] ?? null },
stack: shortStack(),
});
return origRefresh.apply(this, args);
};
chat.replaceMessages = function probeReplace(this: unknown, ...args: any[]) {
const msgs = (args[0] as any[]) ?? [];
const snapshot = msgs.slice(-2).map((m) => ({
id: (m.id ?? '').slice(-8),
role: m.role,
cLen: (m.content ?? '').length,
rLen: (m.reasoning?.content ?? '').length,
updatedAt: m.updatedAt,
}));
calls.push({
t: now(),
name: 'replaceMessages',
args: { count: msgs.length, params: args[1] ?? null, snapshot } as any,
stack: shortStack(),
});
// Pair the call with a mutation row so the analyzer can build a
// single ordered timeline across replaceMessages + dispatchMessage.
const stackTop = shortStack().split(' ← ')[0]?.slice(0, 80);
const last = msgs.at(-1);
const lastSum = last
? {
id: (last.id ?? '').slice(-8),
role: last.role,
cLen: (last.content ?? '').length,
rLen: (last.reasoning?.content ?? '').length,
updatedAt: last.updatedAt,
}
: undefined;
const params: any = args[1] ?? {};
const ctxKey = params.context
? `main_${params.context.agentId ?? '?'}_${
params.context.topicId ? 'tpc_' + params.context.topicId : 'new'
}`.replace('main_tpc_', 'main_') // crude key inference
: '(no-ctx)';
mutations.push({
t: now(),
key: ctxKey,
n: msgs.length,
last: lastSum,
delta: `replaceMessages(action=${params.action ?? '-'}) src=${stackTop ?? '-'}`,
});
return origReplace.apply(this, args);
};
}
} catch (e: any) {
calls.push({ t: now(), name: '_WRAP_ERROR_', error: String(e?.message ?? e) });
}
// ── 3.5. Mutation log — wrap the TWO ChatStore writers (replaceMessages,
// internal_dispatchMessage) to record EVERY dbMessagesMap[key] reference
// change with a one-line "before/after last assistant message" delta. This
// reveals dispatchMessage-driven collapses that the replaceMessages wrap
// alone cannot see.
declare global {
interface Window {
__PROBE_ORIG_DISPATCH_MESSAGE?: any;
}
}
try {
const chat = w.__LOBE_STORES?.chat?.();
if (chat?.internal_dispatchMessage) {
if (!w.__PROBE_ORIG_DISPATCH_MESSAGE)
w.__PROBE_ORIG_DISPATCH_MESSAGE = chat.internal_dispatchMessage;
const origDispatch = w.__PROBE_ORIG_DISPATCH_MESSAGE;
chat.internal_dispatchMessage = origDispatch;
chat.internal_dispatchMessage = function probeDispatch(this: unknown, payload: any, ctx?: any) {
// Snapshot BEFORE — read the would-be target key + last message.
const before = (() => {
try {
const state = w.__LOBE_STORES?.chat?.();
if (!state) return null;
// Replicate state.internal_getConversationContext logic enough to
// resolve a key — but most callers pass operationId on ctx, and
// operationId-keyed lookup needs store internals. Easiest: snapshot
// ALL keys' last-assistant cLen and compare BEFORE vs AFTER below.
const map = state.dbMessagesMap ?? {};
const out: Record<string, any> = {};
for (const k of Object.keys(map)) {
const last = (map[k] ?? []).at(-1);
out[k] = last
? {
id: (last.id ?? '').slice(-8),
cLen: (last.content ?? '').length,
rLen: (last.reasoning?.content ?? '').length,
n: map[k].length,
}
: { n: 0 };
}
return out;
} catch {
return null;
}
})();
const result = origDispatch.apply(this, [payload, ctx]);
// Snapshot AFTER — find which key(s) actually changed.
try {
const state = w.__LOBE_STORES?.chat?.();
if (state && before) {
const map = state.dbMessagesMap ?? {};
for (const k of Object.keys(map)) {
const last = (map[k] ?? []).at(-1);
const beforeSnap = before[k];
const afterSnap = last
? {
id: (last.id ?? '').slice(-8),
cLen: (last.content ?? '').length,
rLen: (last.reasoning?.content ?? '').length,
n: map[k].length,
}
: { n: 0 };
const changed =
!beforeSnap ||
beforeSnap.n !== afterSnap.n ||
beforeSnap.id !== (afterSnap as any).id ||
beforeSnap.cLen !== (afterSnap as any).cLen ||
beforeSnap.rLen !== (afterSnap as any).rLen;
if (!changed) continue;
let delta = '';
if (beforeSnap?.id !== undefined && beforeSnap.id !== (afterSnap as any).id)
delta += `id:${beforeSnap.id}${(afterSnap as any).id};`;
if (
beforeSnap?.cLen !== undefined &&
(afterSnap as any).cLen !== undefined &&
(afterSnap as any).cLen < beforeSnap.cLen
)
delta += `cLen↓${beforeSnap.cLen}${(afterSnap as any).cLen};`;
if (
beforeSnap?.rLen !== undefined &&
(afterSnap as any).rLen !== undefined &&
(afterSnap as any).rLen < beforeSnap.rLen
)
delta += `rLen↓${beforeSnap.rLen}${(afterSnap as any).rLen};`;
if (beforeSnap?.n !== undefined && afterSnap.n < beforeSnap.n)
delta += `n↓${beforeSnap.n}${afterSnap.n};`;
mutations.push({
t: now(),
key: k,
n: afterSnap.n,
last: (afterSnap as any).id ? (afterSnap as any) : undefined,
prevLast: beforeSnap?.id ? beforeSnap : undefined,
delta: delta || `dispatch:${payload?.type}`,
});
}
}
} catch (e: any) {
mutations.push({
t: now(),
key: '_DISPATCH_PROBE_ERROR_',
n: -1,
delta: String(e?.message ?? e),
});
}
return result;
};
}
} catch (e: any) {
calls.push({ t: now(), name: '_DISPATCH_WRAP_ERROR_', error: String(e?.message ?? e) });
}
// ── 4. Periodic per-key timeline snapshots ─────────────────────────
function captureTimeline(): void {
try {
const c = w.__LOBE_STORES?.chat?.();
if (!c) return;
const msgsMap = (c.messagesMap ?? {}) as Record<string, any[]>;
const dbMap = (c.dbMessagesMap ?? {}) as Record<string, any[]>;
const byKey: ProbeTimelineSample['byKey'] = {};
for (const k of Object.keys(msgsMap)) {
const display = msgsMap[k] ?? [];
const db = dbMap[k] ?? [];
if (display.length === 0 && db.length === 0) continue;
byKey[k] = {
n: display.length,
dbN: db.length,
msgs: summarizeMessages(display),
};
}
const ops = Object.values((c.operations ?? {}) as Record<string, any>);
timeline.push({
t: now(),
activeTopic: ((c.activeTopicId as string | null) ?? '').slice(-10) || null,
keys: Object.keys(byKey),
byKey,
runOps: ops.filter((o: any) => o.status === 'running').length,
});
} catch (e: any) {
timeline.push({
t: now(),
activeTopic: null,
keys: [],
byKey: {},
runOps: 0,
err: e?.message ?? String(e),
});
}
}
captureTimeline();
if (w.__PROBE_TIMELINE_TIMER) clearInterval(w.__PROBE_TIMELINE_TIMER);
w.__PROBE_TIMELINE_TIMER = setInterval(captureTimeline, 200);
// ── 5. Tab-switch helpers ──────────────────────────────────────────
function listTopBarTabs(): HTMLElement[] {
return Array.from(
document.querySelectorAll<HTMLElement>(
'[data-insp-path*="TabItem.tsx"][data-contextmenu-trigger]',
),
).filter((t) => t.getBoundingClientRect().top < 30);
}
w.__listTabs = () =>
listTopBarTabs().map((t, i) => ({
i,
key: t.getAttribute('data-contextmenu-trigger'),
active: t.getAttribute('data-active') === 'true',
title: (t.innerText ?? '').slice(0, 60),
}));
w.__clickTabByKey = (key: string) => {
const tab = listTopBarTabs().find((t) => t.getAttribute('data-contextmenu-trigger') === key);
if (!tab) return 'not found: ' + key;
if (tab.getAttribute('data-active') === 'true') return 'already active: ' + key;
tab.click();
return 'clicked key=' + key;
};
w.__PROBE_EVENT = (name: string) => {
calls.push({ t: now(), name: 'MARK:' + name });
};
// `run.ts` wraps the bundle in an IIFE and appends a `return <confirmation>`
// after the bundle body — agent-browser then prints the confirmation back to
// the operator. Nothing to do here at the end of the module body.
@@ -1,204 +0,0 @@
// LobeHub chat streaming time-series probe.
//
// Inject into the renderer (via agent-browser eval) to record store + DOM
// snapshots every 200ms during a streaming session. Designed to surface
// "UI rolled back to an earlier state" symptoms — especially around
// gateway-mode tab switches that happen while the assistant is still writing.
//
// Usage:
// agent-browser --cdp 9222 eval --stdin < probe.js
// # ...do test interactions, call window.__PROBE_EVENT('LABEL') to mark moments...
// agent-browser --cdp 9222 eval --stdin < probe-dump.js > /tmp/probe.json
// node analyze.mjs /tmp/probe.json
//
// What it captures per sample:
// - activeTopicId
// - msgN: top-level messages in chat.messagesMap for this topic
// - childN: total assistantGroup.children blocks across all msgs (THIS is
// where streaming content actually lives — top-level assistantGroup stays empty)
// - cT / rT / toolT: totals across messages AND their children
// (content, reasoning, tool-call count)
// - perMsg: per-message breakdown so regressions can be located precisely
// - runOps: number of running operations (execServerAgentRuntime etc.)
// - domLen: total innerText length of the rendered chat list area
// - ind: visible UI indicators (Search pages, Crawled pages, Deeply Thought, Sending)
//
// Event markers: window.__PROBE_EVENT('NAME') records {t, name} into
// __PROBE_EVENTS, used by the analyzer to align state changes with
// user-driven actions (SENT, AWAY_1, BACK_1, ...).
(function () {
if (window.__PROBE_TIMER) clearInterval(window.__PROBE_TIMER);
window.__PROBE_SAMPLES = [];
window.__PROBE_EVENTS = [];
const t0 = Date.now();
function snapshot() {
try {
const chat = window.__LOBE_STORES.chat();
const topicId = chat.activeTopicId;
const idTail = topicId ? topicId.replace('tpc_', '') : null;
const keys = Object.keys(chat.messagesMap || {});
// Collect messages for the active topic. Before a topic is committed,
// optimistic messages live under the `<agentScope>_new` key — fall
// back to those when no topic is active yet.
let msgs = [];
if (idTail) {
keys.forEach((k) => {
if (k.includes(idTail)) msgs = msgs.concat(chat.messagesMap[k] || []);
});
} else {
keys
.filter((k) => k.endsWith('_new'))
.forEach((k) => {
msgs = msgs.concat(chat.messagesMap[k] || []);
});
}
// Walk top-level + assistantGroup.children. children carry the actual
// streamed content / reasoning / tool calls; the parent assistantGroup
// remains a placeholder (cLen=0, rLen=0) for its whole lifetime.
let totalContent = 0;
let totalReason = 0;
let totalTools = 0;
let childCount = 0;
const perMsg = msgs.map((m) => {
const cLen = (m.content || '').length;
const rLen = ((m.reasoning && m.reasoning.content) || '').length;
const tools = (m.tools || []).length;
totalContent += cLen;
totalReason += rLen;
totalTools += tools;
const children = m.children || [];
let chC = 0;
let chR = 0;
let chT = 0;
children.forEach((c) => {
chC += (c.content || '').length;
chR += ((c.reasoning && c.reasoning.content) || '').length;
chT += (c.tools || []).length;
});
totalContent += chC;
totalReason += chR;
totalTools += chT;
childCount += children.length;
return {
id: (m.id || '').slice(-8),
role: m.role,
cLen,
rLen,
tools,
chCount: children.length,
chC,
chR,
chT,
};
});
const ops = Object.values(chat.operations || {});
const runningOps = ops.filter((o) => o.status === 'running');
// DOM probe: total rendered text in the chat scroll area (proxy for
// "how much is actually visible to the user").
const convScroll =
document.querySelector(
'[data-chat-list], [class*="ChatList"], [class*="ConversationList"]',
) ||
document.querySelector('main [class*="scroll"]') ||
document.querySelector('main');
const domTxt = convScroll ? convScroll.innerText || '' : '';
const bodyTxt = document.body.innerText || '';
const searchMatches = (bodyTxt.match(/Search pages?:|Searched the web/g) || []).length;
const crawlMatches = (bodyTxt.match(/Crawl(ed|ing) pages?/g) || []).length;
window.__PROBE_SAMPLES.push({
t: Date.now() - t0,
topicId,
msgN: msgs.length,
childN: childCount,
cT: totalContent,
rT: totalReason,
toolT: totalTools,
perMsg,
runOps: runningOps.length,
runOpTypes: runningOps.map((o) => o.type),
domLen: domTxt.length,
ind: {
search: searchMatches,
crawl: crawlMatches,
sending: bodyTxt.includes('Sending message'),
deeplyThinking: bodyTxt.includes('Deeply Thinking'),
deeplyThought: bodyTxt.includes('Deeply Thought'),
},
});
} catch (e) {
window.__PROBE_SAMPLES.push({ t: Date.now() - t0, err: e.message });
}
}
snapshot();
window.__PROBE_TIMER = setInterval(snapshot, 200);
window.__PROBE_EVENT = function (name) {
window.__PROBE_EVENTS.push({ t: Date.now() - t0, name });
};
// Tab-switch helpers installed alongside the probe.
//
// The Electron tab bar mounts each tab as a div with data-insp-path
// ending in `TabItem.tsx:...`. The active tab is marked with
// data-active="true". DO NOT search by innerText — the active tab's text
// includes a ` · <agent name>` suffix that produces false matches when
// your search string happens to overlap with the agent name.
function listTabs() {
return Array.from(
document.querySelectorAll('[data-insp-path*="TabItem.tsx"][data-contextmenu-trigger]'),
).filter((t) => t.getBoundingClientRect().top < 30);
}
function tabKey(el) {
// Stable for the tab's lifetime; survives focus changes.
return el.getAttribute('data-contextmenu-trigger');
}
function findActiveTab() {
return listTabs().find((t) => t.getAttribute('data-active') === 'true') || null;
}
// Click by stable key captured earlier (preferred for round-trips).
window.__clickTabByKey = function (key) {
const tab = listTabs().find((t) => tabKey(t) === key);
if (!tab) return 'not found: key=' + key;
if (tab.getAttribute('data-active') === 'true') return 'already active: ' + key;
tab.click();
return 'clicked key=' + key;
};
// Click by index in the tab strip (0-based, left-to-right).
window.__clickTabByIndex = function (i) {
const tabs = listTabs();
if (i < 0 || i >= tabs.length) return 'index out of range: ' + i + '/' + tabs.length;
const t = tabs[i];
if (t.getAttribute('data-active') === 'true') return 'already active: i=' + i;
t.click();
return 'clicked i=' + i + ' key=' + tabKey(t);
};
// Snapshot all tabs in order: [{key, active, title (first 60 chars of innerText)}]
window.__listTabs = function () {
return listTabs().map((t, i) => ({
i,
key: tabKey(t),
active: t.getAttribute('data-active') === 'true',
title: (t.innerText || '').slice(0, 60),
}));
};
window.__activeTabKey = function () {
const a = findActiveTab();
return a ? tabKey(a) : null;
};
return 'probe installed';
})();
@@ -1,211 +0,0 @@
// CLI for the agent-gateway probe.
//
// Bundles the TS probes with esbuild, pipes them into `agent-browser eval`,
// and persists dumps under `.agent-gateway/` (gitignored) for later use as
// streaming-replay test fixtures.
//
// Commands:
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts install
// Bundle probe-events.ts and inject into the CDP-attached browser.
// Re-installing clears all buffers and re-patches WebSocket / fetch.
//
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts dump [name]
// Stop the timeline timer, fetch the capture as JSON, write it to
// `.agent-gateway/<name>-<YYYYMMDD-HHmmss>.json`. `name` defaults to
// `dump`. Prints the absolute path written.
//
// bun run .agents/skills/local-testing/scripts/agent-gateway/run.ts analyze [path]
// Run analyze-events.ts on the dump. `path` defaults to the most
// recently modified file in `.agent-gateway/`.
//
// Optional flags:
// --cdp <port> CDP port (default 9222)
// --browser <bin> agent-browser binary (default 'agent-browser')
import { spawn } from 'node:child_process';
import { mkdirSync, readdirSync, statSync, writeFileSync } from 'node:fs';
import path from 'node:path';
import { fileURLToPath } from 'node:url';
const SCRIPT_DIR = path.dirname(fileURLToPath(import.meta.url));
// .agents/skills/local-testing/scripts/agent-gateway/ → 5 levels up
const PROJECT_ROOT = path.resolve(SCRIPT_DIR, '../../../../..');
const DUMP_DIR = path.join(PROJECT_ROOT, '.agent-gateway');
interface Flags {
browser: string;
cdp: string;
positional: string[];
}
function parseFlags(argv: string[]): Flags {
const out: Flags = { cdp: '9222', browser: 'agent-browser', positional: [] };
for (let i = 0; i < argv.length; i++) {
const a = argv[i];
if (a === '--cdp') out.cdp = argv[++i] ?? out.cdp;
else if (a === '--browser') out.browser = argv[++i] ?? out.browser;
else out.positional.push(a);
}
return out;
}
async function bundle(entry: string): Promise<string> {
// Bun.build is built into the Bun runtime — no external dep needed.
const r = await Bun.build({
entrypoints: [path.join(SCRIPT_DIR, entry)],
target: 'browser',
format: 'esm',
minify: false,
});
if (!r.success) {
const msgs = r.logs.map((l) => `${l.level}: ${l.message}`).join('\n');
throw new Error(`bundle failed for ${entry}:\n${msgs}`);
}
return await r.outputs[0].text();
}
function wrapIife(body: string, returnExpr: string): string {
// Wrap as an IIFE that swallows the bundled top-level (top-level `const`
// declarations get scoped to the IIFE, so re-injection doesn't conflict)
// and returns the configured expression — which `agent-browser eval`
// captures and prints to stdout.
return `(() => {\n${body}\n;return ${returnExpr};\n})()`;
}
function runAgentBrowserEval(flags: Flags, script: string): Promise<string> {
return new Promise((resolveP, rejectP) => {
const child = spawn(flags.browser, ['--cdp', flags.cdp, 'eval', '--stdin'], {
stdio: ['pipe', 'pipe', 'inherit'],
});
let stdout = '';
child.stdout.on('data', (chunk: Buffer) => {
stdout += chunk.toString('utf8');
});
child.on('error', rejectP);
child.on('close', (code) => {
if (code === 0) resolveP(stdout);
else rejectP(new Error(`agent-browser exited ${code}`));
});
child.stdin.write(script);
child.stdin.end();
});
}
// agent-browser prints eval results as JSON (string values are quoted).
function unquoteAgentBrowserResult(raw: string): string {
const trimmed = raw.trim();
if (trimmed.startsWith('"') && trimmed.endsWith('"')) {
try {
return JSON.parse(trimmed) as string;
} catch {
/* fall through */
}
}
return trimmed;
}
function isoStamp(): string {
const d = new Date();
const yyyy = d.getFullYear();
const mm = String(d.getMonth() + 1).padStart(2, '0');
const dd = String(d.getDate()).padStart(2, '0');
const hh = String(d.getHours()).padStart(2, '0');
const mi = String(d.getMinutes()).padStart(2, '0');
const ss = String(d.getSeconds()).padStart(2, '0');
return `${yyyy}${mm}${dd}-${hh}${mi}${ss}`;
}
function ensureDumpDir(): void {
mkdirSync(DUMP_DIR, { recursive: true });
}
function latestDump(): string | null {
ensureDumpDir();
const entries = readdirSync(DUMP_DIR)
.filter((f) => f.endsWith('.json'))
.map((f) => ({ f, mtime: statSync(path.join(DUMP_DIR, f)).mtimeMs }))
.sort((a, b) => b.mtime - a.mtime);
return entries[0] ? path.join(DUMP_DIR, entries[0].f) : null;
}
// ── Commands ────────────────────────────────────────────────────────
async function cmdInstall(flags: Flags): Promise<void> {
const body = await bundle('probe-events.ts');
const installMsg = JSON.stringify(
'events probe installed: WebSocket+fetch interception. ' +
'WS captures operationId= sockets (gateway), fetch captures /api/agent/stream (direct).',
);
const script = wrapIife(body, installMsg);
const out = await runAgentBrowserEval(flags, script);
console.log(unquoteAgentBrowserResult(out));
}
async function cmdDump(flags: Flags): Promise<void> {
const name = flags.positional[1] ?? 'dump';
const body = await bundle('probe-dump.ts');
const script = wrapIife(body, 'window.__PROBE_LAST_DUMP_JSON');
const raw = await runAgentBrowserEval(flags, script);
const json = unquoteAgentBrowserResult(raw);
ensureDumpDir();
const filename = `${name}-${isoStamp()}.json`;
const dumpPath = path.join(DUMP_DIR, filename);
writeFileSync(dumpPath, json, 'utf8');
// Validate by parsing the meta header so we error early on bad capture
try {
const parsed = JSON.parse(json) as {
meta?: { eventCount?: number; callCount?: number; sampleCount?: number };
};
const meta = parsed.meta ?? {};
console.log(
`wrote ${dumpPath} (${json.length} bytes events=${meta.eventCount ?? '?'} ` +
`calls=${meta.callCount ?? '?'} samples=${meta.sampleCount ?? '?'})`,
);
} catch {
console.log(`wrote ${dumpPath} (${json.length} bytes — JSON.parse failed; see file)`);
}
}
async function cmdAnalyze(flags: Flags): Promise<void> {
const target = flags.positional[1] ?? latestDump();
if (!target) {
console.error('no dump file found. run `dump` first or pass a path.');
process.exit(1);
}
const child = spawn('bun', ['run', path.join(SCRIPT_DIR, 'analyze-events.ts'), target], {
stdio: 'inherit',
});
await new Promise<void>((resolveP, rejectP) => {
child.on('error', rejectP);
child.on('close', (code) => (code === 0 ? resolveP() : rejectP(new Error(`exit ${code}`))));
});
}
// ── Entry point ─────────────────────────────────────────────────────
const flags = parseFlags(process.argv.slice(2));
const cmd = flags.positional[0];
const usage = `usage:
bun run run.ts install [--cdp 9222]
bun run run.ts dump [name] [--cdp 9222]
bun run run.ts analyze [path]
`;
if (!cmd) {
console.error(usage);
process.exit(1);
}
try {
if (cmd === 'install') await cmdInstall(flags);
else if (cmd === 'dump') await cmdDump(flags);
else if (cmd === 'analyze') await cmdAnalyze(flags);
else {
console.error(`unknown command: ${cmd}\n\n${usage}`);
process.exit(1);
}
} catch (e: any) {
console.error(e?.stack ?? e);
process.exit(1);
}
@@ -1,72 +0,0 @@
// Run N round-trip tab switches with event markers timed against the probe.
//
// agent-browser --cdp 9222 eval --stdin < tab-switch.js
//
// Captures the currently-active tab as the BACK target and the rightmost
// inactive tab as the AWAY target. Both are addressed by their stable
// data-contextmenu-trigger key (NOT by visible title — the active tab's
// innerText embeds a ` · <agent name>` suffix that breaks text matching).
//
// Fires the loop in the background and returns immediately so the
// agent-browser eval doesn't have to await the full ROUND_TRIPS × DWELL_MS
// duration. Wait on the `SWITCH_LOOP_DONE` event before dumping.
//
// Refuses to launch if a previous loop is still in flight.
//
// Requires probe.js to have been installed first (provides
// window.__PROBE_EVENT / __listTabs / __clickTabByKey / __activeTabKey).
(function () {
const ROUND_TRIPS = 4;
const DWELL_MS = 10_000;
if (!window.__PROBE_EVENT || !window.__listTabs || !window.__clickTabByKey) {
return 'probe not installed — eval probe.js first';
}
if (window.__SWITCH_LOOP_RUNNING) {
return 'switch loop already running — wait for SWITCH_LOOP_DONE first';
}
const tabs = window.__listTabs();
const activeTab = tabs.find((t) => t.active);
if (!activeTab) return 'no active tab — abort';
// Pick the first inactive tab as AWAY target. With multiple inactive tabs
// you'll usually want the one that's stable across the test — feel free
// to swap to tabs[tabs.length-1] if you want the rightmost.
const inactives = tabs.filter((t) => !t.active);
if (inactives.length === 0) return 'no inactive tab to switch to — abort';
const awayTab = inactives.at(-1); // rightmost inactive
const BACK_KEY = activeTab.key;
const AWAY_KEY = awayTab.key;
window.__SWITCH_LOOP_RUNNING = true;
window.__PROBE_EVENT('SWITCH_LOOP_CONFIG:back=' + BACK_KEY + ',away=' + AWAY_KEY);
(async function () {
function sleep(ms) {
return new Promise((r) => setTimeout(r, ms));
}
try {
window.__PROBE_EVENT('SWITCH_LOOP_START');
for (let i = 1; i <= ROUND_TRIPS; i++) {
window.__PROBE_EVENT('AWAY_' + i);
const awayResult = window.__clickTabByKey(AWAY_KEY);
window.__PROBE_EVENT('AWAY_' + i + '_RES:' + awayResult.slice(0, 50));
await sleep(DWELL_MS);
window.__PROBE_EVENT('BACK_' + i);
const backResult = window.__clickTabByKey(BACK_KEY);
window.__PROBE_EVENT('BACK_' + i + '_RES:' + backResult.slice(0, 50));
await sleep(DWELL_MS);
}
window.__PROBE_EVENT('SWITCH_LOOP_DONE');
} finally {
window.__SWITCH_LOOP_RUNNING = false;
}
})();
return 'switch loop kicked off (BACK=' + BACK_KEY + ', AWAY=' + AWAY_KEY + ')';
})();
@@ -1,113 +0,0 @@
// Shared types between the in-browser probe and the Node-side analyzer.
// Kept tiny on purpose — anything the analyzer can re-derive is left off.
export interface ProbeStreamEvent {
/** Summarized payload — long strings truncated, arrays printed as Array(N) */
data?: Record<string, unknown>;
/** Keys present on the event's `data` payload — useful at a glance */
dataKeys?: string[];
/** ServerMessage.id — gateway WS frames carry an event-id we may resume from */
eventId?: string | null;
message?: string;
/** Last 10 chars of the operationId (full id is excessively long) */
opIdTail: string;
raw?: string;
/** Raw frame byte length, when applicable */
rawLen?: number;
/** For non-agent_event server frames (auth_success, heartbeat_ack, …) */
serverType?: string;
sseEvent?: string;
status?: number;
stepIndex?: number;
/** Milliseconds since the probe's t0 (install time). */
t: number;
/** 'ws' for gateway WebSocket frames, 'sse' for direct /api/agent/stream */
transport: 'ws' | 'sse';
/** Either the AgentStreamEvent.type, or a probe sentinel like `_WS_OPEN_` */
type: string;
url?: string;
}
export interface ProbeActionCall {
args?: {
count?: number;
context?: unknown;
params?: unknown;
};
error?: string;
/** `replaceMessages` / `refreshMessages` / `MARK:<label>` / `_WRAP_ERROR_` */
name: string;
stack?: string;
t: number;
}
export interface ProbeMessageSummary {
/** children.length */
chN: number;
/** content.length */
cLen: number;
/** Last 8 chars of the message id */
id: string;
/** reasoning.content.length */
rLen: number;
role: string;
/** tools.length */
tools: number;
}
export interface ProbeTimelineSample {
/** Last 10 chars of activeTopicId, or null */
activeTopic: string | null;
/** Per-key breakdown: display count, db count, message summaries */
byKey: Record<
string,
{
n: number;
dbN: number;
msgs: ProbeMessageSummary[];
}
>;
err?: string;
/** All messagesMap keys that have content at this moment */
keys: string[];
/** Number of operations in 'running' status */
runOps: number;
t: number;
}
export interface ProbeDumpMeta {
callCount: number;
/** Date.now() at dump call */
collectedAt: number;
eventCount: number;
sampleCount: number;
/** Date.now() at probe install */
t0: number;
}
export interface ProbeDump {
actionCalls: ProbeActionCall[];
meta: ProbeDumpMeta;
streamEvents: ProbeStreamEvent[];
timeline: ProbeTimelineSample[];
}
/**
* Globals the probe attaches to `window`. Keeps `as any` casts at the boundary
* instead of sprinkling them through the probe body.
*/
declare global {
interface Window {
__clickTabByKey?: (key: string) => string;
__listTabs?: () => Array<{ i: number; key: string | null; active: boolean; title: string }>;
__LOBE_STORES?: Record<string, () => any>;
__PROBE_ACTION_CALLS?: ProbeActionCall[];
__PROBE_EVENT?: (label: string) => void;
__PROBE_MSG_TIMELINE?: ProbeTimelineSample[];
__PROBE_ORIG_FETCH?: typeof fetch;
__PROBE_ORIG_WEBSOCKET?: typeof WebSocket;
__PROBE_STREAM_EVENTS?: ProbeStreamEvent[];
__PROBE_T0?: number;
__PROBE_TIMELINE_TIMER?: ReturnType<typeof setInterval> | null;
}
}
@@ -60,5 +60,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
@@ -80,5 +80,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
@@ -72,5 +72,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
@@ -60,5 +60,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
@@ -75,5 +75,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
@@ -81,5 +81,5 @@ echo "[$APP] Waiting ${WAIT}s for bot response..."
sleep "$WAIT"
echo "[$APP] Capturing screenshot..."
"$SCRIPT_DIR/../capture-app-window.sh" "$APP" "$SCREENSHOT"
"$SCRIPT_DIR/capture-app-window.sh" "$APP" "$SCREENSHOT"
echo "[$APP] Done! Screenshot saved to $SCREENSHOT"
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: microcopy
description: 'UI copy and microcopy guidelines. Use for user-facing copy, buttons, errors, empty states, onboarding, i18n wording, translation, or copy improvements in Chinese or English.'
description: UI copy and microcopy guidelines. Use when writing UI text, buttons, error messages, empty states, onboarding, or any user-facing copy. Triggers on i18n translation, UI text writing, or copy improvement tasks. Supports both Chinese and English.
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: modal
description: 'LobeHub imperative modal conventions. Use when creating or migrating modals, dialogs, popups, confirm flows, ModalHost wiring, createModal, confirmModal, useModalContext, or base-ui modal APIs.'
description: "LobeHub imperative-modal conventions. Use whenever creating, editing, opening, or migrating a modal/dialog/popup — prefer `createModal` / `confirmModal` / `useModalContext` from `@lobehub/ui/base-ui` (headless) over the legacy root `@lobehub/ui` `createModal` (antd Modal props) and over any declarative `open` state + `<Modal />` pattern. Covers required `ModalHost` mounting, the `Content` + `index.tsx` file layout, `content` vs `children` slot, i18n inside `createModal()` (`import { t } from 'i18next'`), and migration notes. Triggers on `createModal`, `confirmModal`, `useModalContext`, `ModalHost`, `antd Modal`, `<Modal open>`, 'open a modal', 'popup', 'dialog', 'confirm dialog', '弹框', '弹窗', '确认框', 'migrate to base-ui'."
user-invocable: false
---
+1 -80
View File
@@ -1,6 +1,6 @@
---
name: pr
description: "Create a PR for the current branch (targets `canary` by default), including splitting one cross-layer branch into ordered stacked PRs so a lower layer (db / shared package / server TRPC) merges before its callers (desktop / CLI / UI). Use when the user asks to create / submit a PR, or to split a branch because clients call a server contract that isn't on the trunk yet. Triggers on 'pr', 'create pr', 'submit pr', 'open a PR', 'pull request', 'split this PR', 'stacked PR', 'backend should merge first', '提 PR', '提个 PR', '新建 PR', '拆 PR', '后端先合', '分层合并'."
description: "Create a PR for the current branch. Use when the user asks to create a pull request, submit PR, or says 'pr'."
user-invocable: true
---
@@ -71,82 +71,3 @@ Use `.github/PULL_REQUEST_TEMPLATE.md` as the body structure. Key sections:
- **Language**: All PR content must be in English
- If a PR already exists for the branch, inform the user instead of creating a duplicate
---
# Stacked PRs (cross-layer feature)
The steps above create **one** PR for the current branch. When a single branch lands across layers — `packages/database` schema/model → a shared `packages/*` lib → `src/server` TRPC → `apps/desktop` + `apps/cli` callers → `src/features` UI — shipping it as one PR can't merge safely: the clients call an endpoint that doesn't exist on the trunk until the same PR merges, so any partial/rollback or independent review breaks. Split it into **ordered PRs**, lower layer first.
## The ordering rule
A PR may only merge **after** every layer it calls is already on the trunk.
- The **server contract** (new TRPC procedure, changed return shape, new table/model) merges first.
- The **callers** (desktop, CLI, UI) merge after — they invoke that contract.
- Tie-break with one question: _"if this merged alone to `canary` right now, would it build and behave?"_ If no, it belongs in a later PR.
## Which file goes in which PR
The non-obvious calls:
- **Frontend that adapts to a contract change goes WITH the server PR.** If you widen a TRPC return shape (e.g. `listDevices` now returns `platform: string | null`), the component consuming it must change in the _same_ PR — otherwise the server PR breaks the build on its own. Contract + its in-repo consumers ship together.
- **A new shared package goes with its consumer**, not the server, unless the server imports it too. A `@lobechat/*` package imported only by desktop/CLI ships in the client PR. Don't carry an unused package in the lower PR.
- **Workspace dep declarations** (`package.json` `workspace:*`, `pnpm-workspace.yaml`) travel with the code that imports the package.
## The git recipe — split an existing full branch
Starting point: one branch (`feat/x`) with a single commit `<FULL>` containing everything, already pushed (so it's also safe on the remote).
```bash
# 1. Safety nets — make the full work unloseable before rewriting anything
git branch backup/x-full <FULL> # local ref to the full commit
git branch feat/x-clients <FULL> # the higher-layer branch starts here
# 2. Rewrite the lower-layer branch to lower-layer files only
git checkout feat/x # this becomes the SERVER PR
git reset --hard origin/canary
git checkout <FULL> -- <server/db files…> # stages just those paths
git commit -m "✨ feat(...): <server half>"
git push --force-with-lease origin feat/x # never --force; never push to canary
# 3. Build the higher-layer branch STACKED on the lower branch
git checkout feat/x-clients
git reset --hard feat/x # base = the just-rewritten server HEAD
git checkout backup/x-full -- <client/ui files…> # only the remaining paths
git commit -m "✨ feat(...): <client half>"
git push -u origin feat/x-clients
```
Then open the higher PR **based on the lower branch**, not the trunk:
```bash
gh pr create --base feat/x --head feat/x-clients --title "…" --body "…"
```
`--base feat/x` keeps the diff client-only (no server files leak in) and makes it physically impossible to merge the clients before the server. **After the server PR merges to `canary`, retarget the client PR's base to `canary`** (GitHub usually auto-retargets when the base branch merges; note it in the PR body so a human confirms).
## Verify the dependency actually holds
The whole point is the higher layer needs the lower one. Prove it: on the stacked higher branch, type-check the caller and confirm the symbol the lower layer introduced resolves.
```bash
cd apps/cli && bun run type-check 2>&1 | grep -iE "connect\.ts|device\.register"
# empty (re: your change) = the stacked base supplies device.register ✓
```
Filter to your touched files — this repo's standalone type-check emits pre-existing env noise (`__ELECTRON__`, `@/types/llm`, unbuilt `@lobechat/types`) that isn't yours.
## PR + Linear bookkeeping
- **Each PR closes only its own layer's issues.** Server PR: `Closes LOBE-<server>`. Client PR: `Closes LOBE-<pkg> / <desktop> / <cli>`. Don't let one PR's body claim another layer's issue.
- Both PRs are `Part of LOBE-<parent>`.
- On PR creation, move each closed sub-issue to **In Review** (not Done) and add a completion comment — see the `linear` skill.
## Gotchas
- **Never push to `canary`.** A split branch cut with `git checkout -b feat/x origin/canary` _tracks_ `origin/canary`, so a bare `git push` targets canary. Always `git push origin feat/x` with the explicit branch name.
- **`--force-with-lease`, not `--force`** when rewriting the lower branch — it aborts if the remote moved under you.
- **Back up before `reset --hard`.** Step 1's `backup/x-full` + the pushed remote branch mean the full commit is referenced by ≥3 refs before you rewrite anything. Verify with `git branch --contains <FULL>`.
- **Lockfiles:** this monorepo commits no root `pnpm-lock.yaml`, so a new `workspace:*` dep needs no lockfile churn. In a repo that _does_ commit one, regenerate it on each branch after the split.
- **Don't over-split.** Two PRs (contract / callers) is usually enough. A UI page that only reads an existing endpoint can be its own later PR, but don't fragment a single layer across PRs for its own sake.
+58 -56
View File
@@ -1,6 +1,6 @@
---
name: project-overview
description: 'LobeHub open-source monorepo architecture map. Use when locating code layers, understanding apps/packages/src layout, business stubs, project structure, or onboarding to the repository.'
description: Complete project architecture and structure guide. Use when exploring the codebase, understanding project organization, finding files, or needing comprehensive architectural context. Triggers on architecture questions, directory navigation, or project overview needs.
user-invocable: false
---
@@ -13,12 +13,11 @@ user-invocable: false
## Project Description
Open-source, modern-design AI Agent Workspace: **LobeHub** (previously LobeChat).
This repo is the **open-source root** (`github.com/lobehub/lobehub`, package `@lobehub/lobehub`).
**Supported platforms:**
- Web desktop/mobile
- Desktop (Electron)`apps/desktop`
- Desktop (Electron)
- Mobile app (React Native) — **separate repo, already launched** (not in this monorepo)
**Logo emoji:** 🤯
@@ -48,28 +47,30 @@ This repo is the **open-source root** (`github.com/lobehub/lobehub`, package `@l
## Monorepo Layout
Flat layout — `apps/`, `packages/`, and `src/` all sit at the repo root. No
git submodules.
This is a monorepo extending the open-source `lobehub` submodule. Two repos:
- **cloud repo root** — `src/` and `packages/business/` (`config`, `const`, `model-runtime`) hold cloud-only SaaS code that overrides/extends the submodule. See `AGENTS.md` for the override mechanism.
- **`lobehub/` submodule** — the open-source product core.
### `lobehub/` submodule — key directories
```
(repo root)
lobehub/
├── apps/
│ ├── cli/ # LobeHub CLI
│ ├── desktop/ # Electron desktop app
│ └── device-gateway/ # Device gateway service
├── docs/ # changelog, development, self-hosting, usage
├── locales/ # en-US, zh-CN, ...
├── packages/ # ~80 @lobechat/* workspace packages — `ls` for the full set. Key ones:
│ ├── agent-runtime/ # Agent runtime core
│ ├── cli/ # LobeHub CLI
│ ├── desktop/ # Electron desktop app
│ └── device-gateway/ # Device gateway service
├── docs/ # changelog, development, self-hosting, usage
├── locales/ # en-US, zh-CN, ...
├── packages/ # ~80 @lobechat/* workspace packages — `ls` for the full set. Key ones:
│ ├── agent-runtime/ # Agent runtime
│ ├── agent-signal/ # Agent Signal pipeline
│ ├── agent-tracing/ # Tracing / snapshots
│ ├── builtin-tool-*/ # Per-tool packages (calculator, web-browsing, claude-code, ...)
│ ├── builtin-tools/ # Central registries that compose builtin-tool-*
│ ├── builtin-tool-*/ # Builtin tool packages
│ ├── builtin-tools/ # Builtin tool registries
│ ├── context-engine/
│ ├── database/ # src/{models,schemas,repositories}
│ ├── model-bank/ # Model definitions & provider cards
│ ├── model-runtime/ # src/{core,providers}
│ ├── business/ # Open-source stubs (config, const, model-bank, model-runtime) — overridden by cloud
│ ├── types/
│ └── utils/
└── src/
@@ -82,54 +83,55 @@ git submodules.
├── spa/ # SPA entries + router config
│ ├── entry.{web,mobile,desktop,popup}.tsx
│ └── router/
├── business/ # Open-source stubs (client/server) — cloud repo provides real impls
├── business/ # Open-source stubs (~50) overridden by cloud src/business/
├── features/ # Domain business components
├── store/ # ~30 zustand stores — `ls` for the full set
├── server/ # featureFlags, globalConfig, modules, routers, services, workflows, agent-hono
├── store/ # ~28 zustand stores — `ls` for the full set
├── server/ # featureFlags, globalConfig, modules, routers, services
└── ... # components, hooks, layout, libs, locales, services, types, utils
```
### cloud repo — key directories
```
(cloud root)
├── packages/business/ # Cloud overrides: config, const, model-runtime
├── src/
│ ├── business/ # Cloud impls of submodule stubs (client/server/locales)
│ ├── routes/ # Cloud-only route groups: (cloud)/, embed/
│ ├── store/ # Cloud-only stores (e.g. subscription/)
│ ├── server/ # Cloud routers & services (billing, budget, risk control...)
│ └── app/(backend)/cron/ # Vercel cron routes (schedules declared in root vercel.ts)
└── vercel.ts # Cron schedule declarations
```
> File search rule: a path like `@/store/x` resolves cloud `src/store/x` first, then
> `lobehub/packages/store/src/x`, then `lobehub/src/store/x`. Cloud override wins.
## Architecture Map
| Layer | Location |
| ---------------- | --------------------------------------------------- |
| UI Components | `src/components`, `src/features` |
| SPA Pages | `src/routes/` |
| React Router | `src/spa/router/` |
| Global Providers | `src/layout` |
| Zustand Stores | `src/store` |
| Client Services | `src/services/` |
| REST API | `src/app/(backend)/webapi` |
| tRPC Routers | `src/server/routers/{async\|lambda\|mobile\|tools}` |
| Server Services | `src/server/services` (can access DB) |
| Server Modules | `src/server/modules` (no DB access) |
| Feature Flags | `src/server/featureFlags` |
| Global Config | `src/server/globalConfig` |
| DB Schema | `packages/database/src/schemas` |
| DB Model | `packages/database/src/models` |
| DB Repository | `packages/database/src/repositories` |
| Third-party | `src/libs` (analytics, oidc, etc.) |
| Builtin Tools | `packages/builtin-tool-*`, `packages/builtin-tools` |
| Open-source stub | `src/business/*`, `packages/business/*` (this repo) |
| Layer | Location |
| ---------------- | ---------------------------------------------------- |
| UI Components | `src/components`, `src/features` |
| SPA Pages | `src/routes/` |
| React Router | `src/spa/router/` |
| Global Providers | `src/layout` |
| Zustand Stores | `src/store` |
| Client Services | `src/services/` |
| REST API | `src/app/(backend)/webapi` |
| tRPC Routers | `src/server/routers/{async\|lambda\|mobile\|tools}` |
| Server Services | `src/server/services` (can access DB) |
| Server Modules | `src/server/modules` (no DB access) |
| Feature Flags | `src/server/featureFlags` |
| Global Config | `src/server/globalConfig` |
| DB Schema | `packages/database/src/schemas` |
| DB Model | `packages/database/src/models` |
| DB Repository | `packages/database/src/repositories` |
| Third-party | `src/libs` (analytics, oidc, etc.) |
| Builtin Tools | `src/tools`, `packages/builtin-tool-*` |
| Cloud-only | `src/business/*`, `packages/business/*` (cloud repo) |
## Data Flow
```
React UI → Store Actions → Client Service → TRPC Lambda → Server Services → DB Model → PostgreSQL
```
## Note: Relationship to the Cloud Repo
This open-source repo is consumed by a **separate, private cloud (SaaS) repo**
as a git submodule mounted at `lobehub/`. The cloud repo provides:
- **`src/business/{client,server}`** and **`packages/business/*`** implementations
that override the stubs shipped here.
- Cloud-only routes (e.g. `(cloud)/`, `embed/`), cloud-only stores (e.g.
`subscription/`), cloud-only TRPC routers (billing, budget, risk control, …),
and Vercel cron routes under `src/app/(backend)/cron/`.
- File-resolution order in cloud: `@/store/x` → cloud `src/store/x` first, then
`lobehub/packages/store/src/x`, then `lobehub/src/store/x`. **Cloud override wins.**
When working in this repo alone, ignore the cloud layer — the stubs in
`src/business/` and `packages/business/` are the source of truth here.
+23 -49
View File
@@ -1,6 +1,6 @@
---
name: react
description: 'LobeHub React component conventions. Use when editing TSX UI, choosing base-ui vs @lobehub/ui vs antd, styling with antd-style, routing, desktop variants, layouts, or component state.'
description: 'Use when writing or editing any `.tsx` under `src/**`. Triggers: createStaticStyles, createStyles, cssVar, antd-style, Flexbox, Center, Select, Modal, Drawer, Button, Tooltip, DropdownMenu, Popover, Switch, ScrollArea, Link, useNavigate, react-router-dom, next/link, desktopRouter, componentMap.desktop, .desktop.tsx, new component, new page, edit layout, add styles, zustand selector, @lobehub/ui, antd import.'
user-invocable: false
---
@@ -17,45 +17,22 @@ user-invocable: false
## Component Priority
1. **`src/components`** — project-specific reusable components
2. **`@lobehub/ui/base-ui`** — headless primitives. **If the component lives here, use it. Do NOT import the same-named root export.**
3. **`@lobehub/ui`** — higher-level / antd-wrapping components (only when no base-ui equivalent)
4. **antd** — only when neither base-ui nor `@lobehub/ui` root provides it
5. **Custom implementation** — true last resort
2. **`@lobehub/ui/base-ui`** — headless primitives (Select, Modal, DropdownMenu, Popover, Switch, ScrollArea…)
3. **`@lobehub/ui`** — higher-level components (ActionIcon, Markdown, DragPage…)
4. **Custom implementation** — last resort; never reach for antd directly
If unsure about available components, search existing code or check `node_modules/@lobehub/ui/es/index.mjs` and `node_modules/@lobehub/ui/es/base-ui/`.
If unsure about available components, search existing code or check `node_modules/@lobehub/ui/es/index.mjs`.
### `@lobehub/ui/base-ui` — always prefer for these
### Common @lobehub/ui Components
| Component | Import |
| ------------------------------------------ | ------------------------------------------------------------------------------------------------------- |
| `Select` (+ `SelectProps`, `SelectOption`) | `import { Select } from '@lobehub/ui/base-ui';` |
| `Modal` (imperative API) | `import { createModal, confirmModal, useModalContext, type ModalInstance } from '@lobehub/ui/base-ui';` |
| `DropdownMenu` | `import { DropdownMenu } from '@lobehub/ui/base-ui';` |
| `ContextMenu` | `import { ContextMenu } from '@lobehub/ui/base-ui';` |
| `Popover` | `import { Popover } from '@lobehub/ui/base-ui';` |
| `ScrollArea` | `import { ScrollArea } from '@lobehub/ui/base-ui';` |
| `Switch` | `import { Switch } from '@lobehub/ui/base-ui';` |
| `Toast` | `import { Toast } from '@lobehub/ui/base-ui';` |
| `FloatingSheet` | `import { FloatingSheet } from '@lobehub/ui/base-ui';` |
For Modal specifically, see the dedicated **modal** skill — use the imperative `createModal({ content: … })` pattern over the legacy `<Modal open … />` declarative pattern. base-ui has its own `ModalHost` already mounted in `SPAGlobalProvider`.
> Common slip: `import { Select } from '@lobehub/ui'` looks fine but it's the antd-backed Select. Use base-ui Select. Same for `Modal`, `DropdownMenu`, etc.
### `@lobehub/ui` root — use when base-ui has no equivalent
| Category | Components |
| ------------ | ------------------------------------------------------------------------------------- |
| General | ActionIcon, ActionIconGroup, Block, Button, Icon |
| Data Display | Avatar, Collapse, Empty, Highlighter, Markdown, Tag, Tooltip |
| Data Entry | CodeEditor, CopyButton, EditableText, Form, Input, InputPassword, SearchBar, TextArea |
| Feedback | Alert, Drawer |
| Layout | Center, DraggablePanel, Flexbox, Grid, Header, MaskShadow |
| Navigation | Burger, Menu, SideNav, Tabs |
## State
When a feature component manages more than 3 pieces of state (`useState`/`useReducer`/derived state), extract the logic into a custom hook (e.g. `useXxx`). Keep the component focused on rendering — the hook holds state and handlers, so logic can be unit-tested without rendering the component.
| Category | Components |
| ------------ | ------------------------------------------------------------------------------- |
| General | ActionIcon, ActionIconGroup, Block, Button, Icon |
| Data Display | Avatar, Collapse, Empty, Highlighter, Markdown, Tag, Tooltip |
| Data Entry | CodeEditor, CopyButton, EditableText, Form, FormModal, Input, SearchBar, Select |
| Feedback | Alert, Drawer, Modal |
| Layout | Center, DraggablePanel, Flexbox, Grid, Header, MaskShadow |
| Navigation | Burger, Dropdown, Menu, SideNav, Tabs |
## Layout
@@ -108,15 +85,12 @@ errorElement: <ErrorBoundary />;
## Common Mistakes
| Mistake | Fix |
| ------------------------------------------------------------------ | --------------------------------------------------------------------------- |
| Using `next/link` in SPA | Use `react-router-dom` `Link` |
| Using antd directly | Use `@lobehub/ui/base-ui` first, then `@lobehub/ui` |
| `import { Select } from '@lobehub/ui'` | `import { Select } from '@lobehub/ui/base-ui'` |
| `import { Modal } from '@lobehub/ui'` + `<Modal open>` declarative | `createModal` / `confirmModal` from `@lobehub/ui/base-ui` (see modal skill) |
| `import { DropdownMenu/Popover/Switch } from '@lobehub/ui'` | Import same name from `@lobehub/ui/base-ui` instead |
| `createStyles` for static styles | Use `createStaticStyles` + `cssVar` |
| Editing only `desktopRouter.config.tsx` | Must edit both `.tsx` and `.desktop.tsx` |
| Using `margin` for flex spacing | Use `gap` prop on Flexbox |
| Accessing zustand store without selector | Use selectors to access store data (see zustand skill) |
| Text or icon-text actions built with `Flexbox`/`Text` + `onClick` | Use `Button type={'text'} size={'small'}` with `icon` when needed |
| Mistake | Fix |
| ----------------------------------------------------------------- | ----------------------------------------------------------------- |
| Using `next/link` in SPA | Use `react-router-dom` `Link` |
| Using antd directly | Use `@lobehub/ui/base-ui` first, then `@lobehub/ui` |
| `createStyles` for static styles | Use `createStaticStyles` + `cssVar` |
| Editing only `desktopRouter.config.tsx` | Must edit both `.tsx` and `.desktop.tsx` |
| Using `margin` for flex spacing | Use `gap` prop on Flexbox |
| Accessing zustand store without selector | Use selectors to access store data (see zustand skill) |
| Text or icon-text actions built with `Flexbox`/`Text` + `onClick` | Use `Button type={'text'} size={'small'}` with `icon` when needed |
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: response-compliance
description: 'OpenResponses API compliance testing. Use for Response API endpoint tests, compliance runs, schema debugging, response api test, or openresponses test tasks.'
description: OpenResponses API compliance testing. Use when testing the Response API endpoint, running compliance tests, or debugging Response API schema issues. Triggers on 'compliance', 'response api test', 'openresponses test'.
---
# OpenResponses Compliance Test
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: review-checklist
description: 'LobeHub code review checklist. Use when reviewing a PR, diff, or branch for console leftovers, return await, secrets, i18n, desktop router drift, UI imports, migrations, or cloud impact.'
description: 'Common recurring mistakes in LobeHub code review — console leftovers, missing return await, hardcoded secrets, hardcoded i18n strings, desktop router pair drift, antd vs @lobehub/ui, non-idempotent migrations, cloud impact red flags. Use as a quick checklist when reviewing PRs, diffs, or branch changes.'
user-invocable: false
---
-142
View File
@@ -1,142 +0,0 @@
---
name: skills-audit
description: 'Audit .agents/skills SKILL.md files. Use for recurring checks of duplicate, overlapping, stale, inconsistent, or broken skills and merge/delete candidates.'
disable-model-invocation: true
argument-hint: '[--verbose | --apply]'
---
# Skills Audit
Periodic review of the project-local skill set under `.agents/skills/`. The goal is to catch drift before the catalog becomes confusing — too many skills, overlapping triggers, descriptions that no longer match the body, references to skills that were renamed/deleted.
**Recommended cadence:** weekly, or after any week where >1 skill was added/renamed.
## Procedure
### 1 — Inventory
Build a fresh census of all SKILL.md files. Do NOT trust any prior cached list.
```bash
find .agents/skills -name SKILL.md | wc -l # total count
find .agents/skills -name SKILL.md -exec wc -l {} \; | sort -rn # by body length
```
Group by domain in a mental table (DB / state / UI / agent / testing / workflow / docs / etc.). Note new arrivals since last audit (`git log --since="1 week ago" -- .agents/skills/`).
### 2 — Pull frontmatter for all skills
```bash
# Extract name + description for each SKILL.md
for f in .agents/skills/*/SKILL.md; do
echo "=== $(basename $(dirname $f)) ==="
awk '/^---$/{c++; next} c==1' "$f" | head -20
done
```
Read the description block of every skill. The body can stay unread unless step 4 flags it.
### 3 — Detect overlap / redundancy
For each pair within the same domain, ask:
- **Same description**? → likely duplicate (one is probably a stale rename leftover, or a global-vs-local collision).
- **Trigger keywords substantially overlap**? → either merge, OR tighten one description so the model can choose unambiguously.
- **One skill's body says "see also: foo"**? → confirm `foo` still exists, AND confirm the cross-reference is still meaningful (the referenced skill may have absorbed the referrer's concerns).
- **Skill duplicates content from `AGENTS.md`**? → fold into AGENTS.md or slim the skill to just the delta.
Common false positives (do NOT merge):
- `db-migrations` vs `drizzle` — distinct workflows (migration files vs schema authoring).
- `microcopy` vs `i18n` — content vs mechanics.
- `agent-runtime-hooks` vs `agent-tracing` vs `agent-signal` — different surfaces of the agent system.
- `testing` vs `local-testing` vs `cli-backend-testing` — different test types.
### 4 — Description format consistency
Apply the **standard template**:
```
{Topic + key conventions or scope}. Use when {scenarios — verbs + nouns}. Triggers on {`code-symbols`, 'natural phrases', '中文'}.
```
Skills with `disable-model-invocation: true` (user-invoked only, slash commands) don't need `Triggers on` — they're never auto-routed.
Flag descriptions that:
- ❌ Have NO `Use when` clause (model can't decide when to load it).
- ❌ Have NO `Triggers on` clause (and aren't `disable-model-invocation`).
- ❌ Use weird formats (numbered lists `(1)(2)(3)`, `Triggers:` colon instead of `Triggers on`, `MUST use when ...` as opening word).
- ❌ Are dramatically terse for a 200+ line body, or dramatically verbose for a 60-line body.
- ❌ Reference deleted/renamed skills.
### 5 — Stale-skill check
For narrow domain skills (e.g. `response-compliance`, one-off CLI workflows):
```bash
# Confirm the referenced code surface still exists
rg -l "response-compliance|openresponses" packages/ src/ # adjust per skill
git log --since="3 months ago" -- .agents/skills/ < skill > /SKILL.md # is it being maintained?
```
If the underlying surface is gone and the skill hasn't been edited in 3+ months → flag for archival.
### 6 — Cross-reference integrity
Any skill body mentioning another skill by name:
```bash
# Scan all skill bodies for skill-name references
rg -o '`[a-z][a-z0-9-]+`' .agents/skills/*/SKILL.md | grep -v ':\s*$' | sort -u
```
For each name extracted, confirm `.agents/skills/<name>/SKILL.md` exists. Broken references happen after renames — fix them in the same audit pass.
### 7 — Output report
Produce a markdown summary back to the user with the same structure as the original audit (this skill was created during one):
```markdown
## 📊 Inventory
{count, domain breakdown}
## 🎯 Recommendations
### 🔴 High confidence
- {action} — {reason}
### 🟡 Medium confidence
- {action} — {reason needs verification}
### 🟢 Low confidence / no-op
- {item considered but skipping because ...}
## 📋 Suggested order
{table of actions with risk + LOC estimate}
```
End by asking the user which actions to apply — do NOT auto-apply unless the user passed `--apply` and even then confirm destructive deletes individually.
## Output rules
- Be specific. "Skill X overlaps with Y" is useless without naming the overlapping triggers.
- Cite line numbers when flagging description / body issues.
- Don't recommend merges unless the call sites would actually load the merged skill in the same context.
- Don't recommend deletes for skills that haven't been touched recently — "unused" can mean "stable", not "dead".
## What NOT to do
- ❌ Don't rename skill directories without checking for cross-references AND user memory entries that name the old slug.
- ❌ Don't normalize a description by removing trigger keywords just to fit the template — the keywords are the routing signal.
- ❌ Don't fold a heavy 200+ line skill into another just because they share a domain — large skills get loaded selectively and merging makes everything load.
- ❌ Don't propose `.agents/skills/INDEX.md` or `<domain>-<skill>` prefix renames unless the user explicitly asks — costs > benefits for cosmetic reorgs.
## Related history
- First audit: `chore/skills-audit` branch (2026-05-25) — deleted `source-command-dedupe`, renamed `data-fetching``data-fetching-architecture`, normalized 9 descriptions, created this skill.
@@ -0,0 +1,44 @@
---
name: 'source-command-dedupe'
description: 'Find duplicate GitHub issues'
---
# source-command-dedupe
Use this skill when the user asks to run the migrated source command `dedupe`.
## Command Template
Find up to 3 likely duplicate issues for a given GitHub issue.
To do this, follow these steps precisely:
1. Use an agent to check if the Github issue (a) is closed, (b) does not need to be deduped (eg. because it is broad product feedback without a specific solution, or positive feedback), or (c) already has a duplicates comment that you made earlier. If so, do not proceed.
2. Use an agent to view a Github issue, and ask the agent to return a summary of the issue
3. Then, launch 5 parallel agents to search Github for duplicates of this issue, using diverse keywords and search approaches, using the summary from #1
4. Next, feed the results from #1 and #2 into another agent, so that it can filter out false positives, that are likely not actually duplicates of the original issue. If there are no duplicates remaining, do not proceed.
5. Finally, comment back on the issue with a list of up to three duplicate issues (or zero, if there are no likely duplicates)
Notes (be sure to tell this to your agents, too):
- Use `gh` to interact with Github, rather than web fetch
- Do not use other tools, beyond `gh` (eg. don't use other MCP servers, file edit, etc.)
- Make a todo list first
- For your comment, follow the following format precisely (assuming for this example that you found 3 suspected duplicates):
---
Found 3 possible duplicate issues:
1. <link to issue>
2. <link to issue>
3. <link to issue>
This issue will be automatically closed as a duplicate in 3 days.
- If your issue is a duplicate, please close it and 👍 the existing issue instead
- To prevent auto-closure, add a comment or 👎 this comment
> 🤖 Generated with Codex
---
+1 -22
View File
@@ -1,6 +1,6 @@
---
name: spa-routes
description: 'LobeHub SPA route architecture. Use when editing src/routes, src/features delegation, desktop/mobile/popup router configs, .desktop variants, route segments, redirects, or new pages.'
description: MUST use when editing src/routes/ segments, src/spa/router/desktopRouter.config.tsx or desktopRouter.config.desktop.tsx (always change both together), mobileRouter.config.tsx, or when moving UI/logic between routes and src/features/.
user-invocable: false
---
@@ -94,27 +94,6 @@ Anything that changes the tree (new segment, renamed `path`, moved layout, new c
---
## 3b. Other `.desktop.{ts,tsx}` variants inside `src/routes/`
The router pair is **not** the only `.desktop` variant pattern in this repo. Some route trees colocate a `<name>.desktop.{ts,tsx}` next to its base `<name>.{ts,tsx}` — Vite's resolver swaps in the `.desktop` file for Electron builds. Same drift risk as the router pair: editing only one side can break Electron silently.
Known variants today:
| Base file (web) | Desktop file (Electron) | Purpose |
| ----------------------------------------------------- | ------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------ |
| `src/routes/(main)/settings/features/componentMap.ts` | `src/routes/(main)/settings/features/componentMap.desktop.ts` | Settings tab → component map. Web uses dynamic `import()`; desktop uses sync imports. `componentMap.sync.test.ts` enforces identical keys. |
| `src/routes/(main)/agent/index.tsx` | `src/routes/(main)/agent/index.desktop.tsx` | Page entry. Desktop variant overrides the web page wholesale (e.g. extra popup guards). |
| `src/routes/(main)/group/index.tsx` | `src/routes/(main)/group/index.desktop.tsx` | Same pattern as agent. |
**Rules:**
1. After editing **any** `.ts`/`.tsx` under `src/routes/`, glob the same directory for a `<filename>.desktop.{ts,tsx}` sibling. If one exists, apply the equivalent change there in the same commit.
2. When adding a new SettingsTab, register it in **both** `componentMap.ts` (with `dynamic(...)`) and `componentMap.desktop.ts` (with a sync `import`). `componentMap.sync.test.ts` will fail the build otherwise.
3. When adding a new desktop-only page wholesale-override, prefer a single base file with platform-aware code over introducing a new `.desktop.tsx` variant — only add a new variant when the two trees genuinely diverge (different store wiring, different popup guards, etc.).
4. When deleting, remove **both** files together.
---
## 4. How to Divide Files (route vs feature)
| Question | Put in `src/routes/` | Put in `src/features/` |
@@ -1,6 +1,6 @@
---
name: store-data-structures
description: 'LobeHub Zustand store data-shape patterns. Use when designing store state, list/detail splits, normalized maps, reducers, messagesMap, topicsMap, or choosing shared type sources.'
description: Zustand store data structure patterns for LobeHub. Covers List vs Detail data structures, Map + Reducer patterns, type definitions, and when to use each pattern. Use when designing store state, choosing data structures, or implementing list/detail pages.
user-invocable: false
---
@@ -310,5 +310,5 @@ export interface BenchmarkListItem {
## Related Skills
- `data-fetching-architecture` — how to fetch and update this data
- `data-fetching` — how to fetch and update this data
- `zustand` — general Zustand patterns
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: testing
description: 'Vitest testing guide. Use when writing or updating tests, fixing failing tests, improving coverage, debugging test issues, or setting up mocks.'
description: Testing guide using Vitest. Use when writing tests (.test.ts, .test.tsx), fixing failing tests, improving test coverage, or debugging test issues. Triggers on test creation, test debugging, mock setup, or test-related questions.
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: trpc-router
description: 'TRPC router development guide. Use when creating or modifying src/server/routers, adding procedures, or implementing server-side API endpoints.'
description: TRPC router development guide. Use when creating or modifying TRPC routers (src/server/routers/**), adding procedures, or working with server-side API endpoints. Triggers on TRPC router creation, procedure implementation, or API endpoint tasks.
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: typescript
description: 'LobeHub TypeScript style and type-safety guide. Use when editing TS/TSX/MTS, fixing types, choosing interface vs type, avoiding any/object, import type, async flow, or ts-expect-error.'
description: "TypeScript code style and type-safety guide for LobeHub. Read before writing or editing any `.ts` / `.tsx` / `.mts` — covers `interface` vs `type`, `Record<PropertyKey, unknown>` over `any`/`object`, `as const satisfies`, `@ts-expect-error` over `@ts-ignore`, `import type` (`separate-type-imports`), `async`/`await` + `Promise.all`, `for…of` over indexed `for`, and the no-silent-`.catch(() => fallback)` rule. Also use when reviewing type quality, deciding module augmentation (`declare module`) over `namespace`, or designing extensible types (e.g. `PipelineContext.metadata`). Triggers on any TypeScript file edit, 'fix the type', 'why is this `any`', 'should this be interface or type', 'eslint type-import', 'ts-expect-error'."
user-invocable: false
---
+1 -1
View File
@@ -1,6 +1,6 @@
---
name: upstash-workflow
description: 'LobeHub Upstash Workflow and QStash guide. Use for async workflows, process/paginate/execute fan-out, serve handlers, context.run/call/sleep, or workflow triggers.'
description: 'Upstash Workflow implementation guide. Use when creating async workflows with QStash, implementing fan-out patterns, or building 3-layer workflow architecture (process → paginate → execute).'
user-invocable: false
---
+41 -5
View File
@@ -1,6 +1,6 @@
---
name: zustand
description: 'LobeHub Zustand store conventions. Use when editing src/store, store slices, public/internal actions, dispatch actions, flattenActions, optimistic updates, selectors, maps, or class action migration.'
description: "LobeHub Zustand store conventions: public/internal/dispatch action layers, optimistic update pattern, slice composition via `flattenActions`, and class-based action migration. Use whenever working under `src/store/**`, adding a `createXxxSlice`, writing `internal_*` or `internal_dispatch*` actions, designing `messagesMap`/`topicsMap` reducers, refactoring a `StateCreator` object slice into a `XxxActionImpl` class, or debugging stale store reads. Triggers on `useChatStore`/`useUserStore`/`useGlobalStore`, `createStore`, `flattenActions`, `StoreSetter`, `internal_dispatch`, 'add an action', 'zustand selector', 'store slice', 'class action', 'optimistic update'."
user-invocable: false
---
@@ -177,12 +177,29 @@ export const chatGroupAction: StateCreator<
### Slices That Don't Currently Need `set`
When a slice doesn't write local state (e.g. it delegates to another store or just runs hooks), drop `#set` and mark the constructor param as `_set` with `void _set` to keep the `(set, get, api)` shape:
When a slice doesn't write local state at the moment — e.g. it reads context
from `#get()` and forwards calls to another store, or just runs hooks — drop
the `#set` field. Otherwise ESLint's `no-unused-vars` flags the unused private
field.
Mark the constructor's `set` param as `_set` and `void _set` it to keep the
`(set, get, api)` shape aligned with `StateCreator`. This is **a snapshot of
the current need, not a permanent contract** — if a later change needs `set`,
restore the `#set` field and use it; do not invent a workaround to keep the
"unused" form.
```ts
type Setter = StoreSetter<ConversationStore>;
export const toolSlice = (set: Setter, get: () => ConversationStore, _api?: unknown) =>
new ToolActionImpl(set, get, _api);
export class ToolActionImpl {
readonly #get: () => ConversationStore;
// Mark unused params with `_` prefix and `void _x` so the constructor still
// matches StateCreator's `(set, get, api)` shape without triggering unused
// diagnostics.
constructor(_set: Setter, get: () => ConversationStore, _api?: unknown) {
void _set;
void _api;
@@ -195,8 +212,27 @@ export class ToolActionImpl {
hooks.onToolCallComplete?.(id, undefined);
};
}
export type ToolAction = Pick<ToolActionImpl, keyof ToolActionImpl>;
```
- Drop `#set` when unused; restore it when a later edit needs `set` — re-adding costs nothing.
- Don't add `setNamespace` for slices that don't write state.
- Don't keep both old slice objects and class actions active at the same time during migration.
Rules of thumb:
- If a slice doesn't currently call `set`, drop `#set` (use `_set` + `void _set`
in the constructor). When a later edit needs `set`, restore `#set` and use it.
- Don't add `setNamespace` for slices that don't write state. Add it when the
slice starts writing state.
- Never leave `#set` declared but unused "for future use" — lint will fail and
re-adding it later costs nothing.
### Do / Don't
- **Do**: keep constructor signature aligned with `StateCreator` params `(set, get, api)`.
- **Do**: use `#private` to avoid `set/get` being exposed.
- **Do**: use `flattenActions` instead of spreading class instances.
- **Do**: drop `#set` (and use `_set` + `void _set` in the constructor) for
delegate-only slices that never write state — keeps lint green without
breaking the `(set, get, api)` shape.
- **Don't**: keep both old slice objects and class actions active at the same time.
- **Don't**: keep an unused `#set` field "for future use" — it fails ESLint and
re-adding it later costs nothing.
+1 -2
View File
@@ -1,7 +1,6 @@
# Add directories or file patterns to ignore during indexing (e.g. foo/ or *.csv)
locales/
apps/desktop/resources/locales/
**/__snapshots__/
**/fixtures/
packages/database/migrations/
src/database/migrations/
-28
View File
@@ -223,29 +223,6 @@ OPENAI_API_KEY=sk-xxxxxxxxx
# The LobeChat agents market index url
# AGENTS_INDEX_URL=https://chat-agents.lobehub.com
# #######################################
# ######### Cloud Sandbox Service #######
# #######################################
# Sandbox provider for built-in code execution, shell, file operations, and export.
# Supported values: market, onlyboxes
# SANDBOX_PROVIDER=market
# Required when SANDBOX_PROVIDER=onlyboxes. Base URL of the Onlyboxes console API, without /api/v1.
# ONLYBOXES_BASE_URL=https://onlyboxes.example.com
# Required when SANDBOX_PROVIDER=onlyboxes. Must match Onlyboxes CONSOLE_JIT_SIGNING_KEY.
# ONLYBOXES_JIT_SIGNING_KEY=onlyboxes-jit-signing-secret
# Optional JIT token issuer. Defaults to APP_URL.
# ONLYBOXES_JIT_ISSUER=https://lobehub.example.com
# Optional JIT token TTL in seconds.
# ONLYBOXES_JIT_TTL_SEC=1800
# Optional terminal session lease in seconds for the Onlyboxes provider.
# ONLYBOXES_LEASE_TTL_SEC=900
# #######################################
# ########### Plugin Service ############
# #######################################
@@ -399,11 +376,6 @@ OPENAI_API_KEY=sk-xxxxxxxxx
# Postgres database URL
# DATABASE_URL=postgres://username:password@host:port/database
# Optional: server-side timeout (in milliseconds) for a single SQL statement.
# When set, Postgres aborts any statement/idle transaction exceeding it, so a stuck
# query can't block indefinitely. Leave unset to keep Postgres' default of no timeout.
# DATABASE_STATEMENT_TIMEOUT=300000
# use `openssl rand -base64 32` to generate a key for the encryption of the database
# we use this key to encrypt the user api key and proxy url
# KEY_VAULTS_SECRET=xxxxx/xxxxxxxxxxxxxx=
@@ -75,7 +75,7 @@ runs:
# 1. 上传安装包到版本目录
echo "📦 Uploading release files to s3://$S3_BUCKET/$CHANNEL/$VERSION/"
for file in release/*.dmg release/*.zip release/*.exe release/*.AppImage release/*.deb release/*.rpm release/*.snap release/*.tar.gz release/*.blockmap; do
for file in release/*.dmg release/*.zip release/*.exe release/*.AppImage release/*.deb release/*.rpm release/*.snap release/*.tar.gz; do
if [ -f "$file" ]; then
filename=$(basename "$file")
echo " ↗️ $filename"
-75
View File
@@ -1,75 +0,0 @@
name: Release CLI
permissions:
contents: write
on:
push:
tags:
- 'v*.*.*'
workflow_dispatch:
inputs:
tag:
description: 'Tag name for the release (e.g. v0.1.0)'
required: true
default: 'v0.0.0'
jobs:
build:
name: Build ${{ matrix.target }}
runs-on: ${{ matrix.os }}
# skip pre-release tags (containing '-') on auto-trigger; always run on workflow_dispatch
if: ${{ github.event_name == 'workflow_dispatch' || !contains(github.ref_name, '-') }}
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-latest
target: lobe-linux-x64
- os: macos-latest
target: lobe-macos-arm64
steps:
- uses: actions/checkout@v4
- name: Setup pnpm
uses: pnpm/action-setup@v4
- name: Setup Bun
uses: oven-sh/setup-bun@v2
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Build binary
run: |
mkdir -p dist
bun build ./apps/cli/src/index.ts --compile --minify --outfile ./dist/${{ matrix.target }}
- name: Upload artifact
uses: actions/upload-artifact@v4
with:
name: ${{ matrix.target }}
path: ./dist/${{ matrix.target }}
release:
name: Upload to Release
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download all artifacts
uses: actions/download-artifact@v4
with:
path: ./dist
- name: Upload to GitHub Release
uses: softprops/action-gh-release@v2
with:
tag_name: ${{ github.event_name == 'workflow_dispatch' && inputs.tag || github.ref_name }}
files: |
./dist/lobe-linux-x64/lobe-linux-x64
./dist/lobe-macos-arm64/lobe-macos-arm64
./apps/cli/install.sh
+1 -1
View File
@@ -32,7 +32,7 @@ jobs:
runs-on: ubuntu-latest
name: Test Packages
env:
PACKAGES: '@lobechat/file-loaders @lobechat/prompts @lobechat/model-runtime @lobechat/web-crawler @lobechat/electron-server-ipc @lobechat/utils @lobechat/python-interpreter @lobechat/context-engine @lobechat/agent-runtime @lobechat/conversation-flow @lobechat/ssrf-safe-fetch @lobechat/memory-user-memory @lobechat/types @lobechat/builtin-tool-lobe-agent model-bank @lobechat/agent-gateway-client @lobechat/agent-manager-runtime @lobechat/device-gateway-client @lobechat/device-identity @lobechat/eval-dataset-parser @lobechat/eval-rubric @lobechat/fetch-sse @lobechat/heterogeneous-agents'
PACKAGES: '@lobechat/file-loaders @lobechat/prompts @lobechat/model-runtime @lobechat/web-crawler @lobechat/electron-server-ipc @lobechat/utils @lobechat/python-interpreter @lobechat/context-engine @lobechat/agent-runtime @lobechat/conversation-flow @lobechat/ssrf-safe-fetch @lobechat/memory-user-memory @lobechat/types @lobechat/builtin-tool-lobe-agent model-bank'
steps:
- name: Checkout
-3
View File
@@ -28,9 +28,6 @@ prd
# Recordings
.records/
# Agent-gateway probe captures (local debugging dumps)
.agent-gateway/
# Temporary files
.temp/
temp/
+4 -14
View File
@@ -7,7 +7,6 @@ Guidelines for using AI coding agents in this LobeHub repository.
- Next.js 16 + React 19 + TypeScript
- SPA inside Next.js with `react-router-dom`
- `@lobehub/ui`, antd for components; antd-style for CSS-in-JS — **prefer `createStaticStyles` with `cssVar.*`** (zero-runtime); only fall back to `createStyles` + `token` when styles genuinely need runtime computation. See `.cursor/docs/createStaticStyles_migration_guide.md`.
- **Component priority**: `@lobehub/ui/base-ui` (headless primitives) **first**, then `@lobehub/ui` root, then antd as last resort. When the component exists in base-ui, use it — never reach for the root or antd counterpart. Base-ui covers `Select`, `Modal` / `createModal` / `confirmModal`, `DropdownMenu`, `ContextMenu`, `Popover`, `ScrollArea`, `Switch`, `Toast`, `FloatingSheet`. Prefer `@lobehub/ui/base-ui` for new code and migrate root-package call sites opportunistically.
- react-i18next for i18n; zustand for state management
- SWR for data fetching; TRPC for type-safe backend
- Drizzle ORM with PostgreSQL; Vitest for testing
@@ -115,23 +114,14 @@ cd packages/database && bunx vitest run --silent='passed-only' '[file]'
```
- Prefer `vi.spyOn` over `vi.mock`
### Type Checking
```bash
bun run type-check
```
- Tests must pass type check: `bun run type-check`
- After 2 failed fix attempts, stop and ask for help
### i18n
- Add keys to a namespace file under `src/locales/default/` (e.g. `agent.ts`, `auth.ts`)
- Ship en-US and zh-CN by hand in the same PR: write the English source in `src/locales/default/*.ts` and mirror it to `locales/en-US/`; hand-translate `locales/zh-CN/`. Leave all other locales to CI.
- Don't run `pnpm i18n` manually by default — a daily CI workflow (`auto-i18n.yml`) runs it and opens an automated translation PR for any missing keys.
- Run `pnpm i18n` manually only when your branch needs the translated locales immediately, instead of waiting for the daily job (slow; requires `OPENAI_API_KEY`). Note it only fills keys missing from other locales — value-only edits never need it.
### Code Style
- When a single file grows beyond \~800 lines, consider splitting it into multiple files (extract sub-components, hooks, helpers, or types). Smaller, focused files are friendly to humans and agents.
- For dev preview: translate `locales/zh-CN/` and `locales/en-US/`
- `pnpm i18n` is slow; run it manually when locale keys need updating (e.g. before opening a PR).
### Code Review
-29
View File
@@ -2,35 +2,6 @@
# Changelog
## [Version 2.2.1](https://github.com/lobehub/lobe-chat/compare/v0.0.0-nightly.pr15228.13999...v2.2.1)
<sup>Released on **2026-05-29**</sup>
#### ✨ Features
- **device**: device registry TRPC (register / list / update / remove).
- **bot**: add iMessage Desktop setup and bridge.
- **desktop**: show zoom level HUD on Cmd+/- and Cmd+0.
<br/>
<details>
<summary><kbd>Improvements and Fixes</kbd></summary>
#### What's improved
- **device**: device registry TRPC (register / list / update / remove), closes [#15299](https://github.com/lobehub/lobe-chat/issues/15299) ([671b252](https://github.com/lobehub/lobe-chat/commit/671b252))
- **bot**: add iMessage Desktop setup and bridge, closes [#15228](https://github.com/lobehub/lobe-chat/issues/15228) ([6d94635](https://github.com/lobehub/lobe-chat/commit/6d94635))
- **desktop**: show zoom level HUD on Cmd+/- and Cmd+0, closes [#15294](https://github.com/lobehub/lobe-chat/issues/15294) ([109545c](https://github.com/lobehub/lobe-chat/commit/109545c))
</details>
<div align="right">
[![](https://img.shields.io/badge/-BACK_TO_TOP-151515?style=flat-square)](#readme-top)
</div>
### [Version 2.2.0](https://github.com/lobehub/lobe-chat/compare/v2.1.59-canary.27...v2.2.0)
<sup>Released on **2026-05-18**</sup>
-8
View File
@@ -210,14 +210,6 @@ ENV NEXT_PUBLIC_S3_DOMAIN="" \
S3_ENABLE_PATH_STYLE="" \
S3_SET_ACL=""
# Cloud Sandbox
ENV SANDBOX_PROVIDER="" \
ONLYBOXES_BASE_URL="" \
ONLYBOXES_JIT_ISSUER="" \
ONLYBOXES_JIT_SIGNING_KEY="" \
ONLYBOXES_JIT_TTL_SEC="" \
ONLYBOXES_LEASE_TTL_SEC=""
# Model Variables
ENV \
# AI21
-88
View File
@@ -1,88 +0,0 @@
import { execSync } from 'node:child_process';
import { readFileSync } from 'node:fs';
import { fileURLToPath } from 'node:url';
import { describe, expect, it } from 'vitest';
import {
assertGoldenFinalState,
extractGoldenOutcomes,
} from './fixtures/agent-signal/assertGoldenFinalState';
/**
* E2E tests for `lh agent-signal trigger`.
*
* The "golden fixture" block runs fully offline — it is the structural
* regression baseline that the execAgent migration asserts
* against. The "live trigger" block requires a running server + authenticated
* CLI and is gated behind AGENT_SIGNAL_AGENT_ID (or AGENT_ID).
*
* Prerequisites for the live block:
* - `lh` (or LH_CLI_PATH) points at the built CLI
* - User is authenticated (`lh login`) against a dev server with Agent Signal enabled
* - AGENT_SIGNAL_AGENT_ID=<agentId> identifies a target agent the user owns
*/
const CLI = process.env.LH_CLI_PATH || 'lh';
const AGENT_ID = process.env.AGENT_SIGNAL_AGENT_ID || process.env.AGENT_ID;
const TIMEOUT = 60_000;
const goldenPath = fileURLToPath(
new URL('./fixtures/agent-signal/nightly-review.golden.json', import.meta.url),
);
const golden = JSON.parse(readFileSync(goldenPath, 'utf-8'));
function run(args: string): string {
return execSync(`${CLI} ${args}`, {
encoding: 'utf-8',
env: { ...process.env, PATH: `${process.env.HOME}/.bun/bin:${process.env.PATH}` },
timeout: TIMEOUT,
}).trim();
}
describe('agent-signal golden fixture - structural regression', () => {
it('captures a recognizable nightly-review source payload', () => {
expect(golden.source.sourceType).toBe('agent.nightly_review.requested');
expect(golden.source.payload.agentId).toBeTruthy();
expect(golden.source.payload.userId).toBeTruthy();
expect(golden.source.scopeKey).toContain('agent:');
});
it('extracts ideas / write outcomes / brief from finalState', () => {
const outcomes = extractGoldenOutcomes(golden.finalState);
expect(outcomes.ideas.length).toBeGreaterThanOrEqual(1);
expect(outcomes.writeOutcomes.length).toBeGreaterThanOrEqual(1);
expect(outcomes.brief).toBeDefined();
});
it('passes the shared structural assertion', () => {
expect(() => assertGoldenFinalState(golden.finalState)).not.toThrow();
});
it('rejects an empty finalState', () => {
expect(() => assertGoldenFinalState({ messages: [] })).toThrow(/artifact/i);
});
});
describe.skipIf(!AGENT_ID)('lh agent-signal trigger - live', () => {
it('triggers a nightly review and returns a workflow run id', () => {
const output = run(
`agent-signal trigger --source-type agent.nightly_review.requested --agent ${AGENT_ID} --json`,
);
const result = JSON.parse(output);
expect(result).toHaveProperty('accepted');
expect(result).toHaveProperty('scopeKey');
// When Agent Signal is enabled for the account, a workflow run id is returned.
if (result.accepted) {
expect(typeof result.workflowRunId).toBe('string');
expect(result.workflowRunId.length).toBeGreaterThan(0);
}
});
it('exits non-zero on an invalid source type', () => {
expect(() =>
run(`agent-signal trigger --source-type not.a.real.type --agent ${AGENT_ID}`),
).toThrow();
});
});
@@ -1,127 +0,0 @@
/**
* Standalone structural assertions for self-iteration finalState snapshots.
*
* Dependency-free on purpose: the execAgent migration PRs
* import this from server tests AND the CLI e2e suite, so it must not pull in
* vitest or any server-only module. Mirrors the `kind` discrimination used by
* `src/server/services/agentSignal/services/selfIteration/finalStateExtractor.ts`.
*/
export type ToolResultKind = 'artifact' | 'mutation' | 'read';
export interface ToolResultWithKind {
apiName?: string;
data: Record<string, unknown> | unknown;
kind: ToolResultKind;
toolCallId?: string;
}
export interface GoldenOutcomes {
/** The single brief mutation, if any (apiName matches /brief/i). */
brief?: ToolResultWithKind;
/** Artifact tool results whose apiName mentions an idea. */
ideas: ToolResultWithKind[];
/** Artifact tool results whose apiName mentions an intent. */
intents: ToolResultWithKind[];
/** Durable mutation tool results, excluding the brief. */
writeOutcomes: ToolResultWithKind[];
}
interface FinalStateLike {
messages?: unknown[];
}
const isRecord = (value: unknown): value is Record<string, unknown> =>
typeof value === 'object' && value !== null && !Array.isArray(value);
const parseContent = (content: unknown): unknown => {
if (typeof content !== 'string') return content;
try {
return JSON.parse(content);
} catch {
return content;
}
};
/** Extract every tool result of `kind` from a finalState, in message order. */
export const extractFromFinalState = (
finalState: FinalStateLike,
kind: ToolResultKind,
): ToolResultWithKind[] => {
const results: ToolResultWithKind[] = [];
for (const message of finalState.messages ?? []) {
if (!isRecord(message)) continue;
if (message.role !== 'tool') continue;
const content = parseContent(message.content);
const contentRecord = isRecord(content) ? content : undefined;
const pluginState = isRecord(message.pluginState) ? message.pluginState : undefined;
const resultKind = contentRecord?.kind ?? pluginState?.kind;
if (resultKind !== kind) continue;
results.push({
apiName: typeof message.apiName === 'string' ? message.apiName : undefined,
data: contentRecord ?? content,
kind,
toolCallId: typeof message.tool_call_id === 'string' ? message.tool_call_id : undefined,
});
}
return results;
};
const matchesApiName = (result: ToolResultWithKind, pattern: RegExp): boolean =>
typeof result.apiName === 'string' && pattern.test(result.apiName);
const briefText = (brief?: ToolResultWithKind): string => {
if (!brief || !isRecord(brief.data)) return '';
const summary = typeof brief.data.summary === 'string' ? brief.data.summary : '';
const body = typeof brief.data.body === 'string' ? brief.data.body : '';
return `${summary}${body}`.trim();
};
/** Partition a finalState into ideas / intents / writeOutcomes / brief buckets. */
export const extractGoldenOutcomes = (finalState: FinalStateLike): GoldenOutcomes => {
const artifacts = extractFromFinalState(finalState, 'artifact');
const mutations = extractFromFinalState(finalState, 'mutation');
const brief = mutations.find((m) => matchesApiName(m, /brief/i));
return {
brief,
ideas: artifacts.filter((a) => matchesApiName(a, /idea/i)),
intents: artifacts.filter((a) => matchesApiName(a, /intent/i)),
writeOutcomes: mutations.filter((m) => !matchesApiName(m, /brief/i)),
};
};
/**
* Structural regression assertion for a self-iteration finalState.
*
* Throws (with a descriptive message) when the run produced no structured
* output: it requires at least one artifact (idea or intent), at least one
* durable write outcome, and a non-empty brief. Never compares text verbatim.
*/
export const assertGoldenFinalState = (finalState: FinalStateLike): GoldenOutcomes => {
const outcomes = extractGoldenOutcomes(finalState);
const artifactCount = outcomes.ideas.length + outcomes.intents.length;
if (artifactCount < 1) {
throw new Error(`Expected >= 1 artifact (idea/intent) in finalState, found ${artifactCount}`);
}
if (outcomes.writeOutcomes.length < 1) {
throw new Error(
`Expected >= 1 write outcome (mutation) in finalState, found ${outcomes.writeOutcomes.length}`,
);
}
const text = briefText(outcomes.brief);
if (text.length === 0) {
throw new Error('Expected a non-empty brief in finalState, found none');
}
return outcomes;
};
@@ -1,61 +0,0 @@
{
"description": "Desensitized golden snapshot of one nightly-review self-iteration run. Used as a structural regression baseline by the execAgent migration which converges all agent execution paths (chat, self-iteration, memoryWriter, skillManagement) onto a single execAgent entry point. Assert structure, never byte-for-byte: the LLM output is non-deterministic.",
"finalState": {
"messages": [
{
"content": "Run the nightly self-review for the local window.",
"role": "user"
},
{
"apiName": "getEvidenceDigest",
"content": "{\"kind\":\"read\",\"topicCount\":3,\"messageCount\":42,\"window\":\"2026-05-30/2026-05-31\"}",
"role": "tool",
"tool_call_id": "call_read_1"
},
{
"apiName": "recordSelfReviewIdea",
"content": "{\"kind\":\"artifact\",\"idempotencyKey\":\"idea:pref:tone\",\"title\":\"Prefer concise replies\",\"rationale\":\"User repeatedly asked to shorten answers in topic tpc_demo\",\"risk\":\"low\"}",
"role": "tool",
"tool_call_id": "call_idea_1"
},
{
"apiName": "recordSelfReviewIdea",
"content": "{\"kind\":\"artifact\",\"idempotencyKey\":\"idea:skill:drizzle\",\"title\":\"Document Drizzle join helper\",\"rationale\":\"Recurring question about leftJoin usage\",\"risk\":\"medium\"}",
"role": "tool",
"tool_call_id": "call_idea_2"
},
{
"apiName": "writeMemory",
"content": "{\"kind\":\"mutation\",\"status\":\"applied\",\"resourceId\":\"mem_001\",\"summary\":\"Stored tone preference: prefer concise replies\"}",
"pluginState": { "kind": "mutation" },
"role": "tool",
"tool_call_id": "call_mut_1"
},
{
"apiName": "createSelfReviewBrief",
"content": "{\"kind\":\"mutation\",\"briefId\":\"brief_001\",\"summary\":\"Nightly review captured 2 ideas and wrote 1 memory.\",\"body\":\"## Highlights\\n- Prefer concise replies\\n- Document Drizzle join helper\"}",
"role": "tool",
"tool_call_id": "call_brief_1"
},
{
"content": "Nightly review complete. Captured 2 ideas and wrote 1 memory.",
"role": "assistant"
}
]
},
"source": {
"payload": {
"agentId": "agent_demo",
"localDate": "2026-05-30",
"requestedAt": "2026-05-31T04:00:00.000Z",
"reviewWindowEnd": "2026-05-31T04:00:00.000Z",
"reviewWindowStart": "2026-05-30T04:00:00.000Z",
"timezone": "UTC",
"userId": "user_demo"
},
"scopeKey": "agent:agent_demo:user:user_demo",
"sourceId": "nightly-review:user_demo:agent_demo:2026-05-30",
"sourceType": "agent.nightly_review.requested",
"timestamp": 1748664000000
}
}
-78
View File
@@ -1,78 +0,0 @@
#!/bin/sh
set -e
REPO="lobehub/lobe-chat"
BIN_NAME="lh"
# Detect OS
case "$(uname -s)" in
Linux) OS="linux" ;;
Darwin) OS="macos" ;;
*)
printf 'Error: Unsupported OS: %s\n' "$(uname -s)" >&2
exit 1
;;
esac
# Detect architecture
case "$(uname -m)" in
x86_64) ARCH="x64" ;;
aarch64|arm64) ARCH="arm64" ;;
*)
printf 'Error: Unsupported architecture: %s\n' "$(uname -m)" >&2
exit 1
;;
esac
BINARY="lobe-${OS}-${ARCH}"
URL="https://github.com/${REPO}/releases/latest/download/${BINARY}"
printf 'Detected: %s/%s\n' "$OS" "$ARCH"
printf 'Downloading %s...\n' "$BINARY"
TMP="$(mktemp)"
trap 'rm -f "$TMP"' EXIT
if command -v curl >/dev/null 2>&1; then
curl -fsSL "$URL" -o "$TMP"
elif command -v wget >/dev/null 2>&1; then
wget -qO "$TMP" "$URL"
else
printf 'Error: curl or wget is required\n' >&2
exit 1
fi
chmod +x "$TMP"
# Choose install directory: prefer /usr/local/bin, fall back to ~/.local/bin
USE_SUDO=0
if [ -w "/usr/local/bin" ]; then
INSTALL_DIR="/usr/local/bin"
elif command -v sudo >/dev/null 2>&1 && sudo -n true 2>/dev/null; then
INSTALL_DIR="/usr/local/bin"
USE_SUDO=1
else
INSTALL_DIR="${HOME}/.local/bin"
mkdir -p "$INSTALL_DIR"
printf 'Note: No sudo access. Installing to %s\n' "$INSTALL_DIR"
printf 'Add the following to your shell profile if needed:\n'
printf ' export PATH="%s:$PATH"\n' "$INSTALL_DIR"
fi
# Install binary and create symlinks
if [ "$USE_SUDO" = "1" ]; then
sudo cp "$TMP" "${INSTALL_DIR}/${BIN_NAME}"
sudo chmod +x "${INSTALL_DIR}/${BIN_NAME}"
sudo ln -sf "${INSTALL_DIR}/${BIN_NAME}" "${INSTALL_DIR}/lobe"
sudo ln -sf "${INSTALL_DIR}/${BIN_NAME}" "${INSTALL_DIR}/lobehub"
else
cp "$TMP" "${INSTALL_DIR}/${BIN_NAME}"
chmod +x "${INSTALL_DIR}/${BIN_NAME}"
ln -sf "${INSTALL_DIR}/${BIN_NAME}" "${INSTALL_DIR}/lobe"
ln -sf "${INSTALL_DIR}/${BIN_NAME}" "${INSTALL_DIR}/lobehub"
fi
printf '\nInstalled successfully!\n'
printf ' Binary: %s/%s\n' "$INSTALL_DIR" "$BIN_NAME"
printf ' Symlinks: lobe, lobehub -> lh\n\n'
"${INSTALL_DIR}/${BIN_NAME}" --version
+1 -4
View File
@@ -1,6 +1,6 @@
.\" Code generated by `npm run man:generate`; DO NOT EDIT.
.\" Manual command details come from the Commander command tree.
.TH LH 1 "" "@lobehub/cli 0.0.24" "User Commands"
.TH LH 1 "" "@lobehub/cli 0.0.20" "User Commands"
.SH NAME
lh \- LobeHub CLI \- manage and connect to LobeHub services
.SH SYNOPSIS
@@ -65,9 +65,6 @@ Manage agents
.B agent\-group
Manage agent groups
.TP
.B agent\-signal
Inspect and trigger Agent Signal source events
.TP
.B bot
Manage bot integrations
.TP
+2 -4
View File
@@ -1,6 +1,6 @@
{
"name": "@lobehub/cli",
"version": "0.0.24",
"version": "0.0.20",
"type": "module",
"bin": {
"lh": "./dist/index.js",
@@ -30,10 +30,8 @@
"devDependencies": {
"@lobechat/agent-gateway-client": "workspace:*",
"@lobechat/device-gateway-client": "workspace:*",
"@lobechat/device-identity": "workspace:*",
"@lobechat/heterogeneous-agents": "workspace:*",
"@lobechat/local-file-shell": "workspace:*",
"@lobechat/tool-runtime": "workspace:*",
"@trpc/client": "^11.8.1",
"@types/node": "^22.13.5",
"@types/ws": "^8.18.1",
@@ -46,7 +44,7 @@
"picocolors": "^1.1.1",
"superjson": "^2.2.6",
"tsdown": "^0.21.4",
"typescript": "^6.0.3",
"typescript": "^5.9.3",
"ws": "^8.18.1"
},
"publishConfig": {
-4
View File
@@ -1,12 +1,8 @@
packages:
- '../../packages/agent-gateway-client'
- '../../packages/device-gateway-client'
- '../../packages/device-identity'
- '../../packages/heterogeneous-agents'
- '../../packages/local-file-shell'
- '../../packages/tool-runtime'
- '../../packages/prompts'
- '../../packages/const'
- '../../packages/types'
- '../../packages/model-bank'
- '../../packages/business/const'
-20
View File
@@ -70,26 +70,6 @@ export async function getTrpcClient(): Promise<TrpcClient> {
return _client;
}
/**
* Build a Lambda tRPC client from an already-resolved auth context, without
* re-running credential discovery. Use this when the caller already holds a
* token (e.g. `lh connect --token <jwt>`) — `getTrpcClient` would re-resolve
* via env/stored creds and `process.exit(1)` when none exist, which would
* abort an otherwise-valid explicit-token session.
*/
export function createLambdaClient(auth: {
serverUrl: string;
token: string;
tokenType: 'apiKey' | 'jwt' | 'serviceToken';
}): TrpcClient {
const headers =
auth.tokenType === 'apiKey' ? { 'X-API-Key': auth.token } : { 'Oidc-Auth': auth.token };
return createTRPCClient<LambdaRouter>({
links: [httpLink({ headers, transformer: superjson, url: `${auth.serverUrl}/trpc/lambda` })],
});
}
export async function getToolsTrpcClient(): Promise<ToolsTrpcClient> {
if (_toolsClient) return _toolsClient;
+2 -4
View File
@@ -13,7 +13,7 @@ interface CurrentUserResponse {
export async function getUserIdFromApiKey(apiKey: string, serverUrl?: string): Promise<string> {
const normalizedServerUrl = normalizeUrl(serverUrl) || resolveServerUrl();
const response = await fetch(`${normalizedServerUrl}/api/v1/users/me?includeCount=0`, {
const response = await fetch(`${normalizedServerUrl}/api/v1/users/me`, {
headers: {
Authorization: `Bearer ${apiKey}`,
},
@@ -23,9 +23,7 @@ export async function getUserIdFromApiKey(apiKey: string, serverUrl?: string): P
try {
body = (await response.json()) as CurrentUserResponse;
} catch {
throw new Error(
`Failed to parse response from ${normalizedServerUrl}/api/v1/users/me?includeCount=0.`,
);
throw new Error(`Failed to parse response from ${normalizedServerUrl}/api/v1/users/me.`);
}
if (!response.ok || body?.success === false) {
+1 -1
View File
@@ -20,7 +20,7 @@ interface ResolvedAuth {
/**
* Parse the `sub` claim from a JWT without verifying the signature.
*/
export function parseJwtSub(token: string): string | undefined {
function parseJwtSub(token: string): string | undefined {
try {
const payload = JSON.parse(Buffer.from(token.split('.')[1], 'base64url').toString());
return payload.sub;
-129
View File
@@ -1,129 +0,0 @@
import type { Command } from 'commander';
import pc from 'picocolors';
import { getTrpcClient } from '../../api/client';
import { log } from '../../utils/logger';
/**
* Producer source types a developer may trigger manually for local testing.
* Mirrors `AGENT_SIGNAL_TRIGGER_SOURCE_TYPES` on the server; kept inline so the
* CLI bundle does not pull in server-only modules.
*/
const TRIGGER_SOURCE_TYPES = [
'agent.nightly_review.requested',
'agent.self_reflection.requested',
'agent.self_feedback_intent.declared',
'agent.user.message',
'tool.outcome.completed',
'tool.outcome.failed',
] as const;
type TriggerSourceType = (typeof TRIGGER_SOURCE_TYPES)[number];
export function registerAgentSignalCommand(program: Command) {
const agentSignal = program
.command('agent-signal')
.description('Inspect and trigger Agent Signal source events');
agentSignal
.command('trigger')
.description('Trigger an Agent Signal source event for the authenticated user')
.requiredOption(
'--source-type <type>',
`Source type to emit. One of:\n ${TRIGGER_SOURCE_TYPES.join('\n ')}`,
)
.option('--agent <agentId>', 'Target agent ID (required for agent-scoped source types)')
.option('--topic <topicId>', 'Topic ID to scope the event to')
.option('--payload-json <json>', 'JSON object shallow-merged over the default payload')
.option('--source-id <id>', 'Override the auto-derived dedupe source id')
.option('--scope-key <key>', 'Override the auto-derived scope key')
.option('--timestamp <ms>', 'Event timestamp in milliseconds')
.option('--json', 'Output JSON')
.action(
async (options: {
agent?: string;
json?: boolean;
payloadJson?: string;
scopeKey?: string;
sourceId?: string;
sourceType: string;
timestamp?: string;
topic?: string;
}) => {
const sourceType = options.sourceType as TriggerSourceType;
if (!TRIGGER_SOURCE_TYPES.includes(sourceType)) {
console.error(
`${pc.red('✗')} Invalid --source-type "${options.sourceType}". Expected one of: ${TRIGGER_SOURCE_TYPES.join(', ')}`,
);
process.exit(1);
return;
}
let payloadOverride: Record<string, unknown> | undefined;
if (options.payloadJson) {
try {
const parsed = JSON.parse(options.payloadJson);
if (typeof parsed !== 'object' || parsed === null || Array.isArray(parsed)) {
throw new Error('payload must be a JSON object');
}
payloadOverride = parsed as Record<string, unknown>;
} catch (error: any) {
console.error(`${pc.red('✗')} Failed to parse --payload-json: ${error.message}`);
process.exit(1);
return;
}
}
let timestamp: number | undefined;
if (options.timestamp !== undefined) {
timestamp = Number(options.timestamp);
if (!Number.isFinite(timestamp)) {
console.error(`${pc.red('✗')} --timestamp must be a number (milliseconds)`);
process.exit(1);
return;
}
}
log.debug(
'agent-signal trigger: sourceType=%s agent=%s topic=%s',
sourceType,
options.agent,
options.topic,
);
const client = await getTrpcClient();
try {
const result = await client.agentSignal.triggerSourceEvent.mutate({
agentId: options.agent,
payloadOverride,
scopeKey: options.scopeKey,
sourceId: options.sourceId,
sourceType,
timestamp,
topicId: options.topic,
});
if (options.json) {
console.log(JSON.stringify(result, null, 2));
return;
}
if (!result.accepted) {
console.log(
`${pc.yellow('!')} Agent Signal is disabled for this account — event was not enqueued (scopeKey: ${pc.bold(result.scopeKey)})`,
);
return;
}
console.log(`${pc.green('✓')} Triggered ${pc.bold(sourceType)}`);
console.log(` Scope key: ${result.scopeKey}`);
console.log(` Workflow run id: ${result.workflowRunId}`);
} catch (error: any) {
console.error(`${pc.red('✗')} Failed to trigger source event: ${error.message}`);
process.exit(1);
}
},
);
}
+16 -80
View File
@@ -347,33 +347,22 @@ export function registerAgentCommand(program: Command) {
const { serverUrl, headers, token, tokenType } = await getAgentStreamAuthInfo();
const agentGatewayUrl = options.sse ? undefined : resolveAgentGatewayUrl();
try {
if (agentGatewayUrl) {
await streamAgentEventsViaWebSocket({
gatewayUrl: agentGatewayUrl,
json: options.json,
operationId,
serverUrl,
token,
tokenType,
verbose: options.verbose,
});
} else {
const streamUrl = `${serverUrl}/api/agent/stream?operationId=${encodeURIComponent(operationId)}`;
await streamAgentEvents(streamUrl, headers, {
json: options.json,
verbose: options.verbose,
});
}
} catch (error) {
// The live stream (gateway WS / SSE) dropped before the run finished —
// the run is still executing server-side. Instead of failing, fall back
// to polling the run status until it reaches a terminal state.
if (options.json) throw error;
log.warn(
`Live stream unavailable (${(error as Error).message}). Polling run status every 10s…`,
);
await pollAgentRunStatus(client, operationId);
if (agentGatewayUrl) {
await streamAgentEventsViaWebSocket({
gatewayUrl: agentGatewayUrl,
json: options.json,
operationId,
serverUrl,
token,
tokenType,
verbose: options.verbose,
});
} else {
const streamUrl = `${serverUrl}/api/agent/stream?operationId=${encodeURIComponent(operationId)}`;
await streamAgentEvents(streamUrl, headers, {
json: options.json,
verbose: options.verbose,
});
}
},
);
@@ -637,56 +626,3 @@ function colorStatus(status: string): string {
}
}
}
const TERMINAL_RUN_STATUSES = new Set([
'completed',
'done',
'success',
'failed',
'error',
'cancelled',
'canceled',
'aborted',
]);
/**
* Fallback when the live stream (gateway WebSocket / SSE) drops before the run
* finishes: the run is still executing server-side, so poll its status every 10s
* until it reaches a terminal state (or is no longer tracked, which also means it
* has finished). Avoids hard-exiting on a transient gateway disconnect.
*/
async function pollAgentRunStatus(
client: Awaited<ReturnType<typeof getTrpcClient>>,
operationId: string,
): Promise<void> {
const POLL_MS = 10_000;
let lastStatus = '';
for (let i = 0; ; i++) {
if (i > 0) await new Promise((resolve) => setTimeout(resolve, POLL_MS));
let r: any;
try {
r = await client.aiAgent.getOperationStatus.query({ operationId } as any);
} catch (error) {
log.error(`Status poll failed: ${(error as Error).message}`);
process.exit(1);
}
if (!r) {
log.info('Run is no longer tracked — finished (or expired).');
return;
}
const status = r.status || r.state || 'unknown';
if (status !== lastStatus) {
lastStatus = status;
const steps = r.stepCount !== undefined ? ` · ${r.stepCount} step(s)` : '';
log.info(`Run status: ${colorStatus(status)}${steps}`);
}
if (TERMINAL_RUN_STATUSES.has(status)) {
if (r.error) log.error(`Run error: ${r.error}`);
return;
}
}
}
-1
View File
@@ -15,7 +15,6 @@ vi.mock('../auth/resolveToken', () => ({
}),
}));
vi.mock('../settings', () => ({
loadOrCreateConnectionId: vi.fn().mockReturnValue('test-connection-id'),
loadSettings: vi.fn().mockReturnValue(null),
normalizeUrl: vi.fn((url?: string) => (url ? url.replace(/\/$/, '') : undefined)),
saveSettings: vi.fn(),
+4 -32
View File
@@ -25,8 +25,7 @@ import {
stopDaemon,
writeStatus,
} from '../daemon/manager';
import { registerDevice, resolveDeviceIdentity } from '../device/register';
import { loadOrCreateConnectionId, loadSettings, normalizeUrl, saveSettings } from '../settings';
import { loadSettings, normalizeUrl, saveSettings } from '../settings';
import { executeToolCall } from '../tools';
import { cleanupAllProcesses } from '../tools/shell';
import { log, setVerbose } from '../utils/logger';
@@ -193,19 +192,8 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
const resolvedGatewayUrl = gatewayUrl || OFFICIAL_GATEWAY_URL;
// Resolve a stable device identity. An explicit `--device-id` wins (lets a
// user pin a VM to a fixed identity); otherwise derive from the machine id so
// the same machine + user maps to one device across reconnects.
const identity = resolveDeviceIdentity(auth.userId, options.deviceId);
// Freeform channel label (`cli` by default); `LOBEHUB_CLI_CHANNEL` lets a
// dev build tag itself `cli-dev` so the gateway can prioritise / display it.
const channel = process.env.LOBEHUB_CLI_CHANNEL || 'cli';
const client = new GatewayClient({
channel,
connectionId: loadOrCreateConnectionId(),
deviceId: identity?.deviceId ?? options.deviceId,
deviceId: options.deviceId,
gatewayUrl: resolvedGatewayUrl,
logger: isDaemonChild ? createDaemonLogger() : log,
serverUrl: auth.serverUrl,
@@ -260,14 +248,14 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
// Handle tool call requests
client.on('tool_call_request', async (request: ToolCallRequestMessage) => {
const { requestId, timeout, toolCall } = request;
const { requestId, toolCall } = request;
if (isDaemonChild) {
appendLog(`[TOOL] ${toolCall.apiName} (${requestId})`);
} else {
log.toolCall(toolCall.apiName, requestId, toolCall.arguments);
}
const result = await executeToolCall(toolCall.apiName, toolCall.arguments, timeout);
const result = await executeToolCall(toolCall.apiName, toolCall.arguments);
if (isDaemonChild) {
appendLog(`[RESULT] ${result.success ? 'OK' : 'FAIL'} (${requestId})`);
@@ -280,7 +268,6 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
result: {
content: result.content,
error: result.error,
state: result.state,
success: result.success,
},
});
@@ -399,21 +386,6 @@ async function runConnect(options: ConnectOptions, isDaemonChild: boolean) {
process.exit(0);
});
// Register this device in the server registry before opening the WS, so the
// row exists by the time the gateway reports it online. `lh login` already
// registers, but re-running here is cheap (idempotent upsert) and covers
// `--token` sessions that never went through login. Best-effort: a failure
// must not block the connection.
if (identity) {
try {
// Reuse the already-resolved auth (respects `--token` mode) so we don't
// re-discover creds and exit when none are found.
await registerDevice(auth, identity);
} catch (err) {
error(`Device registration failed (non-fatal): ${(err as Error).message}`);
}
}
// Connect
await client.connect();
}
-109
View File
@@ -8,20 +8,11 @@ import { registerHeteroCommand } from './hetero';
const { mockSpawnAgent } = vi.hoisted(() => ({
mockSpawnAgent: vi.fn(),
}));
const { mockGetTrpcClient, mockHeteroFinishMutate, mockHeteroIngestMutate } = vi.hoisted(() => ({
mockGetTrpcClient: vi.fn(),
mockHeteroFinishMutate: vi.fn(),
mockHeteroIngestMutate: vi.fn(),
}));
vi.mock('@lobechat/heterogeneous-agents/spawn', () => ({
spawnAgent: mockSpawnAgent,
}));
vi.mock('../api/client', () => ({
getTrpcClient: mockGetTrpcClient,
}));
vi.mock('../utils/logger', () => ({
log: { debug: vi.fn(), error: vi.fn(), info: vi.fn(), warn: vi.fn() },
setVerbose: vi.fn(),
@@ -86,17 +77,6 @@ describe('hetero exec command', () => {
}) as any);
stdoutSpy = vi.spyOn(process.stdout, 'write').mockImplementation(() => true);
mockSpawnAgent.mockReset();
mockHeteroIngestMutate.mockReset();
mockHeteroFinishMutate.mockReset();
mockGetTrpcClient.mockReset();
mockHeteroIngestMutate.mockResolvedValue({ ack: true });
mockHeteroFinishMutate.mockResolvedValue({ ack: true });
mockGetTrpcClient.mockResolvedValue({
aiAgent: {
heteroFinish: { mutate: mockHeteroFinishMutate },
heteroIngest: { mutate: mockHeteroIngestMutate },
},
});
});
afterEach(() => {
@@ -556,93 +536,4 @@ describe('hetero exec command', () => {
expect(errorLine).toBeDefined();
});
});
it('sends full text snapshots before tools and waits for finish until all server ingests ack', async () => {
const callOrder: string[] = [];
mockHeteroIngestMutate.mockImplementation(async ({ events }: any) => {
const first = events[0];
callOrder.push(`ingest:${first.type}:${first.data?.chunkType ?? 'terminal'}`);
return { ack: true };
});
mockHeteroFinishMutate.mockImplementation(async () => {
callOrder.push('finish');
return { ack: true };
});
mockSpawnAgent.mockReturnValue(
createFakeHandle({
events: [
{
data: { chunkType: 'text', content: 'hello ' },
operationId: 'op-server',
stepIndex: 0,
timestamp: 1,
type: 'stream_chunk',
},
{
data: { chunkType: 'text', content: 'world' },
operationId: 'op-server',
stepIndex: 0,
timestamp: 2,
type: 'stream_chunk',
},
{
data: {
chunkType: 'tools_calling',
toolsCalling: [
{
apiName: 'Bash',
arguments: '{"cmd":"ls"}',
id: 'tc-1',
identifier: 'bash',
type: 'default',
},
],
},
operationId: 'op-server',
stepIndex: 1,
timestamp: 3,
type: 'stream_chunk',
},
{
data: { reason: 'success' },
operationId: 'op-server',
stepIndex: 1,
timestamp: 4,
type: 'agent_runtime_end',
},
],
exitCode: 0,
}),
);
await runCmd([
'hetero',
'exec',
'--type',
'claude-code',
'--prompt',
'hi',
'--topic',
'topic-1',
'--operation-id',
'op-server',
'--render',
'none',
]);
expect(mockHeteroIngestMutate).toHaveBeenCalledTimes(3);
expect(mockHeteroIngestMutate.mock.calls[0][0].events[0].data).toMatchObject({
chunkType: 'text',
content: 'hello world',
snapshotMode: 'replace',
snapshotSeq: 1,
});
expect(callOrder).toEqual([
'ingest:stream_chunk:text',
'ingest:stream_chunk:tools_calling',
'ingest:agent_runtime_end:terminal',
'finish',
]);
});
});
+16 -103
View File
@@ -1,5 +1,4 @@
import { randomUUID } from 'node:crypto';
import { once } from 'node:events';
import { readFile } from 'node:fs/promises';
import path from 'node:path';
@@ -7,12 +6,12 @@ import type {
AgentContentBlock,
AgentImageSource,
AgentPromptInput,
AgentStreamEvent,
} from '@lobechat/heterogeneous-agents/spawn';
import { spawnAgent } from '@lobechat/heterogeneous-agents/spawn';
import type { Command } from 'commander';
import { getTrpcClient } from '../api/client';
import { BatchIngester, NoopIngestSink } from '../utils/BatchIngester';
import { log } from '../utils/logger';
import { TrpcIngestSink } from '../utils/TrpcIngestSink';
@@ -201,85 +200,6 @@ const resolvePrompt = async (options: ExecOptions): Promise<ResolvedPrompt> => {
return buildPromptFromText(raw, images);
};
class SerialServerIngester {
private accumulatedText = '';
private fatalError: Error | null = null;
private inflight: Promise<void> = Promise.resolve();
private nextSnapshotSeq = 0;
private pendingTextEvent: AgentStreamEvent | undefined;
private timer: ReturnType<typeof setTimeout> | null = null;
constructor(
private readonly sink: TrpcIngestSink,
private readonly snapshotFlushMs = 200,
) {}
push(event: AgentStreamEvent): void {
if (this.fatalError) return;
if (
event.type === 'stream_chunk' &&
event.data?.chunkType === 'text' &&
typeof event.data?.content === 'string'
) {
this.accumulatedText += event.data.content;
this.pendingTextEvent = event;
if (this.timer) clearTimeout(this.timer);
this.timer = setTimeout(() => {
this.timer = null;
this.queuePendingTextSnapshot();
}, this.snapshotFlushMs);
return;
}
this.queuePendingTextSnapshot();
this.enqueue(async () => {
await this.sink.ingest([event]);
});
}
async drain(): Promise<void> {
this.queuePendingTextSnapshot();
try {
await this.inflight;
} catch {
// `fatalError` is re-thrown below.
}
if (this.fatalError) throw this.fatalError;
}
private enqueue(task: () => Promise<void>) {
this.inflight = this.inflight.then(task).catch((err) => {
this.fatalError = err instanceof Error ? err : new Error(String(err));
throw this.fatalError;
});
}
private queuePendingTextSnapshot() {
if (!this.pendingTextEvent || this.fatalError) return;
if (this.timer) {
clearTimeout(this.timer);
this.timer = null;
}
const baseEvent = this.pendingTextEvent;
this.pendingTextEvent = undefined;
const snapshotEvent: AgentStreamEvent = {
...baseEvent,
data: {
...baseEvent.data,
content: this.accumulatedText,
snapshotMode: 'replace',
snapshotSeq: ++this.nextSnapshotSeq,
},
};
this.enqueue(async () => {
await this.sink.ingest([snapshotEvent]);
});
}
}
const exec = async (options: ExecOptions): Promise<void> => {
if (!SUPPORTED_AGENT_TYPES.has(options.type)) {
log.error(
@@ -323,22 +243,17 @@ const exec = async (options: ExecOptions): Promise<void> => {
// server-ingest mode. The tRPC client reads LOBEHUB_JWT (operation-scoped
// JWT injected by the server) for authentication.
const agentType = options.type as 'claude-code' | 'codex';
let sink: TrpcIngestSink | undefined;
let serverIngester: SerialServerIngester | undefined;
let sink: InstanceType<typeof TrpcIngestSink> | InstanceType<typeof NoopIngestSink>;
if (serverIngest) {
const client = await getTrpcClient();
sink = new TrpcIngestSink(
client,
agentType,
operationId,
options.topic!,
process.env.LOBEHUB_ASSISTANT_MESSAGE_ID,
);
serverIngester = new SerialServerIngester(sink);
sink = new TrpcIngestSink(client, agentType, operationId, options.topic!);
} else {
sink = new NoopIngestSink();
}
const ingester = new BatchIngester(sink);
/**
* Spawn one agent process and stream all its events into the server ingester.
* Spawn one agent process and stream all its events into `ingester`.
*
* When `interceptResumeErrors` is true, any `error`-type event whose
* message matches `RESUME_RETRY_PATTERNS` is withheld from the
@@ -382,7 +297,6 @@ const exec = async (options: ExecOptions): Promise<void> => {
// Always pipe to process.stderr too so users see auth prompts / warnings.
const STDERR_CAP = 8 * 1024;
let stderrContent = '';
const stderrEnded = once(handle.stderr, 'end').then(() => undefined);
handle.stderr.on('data', (chunk: Buffer) => {
if (stderrContent.length < STDERR_CAP) {
stderrContent += chunk.toString();
@@ -400,9 +314,9 @@ const exec = async (options: ExecOptions): Promise<void> => {
}
interrupted = true;
handle.kill('SIGINT');
if (serverIngester && sink) {
if (serverIngest) {
try {
await serverIngester.drain();
await ingester.drain();
await sink.finish({ result: 'cancelled' });
} catch {
// best-effort; process is exiting anyway
@@ -411,9 +325,9 @@ const exec = async (options: ExecOptions): Promise<void> => {
};
const onSigterm = async () => {
handle.kill('SIGTERM');
if (serverIngester && sink) {
if (serverIngest) {
try {
await serverIngester.drain();
await ingester.drain();
await sink.finish({ result: 'cancelled' });
} catch {
// best-effort
@@ -442,16 +356,16 @@ const exec = async (options: ExecOptions): Promise<void> => {
}
}
if (emitJsonl) process.stdout.write(`${JSON.stringify(event)}\n`);
serverIngester?.push(event);
ingester.push(event);
}
} catch (err) {
log.error(
'Stream error from agent process:',
err instanceof Error ? err.message : String(err),
);
if (serverIngester && sink) {
if (serverIngest) {
try {
await serverIngester.drain();
await ingester.drain();
await sink.finish({
error: { message: String(err), type: 'stream_error' },
result: 'error',
@@ -467,7 +381,6 @@ const exec = async (options: ExecOptions): Promise<void> => {
}
const { code, signal } = await handle.exit;
await stderrEnded;
// Fallback stderr detection: CC may exit non-zero without emitting a
// result event (e.g. it writes to stderr and quits immediately).
@@ -538,9 +451,9 @@ const exec = async (options: ExecOptions): Promise<void> => {
const { code, signal, sessionId } = result;
if (serverIngester && sink) {
if (serverIngest) {
try {
await serverIngester.drain();
await ingester.drain();
} catch (err) {
log.error(
'Failed to flush events to server:',
-26
View File
@@ -6,10 +6,8 @@ import type { Command } from 'commander';
import { getUserIdFromApiKey } from '../auth/apiKey';
import { saveCredentials } from '../auth/credentials';
import { parseJwtSub } from '../auth/resolveToken';
import { CLI_API_KEY_ENV } from '../constants/auth';
import { OFFICIAL_SERVER_URL } from '../constants/urls';
import { registerDevice, resolveDeviceIdentity } from '../device/register';
import { loadSettings, normalizeUrl, saveSettings } from '../settings';
import { log } from '../utils/logger';
@@ -215,30 +213,6 @@ export function registerLoginCommand(program: Command) {
},
);
// Register this device in the server registry right after auth, so
// the device row exists without waiting for a later `lh connect`
// (which only adds the channel-online step). Mirrors the desktop
// app, which registers on login. Best-effort: a failure here must
// not fail the login.
//
// Skip the `fallback` source: `lh login` has no `--device-id` and
// persists no fallback id, so a machine without a readable
// machine-id would derive a *fresh random* id on every login —
// registering it just spawns orphan device rows that never match
// the id a later `lh connect` resolves. Defer registration to
// `connect` in that case, where the same id is reused for the WS.
const identity = resolveDeviceIdentity(parseJwtSub(body.access_token));
if (identity && identity.identitySource !== 'fallback') {
try {
await registerDevice(
{ serverUrl, token: body.access_token, tokenType: 'jwt' },
identity,
);
} catch (err) {
log.warn(`Device registration failed (non-fatal): ${(err as Error).message}`);
}
}
log.info('Login successful! Credentials saved.');
return;
}
-142
View File
@@ -6,13 +6,9 @@ import { registerTopicCommand } from './topic';
const { mockTrpcClient } = vi.hoisted(() => ({
mockTrpcClient: {
message: {
getMessages: { query: vi.fn() },
},
topic: {
batchDelete: { mutate: vi.fn() },
createTopic: { mutate: vi.fn() },
getTopicDetail: { query: vi.fn() },
getTopics: { query: vi.fn() },
recentTopics: { query: vi.fn() },
removeTopic: { mutate: vi.fn() },
@@ -45,18 +41,6 @@ describe('topic command', () => {
(fn as ReturnType<typeof vi.fn>).mockReset();
}
}
for (const method of Object.values(mockTrpcClient.message)) {
for (const fn of Object.values(method)) {
(fn as ReturnType<typeof vi.fn>).mockReset();
}
}
// Default stub for getTopicDetail
mockTrpcClient.topic.getTopicDetail.query.mockResolvedValue({
favorite: false,
id: 't1',
title: 'Test Topic',
updatedAt: new Date().toISOString(),
});
});
afterEach(() => {
@@ -219,130 +203,4 @@ describe('topic command', () => {
expect(mockTrpcClient.topic.recentTopics.query).toHaveBeenCalledWith({ limit: 10 });
});
});
describe('view', () => {
it('should display topic metadata and messages', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([
{ content: 'Hello world', id: 'm1', role: 'user' },
{ content: 'Hi there', id: 'm2', role: 'assistant' },
]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1']);
expect(mockTrpcClient.topic.getTopicDetail.query).toHaveBeenCalledWith(
expect.objectContaining({ id: 't1' }),
);
expect(mockTrpcClient.message.getMessages.query).toHaveBeenCalledWith(
expect.objectContaining({ topicId: 't1' }),
);
expect(consoleSpy).toHaveBeenCalled();
});
it('should skip message query entirely when --no-messages flag is set', async () => {
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1', '--no-messages']);
// getTopicDetail is still called (for metadata)
expect(mockTrpcClient.topic.getTopicDetail.query).toHaveBeenCalled();
// but getMessages must NOT be called
expect(mockTrpcClient.message.getMessages.query).not.toHaveBeenCalled();
});
it('should output json when --json flag is set', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([
{ content: 'Hello', id: 'm1', role: 'user' },
]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1', '--json']);
const calls = consoleSpy.mock.calls.flat().join('');
const parsed = JSON.parse(calls);
expect(parsed.topic.id).toBe('t1');
expect(parsed.messages).toHaveLength(1);
expect(parsed.messages[0]).toHaveProperty('role', 'user');
expect(parsed.messages[0]).toHaveProperty('content', 'Hello');
});
it('should output json with empty messages for --no-messages --json', async () => {
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1', '--no-messages', '--json']);
expect(mockTrpcClient.message.getMessages.query).not.toHaveBeenCalled();
const calls = consoleSpy.mock.calls.flat().join('');
const parsed = JSON.parse(calls);
expect(parsed.topic.id).toBe('t1');
expect(parsed.messages).toHaveLength(0);
});
it('should respect -L for message page size', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1', '-L', '10']);
expect(mockTrpcClient.message.getMessages.query).toHaveBeenCalledWith(
expect.objectContaining({ pageSize: 10, topicId: 't1' }),
);
});
it('should slice messages with --from and --to', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([
{ content: 'msg1', id: 'm1', role: 'user' },
{ content: 'msg2', id: 'm2', role: 'assistant' },
{ content: 'msg3', id: 'm3', role: 'user' },
]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1', '--from', '2', '--to', '3']);
// Should print only m2 and m3 (index 1 and 2)
const output = consoleSpy.mock.calls.flat().join('\n');
expect(output).toContain('msg2');
expect(output).toContain('msg3');
expect(output).not.toContain('msg1');
});
it('should render tool calls inline', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([
{
content: "I'll search for that.",
id: 'm1',
role: 'assistant',
tools: [
{
function: { arguments: '{"query":"lobehub"}', name: 'web_search' },
id: 'call_1',
type: 'function',
},
],
},
{ content: 'search results...', id: 'm2', role: 'tool' },
]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1']);
const output = consoleSpy.mock.calls.flat().join('\n');
expect(output).toContain('web_search');
expect(output).toContain('lobehub');
});
it('should render threaded messages with indentation', async () => {
mockTrpcClient.message.getMessages.query.mockResolvedValue([
{ content: 'Parent message', id: 'm1', parentId: null, role: 'user' },
{ content: 'Thread reply', id: 'm2', parentId: 'm1', role: 'assistant' },
]);
const program = createProgram();
await program.parseAsync(['node', 'test', 'topic', 'view', 't1']);
const output = consoleSpy.mock.calls.flat().join('\n');
expect(output).toContain('Parent message');
expect(output).toContain('Thread reply');
// thread reply should appear after parent (basic ordering check)
expect(output.indexOf('Thread reply')).toBeGreaterThan(output.indexOf('Parent message'));
});
});
});

Some files were not shown because too many files have changed in this diff Show More