mirror of
https://github.com/lobehub/lobe-chat.git
synced 2026-06-14 03:30:19 +00:00
✨ feat: per-call llm_generation_tracing observability (#15124)
* ✨ feat(database): add llm_generation_tracing schema + tracing package (LOBE-9462) Foundation layer for per-call observability of `generateObject` calls. - New Drizzle table `llm_generation_tracing` with identity / context / model / result / usage / storage / feedback / audit columns and full single-column index coverage (Postgres bitmap-scan friendly). Migration 0103 is idempotent (CREATE TABLE/INDEX IF NOT EXISTS) for safe re-runs. - `LlmGenerationTracingModel` with `record` / `updateFeedback` / `findById` / `listRecent`, all userId-scoped to prevent cross-user leaks. - New package `@lobechat/llm-generation-tracing` mirroring agent-tracing's shape: `ITracingStore` interface, `FileTracingStore` (local/dev, scenario subfolders + latest.json symlink), `computePromptHash` (6-char sha256 of systemPrompt + schema), and `TRACING_SCENARIO_REGISTRY` + `resolveScenario` with explicit scenario override. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(model-runtime): wire llm_generation_tracing into ModelRuntime.generateObject (LOBE-9462) Per-call interception layer — one hook covers all generateObject callers. - New `onGenerateObjectComplete` hook on `ModelRuntimeHooks`: always fires (success or failure) with latency, usage, output/error. Fixes the gap where `onGenerateObjectFinal` only fires when the runtime invokes `onUsage`. - `S3TracingStore` (zstd level 3, key `llm-generation-tracing/{scenario}/{v}-{hash}/{date}/{id}.json.zst`) and `LLMGenerationTracingService` that does DB insert → store.save → patch storage_key. Store failures preserve the row with `metadata.store_error`. - `createLLMGenerationTracingHook` + `mergeModelRuntimeHooks` wired into `initModelRuntimeFromDB`; tracing runs alongside business (billing) hooks via `next/server.after()` when available, microtask fallback otherwise. Unknown metadata keys (e.g. `parent_memory_trace_key`) pass through. - Memory extractor accepts `parentMemoryTraceKey` option for the job-level backlink. Follow-up-action caller given an explicit `scenario: 'follow_up'` metadata override — it was the only OSS caller missing trigger metadata. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✅ test(llm-generation-tracing): type vi.fn mocks so tsgo accepts mock.calls indexing The hook + service tests destructured `mock.calls[0][0]` and accessed nested fields, which tsgo flagged as TS2493 / TS18046 because `vi.fn()` defaults to a zero-arg signature. Add explicit type parameters to the mocks so tsgo can infer the call tuple, and cast `call.payload` at the access point. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(model-runtime): move mergeModelRuntimeHooks into the package It's a generic utility for composing `ModelRuntimeHooks` instances — same import surface as `ModelRuntime` and the hooks interface — so it belongs alongside them rather than tucked under a server-side consumer. - New `packages/model-runtime/src/core/mergeHooks.ts` exports `mergeModelRuntimeHooks` and is re-exported from the package index. - Move the unit tests to `packages/model-runtime/src/core/mergeHooks.test.ts`, including a new case covering the "a throws → b is skipped" load-bearing semantics. - `src/server/services/llmGenerationTracing/hook.ts` drops the local copy and the consumer (`src/server/modules/ModelRuntime/index.ts`) imports from `@lobechat/model-runtime`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(llm-generation-tracing): version lives with the prompt, not in a central table `promptVersion` was baked into `TRACING_SCENARIO_REGISTRY`, far from any prompt definition — editing a prompt + forgetting to bump the entry in a completely different file was an obvious foot-gun. - Registry is now `Record<string, string>` mapping trigger → scenario only; it's the stable concern that rarely changes. - `resolveScenario` always passes `promptVersion` through from the caller, defaulting to `UNKNOWN_PROMPT_VERSION` ('v0') when absent. - Each call site declares its own `*_PROMPT_VERSION` constant next to the prompt it describes. `followUpAction` ships the first one: `FOLLOW_UP_PROMPT_VERSION` in `prompts/index.ts`, threaded through `metadata.promptVersion` at the `generateObject` call. Other callers can add the same constant when they next touch their prompts. The 6-char prompt hash on the row still catches forgotten bumps. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(input-completion): wire prompt-version metadata at the auto-complete call site Aligns input auto-complete with the FOLLOW_UP_PROMPT_VERSION convention so each prompt iteration is recordable as the chat-side tracing lands. - `INPUT_COMPLETION_PROMPT_VERSION = 'v1.0'` declared next to `chainInputCompletion` — bump together with the prompt body. - `fetchPresetTaskResult` accepts optional `metadata` and forwards it to `getChatCompletion`; the existing chat path already plumbs metadata to `ModelRuntime.chat` options. - `InputEditor` call site passes `{ scenario: 'input_completion', promptVersion }`. Note: `llm_generation_tracing` currently only fires from `onGenerateObjectComplete`. Input completion is a `chat` call, so this metadata is forward-looking until a chat-side tracing hook lands. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * 🐛 fix(llm-generation-tracing): collapse bucketDir path.join args to silence turbopack glob warning Turbopack's static analyzer treats `path.join(root, dyn1, dyn2)` as a multi-segment glob pattern and warned that it could match ~12k files in the project. Compose the relative subdir as a single string first, so `path.join` only sees one dynamic segment. Behavior unchanged — the resulting path is identical. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(input-completion): route auto-complete through generateObject for tracing Auto-complete is the first preset-task caller migrated to the structured- output path so it lands in `llm_generation_tracing` via the existing `onGenerateObjectComplete` hook. No new server hook, no global chat-side tracing. - `chainInputCompletion` now returns `{ messages, schema }` with a minimal `{ completion: string }` schema and a stable `INPUT_COMPLETION_SCHEMA_NAME` constant. JSON wrapping costs ~15-30 tokens against a 100-token completion budget — negligible for the observability win. - `StructureOutputSchema` / `StructureOutputParams` accept optional `metadata`; `aiChatRouter.outputJSON` merges caller metadata over the default trigger so `{ scenario, promptVersion, schemaName }` reach `ModelRuntime.generateObject` options unchanged. - `IStructureSchema.description` is now optional to match the zod schema — previously the TS type was stricter than runtime validation accepted. - `InputEditor` switches from `chatService.fetchPresetTaskResult` to `aiChatService.generateJSON`, reading `response.completion`. Streaming is dropped because auto-complete already buffers the full result before inserting; no UX change. - Reverts the unused `metadata` field that was added to `fetchPresetTaskResult` in the previous commit — no current caller needs it now that input completion uses the generateObject path. Bumps `INPUT_COMPLETION_PROMPT_VERSION` to v2.0 because the system prompt gained an "output the completion field" instruction. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(aiGeneration): extract the runtime-init + generateObject dance into a service Every server-side caller that produces structured output was repeating the same two-step ritual: `initModelRuntimeFromDB(...)` → `runtime.generateObject(payload, { metadata })`. `AiGenerationService` collapses it into one call so future cross-cutting concerns (default metadata, retry, observability hooks) have one place to land. - New `src/server/services/aiGeneration/index.ts` exposes `generateObject<T>(input, options)` and is unit-tested for provider resolution + payload/metadata pass-through. - `aiChatRouter.outputJSON` and `FollowUpActionService.extract` migrated to the service (other callers move organically when next touched). - Drops the unused `keyVaultsPayload` field from `StructureOutputParams` and the placeholder at the InputEditor call site — key vaults are server-resolved from DB, the client never supplies them. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(tracing): centralize TRACING_SCENARIOS const + inject AiGenerationService via trpc ctx - New `packages/const/src/llmGenerationTracing.ts` exports `TRACING_SCENARIOS` + `TracingScenario` type — the single directory where every known scenario name lives. Adds `@lobechat/const` as a workspace dep on llm-generation- tracing so `TRACING_SCENARIO_REGISTRY` can reference the same literals. - Callers (FollowUpActionService, InputEditor) replace `'follow_up'` / `'input_completion'` string literals with `TRACING_SCENARIOS.FollowUp` / `.InputCompletion`, so a typo or a rename fails the type-check instead of silently drifting on the row. - `AiGenerationService` is now injected into the `aiChatProcedure` ctx middleware alongside `aiChatService`; `outputJSON` consumes it via `ctx.aiGenerationService` instead of new-ing it inside the handler. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ✨ feat(llm-generation-tracing): add lt/llm-tracing CLI + drop local-only storage_key - Add `lt` / `llm-tracing` CLI under @lobechat/llm-generation-tracing with `list` (recent records, --scenario filter, --json) and `inspect` (by tracing_id prefix or latest, --full, --json). - `FileTracingStore.save` now returns `{ key: null }` so dev DB rows leave `storage_key` empty instead of recording a non-resolvable local path; S3 store remains the source of truth for the real key. Add helpers `findByTracingId` / `getLatest` used by the CLI. - Wire `agentId` and `topicId` into `input_completion` tracing metadata from the chat input auto-complete call site. - Default `FileTracingStore` whenever NODE_ENV=development (drop the ENABLE_LLM_GENERATION_TRACING_LOCAL opt-in env var). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * 💄 style(llm-generation-tracing): prettier CLI output (tree + colors) Mirror the @lobechat/agent-tracing viewer style: - Inline ANSI color helpers (dim/bold/cyan/magenta/green/yellow/red). - Compact single-line header with id, scenario, version, model, status, time — replaces the multi-line bullet list. - Tree structure with `├─`/`└─` connectors instead of `── section ──` banners. - input arrays render per-message (role + char count + preview) rather than dumping raw JSON. - Small single-key outputs (e.g. `{ completion: "怎么样" }`) collapse to inline `key: "value"`. - `lt list` switches to a colored, properly padded table. Default view stays compact; --full expands system_prompt / input / schema bodies. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * ♻️ refactor(llm-generation-tracing): split `tracing` config out of `metadata` `options.metadata` was overloaded — half tracing-specific structured fields (scenario / promptVersion / schemaName / agentId / topicId / ...), half free-form jsonb passthrough. Callers couldn't tell which was which, and the inputHint was always auto-extracted (useless when the prompt wraps the user's text in a template). This commit introduces a dedicated `tracing` option: - Add `TracingOptions` to @lobechat/llm-generation-tracing — the typed shape callers import (agentId / topicId / inputHint / scenario / promptVersion / schemaName / systemPrompt / parentTracingId / metadata). - Add loose `tracing?: Record<string, unknown>` to GenerateObjectOptions and StructureOutputParams / StructureOutputSchema so the field flows through the runtime + TRPC. - Tracing hook now reads `context.options.tracing` for structured fields; it still falls back to `metadata.trigger` for the cross-cutting trigger string (ModelRuntime itself uses metadata.trigger for timing logs, so trigger stays on metadata). - Service `record()` accepts an explicit `inputHint`; otherwise falls back to auto-extraction from the first user message. Always truncated. - Free-form jsonb fields move to `tracing.metadata` (was unknown-key passthrough on `metadata`). - Call sites updated: - FollowUpAction now passes `tracing: { scenario, promptVersion, schemaName, topicId }` (previously `metadata`). - InputCompletion now passes `tracing: { agentId, topicId, inputHint: input, scenario, promptVersion, schemaName }` — `inputHint` is the user's actual typed text, not the wrapper prompt's first user message. - `aiChat.outputJSON` router forwards both metadata and tracing. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * Update inputCompletion.ts * 🐛 fix(llm-generation-tracing): stop duplicating provider into the row's metadata jsonb `provider` is already a first-class column on the `llm_generation_tracing` row, so auto-stamping it into the `metadata` jsonb column on every call was pure noise. The hook now writes the caller-supplied `tracing.metadata` verbatim — empty/undefined when the caller had nothing to add. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -106,6 +106,7 @@ vertex-ai-key.json
|
||||
|
||||
# Agent tracing snapshots
|
||||
.agent-tracing/
|
||||
.llm-generation-tracing/
|
||||
|
||||
# AI coding tools
|
||||
.local/
|
||||
|
||||
@@ -264,6 +264,7 @@
|
||||
"@lobechat/fetch-sse": "workspace:*",
|
||||
"@lobechat/file-loaders": "workspace:*",
|
||||
"@lobechat/heterogeneous-agents": "workspace:*",
|
||||
"@lobechat/llm-generation-tracing": "workspace:*",
|
||||
"@lobechat/local-file-shell": "workspace:*",
|
||||
"@lobechat/markdown-patch": "workspace:*",
|
||||
"@lobechat/memory-user-memory": "workspace:*",
|
||||
|
||||
@@ -10,6 +10,7 @@ export * from './file';
|
||||
export * from './interests';
|
||||
export * from './klavis';
|
||||
export * from './layoutTokens';
|
||||
export * from './llmGenerationTracing';
|
||||
export * from './lobehubSkill';
|
||||
export * from './message';
|
||||
export * from './meta';
|
||||
|
||||
@@ -0,0 +1,25 @@
|
||||
/**
|
||||
* Canonical directory of every `llm_generation_tracing` scenario value.
|
||||
*
|
||||
* Add to this map whenever a new caller pipes through the tracing path so
|
||||
* there's one place to scan for all known scenarios. Values are the literal
|
||||
* strings persisted on the row's `scenario` column — keep them stable, they
|
||||
* are dashboard / partition keys.
|
||||
*/
|
||||
export const TRACING_SCENARIOS = {
|
||||
AgentSignal: 'agent_signal',
|
||||
AgentWelcome: 'agent_welcome',
|
||||
FollowUp: 'follow_up',
|
||||
HomeBrief: 'home_brief',
|
||||
InputCompletion: 'input_completion',
|
||||
MemoryExtract: 'memory_extract',
|
||||
SignalFeedbackDomain: 'signal_feedback_domain',
|
||||
SignalFeedbackSatisfaction: 'signal_feedback_satisfaction',
|
||||
SignalSkillIntent: 'signal_skill_intent',
|
||||
SignalSkillManagement: 'signal_skill_management',
|
||||
SignupEmailReview: 'signup_email_review',
|
||||
TopicTitle: 'topic_title',
|
||||
Unknown: 'unknown',
|
||||
} as const;
|
||||
|
||||
export type TracingScenario = (typeof TRACING_SCENARIOS)[keyof typeof TRACING_SCENARIOS];
|
||||
@@ -0,0 +1,162 @@
|
||||
// @vitest-environment node
|
||||
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
|
||||
|
||||
import { getTestDB } from '../../core/getTestDB';
|
||||
import { llmGenerationTracing, users } from '../../schemas';
|
||||
import type { LobeChatDatabase } from '../../type';
|
||||
import { LlmGenerationTracingModel } from '../llmGenerationTracing';
|
||||
|
||||
const serverDB: LobeChatDatabase = await getTestDB();
|
||||
|
||||
const userId = 'llm-gen-trace-test-user';
|
||||
const otherUserId = 'llm-gen-trace-other-user';
|
||||
|
||||
beforeEach(async () => {
|
||||
await serverDB.delete(users);
|
||||
await serverDB.insert(users).values([{ id: userId }, { id: otherUserId }]);
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
await serverDB.delete(llmGenerationTracing);
|
||||
await serverDB.delete(users);
|
||||
});
|
||||
|
||||
describe('LlmGenerationTracingModel', () => {
|
||||
describe('record', () => {
|
||||
it('inserts a row and returns the generated uuid', async () => {
|
||||
const model = new LlmGenerationTracingModel(serverDB, userId);
|
||||
|
||||
const { id } = await model.record({
|
||||
inputHint: 'hello world',
|
||||
inputTokens: 120,
|
||||
latencyMs: 850,
|
||||
model: 'gpt-4o-mini',
|
||||
outputTokens: 40,
|
||||
promptHash: 'ab1fc3',
|
||||
promptVersion: 'v1.0',
|
||||
provider: 'openai',
|
||||
scenario: 'home_brief',
|
||||
schemaName: 'HomeBriefOutputSchema',
|
||||
success: true,
|
||||
trigger: 'home_brief',
|
||||
});
|
||||
|
||||
expect(id).toMatch(/^[0-9a-f-]{36}$/);
|
||||
|
||||
const row = await model.findById(id);
|
||||
expect(row).toMatchObject({
|
||||
id,
|
||||
inputHint: 'hello world',
|
||||
inputTokens: 120,
|
||||
latencyMs: 850,
|
||||
metadata: {},
|
||||
model: 'gpt-4o-mini',
|
||||
outputTokens: 40,
|
||||
promptHash: 'ab1fc3',
|
||||
promptVersion: 'v1.0',
|
||||
provider: 'openai',
|
||||
scenario: 'home_brief',
|
||||
schemaName: 'HomeBriefOutputSchema',
|
||||
success: true,
|
||||
trigger: 'home_brief',
|
||||
userId,
|
||||
validationFailed: false,
|
||||
});
|
||||
expect(row?.createdAt).toBeInstanceOf(Date);
|
||||
});
|
||||
|
||||
it('records a failure with error fields and validation flag', async () => {
|
||||
const model = new LlmGenerationTracingModel(serverDB, userId);
|
||||
|
||||
const { id } = await model.record({
|
||||
errorCode: 'validation_failed',
|
||||
errorDetail: 'output missing required field "summary"',
|
||||
latencyMs: 1200,
|
||||
model: 'gpt-4o',
|
||||
promptHash: 'cccccc',
|
||||
promptVersion: 'v1.0',
|
||||
provider: 'openai',
|
||||
scenario: 'topic_title',
|
||||
success: false,
|
||||
validationFailed: true,
|
||||
});
|
||||
|
||||
const row = await model.findById(id);
|
||||
expect(row).toMatchObject({
|
||||
errorCode: 'validation_failed',
|
||||
errorDetail: 'output missing required field "summary"',
|
||||
success: false,
|
||||
validationFailed: true,
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
describe('updateFeedback', () => {
|
||||
it('writes feedback columns and the updated timestamp', async () => {
|
||||
const model = new LlmGenerationTracingModel(serverDB, userId);
|
||||
const { id } = await model.record({
|
||||
promptHash: 'aaaaaa',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'agent_welcome',
|
||||
success: true,
|
||||
});
|
||||
|
||||
await model.updateFeedback(id, {
|
||||
data: { clicked_question_index: 1 },
|
||||
score: 1,
|
||||
signal: 'positive',
|
||||
source: 'explicit_thumbs',
|
||||
});
|
||||
|
||||
const row = await model.findById(id);
|
||||
expect(row).toMatchObject({
|
||||
feedbackData: { clicked_question_index: 1 },
|
||||
feedbackScore: 1,
|
||||
feedbackSignal: 'positive',
|
||||
feedbackSource: 'explicit_thumbs',
|
||||
});
|
||||
expect(row?.feedbackUpdatedAt).toBeInstanceOf(Date);
|
||||
});
|
||||
|
||||
it("does not touch another user's row", async () => {
|
||||
const owner = new LlmGenerationTracingModel(serverDB, userId);
|
||||
const intruder = new LlmGenerationTracingModel(serverDB, otherUserId);
|
||||
|
||||
const { id } = await owner.record({
|
||||
promptHash: 'aaaaaa',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'follow_up',
|
||||
success: true,
|
||||
});
|
||||
|
||||
await intruder.updateFeedback(id, {
|
||||
signal: 'negative',
|
||||
source: 'manual_edit',
|
||||
});
|
||||
|
||||
const row = await owner.findById(id);
|
||||
expect(row?.feedbackSignal).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe('findById / listRecent', () => {
|
||||
it('only returns rows owned by the caller', async () => {
|
||||
const owner = new LlmGenerationTracingModel(serverDB, userId);
|
||||
const stranger = new LlmGenerationTracingModel(serverDB, otherUserId);
|
||||
|
||||
const { id } = await owner.record({
|
||||
promptHash: 'aaaaaa',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'memory_extract',
|
||||
success: true,
|
||||
});
|
||||
|
||||
expect(await stranger.findById(id)).toBeNull();
|
||||
expect(await stranger.listRecent()).toHaveLength(0);
|
||||
|
||||
const rows = await owner.listRecent();
|
||||
expect(rows).toHaveLength(1);
|
||||
expect(rows[0].id).toBe(id);
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,121 @@
|
||||
import { and, desc, eq } from 'drizzle-orm';
|
||||
|
||||
import type {
|
||||
LlmGenerationFeedbackSignal,
|
||||
LlmGenerationFeedbackSource,
|
||||
NewLlmGenerationTracing,
|
||||
} from '../schemas/llmGenerationTracing';
|
||||
import { llmGenerationTracing } from '../schemas/llmGenerationTracing';
|
||||
import type { LobeChatDatabase } from '../type';
|
||||
|
||||
export interface RecordLlmGenerationParams {
|
||||
agentId?: string | null;
|
||||
costUsd?: number | null;
|
||||
errorCode?: string | null;
|
||||
errorDetail?: string | null;
|
||||
inputHash?: string | null;
|
||||
inputHint?: string | null;
|
||||
inputTokens?: number | null;
|
||||
latencyMs?: number | null;
|
||||
metadata?: Record<string, unknown>;
|
||||
model?: string | null;
|
||||
outputTokens?: number | null;
|
||||
parentTracingId?: string | null;
|
||||
promptHash: string;
|
||||
promptVersion: string;
|
||||
provider?: string | null;
|
||||
scenario: string;
|
||||
schemaName?: string | null;
|
||||
spanId?: string | null;
|
||||
storageKey?: string | null;
|
||||
success: boolean;
|
||||
topicId?: string | null;
|
||||
traceId?: string | null;
|
||||
trigger?: string | null;
|
||||
validationFailed?: boolean;
|
||||
}
|
||||
|
||||
export interface UpdateLlmGenerationFeedbackParams {
|
||||
data?: Record<string, unknown>;
|
||||
score?: number | null;
|
||||
signal: LlmGenerationFeedbackSignal;
|
||||
source: LlmGenerationFeedbackSource;
|
||||
}
|
||||
|
||||
export class LlmGenerationTracingModel {
|
||||
private readonly db: LobeChatDatabase;
|
||||
private readonly userId: string;
|
||||
|
||||
constructor(db: LobeChatDatabase, userId: string) {
|
||||
this.db = db;
|
||||
this.userId = userId;
|
||||
}
|
||||
|
||||
async record(params: RecordLlmGenerationParams): Promise<{ id: string }> {
|
||||
const values: NewLlmGenerationTracing = {
|
||||
agentId: params.agentId ?? null,
|
||||
costUsd: params.costUsd ?? null,
|
||||
errorCode: params.errorCode ?? null,
|
||||
errorDetail: params.errorDetail ?? null,
|
||||
inputHash: params.inputHash ?? null,
|
||||
inputHint: params.inputHint ?? null,
|
||||
inputTokens: params.inputTokens ?? null,
|
||||
latencyMs: params.latencyMs ?? null,
|
||||
metadata: params.metadata ?? {},
|
||||
model: params.model ?? null,
|
||||
outputTokens: params.outputTokens ?? null,
|
||||
parentTracingId: params.parentTracingId ?? null,
|
||||
promptHash: params.promptHash,
|
||||
promptVersion: params.promptVersion,
|
||||
provider: params.provider ?? null,
|
||||
scenario: params.scenario,
|
||||
schemaName: params.schemaName ?? null,
|
||||
spanId: params.spanId ?? null,
|
||||
storageKey: params.storageKey ?? null,
|
||||
success: params.success,
|
||||
topicId: params.topicId ?? null,
|
||||
traceId: params.traceId ?? null,
|
||||
trigger: params.trigger ?? null,
|
||||
userId: this.userId,
|
||||
validationFailed: params.validationFailed ?? false,
|
||||
};
|
||||
|
||||
const [row] = await this.db
|
||||
.insert(llmGenerationTracing)
|
||||
.values(values)
|
||||
.returning({ id: llmGenerationTracing.id });
|
||||
|
||||
return { id: row.id };
|
||||
}
|
||||
|
||||
async updateFeedback(id: string, params: UpdateLlmGenerationFeedbackParams): Promise<void> {
|
||||
await this.db
|
||||
.update(llmGenerationTracing)
|
||||
.set({
|
||||
feedbackData: params.data,
|
||||
feedbackScore: params.score ?? null,
|
||||
feedbackSignal: params.signal,
|
||||
feedbackSource: params.source,
|
||||
feedbackUpdatedAt: new Date(),
|
||||
})
|
||||
.where(and(eq(llmGenerationTracing.id, id), eq(llmGenerationTracing.userId, this.userId)));
|
||||
}
|
||||
|
||||
async findById(id: string) {
|
||||
const [row] = await this.db
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(and(eq(llmGenerationTracing.id, id), eq(llmGenerationTracing.userId, this.userId)))
|
||||
.limit(1);
|
||||
return row ?? null;
|
||||
}
|
||||
|
||||
async listRecent(limit = 50) {
|
||||
return this.db
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.userId, this.userId))
|
||||
.orderBy(desc(llmGenerationTracing.createdAt))
|
||||
.limit(limit);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,18 @@
|
||||
{
|
||||
"name": "@lobechat/llm-generation-tracing",
|
||||
"version": "1.0.0",
|
||||
"private": true,
|
||||
"exports": {
|
||||
".": "./src/index.ts",
|
||||
"./store": "./src/store/types.ts"
|
||||
},
|
||||
"main": "./src/index.ts",
|
||||
"bin": {
|
||||
"llm-tracing": "./src/cli/index.ts",
|
||||
"lt": "./src/cli/index.ts"
|
||||
},
|
||||
"dependencies": {
|
||||
"@lobechat/const": "workspace:*",
|
||||
"commander": "^13.1.0"
|
||||
}
|
||||
}
|
||||
+18
@@ -0,0 +1,18 @@
|
||||
#!/usr/bin/env bun
|
||||
|
||||
import { Command } from 'commander';
|
||||
|
||||
import { registerInspectCommand } from './inspect';
|
||||
import { registerListCommand } from './list';
|
||||
|
||||
const program = new Command();
|
||||
|
||||
program
|
||||
.name('llm-tracing')
|
||||
.description('Inspect local llm-generation-tracing records under .llm-generation-tracing/')
|
||||
.version('1.0.0');
|
||||
|
||||
registerInspectCommand(program);
|
||||
registerListCommand(program);
|
||||
|
||||
program.parse();
|
||||
@@ -0,0 +1,35 @@
|
||||
import type { Command } from 'commander';
|
||||
|
||||
import { FileTracingStore } from '../store/file-store';
|
||||
import type { TracingPayload } from '../types';
|
||||
import { renderPayloadDetail } from '../viewer';
|
||||
|
||||
export function registerInspectCommand(program: Command) {
|
||||
program
|
||||
.command('inspect', { isDefault: true })
|
||||
.alias('i')
|
||||
.description('Inspect a tracing record by tracing_id prefix (defaults to latest)')
|
||||
.argument('[tracingId]', 'tracing_id or prefix; omit to inspect the latest record')
|
||||
.option('-j, --json', 'Output the raw JSON payload')
|
||||
.option('-f, --full', 'Show full system_prompt / input / output (no truncation)')
|
||||
.action(async (tracingId: string | undefined, opts: { full?: boolean; json?: boolean }) => {
|
||||
const store = new FileTracingStore();
|
||||
let record: TracingPayload | null;
|
||||
if (tracingId) {
|
||||
record = await store.findByTracingId(tracingId);
|
||||
} else {
|
||||
record = await store.getLatest();
|
||||
}
|
||||
|
||||
if (!record) {
|
||||
console.error(
|
||||
tracingId
|
||||
? `No tracing record matched id prefix: ${tracingId}`
|
||||
: 'No tracing records found. Run a generateObject call first (NODE_ENV=development).',
|
||||
);
|
||||
process.exit(1);
|
||||
}
|
||||
|
||||
console.info(opts.json ? JSON.stringify(record, null, 2) : renderPayloadDetail(record, opts));
|
||||
});
|
||||
}
|
||||
@@ -0,0 +1,21 @@
|
||||
import type { Command } from 'commander';
|
||||
|
||||
import { FileTracingStore } from '../store/file-store';
|
||||
import { renderSummaryTable } from '../viewer';
|
||||
|
||||
export function registerListCommand(program: Command) {
|
||||
program
|
||||
.command('list')
|
||||
.alias('ls')
|
||||
.description('List recent llm-generation-tracing records (newest first)')
|
||||
.option('-l, --limit <n>', 'Max number of records to show', '20')
|
||||
.option('-s, --scenario <name>', 'Filter by scenario (e.g. input_completion, topic_title)')
|
||||
.option('-j, --json', 'Output as JSON instead of a table')
|
||||
.action(async (opts: { json?: boolean; limit: string; scenario?: string }) => {
|
||||
const store = new FileTracingStore();
|
||||
let limit = Number.parseInt(opts.limit, 10);
|
||||
if (Number.isNaN(limit) || limit < 1) limit = 20;
|
||||
const summaries = await store.list({ limit, scenario: opts.scenario });
|
||||
console.info(opts.json ? JSON.stringify(summaries, null, 2) : renderSummaryTable(summaries));
|
||||
});
|
||||
}
|
||||
@@ -0,0 +1,19 @@
|
||||
export { computeInputHash, computePromptHash } from './promptHash';
|
||||
export type { ResolveScenarioInput } from './registry';
|
||||
export {
|
||||
resolveScenario,
|
||||
TRACING_SCENARIO_REGISTRY,
|
||||
UNKNOWN_PROMPT_VERSION,
|
||||
UNKNOWN_SCENARIO,
|
||||
} from './registry';
|
||||
export { DEFAULT_DIR, FileTracingStore } from './store/file-store';
|
||||
export type { ITracingStore, SaveResult } from './store/types';
|
||||
export type {
|
||||
LlmGenerationFeedbackSignal,
|
||||
ScenarioDefinition,
|
||||
TracingErrorPayload,
|
||||
TracingModelMetadata,
|
||||
TracingOptions,
|
||||
TracingPayload,
|
||||
TracingSummary,
|
||||
} from './types';
|
||||
@@ -0,0 +1,42 @@
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import { computeInputHash, computePromptHash } from './promptHash';
|
||||
|
||||
describe('computePromptHash', () => {
|
||||
it('returns a 6-char hex digest', () => {
|
||||
const hash = computePromptHash('you are a helpful agent', { type: 'object' });
|
||||
expect(hash).toHaveLength(6);
|
||||
expect(hash).toMatch(/^[0-9a-f]{6}$/);
|
||||
});
|
||||
|
||||
it('is stable across calls with the same input', () => {
|
||||
const a = computePromptHash('prompt-A', { foo: 1 });
|
||||
const b = computePromptHash('prompt-A', { foo: 1 });
|
||||
expect(a).toBe(b);
|
||||
});
|
||||
|
||||
it('changes when system prompt changes', () => {
|
||||
const a = computePromptHash('prompt-A', { foo: 1 });
|
||||
const b = computePromptHash('prompt-B', { foo: 1 });
|
||||
expect(a).not.toBe(b);
|
||||
});
|
||||
|
||||
it('changes when schema changes', () => {
|
||||
const a = computePromptHash('prompt', { foo: 1 });
|
||||
const b = computePromptHash('prompt', { foo: 2 });
|
||||
expect(a).not.toBe(b);
|
||||
});
|
||||
|
||||
it('treats missing schema and empty schema differently', () => {
|
||||
const undef = computePromptHash('prompt', undefined);
|
||||
const empty = computePromptHash('prompt', {});
|
||||
expect(undef).not.toBe(empty);
|
||||
});
|
||||
});
|
||||
|
||||
describe('computeInputHash', () => {
|
||||
it('returns a full-length sha256 hex', () => {
|
||||
expect(computeInputHash('hello')).toHaveLength(64);
|
||||
expect(computeInputHash({ a: 1 })).toMatch(/^[0-9a-f]{64}$/);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,29 @@
|
||||
import { createHash } from 'node:crypto';
|
||||
|
||||
const SHORT_LENGTH = 6;
|
||||
|
||||
/**
|
||||
* Compute the 6-char prompt hash used to detect silently-mutated prompts.
|
||||
*
|
||||
* Hash input: `systemPrompt + '\n---\n' + JSON.stringify(schema)` — schema MUST
|
||||
* be a deterministic JSON form (e.g. zod-to-json-schema). Keys in objects are
|
||||
* stringified in insertion order; the caller is responsible for normalising.
|
||||
*/
|
||||
export const computePromptHash = (systemPrompt: string, schema: unknown): string => {
|
||||
const schemaPart = schema === undefined ? '' : JSON.stringify(schema);
|
||||
const hash = createHash('sha256');
|
||||
hash.update(systemPrompt);
|
||||
hash.update('\n---\n');
|
||||
hash.update(schemaPart);
|
||||
return hash.digest('hex').slice(0, SHORT_LENGTH);
|
||||
};
|
||||
|
||||
/**
|
||||
* sha256 of normalized input — for dedup / cache-hit analysis. Returned as the
|
||||
* full hex digest; truncate at the caller if storage size matters.
|
||||
*/
|
||||
export const computeInputHash = (input: unknown): string => {
|
||||
const hash = createHash('sha256');
|
||||
hash.update(typeof input === 'string' ? input : JSON.stringify(input));
|
||||
return hash.digest('hex');
|
||||
};
|
||||
@@ -0,0 +1,52 @@
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import {
|
||||
resolveScenario,
|
||||
TRACING_SCENARIO_REGISTRY,
|
||||
UNKNOWN_PROMPT_VERSION,
|
||||
UNKNOWN_SCENARIO,
|
||||
} from './registry';
|
||||
|
||||
describe('TRACING_SCENARIO_REGISTRY', () => {
|
||||
it('maps known triggers to scenario names (no versions)', () => {
|
||||
expect(TRACING_SCENARIO_REGISTRY.topic).toBe('topic_title');
|
||||
expect(TRACING_SCENARIO_REGISTRY.memory).toBe('memory_extract');
|
||||
});
|
||||
});
|
||||
|
||||
describe('resolveScenario', () => {
|
||||
it('looks the scenario up by trigger and uses the caller-supplied promptVersion', () => {
|
||||
expect(resolveScenario({ promptVersion: 'v3.1', trigger: 'topic' })).toEqual({
|
||||
promptVersion: 'v3.1',
|
||||
scenario: 'topic_title',
|
||||
});
|
||||
});
|
||||
|
||||
it('honours an explicit scenario override even when trigger has a registry mapping', () => {
|
||||
expect(
|
||||
resolveScenario({
|
||||
promptVersion: 'v2.1',
|
||||
scenario: 'signal_skill_intent',
|
||||
trigger: 'agent_signal',
|
||||
}),
|
||||
).toEqual({ promptVersion: 'v2.1', scenario: 'signal_skill_intent' });
|
||||
});
|
||||
|
||||
it('falls back to UNKNOWN_PROMPT_VERSION when no version is provided', () => {
|
||||
expect(resolveScenario({ scenario: 'custom_thing' })).toEqual({
|
||||
promptVersion: UNKNOWN_PROMPT_VERSION,
|
||||
scenario: 'custom_thing',
|
||||
});
|
||||
});
|
||||
|
||||
it('falls back to the unknown scenario sentinel when neither matches', () => {
|
||||
expect(resolveScenario({ trigger: 'does_not_exist' })).toEqual({
|
||||
promptVersion: UNKNOWN_PROMPT_VERSION,
|
||||
scenario: UNKNOWN_SCENARIO,
|
||||
});
|
||||
expect(resolveScenario({})).toEqual({
|
||||
promptVersion: UNKNOWN_PROMPT_VERSION,
|
||||
scenario: UNKNOWN_SCENARIO,
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,67 @@
|
||||
import { TRACING_SCENARIOS, type TracingScenario } from '@lobechat/const';
|
||||
|
||||
import type { ScenarioDefinition } from './types';
|
||||
|
||||
/**
|
||||
* Stable `trigger → scenario` mapping. Maps a `RequestTrigger` value to the
|
||||
* default scenario name used for tracing.
|
||||
*
|
||||
* Triggers that fan out into multiple scenarios (e.g. `agent_signal` →
|
||||
* `signal_skill_intent` / `signal_feedback_satisfaction` / ...) deliberately
|
||||
* have no default entry here; those callers pass an explicit
|
||||
* `metadata.scenario` instead.
|
||||
*
|
||||
* **Note on prompt versions**: version intentionally lives next to the prompt
|
||||
* it describes (see `tracing.ts` files / `*_PROMPT_VERSION` constants near the
|
||||
* `generateObject` call site). When the prompt or schema changes, bump that
|
||||
* local constant — keeping the version next to the thing it versions avoids
|
||||
* the drift you'd get from a central table that nobody remembers to update.
|
||||
*
|
||||
* For the full directory of scenario *names*, see `@lobechat/const`
|
||||
* `TRACING_SCENARIOS`.
|
||||
*/
|
||||
export const TRACING_SCENARIO_REGISTRY: Record<string, TracingScenario> = {
|
||||
agent_signal: TRACING_SCENARIOS.AgentSignal,
|
||||
memory: TRACING_SCENARIOS.MemoryExtract,
|
||||
signup_email_llm_review: TRACING_SCENARIOS.SignupEmailReview,
|
||||
topic: TRACING_SCENARIOS.TopicTitle,
|
||||
};
|
||||
|
||||
export const UNKNOWN_SCENARIO = TRACING_SCENARIOS.Unknown;
|
||||
export const UNKNOWN_PROMPT_VERSION = 'v0';
|
||||
|
||||
export interface ResolveScenarioInput {
|
||||
/**
|
||||
* Prompt version supplied by the caller. Conventionally a `v<major>.<minor>`
|
||||
* constant declared next to the prompt definition. Missing values resolve to
|
||||
* `UNKNOWN_PROMPT_VERSION` so tracing still records the row.
|
||||
*/
|
||||
promptVersion?: string;
|
||||
/** Override scenario name (e.g. `signal_skill_intent`); takes precedence over registry. */
|
||||
scenario?: string;
|
||||
/** RequestTrigger value (string form). */
|
||||
trigger?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Pick the `{ scenario, promptVersion }` for a tracing record.
|
||||
*
|
||||
* Resolution order:
|
||||
* 1. `input.scenario` if provided
|
||||
* 2. registry lookup by `input.trigger`
|
||||
* 3. `UNKNOWN_SCENARIO` sentinel
|
||||
*
|
||||
* `promptVersion` is always passed through from the caller (or
|
||||
* `UNKNOWN_PROMPT_VERSION` if absent). The registry never assigns versions —
|
||||
* they live with the prompt.
|
||||
*/
|
||||
export const resolveScenario = (input: ResolveScenarioInput): ScenarioDefinition => {
|
||||
const scenario =
|
||||
input.scenario ??
|
||||
(input.trigger ? TRACING_SCENARIO_REGISTRY[input.trigger] : undefined) ??
|
||||
UNKNOWN_SCENARIO;
|
||||
return {
|
||||
promptVersion: input.promptVersion ?? UNKNOWN_PROMPT_VERSION,
|
||||
scenario,
|
||||
};
|
||||
};
|
||||
@@ -0,0 +1,117 @@
|
||||
import fs from 'node:fs/promises';
|
||||
import os from 'node:os';
|
||||
import path from 'node:path';
|
||||
|
||||
import { afterEach, beforeEach, describe, expect, it } from 'vitest';
|
||||
|
||||
import type { TracingPayload } from '../types';
|
||||
import { DEFAULT_DIR, FileTracingStore } from './file-store';
|
||||
|
||||
let tmpRoot: string;
|
||||
|
||||
const makePayload = (overrides: Partial<TracingPayload> = {}): TracingPayload => ({
|
||||
created_at: new Date('2026-05-22T11:22:33.444Z').getTime(),
|
||||
prompt_hash: 'abcdef',
|
||||
prompt_version: 'v1.0',
|
||||
scenario: 'home_brief',
|
||||
tracing_id: '00000000-0000-0000-0000-000000000001',
|
||||
version: '1.0',
|
||||
...overrides,
|
||||
});
|
||||
|
||||
beforeEach(async () => {
|
||||
tmpRoot = await fs.mkdtemp(path.join(os.tmpdir(), 'llm-gen-trace-test-'));
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
await fs.rm(tmpRoot, { force: true, recursive: true });
|
||||
});
|
||||
|
||||
describe('FileTracingStore', () => {
|
||||
it('writes payloads under {scenario}/{promptVersion}-{promptHash}/ and returns a null key (local-only)', async () => {
|
||||
const store = new FileTracingStore(tmpRoot);
|
||||
const payload = makePayload();
|
||||
|
||||
const { key } = await store.save(payload);
|
||||
// Local store is non-shareable — DB should leave `storage_key` empty.
|
||||
expect(key).toBeNull();
|
||||
|
||||
const dir = path.join(tmpRoot, DEFAULT_DIR, 'home_brief', 'v1.0-abcdef');
|
||||
const entries = await fs.readdir(dir);
|
||||
const jsonFiles = entries.filter((f) => f.endsWith('.json'));
|
||||
expect(jsonFiles).toHaveLength(1);
|
||||
|
||||
const raw = await fs.readFile(path.join(dir, jsonFiles[0]), 'utf8');
|
||||
expect(JSON.parse(raw)).toMatchObject({
|
||||
prompt_hash: 'abcdef',
|
||||
scenario: 'home_brief',
|
||||
tracing_id: payload.tracing_id,
|
||||
});
|
||||
});
|
||||
|
||||
it('updates the latest.json symlink to point at the freshest record', async () => {
|
||||
const store = new FileTracingStore(tmpRoot);
|
||||
await store.save(makePayload({ tracing_id: 'aaaa-1' }));
|
||||
await store.save(
|
||||
makePayload({
|
||||
created_at: new Date('2026-05-22T11:30:00.000Z').getTime(),
|
||||
scenario: 'topic_title',
|
||||
tracing_id: 'bbbb-2',
|
||||
}),
|
||||
);
|
||||
|
||||
const latestPath = path.join(tmpRoot, DEFAULT_DIR, 'latest.json');
|
||||
const target = await fs.realpath(latestPath);
|
||||
const content = await fs.readFile(target, 'utf8');
|
||||
expect(JSON.parse(content)).toMatchObject({
|
||||
scenario: 'topic_title',
|
||||
tracing_id: 'bbbb-2',
|
||||
});
|
||||
});
|
||||
|
||||
it('lists recent records as flat summaries newest-first', async () => {
|
||||
const store = new FileTracingStore(tmpRoot);
|
||||
await store.save(
|
||||
makePayload({
|
||||
created_at: new Date('2026-05-22T11:00:00.000Z').getTime(),
|
||||
scenario: 'home_brief',
|
||||
tracing_id: 'aaaa',
|
||||
}),
|
||||
);
|
||||
await store.save(
|
||||
makePayload({
|
||||
created_at: new Date('2026-05-22T12:00:00.000Z').getTime(),
|
||||
scenario: 'memory_extract',
|
||||
tracing_id: 'bbbb',
|
||||
}),
|
||||
);
|
||||
|
||||
const summaries = await store.list();
|
||||
expect(summaries.map((s) => s.tracing_id)).toEqual(['bbbb', 'aaaa']);
|
||||
});
|
||||
|
||||
it('round-trips a payload via get() using the on-disk file path', async () => {
|
||||
const store = new FileTracingStore(tmpRoot);
|
||||
const payload = makePayload({
|
||||
input: { messages: [{ content: 'hi', role: 'user' }] },
|
||||
output: { topic: 'greeting' },
|
||||
});
|
||||
await store.save(payload);
|
||||
|
||||
// save() returns a null key, so locate the file on disk and read via its path.
|
||||
const dir = path.join(tmpRoot, DEFAULT_DIR, 'home_brief', 'v1.0-abcdef');
|
||||
const jsonFile = (await fs.readdir(dir)).find((f) => f.endsWith('.json'));
|
||||
if (!jsonFile) throw new Error('expected a saved tracing file to exist');
|
||||
const loaded = await store.get(path.join(dir, jsonFile));
|
||||
expect(loaded).toMatchObject({
|
||||
input: { messages: [{ content: 'hi', role: 'user' }] },
|
||||
output: { topic: 'greeting' },
|
||||
tracing_id: payload.tracing_id,
|
||||
});
|
||||
});
|
||||
|
||||
it('returns null when get() targets a missing key', async () => {
|
||||
const store = new FileTracingStore(tmpRoot);
|
||||
expect(await store.get('not/a/real/key.json')).toBeNull();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,167 @@
|
||||
import type { Dirent } from 'node:fs';
|
||||
import fs from 'node:fs/promises';
|
||||
import path from 'node:path';
|
||||
|
||||
import type { TracingPayload, TracingSummary } from '../types';
|
||||
import type { ITracingStore, SaveResult } from './types';
|
||||
|
||||
export const DEFAULT_DIR = '.llm-generation-tracing';
|
||||
|
||||
const safeSegment = (value: string): string => value.replaceAll(/[^\w.-]+/g, '_') || 'unknown';
|
||||
|
||||
/**
|
||||
* Local / dev / desktop store. Writes plain JSON (no compression) so contents
|
||||
* can be inspected with `cat`. Layout mirrors the S3 key pattern:
|
||||
*
|
||||
* .llm-generation-tracing/{scenario}/{promptVersion}-{promptHash}/{file}.json
|
||||
*
|
||||
* Keeps a top-level `latest.json` symlink pointing at the most recent record.
|
||||
*/
|
||||
export class FileTracingStore implements ITracingStore {
|
||||
private readonly root: string;
|
||||
|
||||
constructor(rootDir?: string) {
|
||||
this.root = path.resolve(rootDir ?? process.cwd(), DEFAULT_DIR);
|
||||
}
|
||||
|
||||
async save(record: TracingPayload): Promise<SaveResult> {
|
||||
const dir = this.bucketDir(record);
|
||||
await fs.mkdir(dir, { recursive: true });
|
||||
|
||||
const ts = new Date(record.created_at).toISOString().replaceAll(':', '-');
|
||||
const shortId = safeSegment(record.tracing_id.slice(0, 12));
|
||||
const filename = `${ts}_${shortId}.json`;
|
||||
const filePath = path.join(dir, filename);
|
||||
|
||||
await fs.writeFile(filePath, JSON.stringify(record, null, 2), 'utf8');
|
||||
await this.updateLatestSymlink(filePath);
|
||||
|
||||
// Local-only path — return null so the DB row's `storage_key` stays empty.
|
||||
// The CLI rediscovers files by walking `.llm-generation-tracing/`.
|
||||
return { key: null };
|
||||
}
|
||||
|
||||
async get(key: string): Promise<TracingPayload | null> {
|
||||
const target = path.isAbsolute(key) ? key : path.join(this.root, key);
|
||||
try {
|
||||
const content = await fs.readFile(target, 'utf8');
|
||||
return JSON.parse(content) as TracingPayload;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
async list(options?: { limit?: number; scenario?: string }): Promise<TracingSummary[]> {
|
||||
const limit = options?.limit ?? 20;
|
||||
const files = await this.collectFiles();
|
||||
files.sort((a, b) => (a.filename < b.filename ? 1 : -1));
|
||||
|
||||
const summaries: TracingSummary[] = [];
|
||||
for (const file of files) {
|
||||
if (summaries.length >= limit) break;
|
||||
try {
|
||||
const content = await fs.readFile(file.fullPath, 'utf8');
|
||||
const record = JSON.parse(content) as TracingPayload;
|
||||
if (options?.scenario && record.scenario !== options.scenario) continue;
|
||||
summaries.push({
|
||||
created_at: record.created_at,
|
||||
model: record.model_metadata?.model,
|
||||
prompt_version: record.prompt_version,
|
||||
scenario: record.scenario,
|
||||
success: !record.error,
|
||||
tracing_id: record.tracing_id,
|
||||
validation_failed: record.validation_failed,
|
||||
});
|
||||
} catch {
|
||||
// skip corrupted files
|
||||
}
|
||||
}
|
||||
return summaries;
|
||||
}
|
||||
|
||||
/**
|
||||
* CLI helper: find a payload by tracing_id prefix. Returns the most-recent
|
||||
* match when several rows share the same prefix (e.g. truncated short id).
|
||||
*/
|
||||
async findByTracingId(prefix: string): Promise<TracingPayload | null> {
|
||||
const files = await this.collectFiles();
|
||||
files.sort((a, b) => (a.filename < b.filename ? 1 : -1));
|
||||
for (const file of files) {
|
||||
try {
|
||||
const content = await fs.readFile(file.fullPath, 'utf8');
|
||||
const record = JSON.parse(content) as TracingPayload;
|
||||
if (record.tracing_id.startsWith(prefix)) return record;
|
||||
} catch {
|
||||
// skip corrupted files
|
||||
}
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/** CLI helper: resolve the `latest.json` symlink (or fall back to the newest file). */
|
||||
async getLatest(): Promise<TracingPayload | null> {
|
||||
const latestPath = path.join(this.root, 'latest.json');
|
||||
try {
|
||||
const real = await fs.realpath(latestPath);
|
||||
const content = await fs.readFile(real, 'utf8');
|
||||
return JSON.parse(content) as TracingPayload;
|
||||
} catch {
|
||||
// symlink missing or unreadable — fall back to newest by filename order
|
||||
}
|
||||
const files = await this.collectFiles();
|
||||
if (files.length === 0) return null;
|
||||
files.sort((a, b) => (a.filename < b.filename ? 1 : -1));
|
||||
try {
|
||||
const content = await fs.readFile(files[0].fullPath, 'utf8');
|
||||
return JSON.parse(content) as TracingPayload;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
private bucketDir(record: TracingPayload): string {
|
||||
// Compose the relative segment as a single string so Turbopack / Webpack
|
||||
// static analyzers don't try to enumerate path.join's multi-arg pattern
|
||||
// (which fans out into a glob match against the project).
|
||||
const sub = `${safeSegment(record.scenario)}/${safeSegment(record.prompt_version)}-${safeSegment(record.prompt_hash)}`;
|
||||
return path.join(this.root, sub);
|
||||
}
|
||||
|
||||
private async updateLatestSymlink(filePath: string): Promise<void> {
|
||||
const latestPath = path.join(this.root, 'latest.json');
|
||||
try {
|
||||
await fs.unlink(latestPath);
|
||||
} catch {
|
||||
// ignore — no previous symlink
|
||||
}
|
||||
try {
|
||||
await fs.symlink(path.relative(this.root, filePath), latestPath);
|
||||
} catch {
|
||||
// file systems without symlink support (e.g. Windows w/o dev mode) — silently skip
|
||||
}
|
||||
}
|
||||
|
||||
private async collectFiles(): Promise<{ filename: string; fullPath: string }[]> {
|
||||
const results: { filename: string; fullPath: string }[] = [];
|
||||
|
||||
const walk = async (dir: string): Promise<void> => {
|
||||
let entries: Dirent[];
|
||||
try {
|
||||
entries = await fs.readdir(dir, { withFileTypes: true });
|
||||
} catch {
|
||||
return;
|
||||
}
|
||||
for (const entry of entries) {
|
||||
const full = path.join(dir, entry.name);
|
||||
if (entry.isDirectory()) {
|
||||
await walk(full);
|
||||
} else if (entry.name.endsWith('.json') && entry.name !== 'latest.json') {
|
||||
results.push({ filename: entry.name, fullPath: full });
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
await walk(this.root);
|
||||
return results;
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,20 @@
|
||||
import type { TracingPayload, TracingSummary } from '../types';
|
||||
|
||||
export interface SaveResult {
|
||||
/**
|
||||
* Canonical, globally addressable key for the saved payload (e.g. an S3
|
||||
* object key). `null` when the payload was persisted only to a local /
|
||||
* non-shareable location — the service should then leave `storage_key`
|
||||
* empty in the DB rather than record a path no other process can resolve.
|
||||
*/
|
||||
key: string | null;
|
||||
}
|
||||
|
||||
export interface ITracingStore {
|
||||
/** Optional retrieval — used by CLI / debug tooling only. */
|
||||
get?: (key: string) => Promise<TracingPayload | null>;
|
||||
/** Optional listing — used by CLI / debug tooling only. */
|
||||
list?: (options?: { limit?: number }) => Promise<TracingSummary[]>;
|
||||
/** Persist a tracing payload; returns the storage key for cross-reference. */
|
||||
save: (record: TracingPayload) => Promise<SaveResult>;
|
||||
}
|
||||
@@ -0,0 +1,100 @@
|
||||
export type LlmGenerationFeedbackSignal = 'positive' | 'negative' | 'neutral';
|
||||
|
||||
export interface TracingErrorPayload {
|
||||
code?: string;
|
||||
message?: string;
|
||||
stack?: string;
|
||||
}
|
||||
|
||||
export interface TracingModelMetadata {
|
||||
[key: string]: unknown;
|
||||
finish_reason?: string;
|
||||
model?: string;
|
||||
provider?: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Blob payload written to the store. Mirrors the design's Blob schema —
|
||||
* the DB row stores indexable summary columns; this carries the full prompt /
|
||||
* input / output detail for offline analysis.
|
||||
*
|
||||
* Version field guards future schema evolution.
|
||||
*/
|
||||
export interface TracingPayload {
|
||||
created_at: number;
|
||||
error?: TracingErrorPayload;
|
||||
input?: unknown;
|
||||
model_metadata?: TracingModelMetadata;
|
||||
output?: unknown;
|
||||
prompt_hash: string;
|
||||
prompt_version: string;
|
||||
raw_output?: string;
|
||||
scenario: string;
|
||||
schema?: unknown;
|
||||
system_prompt?: string;
|
||||
/** Unique id of the tracing row in the DB. Used by the store to build the key. */
|
||||
tracing_id: string;
|
||||
validation_failed?: boolean;
|
||||
version: '1.0';
|
||||
}
|
||||
|
||||
export interface TracingSummary {
|
||||
created_at: number;
|
||||
latency_ms?: number;
|
||||
model?: string;
|
||||
prompt_version: string;
|
||||
scenario: string;
|
||||
success: boolean;
|
||||
tracing_id: string;
|
||||
validation_failed?: boolean;
|
||||
}
|
||||
|
||||
export interface ScenarioDefinition {
|
||||
/** Human-bumped prompt version (e.g. `v1.0`). */
|
||||
promptVersion: string;
|
||||
/** Symbolic scenario name, used for grouping and partitioning storage. */
|
||||
scenario: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* Caller-facing tracing config for a single `generateObject` call. Passed
|
||||
* through `GenerateObjectOptions.tracing` and consumed by the tracing hook
|
||||
* to populate the `llm_generation_tracing` DB row + off-DB blob.
|
||||
*
|
||||
* Every field is optional — the hook fills sensible defaults (auto-extracted
|
||||
* `inputHint`, registry-resolved `scenario`, `messages[0]` as system prompt).
|
||||
* Supply the fields explicitly to keep the DB row scannable.
|
||||
*/
|
||||
export interface TracingOptions {
|
||||
/** Owning agent ID; persisted to `agent_id`. */
|
||||
agentId?: string;
|
||||
/**
|
||||
* Short snippet stored on `input_hint`. Pass the user's actual typed text
|
||||
* when the prompt wraps it in a template — otherwise the auto-extracted
|
||||
* hint ends up being the wrapper's first user message (e.g.
|
||||
* `Before cursor: "…" After cursor: "…"`) instead of what the user wrote.
|
||||
*/
|
||||
inputHint?: string;
|
||||
/**
|
||||
* Free-form context written to the row's `metadata` jsonb column. Use this
|
||||
* for ad-hoc fields that don't deserve a typed slot (e.g. correlation IDs).
|
||||
*/
|
||||
metadata?: Record<string, unknown>;
|
||||
/** Parent tracing row for chained generations. */
|
||||
parentTracingId?: string;
|
||||
/** Semantic prompt version (e.g. `v1.0`). */
|
||||
promptVersion?: string;
|
||||
/** Scenario name; falls back to registry lookup by `trigger`. */
|
||||
scenario?: string;
|
||||
/** Structured-output schema identifier. */
|
||||
schemaName?: string;
|
||||
/**
|
||||
* Override for the prompt-hash system text. Defaults to `messages[0]`
|
||||
* when it's a system message.
|
||||
*/
|
||||
systemPrompt?: string;
|
||||
/** Topic / conversation ID. */
|
||||
topicId?: string;
|
||||
/** RequestTrigger string. */
|
||||
trigger?: string;
|
||||
}
|
||||
@@ -0,0 +1,232 @@
|
||||
import type { TracingPayload, TracingSummary } from '../types';
|
||||
|
||||
// ANSI color helpers — keep parity with @lobechat/agent-tracing's viewer.
|
||||
const dim = (s: string) => `\x1B[2m${s}\x1B[22m`;
|
||||
const bold = (s: string) => `\x1B[1m${s}\x1B[22m`;
|
||||
const green = (s: string) => `\x1B[32m${s}\x1B[39m`;
|
||||
const red = (s: string) => `\x1B[31m${s}\x1B[39m`;
|
||||
const yellow = (s: string) => `\x1B[33m${s}\x1B[39m`;
|
||||
const cyan = (s: string) => `\x1B[36m${s}\x1B[39m`;
|
||||
const magenta = (s: string) => `\x1B[35m${s}\x1B[39m`;
|
||||
|
||||
const PREVIEW_CHARS = 120;
|
||||
const FULL_PREVIEW_CHARS = 4000;
|
||||
|
||||
const padEnd = (text: string, width: number): string =>
|
||||
text.length >= width ? text : text + ' '.repeat(width - text.length);
|
||||
|
||||
const formatTime = (timestamp: number): string => {
|
||||
const d = new Date(timestamp);
|
||||
const pad = (n: number) => String(n).padStart(2, '0');
|
||||
return `${d.getFullYear()}-${pad(d.getMonth() + 1)}-${pad(d.getDate())} ${pad(d.getHours())}:${pad(d.getMinutes())}:${pad(d.getSeconds())}`;
|
||||
};
|
||||
|
||||
const previewLine = (text: string, maxLen: number): string => {
|
||||
const single = text.replaceAll(/\s+/g, ' ').trim();
|
||||
if (single.length <= maxLen) return single;
|
||||
return `${single.slice(0, maxLen - 1)}…`;
|
||||
};
|
||||
|
||||
const stringify = (value: unknown): string =>
|
||||
typeof value === 'string' ? value : JSON.stringify(value, null, 2);
|
||||
|
||||
const statusOf = (record: TracingPayload): string => {
|
||||
if (record.error) return red('error');
|
||||
if (record.validation_failed) return yellow('validation-fail');
|
||||
return green('ok');
|
||||
};
|
||||
|
||||
export const renderSummaryTable = (summaries: TracingSummary[]): string => {
|
||||
if (summaries.length === 0) return dim('No tracing records found.');
|
||||
|
||||
const rows = summaries.map((s) => ({
|
||||
created: formatTime(s.created_at),
|
||||
id: s.tracing_id.slice(0, 12),
|
||||
model: s.model ?? '-',
|
||||
scenario: s.scenario,
|
||||
statusRaw: s.success ? (s.validation_failed ? 'validation-fail' : 'ok') : 'error',
|
||||
version: s.prompt_version,
|
||||
}));
|
||||
|
||||
// Column widths include a 2-space right gutter so the next column never
|
||||
// butts up against this one.
|
||||
const widths = {
|
||||
created: 19,
|
||||
id: 14,
|
||||
model: Math.max(8, 'MODEL'.length, ...rows.map((r) => r.model.length)) + 2,
|
||||
scenario: Math.max(10, 'SCENARIO'.length, ...rows.map((r) => r.scenario.length)) + 2,
|
||||
status: Math.max(8, 'STATUS'.length, 'validation-fail'.length) + 2,
|
||||
version: Math.max(7, 'VERSION'.length, ...rows.map((r) => r.version.length)) + 2,
|
||||
};
|
||||
|
||||
const colorStatus = (status: string): string =>
|
||||
status === 'ok' ? green(status) : status === 'error' ? red(status) : yellow(status);
|
||||
|
||||
// Pad first (using raw text length), then colorize — keeps column alignment
|
||||
// independent of ANSI escape codes.
|
||||
const header =
|
||||
bold(padEnd('ID', widths.id)) +
|
||||
bold(padEnd('SCENARIO', widths.scenario)) +
|
||||
bold(padEnd('VERSION', widths.version)) +
|
||||
bold(padEnd('MODEL', widths.model)) +
|
||||
bold(padEnd('STATUS', widths.status)) +
|
||||
bold('CREATED');
|
||||
|
||||
const padCell = (text: string, width: number): string =>
|
||||
' '.repeat(Math.max(0, width - text.length));
|
||||
|
||||
const body = rows.map(
|
||||
(r) =>
|
||||
cyan(r.id) +
|
||||
padCell(r.id, widths.id) +
|
||||
r.scenario +
|
||||
padCell(r.scenario, widths.scenario) +
|
||||
r.version +
|
||||
padCell(r.version, widths.version) +
|
||||
magenta(r.model) +
|
||||
padCell(r.model, widths.model) +
|
||||
colorStatus(r.statusRaw) +
|
||||
padCell(r.statusRaw, widths.status) +
|
||||
dim(r.created),
|
||||
);
|
||||
|
||||
const ruleWidth =
|
||||
widths.id + widths.scenario + widths.version + widths.model + widths.status + widths.created;
|
||||
return [header, dim('─'.repeat(ruleWidth)), ...body].join('\n');
|
||||
};
|
||||
|
||||
const roleColor = (role: string): ((s: string) => string) => {
|
||||
if (role === 'user') return green;
|
||||
if (role === 'assistant') return cyan;
|
||||
if (role === 'system') return magenta;
|
||||
return yellow;
|
||||
};
|
||||
|
||||
const renderInputMessages = (input: unknown, full: boolean): string[] => {
|
||||
if (!Array.isArray(input))
|
||||
return [` ${dim(previewLine(stringify(input), full ? FULL_PREVIEW_CHARS : PREVIEW_CHARS))}`];
|
||||
|
||||
const lines: string[] = [];
|
||||
for (let i = 0; i < input.length; i++) {
|
||||
const msg = (input[i] ?? {}) as { content?: unknown; role?: string };
|
||||
const role = msg.role ?? 'unknown';
|
||||
const rawContent =
|
||||
typeof msg.content === 'string' ? msg.content : JSON.stringify(msg.content ?? '');
|
||||
const charCount = rawContent.length;
|
||||
const charLabel = charCount > 0 ? dim(` ${charCount} chars`) : '';
|
||||
const connector = i === input.length - 1 ? '└─' : '├─';
|
||||
lines.push(` ${dim(connector)} ${dim(`[${i}]`)} ${roleColor(role)(role)}${charLabel}`);
|
||||
if (rawContent) {
|
||||
const preview = full ? rawContent : previewLine(rawContent, PREVIEW_CHARS);
|
||||
lines.push(` ${dim(preview)}`);
|
||||
}
|
||||
}
|
||||
return lines;
|
||||
};
|
||||
|
||||
const renderOutput = (output: unknown, full: boolean): string => {
|
||||
// Inline tiny single-key objects: `{ completion: "怎么样" }` → `completion: "怎么样"`
|
||||
if (
|
||||
output &&
|
||||
typeof output === 'object' &&
|
||||
!Array.isArray(output) &&
|
||||
Object.keys(output).length === 1
|
||||
) {
|
||||
const [key, value] = Object.entries(output)[0];
|
||||
const rendered = typeof value === 'string' ? `"${value}"` : JSON.stringify(value);
|
||||
if (rendered.length <= PREVIEW_CHARS) return `${cyan(key)}: ${rendered}`;
|
||||
}
|
||||
|
||||
const text = stringify(output);
|
||||
if (full || text.length <= PREVIEW_CHARS * 2) return text;
|
||||
return previewLine(text, PREVIEW_CHARS);
|
||||
};
|
||||
|
||||
export const renderPayloadDetail = (
|
||||
record: TracingPayload,
|
||||
options: { full?: boolean },
|
||||
): string => {
|
||||
const full = !!options.full;
|
||||
const lines: string[] = [];
|
||||
|
||||
// Header — single compact line.
|
||||
const modelLabel =
|
||||
record.model_metadata?.provider || record.model_metadata?.model
|
||||
? ` ${magenta(`${record.model_metadata?.provider ?? '-'} / ${record.model_metadata?.model ?? '-'}`)}`
|
||||
: '';
|
||||
lines.push(
|
||||
bold('LLM Generation') +
|
||||
` ${cyan(record.tracing_id.slice(0, 12))}` +
|
||||
` scenario:${record.scenario}` +
|
||||
` ${dim(record.prompt_version)}` +
|
||||
modelLabel +
|
||||
` ${statusOf(record)}` +
|
||||
` ${dim(formatTime(record.created_at))}`,
|
||||
);
|
||||
|
||||
if (record.error) {
|
||||
lines.push(`${red('Error:')} ${record.error.code ?? '-'} — ${record.error.message ?? '-'}`);
|
||||
}
|
||||
|
||||
// Build sections as a tree. Each section is rendered as `├─ label meta` then optional indented body.
|
||||
type Section = { body?: string[]; label: string; meta?: string };
|
||||
const sections: Section[] = [];
|
||||
|
||||
if (record.system_prompt) {
|
||||
sections.push({
|
||||
body: full ? [` ${dim(record.system_prompt)}`] : undefined,
|
||||
label: 'system_prompt',
|
||||
meta: full
|
||||
? dim(`${record.system_prompt.length} chars`)
|
||||
: dim(`${record.system_prompt.length} chars (use --full to expand)`),
|
||||
});
|
||||
}
|
||||
|
||||
if (record.input !== undefined) {
|
||||
const isArr = Array.isArray(record.input);
|
||||
const count = isArr ? (record.input as unknown[]).length : 1;
|
||||
sections.push({
|
||||
body: renderInputMessages(record.input, full),
|
||||
label: 'input',
|
||||
meta: isArr ? dim(`${count} message${count === 1 ? '' : 's'}`) : undefined,
|
||||
});
|
||||
}
|
||||
|
||||
if (record.output !== undefined) {
|
||||
sections.push({
|
||||
body: [` ${renderOutput(record.output, full)}`],
|
||||
label: 'output',
|
||||
});
|
||||
}
|
||||
|
||||
if (record.raw_output) {
|
||||
sections.push({
|
||||
body: [` ${dim(full ? record.raw_output : previewLine(record.raw_output, PREVIEW_CHARS))}`],
|
||||
label: 'raw_output',
|
||||
meta: yellow('validation_failed'),
|
||||
});
|
||||
}
|
||||
|
||||
if (record.schema !== undefined) {
|
||||
const schemaText = stringify(record.schema);
|
||||
sections.push({
|
||||
body: full ? [` ${dim(schemaText)}`] : undefined,
|
||||
label: 'schema',
|
||||
meta: dim(
|
||||
full ? `${schemaText.length} chars` : `${schemaText.length} chars (use --full to expand)`,
|
||||
),
|
||||
});
|
||||
}
|
||||
|
||||
for (let i = 0; i < sections.length; i++) {
|
||||
const s = sections[i];
|
||||
const isLast = i === sections.length - 1;
|
||||
const connector = isLast ? '└─' : '├─';
|
||||
lines.push(`${dim(connector)} ${bold(s.label)}${s.meta ? ` ${s.meta}` : ''}`);
|
||||
if (s.body) {
|
||||
for (const line of s.body) lines.push(line);
|
||||
}
|
||||
}
|
||||
|
||||
return lines.join('\n');
|
||||
};
|
||||
@@ -0,0 +1,7 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"outDir": "./dist"
|
||||
},
|
||||
"extends": "../../tsconfig.json",
|
||||
"include": ["src/**/*.ts"]
|
||||
}
|
||||
@@ -0,0 +1,11 @@
|
||||
import { defineConfig } from 'vitest/config';
|
||||
|
||||
export default defineConfig({
|
||||
test: {
|
||||
coverage: {
|
||||
provider: 'v8',
|
||||
reporter: ['text', 'json', 'lcov', 'text-summary'],
|
||||
},
|
||||
environment: 'node',
|
||||
},
|
||||
});
|
||||
@@ -157,7 +157,15 @@ export abstract class BaseMemoryExtractor<
|
||||
|
||||
span.addEvent('gen_ai.request.send');
|
||||
const result = await this.runtime.generateObject(payload, {
|
||||
metadata: { trigger: RequestTrigger.Memory },
|
||||
metadata: {
|
||||
// Optional backlink to the job-level memory trace blob; the
|
||||
// llm_generation_tracing hook persists it under metadata so a
|
||||
// per-call row can be traced back to the job-level dump.
|
||||
...(options?.parentMemoryTraceKey
|
||||
? { parent_memory_trace_key: options.parentMemoryTraceKey }
|
||||
: {}),
|
||||
trigger: RequestTrigger.Memory,
|
||||
},
|
||||
});
|
||||
span.addEvent('gen_ai.response.receive');
|
||||
|
||||
|
||||
@@ -43,6 +43,13 @@ export interface ExtractorOptions extends ExtractorTemplateProps {
|
||||
) => Promise<void> | void;
|
||||
};
|
||||
messageIds?: string[];
|
||||
/**
|
||||
* S3 key of the parent memory job trace. When provided, propagated into
|
||||
* the per-call `llm_generation_tracing` row as `metadata.parent_memory_trace_key`,
|
||||
* giving offline analysis a backlink from a single generateObject call to the
|
||||
* job-level memory trace that spawned it.
|
||||
*/
|
||||
parentMemoryTraceKey?: string;
|
||||
sourceId?: string;
|
||||
userId?: string;
|
||||
}
|
||||
|
||||
@@ -716,6 +716,46 @@ describe('ModelRuntime', () => {
|
||||
|
||||
await expect(runtime.generateObject(genObjPayload)).resolves.toEqual({ result: 'ok' });
|
||||
});
|
||||
|
||||
it('onGenerateObjectComplete fires on success with output, latency and usage', async () => {
|
||||
const onGenerateObjectComplete = vi.fn();
|
||||
const { runtime, mockRuntimeAI } = createMockRuntime({ onGenerateObjectComplete });
|
||||
const usage = { totalInputTokens: 50, totalOutputTokens: 20, cost: 0.001 };
|
||||
mockRuntimeAI.generateObject.mockImplementation(async (_p: any, opts: any) => {
|
||||
await opts?.onUsage?.(usage);
|
||||
return { result: 'ok' };
|
||||
});
|
||||
|
||||
await runtime.generateObject(genObjPayload);
|
||||
|
||||
expect(onGenerateObjectComplete).toHaveBeenCalledTimes(1);
|
||||
const [data, context] = onGenerateObjectComplete.mock.calls[0];
|
||||
expect(data).toMatchObject({ output: { result: 'ok' }, success: true, usage });
|
||||
expect(data.latencyMs).toBeGreaterThanOrEqual(0);
|
||||
expect(context.payload).toBe(genObjPayload);
|
||||
});
|
||||
|
||||
it('onGenerateObjectComplete fires on failure with structured error and is awaited before throw', async () => {
|
||||
const onGenerateObjectComplete = vi.fn();
|
||||
const { runtime, mockRuntimeAI } = createMockRuntime({ onGenerateObjectComplete });
|
||||
const cause = new Error('boom');
|
||||
mockRuntimeAI.generateObject.mockRejectedValue(cause);
|
||||
|
||||
await expect(runtime.generateObject(genObjPayload)).rejects.toBe(cause);
|
||||
expect(onGenerateObjectComplete).toHaveBeenCalledTimes(1);
|
||||
const [data] = onGenerateObjectComplete.mock.calls[0];
|
||||
expect(data.success).toBe(false);
|
||||
expect(data.error?.message).toBe('boom');
|
||||
});
|
||||
|
||||
it('hook errors thrown from onGenerateObjectComplete are swallowed and do not surface', async () => {
|
||||
const onGenerateObjectComplete = vi.fn().mockRejectedValue(new Error('hook broke'));
|
||||
const { runtime, mockRuntimeAI } = createMockRuntime({ onGenerateObjectComplete });
|
||||
mockRuntimeAI.generateObject.mockResolvedValue({ result: 'ok' });
|
||||
|
||||
await expect(runtime.generateObject(genObjPayload)).resolves.toEqual({ result: 'ok' });
|
||||
expect(onGenerateObjectComplete).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
});
|
||||
|
||||
describe('embeddings hooks', () => {
|
||||
|
||||
@@ -84,6 +84,25 @@ export interface ModelRuntimeHooks {
|
||||
context: { options?: EmbeddingsOptions; payload: EmbeddingsPayload },
|
||||
) => void | Promise<void>;
|
||||
|
||||
/**
|
||||
* Always fires after `generateObject` returns or throws — success or failure.
|
||||
* Use this for full-lifecycle observability (per-call tracing, prompt analytics).
|
||||
* Unlike `onGenerateObjectFinal`, this fires regardless of whether the runtime
|
||||
* surfaces a `usage` callback, so the gap of "succeeded but no usage" is covered.
|
||||
*
|
||||
* Hook failures are swallowed and logged — they must not interfere with the response.
|
||||
*/
|
||||
onGenerateObjectComplete?: (
|
||||
data: {
|
||||
error?: { code?: string; message?: string; stack?: string };
|
||||
latencyMs: number;
|
||||
output?: unknown;
|
||||
success: boolean;
|
||||
usage?: ModelUsage;
|
||||
},
|
||||
context: { options?: GenerateObjectOptions; payload: GenerateObjectPayload },
|
||||
) => void | Promise<void>;
|
||||
|
||||
onGenerateObjectError?: (
|
||||
error: ChatCompletionErrorPayload,
|
||||
context: { options?: GenerateObjectOptions; payload: GenerateObjectPayload },
|
||||
@@ -286,16 +305,46 @@ export class ModelRuntime {
|
||||
}
|
||||
|
||||
async generateObject(payload: GenerateObjectPayload, options?: GenerateObjectOptions) {
|
||||
const startedAt = Date.now();
|
||||
let usageCapture: ModelUsage | undefined;
|
||||
|
||||
const fireComplete = async (data: {
|
||||
error?: { code?: string; message?: string; stack?: string };
|
||||
output?: unknown;
|
||||
success: boolean;
|
||||
}) => {
|
||||
if (!this._hooks?.onGenerateObjectComplete) return;
|
||||
try {
|
||||
await this._hooks.onGenerateObjectComplete(
|
||||
{
|
||||
error: data.error,
|
||||
latencyMs: Date.now() - startedAt,
|
||||
output: data.output,
|
||||
success: data.success,
|
||||
usage: usageCapture,
|
||||
},
|
||||
{ options, payload },
|
||||
);
|
||||
} catch (e) {
|
||||
// Hook failures must not affect the caller — log and move on.
|
||||
console.error('[ModelRuntime] onGenerateObjectComplete hook error:', e);
|
||||
}
|
||||
};
|
||||
|
||||
try {
|
||||
await this._hooks?.beforeGenerateObject?.(payload, options);
|
||||
|
||||
const finalOptions = this._hooks?.onGenerateObjectFinal
|
||||
const needsUsageCapture =
|
||||
this._hooks?.onGenerateObjectFinal || this._hooks?.onGenerateObjectComplete;
|
||||
|
||||
const finalOptions = needsUsageCapture
|
||||
? {
|
||||
...options,
|
||||
onUsage: async (usage: ModelUsage) => {
|
||||
usageCapture = usage;
|
||||
await options?.onUsage?.(usage);
|
||||
try {
|
||||
await this._hooks!.onGenerateObjectFinal!({ usage }, { options, payload });
|
||||
await this._hooks?.onGenerateObjectFinal?.({ usage }, { options, payload });
|
||||
} catch (e) {
|
||||
// Hook failures (billing, tracing) must not interfere with response completion
|
||||
console.error('[ModelRuntime] onGenerateObjectFinal hook error:', e);
|
||||
@@ -304,7 +353,9 @@ export class ModelRuntime {
|
||||
}
|
||||
: options;
|
||||
|
||||
return await this._runtime.generateObject!(payload, finalOptions);
|
||||
const output = await this._runtime.generateObject!(payload, finalOptions);
|
||||
await fireComplete({ output, success: true });
|
||||
return output;
|
||||
} catch (error) {
|
||||
if (this._hooks?.onGenerateObjectError) {
|
||||
await this._hooks.onGenerateObjectError(error as ChatCompletionErrorPayload, {
|
||||
@@ -312,6 +363,11 @@ export class ModelRuntime {
|
||||
payload,
|
||||
});
|
||||
}
|
||||
const err = error as Error & { code?: string };
|
||||
await fireComplete({
|
||||
error: { code: err?.code, message: err?.message, stack: err?.stack },
|
||||
success: false,
|
||||
});
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,67 @@
|
||||
import { describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import { mergeModelRuntimeHooks } from './mergeHooks';
|
||||
import type { ModelRuntimeHooks } from './ModelRuntime';
|
||||
|
||||
describe('mergeModelRuntimeHooks', () => {
|
||||
it('returns undefined when both hooks are empty', () => {
|
||||
expect(mergeModelRuntimeHooks(undefined, undefined)).toBeUndefined();
|
||||
});
|
||||
|
||||
it('returns the only present hook untouched', () => {
|
||||
const fn = vi.fn();
|
||||
const merged = mergeModelRuntimeHooks({ beforeChat: fn }, undefined);
|
||||
expect(merged?.beforeChat).toBe(fn);
|
||||
});
|
||||
|
||||
it('chains hooks of the same name in a → b order', async () => {
|
||||
const order: string[] = [];
|
||||
const a: ModelRuntimeHooks = {
|
||||
onGenerateObjectComplete: vi.fn(async () => {
|
||||
order.push('a');
|
||||
}),
|
||||
};
|
||||
const b: ModelRuntimeHooks = {
|
||||
onGenerateObjectComplete: vi.fn(async () => {
|
||||
order.push('b');
|
||||
}),
|
||||
};
|
||||
|
||||
const merged = mergeModelRuntimeHooks(a, b);
|
||||
await merged?.onGenerateObjectComplete?.(
|
||||
{ latencyMs: 0, success: true },
|
||||
{} as Parameters<NonNullable<ModelRuntimeHooks['onGenerateObjectComplete']>>[1],
|
||||
);
|
||||
expect(order).toEqual(['a', 'b']);
|
||||
expect(a.onGenerateObjectComplete).toHaveBeenCalledTimes(1);
|
||||
expect(b.onGenerateObjectComplete).toHaveBeenCalledTimes(1);
|
||||
});
|
||||
|
||||
it('does not run b when a throws (a is load-bearing)', async () => {
|
||||
const bSpy = vi.fn();
|
||||
const merged = mergeModelRuntimeHooks(
|
||||
{
|
||||
onGenerateObjectComplete: async () => {
|
||||
throw new Error('billing failed');
|
||||
},
|
||||
},
|
||||
{ onGenerateObjectComplete: bSpy },
|
||||
);
|
||||
|
||||
await expect(
|
||||
merged?.onGenerateObjectComplete?.(
|
||||
{ latencyMs: 0, success: true },
|
||||
{} as Parameters<NonNullable<ModelRuntimeHooks['onGenerateObjectComplete']>>[1],
|
||||
),
|
||||
).rejects.toThrow('billing failed');
|
||||
expect(bSpy).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('keeps hooks that exist in only one side without wrapping', () => {
|
||||
const onlyInA = vi.fn();
|
||||
const onlyInB = vi.fn();
|
||||
const merged = mergeModelRuntimeHooks({ beforeChat: onlyInA }, { onChatFinal: onlyInB });
|
||||
expect(merged?.beforeChat).toBe(onlyInA);
|
||||
expect(merged?.onChatFinal).toBe(onlyInB);
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,37 @@
|
||||
import type { ModelRuntimeHooks } from './ModelRuntime';
|
||||
|
||||
/**
|
||||
* Merge two `ModelRuntimeHooks` instances, chaining handlers that share a key
|
||||
* so both fire in `a → b` order. Designed for composing layered hooks at the
|
||||
* `ModelRuntime` construction site (e.g. billing hooks + tracing hooks).
|
||||
*
|
||||
* - Returns `undefined` only when both inputs are empty.
|
||||
* - Chained hooks run sequentially (`a` first, then `b`); the second hook only
|
||||
* runs if the first resolves. Place load-bearing hooks (the ones whose
|
||||
* failure should abort the call) in `a`.
|
||||
*/
|
||||
export const mergeModelRuntimeHooks = (
|
||||
a?: ModelRuntimeHooks,
|
||||
b?: ModelRuntimeHooks,
|
||||
): ModelRuntimeHooks | undefined => {
|
||||
if (!a && !b) return undefined;
|
||||
if (!a) return b;
|
||||
if (!b) return a;
|
||||
|
||||
const merged: ModelRuntimeHooks = { ...a };
|
||||
|
||||
for (const key of Object.keys(b) as (keyof ModelRuntimeHooks)[]) {
|
||||
const existing = merged[key];
|
||||
const next = b[key];
|
||||
if (!existing) {
|
||||
(merged[key] as unknown) = next;
|
||||
continue;
|
||||
}
|
||||
(merged[key] as unknown) = async (...args: unknown[]) => {
|
||||
await (existing as (...args: unknown[]) => Promise<unknown>)(...args);
|
||||
await (next as (...args: unknown[]) => Promise<unknown>)(...args);
|
||||
};
|
||||
}
|
||||
|
||||
return merged;
|
||||
};
|
||||
@@ -1,6 +1,7 @@
|
||||
export * from './const/models';
|
||||
export * from './core/BaseAI';
|
||||
export { pruneReasoningPayload } from './core/contextBuilders/openai';
|
||||
export { mergeModelRuntimeHooks } from './core/mergeHooks';
|
||||
export type { ModelRuntimeHooks } from './core/ModelRuntime';
|
||||
export { ModelRuntime } from './core/ModelRuntime';
|
||||
export { createOpenAICompatibleRuntime } from './core/openaiCompatibleFactory';
|
||||
|
||||
@@ -36,12 +36,20 @@ export interface GenerateObjectOptions {
|
||||
*/
|
||||
headers?: Record<string, any>;
|
||||
|
||||
/** Metadata passed to hooks (billing, tracing, etc.) */
|
||||
/** Free-form context passed to hooks (e.g. billing, routing). */
|
||||
metadata?: Record<string, unknown>;
|
||||
|
||||
onUsage?: (usage: ModelUsage) => void | Promise<void>;
|
||||
|
||||
signal?: AbortSignal;
|
||||
/**
|
||||
* Structured tracing config consumed by tracing hooks (e.g.
|
||||
* `llm_generation_tracing`). Loosely typed here so the runtime stays
|
||||
* tracing-agnostic; callers should import `TracingOptions` from
|
||||
* `@lobechat/llm-generation-tracing` for the strongly-typed shape.
|
||||
*/
|
||||
tracing?: Record<string, unknown>;
|
||||
|
||||
/**
|
||||
* userId for the GenerateObject
|
||||
*/
|
||||
|
||||
@@ -0,0 +1,42 @@
|
||||
import { describe, expect, it } from 'vitest';
|
||||
|
||||
import {
|
||||
chainInputCompletion,
|
||||
INPUT_COMPLETION_PROMPT_VERSION,
|
||||
INPUT_COMPLETION_SCHEMA_NAME,
|
||||
} from './inputCompletion';
|
||||
|
||||
describe('chainInputCompletion', () => {
|
||||
it('returns a system + user message pair', () => {
|
||||
const { messages } = chainInputCompletion('How can I ', '');
|
||||
expect(messages).toHaveLength(2);
|
||||
expect(messages[0].role).toBe('system');
|
||||
expect(messages[1].role).toBe('user');
|
||||
expect(messages[1].content).toContain('Before cursor: "How can I "');
|
||||
expect(messages[1].content).toContain('After cursor: ""');
|
||||
});
|
||||
|
||||
it('attaches a minimal `{ completion: string }` schema for generateObject', () => {
|
||||
const { schema } = chainInputCompletion('hi', '');
|
||||
expect(schema.name).toBe(INPUT_COMPLETION_SCHEMA_NAME);
|
||||
expect(schema.strict).toBe(true);
|
||||
expect(schema.schema.required).toEqual(['completion']);
|
||||
expect(schema.schema.additionalProperties).toBe(false);
|
||||
expect(schema.schema.properties.completion.type).toBe('string');
|
||||
});
|
||||
|
||||
it('appends conversation context to the system prompt when provided', () => {
|
||||
const { messages } = chainInputCompletion('write ', '', [
|
||||
{ content: 'previous response', role: 'assistant' },
|
||||
{ content: 'previous question', role: 'user' },
|
||||
]);
|
||||
const sys = messages[0].content as string;
|
||||
expect(sys).toContain('Current conversation context');
|
||||
expect(sys).toContain('assistant: previous response');
|
||||
expect(sys).toContain('user: previous question');
|
||||
});
|
||||
|
||||
it('exports a version constant the call site can pin to metadata', () => {
|
||||
expect(INPUT_COMPLETION_PROMPT_VERSION).toMatch(/^v\d+\.\d+$/);
|
||||
});
|
||||
});
|
||||
@@ -1,21 +1,56 @@
|
||||
import type { ChatStreamPayload, OpenAIChatMessage } from '@lobechat/types';
|
||||
import type { OpenAIChatMessage } from '@lobechat/types';
|
||||
|
||||
export const chainInputCompletion = (
|
||||
beforeCursor: string,
|
||||
afterCursor: string,
|
||||
context?: OpenAIChatMessage[],
|
||||
): Partial<ChatStreamPayload> => {
|
||||
let contextBlock = '';
|
||||
if (context?.length) {
|
||||
contextBlock = `\n\nCurrent conversation context:
|
||||
${context.map((m) => `${m.role}: ${m.content}`).join('\n')}`;
|
||||
}
|
||||
/**
|
||||
* Bump when editing the autocomplete system prompt or schema below. Plumbed
|
||||
* through `metadata.promptVersion` at the call site so per-call tracing
|
||||
* groups runs by prompt iteration. The 6-char prompt hash on the row catches
|
||||
* forgotten bumps.
|
||||
*/
|
||||
export const INPUT_COMPLETION_PROMPT_VERSION = 'v1.0';
|
||||
|
||||
return {
|
||||
max_tokens: 100,
|
||||
messages: [
|
||||
{
|
||||
content: `You are an autocomplete engine for a chat input box. The user is composing a message to send to an AI assistant. Predict and complete what the USER is typing. Output ONLY the missing text to insert at the cursor.
|
||||
/**
|
||||
* Symbolic schema name — also recorded on the tracing row's `schemaName`
|
||||
* column so prompt iterations and schema renames can be reasoned about
|
||||
* together.
|
||||
*/
|
||||
export const INPUT_COMPLETION_SCHEMA_NAME = 'InputCompletion';
|
||||
|
||||
/**
|
||||
* Minimal `generateObject` schema: a single `completion` string. The JSON
|
||||
* wrapping overhead is ~15-30 tokens, which is negligible against the model's
|
||||
* ~100-token completion budget but unlocks per-call tracing via the existing
|
||||
* `ModelRuntime.generateObject` hook.
|
||||
*/
|
||||
export interface InputCompletionSchema {
|
||||
name: typeof INPUT_COMPLETION_SCHEMA_NAME;
|
||||
schema: {
|
||||
additionalProperties: false;
|
||||
properties: {
|
||||
completion: { description: string; type: 'string' };
|
||||
};
|
||||
required: ['completion'];
|
||||
type: 'object';
|
||||
};
|
||||
strict: true;
|
||||
}
|
||||
|
||||
const INPUT_COMPLETION_SCHEMA: InputCompletionSchema = {
|
||||
name: INPUT_COMPLETION_SCHEMA_NAME,
|
||||
schema: {
|
||||
additionalProperties: false,
|
||||
properties: {
|
||||
completion: {
|
||||
description: 'The missing text to insert at the cursor. Empty string for no suggestion.',
|
||||
type: 'string',
|
||||
},
|
||||
},
|
||||
required: ['completion'],
|
||||
type: 'object',
|
||||
},
|
||||
strict: true,
|
||||
};
|
||||
|
||||
const SYSTEM_PROMPT = `You are an autocomplete engine for a chat input box. The user is composing a message to send to an AI assistant. Predict and complete what the USER is typing. Return only the missing text to insert at the cursor in the JSON object's \`completion\` field.
|
||||
|
||||
CRITICAL RULES:
|
||||
- You are completing the USER's message, NOT the AI assistant's response
|
||||
@@ -23,6 +58,7 @@ CRITICAL RULES:
|
||||
- NEVER generate text that sounds like an AI assistant responding (e.g., "help you", "assist you", "I can help")
|
||||
- Keep it short and natural, under 15 words
|
||||
- Match the user's language
|
||||
- If no completion would be useful, return an empty string
|
||||
|
||||
GOOD examples (user perspective):
|
||||
"How can I " → "optimize my React component's performance?"
|
||||
@@ -35,13 +71,28 @@ GOOD examples (user perspective):
|
||||
BAD examples (assistant perspective — NEVER do this):
|
||||
"How can I " → "help you today?" ← WRONG: this is what an AI assistant says
|
||||
"Hi" → ", how can I help you?" ← WRONG: assistant greeting
|
||||
"Let me " → "explain that for you" ← WRONG: assistant offering to explain${contextBlock}`,
|
||||
role: 'system',
|
||||
},
|
||||
{
|
||||
content: `Before cursor: "${beforeCursor}"\nAfter cursor: "${afterCursor}"`,
|
||||
role: 'user',
|
||||
},
|
||||
"Let me " → "explain that for you" ← WRONG: assistant offering to explain`;
|
||||
|
||||
export interface InputCompletionChainResult {
|
||||
messages: OpenAIChatMessage[];
|
||||
schema: InputCompletionSchema;
|
||||
}
|
||||
|
||||
export const chainInputCompletion = (
|
||||
beforeCursor: string,
|
||||
afterCursor: string,
|
||||
context?: OpenAIChatMessage[],
|
||||
): InputCompletionChainResult => {
|
||||
let contextBlock = '';
|
||||
if (context?.length) {
|
||||
contextBlock = `\n\nCurrent conversation context:\n${context.map((m) => `${m.role}: ${m.content}`).join('\n')}`;
|
||||
}
|
||||
|
||||
return {
|
||||
messages: [
|
||||
{ content: `${SYSTEM_PROMPT}${contextBlock}`, role: 'system' },
|
||||
{ content: `Before cursor: "${beforeCursor}"\nAfter cursor: "${afterCursor}"`, role: 'user' },
|
||||
],
|
||||
schema: INPUT_COMPLETION_SCHEMA,
|
||||
};
|
||||
};
|
||||
|
||||
@@ -194,6 +194,11 @@ export const StructureSchema = z.object({
|
||||
});
|
||||
|
||||
export const StructureOutputSchema = z.object({
|
||||
/**
|
||||
* Free-form context forwarded to non-tracing hooks (e.g. billing). Use
|
||||
* `tracing` for `llm_generation_tracing` config.
|
||||
*/
|
||||
metadata: z.record(z.string(), z.unknown()).optional(),
|
||||
messages: z.array(z.any()),
|
||||
model: z.string(),
|
||||
provider: z.string(),
|
||||
@@ -201,10 +206,16 @@ export const StructureOutputSchema = z.object({
|
||||
tools: z
|
||||
.array(z.object({ function: LobeUniformToolSchema, type: z.literal('function') }))
|
||||
.optional(),
|
||||
/**
|
||||
* Structured tracing config (scenario / promptVersion / schemaName /
|
||||
* agentId / topicId / inputHint / ...). See `TracingOptions` from
|
||||
* `@lobechat/llm-generation-tracing` for the typed shape.
|
||||
*/
|
||||
tracing: z.record(z.string(), z.unknown()).optional(),
|
||||
});
|
||||
|
||||
interface IStructureSchema {
|
||||
description: string;
|
||||
description?: string;
|
||||
name: string;
|
||||
schema: {
|
||||
additionalProperties?: boolean;
|
||||
@@ -216,8 +227,12 @@ interface IStructureSchema {
|
||||
}
|
||||
|
||||
export interface StructureOutputParams {
|
||||
keyVaultsPayload: string;
|
||||
messages: OpenAIChatMessage[];
|
||||
/**
|
||||
* Free-form context forwarded to non-tracing hooks (e.g. billing). Use
|
||||
* `tracing` for `llm_generation_tracing` config.
|
||||
*/
|
||||
metadata?: Record<string, unknown>;
|
||||
model: string;
|
||||
provider: string;
|
||||
schema?: IStructureSchema;
|
||||
@@ -226,4 +241,10 @@ export interface StructureOutputParams {
|
||||
function: LobeUniformTool;
|
||||
type: 'function';
|
||||
}[];
|
||||
/**
|
||||
* Structured tracing config (scenario / promptVersion / schemaName /
|
||||
* agentId / topicId / inputHint / ...). See `TracingOptions` from
|
||||
* `@lobechat/llm-generation-tracing` for the typed shape.
|
||||
*/
|
||||
tracing?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
@@ -1,8 +1,13 @@
|
||||
import { isDesktop } from '@lobechat/const';
|
||||
import { isDesktop, TRACING_SCENARIOS } from '@lobechat/const';
|
||||
import { HotkeyEnum, KeyEnum } from '@lobechat/const/hotkeys';
|
||||
import { HETEROGENEOUS_TYPE_LABELS } from '@lobechat/heterogeneous-agents';
|
||||
import { chainInputCompletion, escapeXmlAttr } from '@lobechat/prompts';
|
||||
import { isCommandPressed, merge } from '@lobechat/utils';
|
||||
import {
|
||||
chainInputCompletion,
|
||||
escapeXmlAttr,
|
||||
INPUT_COMPLETION_PROMPT_VERSION,
|
||||
INPUT_COMPLETION_SCHEMA_NAME,
|
||||
} from '@lobechat/prompts';
|
||||
import { isCommandPressed } from '@lobechat/utils';
|
||||
import type { IEditor } from '@lobehub/editor';
|
||||
import { INSERT_MENTION_COMMAND, ReactAutoCompletePlugin, ReactMathPlugin } from '@lobehub/editor';
|
||||
import { Editor, FloatMenu, useEditorState } from '@lobehub/editor/react';
|
||||
@@ -16,9 +21,10 @@ import { useHotkeysContext } from 'react-hotkeys-hook';
|
||||
import { usePasteFile, useUploadFiles } from '@/components/DragUploadZone';
|
||||
import { useEnterToSend } from '@/hooks/useEnterToSend';
|
||||
import { useIMECompositionEvent } from '@/hooks/useIMECompositionEvent';
|
||||
import { chatService } from '@/services/chat';
|
||||
import { aiChatService } from '@/services/aiChat';
|
||||
import { useAgentStore } from '@/store/agent';
|
||||
import { agentByIdSelectors } from '@/store/agent/selectors';
|
||||
import { useChatStore } from '@/store/chat';
|
||||
import { useUserStore } from '@/store/user';
|
||||
import {
|
||||
labPreferSelectors,
|
||||
@@ -213,36 +219,47 @@ const InputEditor = memo<{
|
||||
// mid-text causes nested editor updates that freeze the input
|
||||
if (afterText.trim()) return null;
|
||||
|
||||
const { enabled: _, ...config } = systemAgentSelectors.inputCompletion(
|
||||
useUserStore.getState(),
|
||||
);
|
||||
const config = systemAgentSelectors.inputCompletion(useUserStore.getState());
|
||||
const context = getMessagesRef.current?.();
|
||||
const chainParams = chainInputCompletion(input, afterText, context);
|
||||
const { messages, schema } = chainInputCompletion(input, afterText, context);
|
||||
|
||||
const abortController = new AbortController();
|
||||
abortSignal.addEventListener('abort', () => abortController.abort());
|
||||
|
||||
let result = '';
|
||||
const currentTopicId = useChatStore.getState().activeTopicId;
|
||||
|
||||
let response: { completion?: string } | null;
|
||||
try {
|
||||
await chatService.fetchPresetTaskResult({
|
||||
abortController,
|
||||
onMessageHandle: (chunk) => {
|
||||
if (chunk.type === 'text') {
|
||||
result += chunk.text;
|
||||
}
|
||||
response = (await aiChatService.generateJSON(
|
||||
{
|
||||
messages,
|
||||
model: config.model,
|
||||
provider: config.provider,
|
||||
schema,
|
||||
tracing: {
|
||||
agentId,
|
||||
// Use the user's actual typed text as the row's `input_hint`
|
||||
// — the wrapped prompt's first user message is templated and
|
||||
// not human-scannable.
|
||||
inputHint: input,
|
||||
promptVersion: INPUT_COMPLETION_PROMPT_VERSION,
|
||||
scenario: TRACING_SCENARIOS.InputCompletion,
|
||||
schemaName: INPUT_COMPLETION_SCHEMA_NAME,
|
||||
topicId: currentTopicId,
|
||||
},
|
||||
},
|
||||
params: merge(config, chainParams),
|
||||
});
|
||||
abortController,
|
||||
)) as { completion?: string } | null;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
|
||||
if (abortSignal.aborted) return null;
|
||||
|
||||
return result.trimEnd() || null;
|
||||
const completion = response?.completion?.trimEnd();
|
||||
return completion || null;
|
||||
},
|
||||
[isComposingRef],
|
||||
[isComposingRef, agentId],
|
||||
);
|
||||
|
||||
const autoCompletePlugin = useMemo(
|
||||
|
||||
@@ -0,0 +1,90 @@
|
||||
// @vitest-environment node
|
||||
import { promisify } from 'node:util';
|
||||
import { zstdCompress, zstdDecompress } from 'node:zlib';
|
||||
|
||||
import type { TracingPayload } from '@lobechat/llm-generation-tracing';
|
||||
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
const compressZstd = promisify(zstdCompress);
|
||||
const decompressZstd = promisify(zstdDecompress);
|
||||
|
||||
const uploadBuffer = vi.fn();
|
||||
const getFileByteArray = vi.fn();
|
||||
|
||||
vi.mock('@/server/modules/S3', () => ({
|
||||
FileS3: vi.fn(() => ({ getFileByteArray, uploadBuffer })),
|
||||
}));
|
||||
|
||||
const { S3TracingStore, buildTracingKey } = await import('./S3TracingStore');
|
||||
|
||||
const samplePayload = (overrides: Partial<TracingPayload> = {}): TracingPayload => ({
|
||||
created_at: new Date('2026-05-22T11:22:33.444Z').getTime(),
|
||||
prompt_hash: 'ab1fc3',
|
||||
prompt_version: 'v1.0',
|
||||
scenario: 'home_brief',
|
||||
tracing_id: '00000000-0000-0000-0000-000000000001',
|
||||
version: '1.0',
|
||||
...overrides,
|
||||
});
|
||||
|
||||
beforeEach(() => {
|
||||
uploadBuffer.mockReset().mockResolvedValue(undefined);
|
||||
getFileByteArray.mockReset();
|
||||
});
|
||||
|
||||
describe('buildTracingKey', () => {
|
||||
it('lays out scenario / version-hash / date / id with the .json.zst suffix', () => {
|
||||
const key = buildTracingKey(samplePayload());
|
||||
expect(key).toBe(
|
||||
'llm-generation-tracing/home_brief/v1.0-ab1fc3/2026-05-22/00000000-0000-0000-0000-000000000001.json.zst',
|
||||
);
|
||||
});
|
||||
|
||||
it('sanitises path-unsafe characters in scenario and version segments', () => {
|
||||
const key = buildTracingKey(samplePayload({ prompt_version: 'v 2/0', scenario: 'odd name!' }));
|
||||
expect(key).toMatch(
|
||||
/llm-generation-tracing\/odd_name_\/v_2_0-ab1fc3\/2026-05-22\/00000000-0000-0000-0000-000000000001\.json\.zst/,
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
describe('S3TracingStore.save', () => {
|
||||
it('uploads zstd-compressed JSON with the canonical key and content-type', async () => {
|
||||
const store = new S3TracingStore();
|
||||
const payload = samplePayload({ input: { messages: [{ role: 'user' }] } });
|
||||
|
||||
const { key } = await store.save(payload);
|
||||
|
||||
expect(key).toBe(
|
||||
'llm-generation-tracing/home_brief/v1.0-ab1fc3/2026-05-22/00000000-0000-0000-0000-000000000001.json.zst',
|
||||
);
|
||||
expect(uploadBuffer).toHaveBeenCalledTimes(1);
|
||||
|
||||
const [callKey, body, contentType] = uploadBuffer.mock.calls[0];
|
||||
expect(callKey).toBe(key);
|
||||
expect(contentType).toBe('application/zstd');
|
||||
expect(Buffer.isBuffer(body)).toBe(true);
|
||||
expect([body[0], body[1], body[2], body[3]]).toEqual([0x28, 0xb5, 0x2f, 0xfd]);
|
||||
|
||||
const roundtripped = JSON.parse((await decompressZstd(body)).toString('utf8'));
|
||||
expect(roundtripped).toEqual(payload);
|
||||
});
|
||||
});
|
||||
|
||||
describe('S3TracingStore.get', () => {
|
||||
it('decompresses a stored payload by key', async () => {
|
||||
const store = new S3TracingStore();
|
||||
const payload = samplePayload();
|
||||
const buf = await compressZstd(Buffer.from(JSON.stringify(payload)));
|
||||
getFileByteArray.mockResolvedValueOnce(new Uint8Array(buf));
|
||||
|
||||
const loaded = await store.get('some/key.json.zst');
|
||||
expect(loaded).toEqual(payload);
|
||||
});
|
||||
|
||||
it('returns null when the key is missing', async () => {
|
||||
const store = new S3TracingStore();
|
||||
getFileByteArray.mockRejectedValueOnce(new Error('NoSuchKey'));
|
||||
expect(await store.get('missing')).toBeNull();
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,88 @@
|
||||
import { promisify } from 'node:util';
|
||||
import { zstdCompress, zstdDecompress } from 'node:zlib';
|
||||
|
||||
import type {
|
||||
ITracingStore,
|
||||
SaveResult,
|
||||
TracingPayload,
|
||||
TracingSummary,
|
||||
} from '@lobechat/llm-generation-tracing';
|
||||
import debug from 'debug';
|
||||
|
||||
import { FileS3 } from '@/server/modules/S3';
|
||||
|
||||
const compressZstd = promisify(zstdCompress);
|
||||
const decompressZstd = promisify(zstdDecompress);
|
||||
|
||||
const log = debug('lobe-server:llm-generation-tracing:s3');
|
||||
|
||||
const TRACE_PREFIX = 'llm-generation-tracing';
|
||||
const PAYLOAD_SUFFIX = '.json.zst';
|
||||
const ZSTD_CONTENT_TYPE = 'application/zstd';
|
||||
|
||||
const sanitize = (value: string): string => value.replaceAll(/[^\w.-]+/g, '_') || 'unknown';
|
||||
|
||||
const dateSegment = (createdAt: number): string => new Date(createdAt).toISOString().slice(0, 10);
|
||||
|
||||
/**
|
||||
* Canonical S3 key for a tracing payload. Same source of truth used by both
|
||||
* the store's `save()` and the DB row's `storage_key` so the value persisted
|
||||
* in `llm_generation_tracing.storage_key` always matches the object in S3.
|
||||
*
|
||||
* Layout:
|
||||
* llm-generation-tracing/{scenario}/{promptVersion}-{promptHash}/{yyyy-mm-dd}/{tracingId}.json.zst
|
||||
*/
|
||||
export const buildTracingKey = (record: {
|
||||
created_at: number;
|
||||
prompt_hash: string;
|
||||
prompt_version: string;
|
||||
scenario: string;
|
||||
tracing_id: string;
|
||||
}): string =>
|
||||
[
|
||||
TRACE_PREFIX,
|
||||
sanitize(record.scenario),
|
||||
`${sanitize(record.prompt_version)}-${sanitize(record.prompt_hash)}`,
|
||||
dateSegment(record.created_at),
|
||||
`${sanitize(record.tracing_id)}${PAYLOAD_SUFFIX}`,
|
||||
].join('/');
|
||||
|
||||
/**
|
||||
* S3-backed store for per-call llm_generation_tracing payloads.
|
||||
*
|
||||
* Payload is zstd-compressed (level 3) prior to upload; the `.zst` suffix
|
||||
* advertises the format but Content-Encoding is intentionally omitted to keep
|
||||
* the object opaque to HTTP middleware (callers decompress explicitly).
|
||||
*
|
||||
* Query (`get` / `list`) is left intentionally minimal — analytics queries go
|
||||
* against the DB row; the S3 blob is the cold artefact for offline review.
|
||||
*/
|
||||
export class S3TracingStore implements ITracingStore {
|
||||
private readonly s3: FileS3;
|
||||
|
||||
constructor() {
|
||||
this.s3 = new FileS3();
|
||||
}
|
||||
|
||||
async save(record: TracingPayload): Promise<SaveResult> {
|
||||
const key = buildTracingKey(record);
|
||||
log('Saving tracing payload to S3: %s', key);
|
||||
const compressed = await compressZstd(Buffer.from(JSON.stringify(record)));
|
||||
await this.s3.uploadBuffer(key, compressed, ZSTD_CONTENT_TYPE);
|
||||
return { key };
|
||||
}
|
||||
|
||||
async get(key: string): Promise<TracingPayload | null> {
|
||||
try {
|
||||
const bytes = await this.s3.getFileByteArray(key);
|
||||
const buf = await decompressZstd(Buffer.from(bytes));
|
||||
return JSON.parse(buf.toString('utf8')) as TracingPayload;
|
||||
} catch {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
async list(_options?: { limit?: number }): Promise<TracingSummary[]> {
|
||||
return [];
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1 @@
|
||||
export { buildTracingKey, S3TracingStore } from './S3TracingStore';
|
||||
@@ -1,5 +1,9 @@
|
||||
import { type GoogleGenAIOptions } from '@google/genai';
|
||||
import { ModelRuntime, type ModelRuntimeHooks } from '@lobechat/model-runtime';
|
||||
import {
|
||||
mergeModelRuntimeHooks,
|
||||
ModelRuntime,
|
||||
type ModelRuntimeHooks,
|
||||
} from '@lobechat/model-runtime';
|
||||
import { LobeVertexAI } from '@lobechat/model-runtime/vertexai';
|
||||
import {
|
||||
type AWSBedrockKeyVault,
|
||||
@@ -18,6 +22,7 @@ import { getBusinessModelRuntimeHooks } from '@/business/server/model-runtime';
|
||||
import { AiProviderModel } from '@/database/models/aiProvider';
|
||||
import { type LobeChatDatabase } from '@/database/type';
|
||||
import { getLLMConfig } from '@/envs/llm';
|
||||
import { createLLMGenerationTracingHook } from '@/server/services/llmGenerationTracing/hook';
|
||||
|
||||
import { KeyVaultsGateKeeper } from '../KeyVaultsEncrypt';
|
||||
import apiKeyManager from './apiKeyManager';
|
||||
@@ -420,8 +425,13 @@ export const initModelRuntimeFromDB = async (
|
||||
const payload = buildPayloadFromKeyVaults(keyVaults, runtimeProvider);
|
||||
|
||||
// 4. Get business hooks (billing in cloud, undefined in OSS)
|
||||
const hooks = getBusinessModelRuntimeHooks(userId, provider);
|
||||
const businessHooks = getBusinessModelRuntimeHooks(userId, provider);
|
||||
|
||||
// 5. Initialize ModelRuntime with the payload and hooks
|
||||
// 5. Compose with the per-call llm_generation_tracing hook (no-op when the
|
||||
// service is unconfigured, so OSS / self-hosted setups pay nothing for it).
|
||||
const tracingHooks = createLLMGenerationTracingHook(userId, provider);
|
||||
const hooks = mergeModelRuntimeHooks(businessHooks, tracingHooks);
|
||||
|
||||
// 6. Initialize ModelRuntime with the payload and hooks
|
||||
return initModelRuntimeWithUserPayload(provider, payload, { userId }, hooks);
|
||||
};
|
||||
|
||||
@@ -1013,5 +1013,43 @@ describe('aiChatRouter', () => {
|
||||
{ metadata: { trigger: 'chat' } },
|
||||
);
|
||||
});
|
||||
|
||||
it('merges caller metadata over the default trigger', async () => {
|
||||
const { initModelRuntimeFromDB } = await import('@/server/modules/ModelRuntime');
|
||||
const mockGenerateObject = vi.fn().mockResolvedValue({ completion: 'hi there' });
|
||||
vi.mocked(initModelRuntimeFromDB).mockResolvedValue({
|
||||
generateObject: mockGenerateObject,
|
||||
} as any);
|
||||
|
||||
const caller = aiChatRouter.createCaller({ ...mockCtx, serverDB: {} } as any);
|
||||
await caller.outputJSON({
|
||||
messages: [{ content: 'be helpful', role: 'system' }],
|
||||
metadata: {
|
||||
promptVersion: 'v2.0',
|
||||
scenario: 'input_completion',
|
||||
schemaName: 'InputCompletion',
|
||||
},
|
||||
model: 'gpt-4o-mini',
|
||||
provider: 'openai',
|
||||
schema: {
|
||||
name: 'InputCompletion',
|
||||
schema: {
|
||||
additionalProperties: false,
|
||||
properties: { completion: { type: 'string' } },
|
||||
required: ['completion'],
|
||||
type: 'object' as const,
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(mockGenerateObject.mock.calls[0][1]).toEqual({
|
||||
metadata: {
|
||||
promptVersion: 'v2.0',
|
||||
scenario: 'input_completion',
|
||||
schemaName: 'InputCompletion',
|
||||
trigger: 'chat',
|
||||
},
|
||||
});
|
||||
});
|
||||
});
|
||||
});
|
||||
|
||||
@@ -11,9 +11,9 @@ import { ThreadModel } from '@/database/models/thread';
|
||||
import { TopicModel } from '@/database/models/topic';
|
||||
import { authedProcedure, router } from '@/libs/trpc/lambda';
|
||||
import { serverDatabase } from '@/libs/trpc/lambda/middleware';
|
||||
import { initModelRuntimeFromDB } from '@/server/modules/ModelRuntime';
|
||||
import { resolveContext } from '@/server/routers/lambda/_helpers/resolveContext';
|
||||
import { AiChatService } from '@/server/services/aiChat';
|
||||
import { AiGenerationService } from '@/server/services/aiGeneration';
|
||||
import { FileService } from '@/server/services/file';
|
||||
import { archiveToolResultIfNeeded } from '@/server/services/toolExecution/archiveToolResult';
|
||||
|
||||
@@ -29,6 +29,7 @@ const aiChatProcedure = authedProcedure.use(serverDatabase).use(async (opts) =>
|
||||
ctx: {
|
||||
agentModel: new AgentModel(ctx.serverDB, ctx.userId),
|
||||
aiChatService: new AiChatService(ctx.serverDB, ctx.userId),
|
||||
aiGenerationService: new AiGenerationService(ctx.serverDB, ctx.userId),
|
||||
fileService: new FileService(ctx.serverDB, ctx.userId),
|
||||
messageModel: new MessageModel(ctx.serverDB, ctx.userId),
|
||||
threadModel: new ThreadModel(ctx.serverDB, ctx.userId),
|
||||
@@ -43,19 +44,22 @@ export const aiChatRouter = router({
|
||||
log('messages count: %d', input.messages.length);
|
||||
log('schema: %O', input.schema);
|
||||
|
||||
log('initializing model runtime from DB with provider: %s', input.provider);
|
||||
// Read user's provider config from database
|
||||
const modelRuntime = await initModelRuntimeFromDB(ctx.serverDB, ctx.userId, input.provider);
|
||||
|
||||
log('calling generateObject');
|
||||
const result = await modelRuntime.generateObject(
|
||||
// Always stamp a trigger on metadata so cross-cutting hooks (timing,
|
||||
// routing) and the tracing registry have a fallback when the caller
|
||||
// forgets to set one. `tracing` carries the structured tracing config
|
||||
// (scenario / promptVersion / schemaName / inputHint / ...).
|
||||
const result = await ctx.aiGenerationService.generateObject(
|
||||
{
|
||||
messages: input.messages,
|
||||
model: input.model,
|
||||
provider: input.provider,
|
||||
schema: input.schema,
|
||||
tools: input.tools,
|
||||
},
|
||||
{ metadata: { trigger: RequestTrigger.Chat } },
|
||||
{
|
||||
metadata: { trigger: RequestTrigger.Chat, ...input.metadata },
|
||||
tracing: input.tracing,
|
||||
},
|
||||
);
|
||||
|
||||
log('generateObject completed, result: %O', result);
|
||||
|
||||
@@ -0,0 +1,94 @@
|
||||
// @vitest-environment node
|
||||
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import * as ModelRuntimeModule from '@/server/modules/ModelRuntime';
|
||||
|
||||
import { AiGenerationService } from './index';
|
||||
|
||||
describe('AiGenerationService.generateObject', () => {
|
||||
const generateObject = vi.fn();
|
||||
const initSpy = vi.spyOn(ModelRuntimeModule, 'initModelRuntimeFromDB');
|
||||
|
||||
beforeEach(() => {
|
||||
generateObject.mockReset();
|
||||
initSpy.mockReset();
|
||||
initSpy.mockResolvedValue({ generateObject } as any);
|
||||
});
|
||||
|
||||
it('initialises the runtime from DB with the caller-supplied provider', async () => {
|
||||
generateObject.mockResolvedValue({ ok: true });
|
||||
const ai = new AiGenerationService({} as any, 'user-1');
|
||||
await ai.generateObject({
|
||||
messages: [{ content: 'hi', role: 'user' }],
|
||||
model: 'gpt-4o',
|
||||
provider: 'openai',
|
||||
});
|
||||
expect(initSpy).toHaveBeenCalledWith({}, 'user-1', 'openai');
|
||||
});
|
||||
|
||||
it('forwards messages / model / schema / tools verbatim to the runtime', async () => {
|
||||
generateObject.mockResolvedValue({ name: 'Atlas' });
|
||||
const schema = {
|
||||
name: 'Person',
|
||||
schema: {
|
||||
properties: { name: { type: 'string' } },
|
||||
required: ['name'],
|
||||
type: 'object' as const,
|
||||
},
|
||||
};
|
||||
|
||||
const ai = new AiGenerationService({} as any, 'user-1');
|
||||
await ai.generateObject({
|
||||
messages: [{ content: 'pick a name', role: 'user' }],
|
||||
model: 'gpt-4o',
|
||||
provider: 'openai',
|
||||
schema,
|
||||
});
|
||||
|
||||
const [payload] = generateObject.mock.calls[0];
|
||||
expect(payload).toEqual({
|
||||
messages: [{ content: 'pick a name', role: 'user' }],
|
||||
model: 'gpt-4o',
|
||||
schema,
|
||||
tools: undefined,
|
||||
});
|
||||
});
|
||||
|
||||
it('forwards both options.metadata and options.tracing through to ModelRuntime.generateObject', async () => {
|
||||
generateObject.mockResolvedValue({});
|
||||
const ai = new AiGenerationService({} as any, 'user-1');
|
||||
await ai.generateObject(
|
||||
{
|
||||
messages: [],
|
||||
model: 'gpt-4o',
|
||||
provider: 'openai',
|
||||
},
|
||||
{
|
||||
metadata: { trigger: 'chat' },
|
||||
tracing: {
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'input_completion',
|
||||
},
|
||||
},
|
||||
);
|
||||
const [, options] = generateObject.mock.calls[0];
|
||||
expect(options).toMatchObject({
|
||||
metadata: { trigger: 'chat' },
|
||||
tracing: {
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'input_completion',
|
||||
},
|
||||
});
|
||||
});
|
||||
|
||||
it('returns the runtime result with the typed cast applied', async () => {
|
||||
generateObject.mockResolvedValue({ completion: 'hello world' });
|
||||
const ai = new AiGenerationService({} as any, 'user-1');
|
||||
const result = await ai.generateObject<{ completion: string }>({
|
||||
messages: [],
|
||||
model: 'gpt-4o',
|
||||
provider: 'openai',
|
||||
});
|
||||
expect(result.completion).toBe('hello world');
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,71 @@
|
||||
import type {
|
||||
ChatCompletionTool,
|
||||
GenerateObjectPayload,
|
||||
GenerateObjectSchema,
|
||||
} from '@lobechat/model-runtime';
|
||||
import type { OpenAIChatMessage } from '@lobechat/types';
|
||||
|
||||
import type { LobeChatDatabase } from '@/database/type';
|
||||
import { initModelRuntimeFromDB } from '@/server/modules/ModelRuntime';
|
||||
|
||||
export interface AiGenerationObjectInput {
|
||||
messages: OpenAIChatMessage[] | GenerateObjectPayload['messages'];
|
||||
model: string;
|
||||
provider: string;
|
||||
schema?: GenerateObjectSchema;
|
||||
tools?: ChatCompletionTool[];
|
||||
}
|
||||
|
||||
export interface AiGenerationObjectOptions {
|
||||
/**
|
||||
* Free-form context forwarded to non-tracing hooks (billing, routing). Use
|
||||
* `tracing` instead for `llm_generation_tracing` config.
|
||||
*/
|
||||
metadata?: Record<string, unknown>;
|
||||
signal?: AbortSignal;
|
||||
/**
|
||||
* Structured tracing config (scenario / promptVersion / schemaName /
|
||||
* agentId / topicId / inputHint / ...). Forwarded to the
|
||||
* `llm_generation_tracing` hook. Strongly typed by `TracingOptions` from
|
||||
* `@lobechat/llm-generation-tracing` at call sites.
|
||||
*/
|
||||
tracing?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
/**
|
||||
* Thin wrapper around `initModelRuntimeFromDB` + `ModelRuntime.generateObject`.
|
||||
*
|
||||
* Almost every server-side caller that produces structured output goes through
|
||||
* the same two-step dance: resolve the user's provider config from the DB,
|
||||
* then call generateObject with caller-specific metadata. This service exists
|
||||
* so those call sites don't repeat the init wiring, and so adding a future
|
||||
* cross-cutting concern (default metadata, retries, observability defaults)
|
||||
* has one place to land.
|
||||
*
|
||||
* Construct one per request — `db` and `userId` come from the request context.
|
||||
*/
|
||||
export class AiGenerationService {
|
||||
private readonly db: LobeChatDatabase;
|
||||
private readonly userId: string;
|
||||
|
||||
constructor(db: LobeChatDatabase, userId: string) {
|
||||
this.db = db;
|
||||
this.userId = userId;
|
||||
}
|
||||
|
||||
async generateObject<T = unknown>(
|
||||
input: AiGenerationObjectInput,
|
||||
options: AiGenerationObjectOptions = {},
|
||||
): Promise<T> {
|
||||
const runtime = await initModelRuntimeFromDB(this.db, this.userId, input.provider);
|
||||
return (await runtime.generateObject(
|
||||
{
|
||||
messages: input.messages as GenerateObjectPayload['messages'],
|
||||
model: input.model,
|
||||
schema: input.schema,
|
||||
tools: input.tools,
|
||||
},
|
||||
{ metadata: options.metadata, signal: options.signal, tracing: options.tracing },
|
||||
)) as T;
|
||||
}
|
||||
}
|
||||
@@ -92,6 +92,12 @@ describe('FollowUpActionService.extract', () => {
|
||||
expect.objectContaining({
|
||||
model: 'custom-scene-model',
|
||||
}),
|
||||
expect.objectContaining({
|
||||
tracing: expect.objectContaining({
|
||||
scenario: 'follow_up',
|
||||
topicId: TEST_TOPIC,
|
||||
}),
|
||||
}),
|
||||
);
|
||||
});
|
||||
|
||||
|
||||
@@ -1,10 +1,12 @@
|
||||
import { TRACING_SCENARIOS } from '@lobechat/const';
|
||||
import type { TracingOptions } from '@lobechat/llm-generation-tracing';
|
||||
import type { FollowUpChip, FollowUpExtractInput, FollowUpExtractResult } from '@lobechat/types';
|
||||
import debug from 'debug';
|
||||
|
||||
import type { LobeChatDatabase } from '@/database/type';
|
||||
import { initModelRuntimeFromDB } from '@/server/modules/ModelRuntime';
|
||||
import { AiGenerationService } from '@/server/services/aiGeneration';
|
||||
|
||||
import { buildSuggestionPrompt } from './prompts';
|
||||
import { buildSuggestionPrompt, FOLLOW_UP_PROMPT_VERSION } from './prompts';
|
||||
import { RawResponseSchema, SUGGESTION_RESPONSE_JSON_SCHEMA } from './schema';
|
||||
|
||||
const log = debug('lobe-server:follow-up-action-service');
|
||||
@@ -52,17 +54,28 @@ export class FollowUpActionService {
|
||||
const { system, user } = buildSuggestionPrompt({ assistantText: text, hint });
|
||||
const { model, provider } = modelConfig;
|
||||
|
||||
const ai = new AiGenerationService(this.db, this.userId);
|
||||
let raw: unknown;
|
||||
try {
|
||||
const modelRuntime = await initModelRuntimeFromDB(this.db, this.userId, provider);
|
||||
raw = await modelRuntime.generateObject({
|
||||
messages: [
|
||||
{ content: system, role: 'system' as const },
|
||||
{ content: user, role: 'user' as const },
|
||||
],
|
||||
model,
|
||||
schema: SUGGESTION_RESPONSE_JSON_SCHEMA,
|
||||
});
|
||||
raw = await ai.generateObject(
|
||||
{
|
||||
messages: [
|
||||
{ content: system, role: 'system' as const },
|
||||
{ content: user, role: 'user' as const },
|
||||
],
|
||||
model,
|
||||
provider,
|
||||
schema: SUGGESTION_RESPONSE_JSON_SCHEMA,
|
||||
},
|
||||
{
|
||||
tracing: {
|
||||
promptVersion: FOLLOW_UP_PROMPT_VERSION,
|
||||
scenario: TRACING_SCENARIOS.FollowUp,
|
||||
schemaName: 'FollowUpSuggestionResponse',
|
||||
topicId,
|
||||
} satisfies TracingOptions,
|
||||
},
|
||||
);
|
||||
} catch (error) {
|
||||
log('LLM call failed: %O', error);
|
||||
return EMPTY_RESULT(row.id);
|
||||
|
||||
@@ -3,6 +3,13 @@ import type { FollowUpHint } from '@lobechat/types';
|
||||
import { BASE_SYSTEM_PROMPT } from './base';
|
||||
import { buildOnboardingAddendum } from './onboarding';
|
||||
|
||||
/**
|
||||
* Bump when editing BASE_SYSTEM_PROMPT, the onboarding addendum, or the
|
||||
* suggestion response schema. The 6-char prompt hash in the tracing row
|
||||
* catches forgotten bumps.
|
||||
*/
|
||||
export const FOLLOW_UP_PROMPT_VERSION = 'v1.0';
|
||||
|
||||
export interface BuiltPrompt {
|
||||
system: string;
|
||||
user: string;
|
||||
|
||||
@@ -0,0 +1,186 @@
|
||||
// @vitest-environment node
|
||||
import { beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
const isEnabled = vi.fn<() => boolean>(() => true);
|
||||
const record = vi.fn<(params: Record<string, unknown>) => Promise<{ tracingId: string }>>(
|
||||
async () => ({ tracingId: 'trace-1' }),
|
||||
);
|
||||
|
||||
vi.mock('./index', () => ({
|
||||
getLLMGenerationTracingService: () => ({ isEnabled, record }),
|
||||
}));
|
||||
|
||||
// next/server is optional at runtime; default to "not available" so the hook
|
||||
// falls back to its microtask path which is straightforward to test.
|
||||
vi.mock('next/server', () => ({}));
|
||||
|
||||
const { createLLMGenerationTracingHook } = await import('./hook');
|
||||
|
||||
const flushMicrotasks = async () => {
|
||||
await new Promise((resolve) => setImmediate(resolve));
|
||||
};
|
||||
|
||||
beforeEach(() => {
|
||||
isEnabled.mockReturnValue(true);
|
||||
record.mockClear();
|
||||
});
|
||||
|
||||
describe('createLLMGenerationTracingHook', () => {
|
||||
it('returns an empty object when the service is disabled', () => {
|
||||
isEnabled.mockReturnValue(false);
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
expect(hooks).toEqual({});
|
||||
});
|
||||
|
||||
it('schedules a service.record call on success, reading structured tracing config from options.tracing', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
expect(hooks.onGenerateObjectComplete).toBeDefined();
|
||||
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{
|
||||
latencyMs: 250,
|
||||
output: { topic: 'greeting' },
|
||||
success: true,
|
||||
usage: { cost: 0.001, totalInputTokens: 100, totalOutputTokens: 30 } as any,
|
||||
},
|
||||
{
|
||||
options: {
|
||||
metadata: { trigger: 'agent_signal' },
|
||||
tracing: {
|
||||
agentId: 'agt-1',
|
||||
promptVersion: 'v2.0',
|
||||
scenario: 'signal_skill_intent',
|
||||
topicId: 'tpc-1',
|
||||
},
|
||||
},
|
||||
payload: {
|
||||
messages: [
|
||||
{ content: 'be helpful', role: 'system' },
|
||||
{ content: 'hi there', role: 'user' },
|
||||
],
|
||||
model: 'gpt-4o',
|
||||
schema: { type: 'object' },
|
||||
} as any,
|
||||
},
|
||||
);
|
||||
|
||||
await flushMicrotasks();
|
||||
expect(record).toHaveBeenCalledTimes(1);
|
||||
const call = record.mock.calls[0][0];
|
||||
expect(call).toMatchObject({
|
||||
agentId: 'agt-1',
|
||||
costUsd: 0.001,
|
||||
inputTokens: 100,
|
||||
latencyMs: 250,
|
||||
model: 'gpt-4o',
|
||||
outputTokens: 30,
|
||||
promptVersion: 'v2.0',
|
||||
provider: 'openai',
|
||||
scenario: 'signal_skill_intent',
|
||||
success: true,
|
||||
topicId: 'tpc-1',
|
||||
trigger: 'agent_signal',
|
||||
userId: 'user-1',
|
||||
});
|
||||
expect((call.payload as { systemPrompt?: string }).systemPrompt).toBe('be helpful');
|
||||
expect(call.promptHash).toHaveLength(6);
|
||||
});
|
||||
|
||||
it('forwards caller-supplied inputHint through to the service', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{ latencyMs: 10, success: true },
|
||||
{
|
||||
options: {
|
||||
tracing: {
|
||||
inputHint: '杭州天气',
|
||||
scenario: 'input_completion',
|
||||
schemaName: 'InputCompletion',
|
||||
},
|
||||
},
|
||||
payload: { messages: [], model: 'gpt-4o', schema: {} } as any,
|
||||
},
|
||||
);
|
||||
await flushMicrotasks();
|
||||
expect(record.mock.calls[0][0]).toMatchObject({
|
||||
inputHint: '杭州天气',
|
||||
scenario: 'input_completion',
|
||||
schemaName: 'InputCompletion',
|
||||
});
|
||||
});
|
||||
|
||||
it('flags validation failures using the error message heuristic and resolves scenario from metadata.trigger fallback', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{
|
||||
error: { message: 'ZodError: required field missing' },
|
||||
latencyMs: 100,
|
||||
success: false,
|
||||
},
|
||||
{
|
||||
options: { metadata: { trigger: 'topic' } },
|
||||
payload: { messages: [], model: 'gpt-4o', schema: { type: 'object' } } as any,
|
||||
},
|
||||
);
|
||||
|
||||
await flushMicrotasks();
|
||||
expect(record.mock.calls[0][0]).toMatchObject({
|
||||
errorDetail: 'ZodError: required field missing',
|
||||
scenario: 'topic_title',
|
||||
success: false,
|
||||
trigger: 'topic',
|
||||
validationFailed: true,
|
||||
});
|
||||
});
|
||||
|
||||
it('writes caller-supplied tracing.metadata verbatim to the DB jsonb column (no auto-stamped provider)', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{ latencyMs: 5, success: true },
|
||||
{
|
||||
options: {
|
||||
metadata: { trigger: 'memory' },
|
||||
tracing: {
|
||||
agentId: 'agt-known',
|
||||
metadata: {
|
||||
parent_memory_trace_key: 'memory-extraction/user-1/topic/abc/trace/2026-05-22.json',
|
||||
},
|
||||
},
|
||||
},
|
||||
payload: { messages: [], model: 'gpt-4o', schema: {} } as any,
|
||||
},
|
||||
);
|
||||
await flushMicrotasks();
|
||||
// `provider` is a first-class column — must NOT be duplicated into metadata.
|
||||
expect(record.mock.calls[0][0].metadata).toEqual({
|
||||
parent_memory_trace_key: 'memory-extraction/user-1/topic/abc/trace/2026-05-22.json',
|
||||
});
|
||||
});
|
||||
|
||||
it('omits the metadata field when the caller passes no tracing.metadata', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{ latencyMs: 5, success: true },
|
||||
{
|
||||
options: { tracing: { scenario: 'input_completion' } },
|
||||
payload: { messages: [], model: 'gpt-4o', schema: {} } as any,
|
||||
},
|
||||
);
|
||||
await flushMicrotasks();
|
||||
expect(record.mock.calls[0][0].metadata).toBeUndefined();
|
||||
});
|
||||
|
||||
it('falls back to the unknown scenario when no trigger / scenario is provided anywhere', async () => {
|
||||
const hooks = createLLMGenerationTracingHook('user-1', 'openai');
|
||||
hooks.onGenerateObjectComplete!(
|
||||
{ latencyMs: 100, success: true },
|
||||
{ options: {}, payload: { messages: [], model: 'gpt-4o' } as any },
|
||||
);
|
||||
|
||||
await flushMicrotasks();
|
||||
expect(record.mock.calls[0][0]).toMatchObject({
|
||||
promptVersion: 'v0',
|
||||
scenario: 'unknown',
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,141 @@
|
||||
import {
|
||||
computePromptHash,
|
||||
resolveScenario,
|
||||
type TracingOptions,
|
||||
} from '@lobechat/llm-generation-tracing';
|
||||
import type { ModelRuntimeHooks } from '@lobechat/model-runtime';
|
||||
import debug from 'debug';
|
||||
|
||||
import { getLLMGenerationTracingService } from './index';
|
||||
|
||||
const log = debug('lobe-server:llm-generation-tracing:hook');
|
||||
|
||||
const pickString = (value: unknown): string | undefined =>
|
||||
typeof value === 'string' ? value : undefined;
|
||||
|
||||
/**
|
||||
* Validate the loose `options.tracing` bag (the runtime declares it as
|
||||
* `Record<string, unknown>`) into the strongly-typed `TracingOptions` shape
|
||||
* the hook works with. Unknown keys flow through `metadata` for the DB jsonb
|
||||
* column.
|
||||
*/
|
||||
const parseTracingOptions = (raw: Record<string, unknown> | undefined): TracingOptions => {
|
||||
if (!raw) return {};
|
||||
return {
|
||||
agentId: pickString(raw.agentId),
|
||||
inputHint: pickString(raw.inputHint),
|
||||
metadata:
|
||||
raw.metadata && typeof raw.metadata === 'object' && !Array.isArray(raw.metadata)
|
||||
? (raw.metadata as Record<string, unknown>)
|
||||
: undefined,
|
||||
parentTracingId: pickString(raw.parentTracingId),
|
||||
promptVersion: pickString(raw.promptVersion),
|
||||
scenario: pickString(raw.scenario),
|
||||
schemaName: pickString(raw.schemaName),
|
||||
systemPrompt: pickString(raw.systemPrompt),
|
||||
topicId: pickString(raw.topicId),
|
||||
trigger: pickString(raw.trigger),
|
||||
};
|
||||
};
|
||||
|
||||
const extractSystemPrompt = (messages: unknown): string => {
|
||||
if (!Array.isArray(messages)) return '';
|
||||
const first = messages[0] as { content?: unknown; role?: unknown } | undefined;
|
||||
if (first?.role === 'system' && typeof first.content === 'string') return first.content;
|
||||
return '';
|
||||
};
|
||||
|
||||
const tryScheduleAfter = (work: () => Promise<void> | void): void => {
|
||||
let scheduled = false;
|
||||
try {
|
||||
const nextServer = require('next/server') as { after?: (fn: () => unknown) => void };
|
||||
if (typeof nextServer.after === 'function') {
|
||||
nextServer.after(work);
|
||||
scheduled = true;
|
||||
}
|
||||
} catch {
|
||||
// next/server not available — fall through to fire-and-forget
|
||||
}
|
||||
if (!scheduled) {
|
||||
Promise.resolve()
|
||||
.then(work)
|
||||
.catch((err) => log('Deferred tracing work threw: %O', err));
|
||||
}
|
||||
};
|
||||
|
||||
/**
|
||||
* Build a `ModelRuntimeHooks` slice that records every `generateObject` call to
|
||||
* the `llm_generation_tracing` DB table + blob store. Designed to be merged
|
||||
* with any business hooks at the ModelRuntime construction site.
|
||||
*/
|
||||
export const createLLMGenerationTracingHook = (
|
||||
userId: string,
|
||||
provider: string,
|
||||
): Pick<ModelRuntimeHooks, 'onGenerateObjectComplete'> => {
|
||||
const service = getLLMGenerationTracingService();
|
||||
if (!service.isEnabled()) return {};
|
||||
|
||||
return {
|
||||
onGenerateObjectComplete: (data, context) => {
|
||||
const tracing = parseTracingOptions(context.options?.tracing as Record<string, unknown>);
|
||||
// `trigger` is also read by ModelRuntime itself (timing logs) so it
|
||||
// legitimately lives on `metadata`. Honour the explicit `tracing.trigger`
|
||||
// override but fall back to the cross-cutting `metadata.trigger`.
|
||||
const metadataTrigger = pickString(
|
||||
(context.options?.metadata as Record<string, unknown> | undefined)?.trigger,
|
||||
);
|
||||
const trigger = tracing.trigger ?? metadataTrigger;
|
||||
const { scenario, promptVersion } = resolveScenario({
|
||||
promptVersion: tracing.promptVersion,
|
||||
scenario: tracing.scenario,
|
||||
trigger,
|
||||
});
|
||||
|
||||
const systemPrompt = tracing.systemPrompt ?? extractSystemPrompt(context.payload.messages);
|
||||
const promptHash = computePromptHash(systemPrompt, context.payload.schema);
|
||||
|
||||
// Heuristic: a Zod validation error message starts with the Zod marker.
|
||||
const errorMessage = data.error?.message;
|
||||
const validationFailed =
|
||||
!data.success && typeof errorMessage === 'string' && /zod|validation/i.test(errorMessage);
|
||||
|
||||
tryScheduleAfter(async () => {
|
||||
try {
|
||||
await service.record({
|
||||
agentId: tracing.agentId,
|
||||
costUsd: (data.usage as { cost?: number } | undefined)?.cost,
|
||||
errorCode: data.error?.code,
|
||||
errorDetail: data.error?.message ?? data.error?.stack,
|
||||
inputHint: tracing.inputHint,
|
||||
inputTokens: data.usage?.totalInputTokens ?? data.usage?.inputTextTokens,
|
||||
latencyMs: data.latencyMs,
|
||||
// Caller-supplied jsonb context only. `provider` is already a
|
||||
// first-class column on the row — no need to duplicate it here.
|
||||
metadata: tracing.metadata,
|
||||
model: context.payload.model,
|
||||
outputTokens: data.usage?.totalOutputTokens ?? data.usage?.outputTextTokens,
|
||||
parentTracingId: tracing.parentTracingId,
|
||||
payload: {
|
||||
input: context.payload.messages,
|
||||
output: data.output,
|
||||
schema: context.payload.schema,
|
||||
systemPrompt,
|
||||
},
|
||||
promptHash,
|
||||
promptVersion,
|
||||
provider,
|
||||
scenario,
|
||||
schemaName: tracing.schemaName,
|
||||
success: data.success,
|
||||
topicId: tracing.topicId,
|
||||
trigger,
|
||||
userId,
|
||||
validationFailed,
|
||||
});
|
||||
} catch (err) {
|
||||
log('Tracing service threw: %O', err);
|
||||
}
|
||||
});
|
||||
},
|
||||
};
|
||||
};
|
||||
@@ -0,0 +1,227 @@
|
||||
// @vitest-environment node
|
||||
import type { ITracingStore, TracingPayload } from '@lobechat/llm-generation-tracing';
|
||||
import { eq } from 'drizzle-orm';
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from 'vitest';
|
||||
|
||||
import { getTestDB } from '@/database/core/getTestDB';
|
||||
import { llmGenerationTracing, users } from '@/database/schemas';
|
||||
import type { LobeChatDatabase } from '@/database/type';
|
||||
|
||||
import { LLMGenerationTracingService } from './index';
|
||||
|
||||
const serverDB: LobeChatDatabase = await getTestDB();
|
||||
|
||||
// The service resolves the DB via getServerDB at call time. Point it at our
|
||||
// test DB so the integration covers the real insert/update path.
|
||||
vi.mock('@/database/server', () => ({ getServerDB: async () => serverDB }));
|
||||
|
||||
const userId = 'llm-gen-trace-svc-user';
|
||||
|
||||
const stubStore: ITracingStore & {
|
||||
save: ReturnType<typeof vi.fn<(record: TracingPayload) => Promise<{ key: string | null }>>>;
|
||||
} = {
|
||||
save: vi.fn<(record: TracingPayload) => Promise<{ key: string | null }>>(async () => ({
|
||||
key: 'memo://saved',
|
||||
})),
|
||||
};
|
||||
|
||||
beforeEach(async () => {
|
||||
await serverDB.delete(users);
|
||||
await serverDB.insert(users).values([{ id: userId }]);
|
||||
stubStore.save.mockClear();
|
||||
stubStore.save.mockResolvedValue({ key: 'memo://saved' });
|
||||
});
|
||||
|
||||
afterEach(async () => {
|
||||
await serverDB.delete(llmGenerationTracing);
|
||||
await serverDB.delete(users);
|
||||
});
|
||||
|
||||
describe('LLMGenerationTracingService.record', () => {
|
||||
it('inserts a row, calls the store, and patches the returned storage_key', async () => {
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
|
||||
const result = await service.record({
|
||||
latencyMs: 420,
|
||||
model: 'gpt-4o',
|
||||
payload: {
|
||||
input: [{ content: 'hello world from the user', role: 'user' }],
|
||||
output: { topic: 'greeting' },
|
||||
schema: { type: 'object' },
|
||||
systemPrompt: 'be helpful',
|
||||
},
|
||||
promptHash: 'aaaaaa',
|
||||
promptVersion: 'v1.0',
|
||||
provider: 'openai',
|
||||
scenario: 'home_brief',
|
||||
success: true,
|
||||
trigger: 'home_brief',
|
||||
userId,
|
||||
});
|
||||
|
||||
expect(result?.tracingId).toMatch(/^[0-9a-f-]{36}$/);
|
||||
expect(stubStore.save).toHaveBeenCalledTimes(1);
|
||||
|
||||
const payload = stubStore.save.mock.calls[0][0];
|
||||
expect(payload).toMatchObject({
|
||||
input: [{ content: 'hello world from the user', role: 'user' }],
|
||||
model_metadata: { model: 'gpt-4o', provider: 'openai' },
|
||||
output: { topic: 'greeting' },
|
||||
prompt_hash: 'aaaaaa',
|
||||
scenario: 'home_brief',
|
||||
tracing_id: result?.tracingId,
|
||||
version: '1.0',
|
||||
});
|
||||
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, result!.tracingId));
|
||||
expect(row).toMatchObject({
|
||||
inputHint: 'hello world from the user',
|
||||
latencyMs: 420,
|
||||
model: 'gpt-4o',
|
||||
promptHash: 'aaaaaa',
|
||||
provider: 'openai',
|
||||
scenario: 'home_brief',
|
||||
storageKey: 'memo://saved',
|
||||
success: true,
|
||||
trigger: 'home_brief',
|
||||
userId,
|
||||
});
|
||||
expect(row?.inputHash).toMatch(/^[0-9a-f]{64}$/);
|
||||
});
|
||||
|
||||
it('preserves the row with storage_key=null and metadata.store_error when the store throws', async () => {
|
||||
stubStore.save.mockRejectedValueOnce(new Error('S3 5xx'));
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
|
||||
const result = await service.record({
|
||||
metadata: { caller: 'home_brief_handler' },
|
||||
promptHash: 'bbbbbb',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'home_brief',
|
||||
success: true,
|
||||
userId,
|
||||
});
|
||||
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, result!.tracingId));
|
||||
expect(row?.storageKey).toBeNull();
|
||||
expect(row?.metadata).toMatchObject({
|
||||
caller: 'home_brief_handler',
|
||||
store_error: 'S3 5xx',
|
||||
});
|
||||
});
|
||||
|
||||
it('honours an explicit inputHint override instead of auto-extracting from the first user message', async () => {
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
const result = await service.record({
|
||||
inputHint: '杭州天气',
|
||||
payload: {
|
||||
// Wrapper prompt — first user message is a template, not the real input.
|
||||
input: [
|
||||
{ content: 'be helpful', role: 'system' },
|
||||
{ content: 'Before cursor: "杭州天气" After cursor: ""', role: 'user' },
|
||||
],
|
||||
},
|
||||
promptHash: 'ffffff',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'input_completion',
|
||||
success: true,
|
||||
userId,
|
||||
});
|
||||
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, result!.tracingId));
|
||||
expect(row?.inputHint).toBe('杭州天气');
|
||||
});
|
||||
|
||||
it('truncates an excessively long inputHint override to INPUT_HINT_MAX', async () => {
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
const long = 'x'.repeat(500);
|
||||
const result = await service.record({
|
||||
inputHint: long,
|
||||
promptHash: 'ffffff',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'input_completion',
|
||||
success: true,
|
||||
userId,
|
||||
});
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, result!.tracingId));
|
||||
expect(row?.inputHint?.length).toBe(200);
|
||||
});
|
||||
|
||||
it('leaves storage_key null when the store reports a local-only save (key=null)', async () => {
|
||||
stubStore.save.mockResolvedValueOnce({ key: null });
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
|
||||
const result = await service.record({
|
||||
promptHash: 'eeeeee',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'home_brief',
|
||||
success: true,
|
||||
userId,
|
||||
});
|
||||
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, result!.tracingId));
|
||||
expect(row?.storageKey).toBeNull();
|
||||
});
|
||||
|
||||
it('returns null and skips everything when no store is configured', async () => {
|
||||
const service = new LLMGenerationTracingService(null);
|
||||
const result = await service.record({
|
||||
promptHash: 'cccccc',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'home_brief',
|
||||
success: true,
|
||||
userId,
|
||||
});
|
||||
expect(result).toBeNull();
|
||||
expect(stubStore.save).not.toHaveBeenCalled();
|
||||
const rows = await serverDB.select().from(llmGenerationTracing);
|
||||
expect(rows).toHaveLength(0);
|
||||
});
|
||||
});
|
||||
|
||||
describe('LLMGenerationTracingService.recordFeedback', () => {
|
||||
it('writes feedback columns onto the row owned by the caller', async () => {
|
||||
const service = new LLMGenerationTracingService(stubStore);
|
||||
|
||||
const { tracingId } = (await service.record({
|
||||
promptHash: 'dddddd',
|
||||
promptVersion: 'v1.0',
|
||||
scenario: 'agent_welcome',
|
||||
success: true,
|
||||
userId,
|
||||
}))!;
|
||||
|
||||
await service.recordFeedback(userId, tracingId, {
|
||||
data: { clicked_question_index: 0 },
|
||||
score: 1,
|
||||
signal: 'positive',
|
||||
source: 'explicit_thumbs',
|
||||
});
|
||||
|
||||
const [row] = await serverDB
|
||||
.select()
|
||||
.from(llmGenerationTracing)
|
||||
.where(eq(llmGenerationTracing.id, tracingId));
|
||||
expect(row).toMatchObject({
|
||||
feedbackData: { clicked_question_index: 0 },
|
||||
feedbackScore: 1,
|
||||
feedbackSignal: 'positive',
|
||||
feedbackSource: 'explicit_thumbs',
|
||||
});
|
||||
});
|
||||
});
|
||||
@@ -0,0 +1,258 @@
|
||||
import {
|
||||
computeInputHash,
|
||||
FileTracingStore,
|
||||
type ITracingStore,
|
||||
type TracingPayload,
|
||||
} from '@lobechat/llm-generation-tracing';
|
||||
import debug from 'debug';
|
||||
import { eq } from 'drizzle-orm';
|
||||
|
||||
import {
|
||||
LlmGenerationTracingModel,
|
||||
type RecordLlmGenerationParams,
|
||||
type UpdateLlmGenerationFeedbackParams,
|
||||
} from '@/database/models/llmGenerationTracing';
|
||||
import { llmGenerationTracing } from '@/database/schemas/llmGenerationTracing';
|
||||
import { getServerDB } from '@/database/server';
|
||||
|
||||
const log = debug('lobe-server:llm-generation-tracing:service');
|
||||
|
||||
const INPUT_HINT_MAX = 200;
|
||||
|
||||
export interface GenerationCallPayload {
|
||||
input?: unknown;
|
||||
output?: unknown;
|
||||
rawOutput?: string;
|
||||
schema?: unknown;
|
||||
systemPrompt?: string;
|
||||
}
|
||||
|
||||
export interface RecordLLMGenerationCallParams {
|
||||
agentId?: string | null;
|
||||
costUsd?: number | null;
|
||||
errorCode?: string | null;
|
||||
errorDetail?: string | null;
|
||||
/**
|
||||
* Caller-supplied snippet stored on `input_hint`. When omitted, the service
|
||||
* auto-extracts a hint from the first user message in `payload.input`.
|
||||
* Callers wrapping the user's text in a template should pass the raw input
|
||||
* here so the DB row stays human-scannable.
|
||||
*/
|
||||
inputHint?: string | null;
|
||||
inputTokens?: number | null;
|
||||
latencyMs?: number | null;
|
||||
metadata?: Record<string, unknown>;
|
||||
model?: string | null;
|
||||
outputTokens?: number | null;
|
||||
parentTracingId?: string | null;
|
||||
payload?: GenerationCallPayload;
|
||||
promptHash: string;
|
||||
promptVersion: string;
|
||||
provider?: string | null;
|
||||
scenario: string;
|
||||
schemaName?: string | null;
|
||||
spanId?: string | null;
|
||||
success: boolean;
|
||||
topicId?: string | null;
|
||||
traceId?: string | null;
|
||||
trigger?: string | null;
|
||||
userId: string;
|
||||
validationFailed?: boolean;
|
||||
}
|
||||
|
||||
/**
|
||||
* Per-call observability for `generateObject`. Persists a structured summary
|
||||
* row to `llm_generation_tracing` and the full prompt/input/output blob to the
|
||||
* configured store (S3 in prod, local file in dev, no-op otherwise).
|
||||
*
|
||||
* Always invoked from `after()` so it never blocks the user response. Both
|
||||
* store and DB failures are swallowed and logged — the DB row is the source
|
||||
* of truth for analytics, the blob is a cold artefact for offline review.
|
||||
*/
|
||||
export class LLMGenerationTracingService {
|
||||
private readonly store: ITracingStore | null;
|
||||
|
||||
constructor(store?: ITracingStore | null) {
|
||||
this.store = store === undefined ? createDefaultStore() : store;
|
||||
}
|
||||
|
||||
isEnabled(): boolean {
|
||||
return this.store !== null;
|
||||
}
|
||||
|
||||
async record(params: RecordLLMGenerationCallParams): Promise<{ tracingId: string } | null> {
|
||||
if (!this.store) return null;
|
||||
|
||||
let db: Awaited<ReturnType<typeof getServerDB>>;
|
||||
try {
|
||||
db = await getServerDB();
|
||||
} catch (err) {
|
||||
log('Skipping tracing — getServerDB failed: %O', err);
|
||||
return null;
|
||||
}
|
||||
|
||||
const model = new LlmGenerationTracingModel(db, params.userId);
|
||||
|
||||
const dbValues: RecordLlmGenerationParams = {
|
||||
agentId: params.agentId,
|
||||
costUsd: params.costUsd,
|
||||
errorCode: params.errorCode,
|
||||
errorDetail: params.errorDetail,
|
||||
inputHash: params.payload?.input ? computeInputHash(params.payload.input) : null,
|
||||
inputHint: resolveInputHint(params.inputHint, params.payload?.input),
|
||||
inputTokens: params.inputTokens,
|
||||
latencyMs: params.latencyMs,
|
||||
metadata: params.metadata,
|
||||
model: params.model,
|
||||
outputTokens: params.outputTokens,
|
||||
parentTracingId: params.parentTracingId,
|
||||
promptHash: params.promptHash,
|
||||
promptVersion: params.promptVersion,
|
||||
provider: params.provider,
|
||||
scenario: params.scenario,
|
||||
schemaName: params.schemaName,
|
||||
spanId: params.spanId,
|
||||
success: params.success,
|
||||
topicId: params.topicId,
|
||||
traceId: params.traceId,
|
||||
trigger: params.trigger,
|
||||
validationFailed: params.validationFailed,
|
||||
};
|
||||
|
||||
// Insert first so the storage key can embed the row's id — every blob then
|
||||
// points at exactly one row.
|
||||
let tracingId: string;
|
||||
try {
|
||||
const row = await model.record(dbValues);
|
||||
tracingId = row.id;
|
||||
} catch (err) {
|
||||
log('DB insert failed: %O', err);
|
||||
return null;
|
||||
}
|
||||
|
||||
const payload: TracingPayload = {
|
||||
created_at: Date.now(),
|
||||
error: params.success
|
||||
? undefined
|
||||
: {
|
||||
code: params.errorCode ?? undefined,
|
||||
message: params.errorDetail ?? undefined,
|
||||
},
|
||||
input: params.payload?.input,
|
||||
model_metadata: {
|
||||
model: params.model ?? undefined,
|
||||
provider: params.provider ?? undefined,
|
||||
},
|
||||
output: params.payload?.output,
|
||||
prompt_hash: params.promptHash,
|
||||
prompt_version: params.promptVersion,
|
||||
raw_output: params.validationFailed ? params.payload?.rawOutput : undefined,
|
||||
scenario: params.scenario,
|
||||
schema: params.payload?.schema,
|
||||
system_prompt: params.payload?.systemPrompt,
|
||||
tracing_id: tracingId,
|
||||
validation_failed: params.validationFailed,
|
||||
version: '1.0',
|
||||
};
|
||||
|
||||
let storageKey: string | null = null;
|
||||
let storeError: string | undefined;
|
||||
try {
|
||||
const result = await this.store.save(payload);
|
||||
storageKey = result.key;
|
||||
} catch (err) {
|
||||
storeError = err instanceof Error ? err.message : String(err);
|
||||
log('Store save failed (DB row kept): %O', err);
|
||||
}
|
||||
|
||||
try {
|
||||
await db
|
||||
.update(llmGenerationTracing)
|
||||
.set({
|
||||
metadata: storeError
|
||||
? { ...params.metadata, store_error: storeError }
|
||||
: (params.metadata ?? {}),
|
||||
storageKey,
|
||||
})
|
||||
.where(eq(llmGenerationTracing.id, tracingId));
|
||||
} catch (err) {
|
||||
log('Failed to patch storage_key onto row: %O', err);
|
||||
}
|
||||
|
||||
return { tracingId };
|
||||
}
|
||||
|
||||
async recordFeedback(
|
||||
userId: string,
|
||||
tracingId: string,
|
||||
params: UpdateLlmGenerationFeedbackParams,
|
||||
): Promise<void> {
|
||||
let db: Awaited<ReturnType<typeof getServerDB>>;
|
||||
try {
|
||||
db = await getServerDB();
|
||||
} catch (err) {
|
||||
log('Skipping feedback — getServerDB failed: %O', err);
|
||||
return;
|
||||
}
|
||||
const model = new LlmGenerationTracingModel(db, userId);
|
||||
try {
|
||||
await model.updateFeedback(tracingId, params);
|
||||
} catch (err) {
|
||||
log('Feedback update failed: %O', err);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
const createDefaultStore = (): ITracingStore | null => {
|
||||
if (process.env.ENABLE_LLM_GENERATION_TRACING_S3 === '1') {
|
||||
try {
|
||||
// Require at call time so test environments without S3 wiring don't break.
|
||||
|
||||
const { S3TracingStore } = require('@/server/modules/LLMGenerationTracing');
|
||||
return new S3TracingStore();
|
||||
} catch {
|
||||
// S3 wiring not available — fall through to file store / null.
|
||||
}
|
||||
}
|
||||
|
||||
if (process.env.NODE_ENV === 'development') {
|
||||
try {
|
||||
return new FileTracingStore();
|
||||
} catch {
|
||||
// Filesystem unavailable — fall through to null.
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
};
|
||||
|
||||
const autoExtractHint = (input: unknown): string | null => {
|
||||
if (input == null) return null;
|
||||
if (typeof input === 'string') return input;
|
||||
if (!Array.isArray(input)) return null;
|
||||
const firstUser = input.find(
|
||||
(m): m is { content: unknown; role: string } =>
|
||||
typeof m === 'object' &&
|
||||
m !== null &&
|
||||
'role' in m &&
|
||||
(m as { role: unknown }).role === 'user',
|
||||
);
|
||||
return firstUser && typeof firstUser.content === 'string' ? firstUser.content : null;
|
||||
};
|
||||
|
||||
/**
|
||||
* Pick the `input_hint` value: caller-supplied override wins; otherwise fall
|
||||
* back to a best-effort auto-extraction from the first user message. Always
|
||||
* truncated to `INPUT_HINT_MAX` so the column stays scannable.
|
||||
*/
|
||||
const resolveInputHint = (override: string | null | undefined, input: unknown): string | null => {
|
||||
const raw = override ?? autoExtractHint(input);
|
||||
if (raw == null) return null;
|
||||
return raw.slice(0, INPUT_HINT_MAX);
|
||||
};
|
||||
|
||||
let cachedInstance: LLMGenerationTracingService | null = null;
|
||||
export const getLLMGenerationTracingService = (): LLMGenerationTracingService => {
|
||||
if (!cachedInstance) cachedInstance = new LLMGenerationTracingService();
|
||||
return cachedInstance;
|
||||
};
|
||||
Reference in New Issue
Block a user