Files
lobe-chat/src/server/services/taskRunner
Tsuki 480f6a8e7b feat(task): support file & image attachments (#15141)
*  feat(task): support file & image attachments (LOBE-8967)

Adds attachment / image upload to all four Task input surfaces (Create
Modal, Inline Entry, Task Instruction, Comment Input, Feedback Input)
plus comment edit. Attachments persist in `tasks.editor_data` /
`task_comments.editor_data` as part of the Lexical JSON state and flow
into agent runs via `execAgent.fileIds` — images as multimodal vision
content, documents through `documentService.parseFile` for text
extraction.

Server-side fileId resolution rides on the editor's
`extractMediaFromEditorState` (`@lobehub/editor/headless` 4.15.1), so
no junction tables are needed — editor_data is the single source of
truth. The /f/{fileId} proxy URL contract from the file router stays
the bridge between editor URLs and backend file lookup.

Five UI surfaces share `EditorCanvas` + `editorAttachments` for inline
attachment insertion. Comment display renders the Lexical state via
`@lobehub/editor/renderer`'s `LexicalRenderer` so image sizes round-
trip without the EditorCanvas hydration flash.

DB schema (`tasks.editor_data jsonb` column) landed separately via
#15280.

Fixes LOBE-8967

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(task): correct fileId prefix + accept nodes without status

Real-world editor_data exposed two bugs in the regex-based extract:

1. `fileId` prefix was wrong — the regex looked for `fle_…` but
   `idGenerator('files')` actually produces `file_…`, so every proxy
   URL `/f/file_…` silently failed to match.
2. `@lobehub/editor`'s `extractMediaFromEditorState` requires
   `status === 'uploaded'` strictly. Editor data from the cloud upload
   path and from historical inserts omits the `status` field entirely,
   so the upstream helper silently dropped everything. Walk the tree
   ourselves and treat a missing `status` as uploaded.

Verified against real `tasks.editor_data` rows: T-6 (proxy URL form)
now extracts `file_…` correctly. T-8 (cloud R2 signed URL form) still
returns `[]` — that requires either aligning cloud's `createFile` to
return the proxy URL or adding a DB-fallback resolver, tracked as a
follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(task): resolve fileIds from pre-signed editor URLs via files.url lookup

Root cause: `fileService.getFileAccessUrl()` returns different URL forms
depending on the environment:

- prod / non-dev → `getFileProxyUrl(fileId)` = `${APP_URL}/f/{fileId}`
- dev → `getFullFileUrl(file.url)` = a pre-signed R2/S3 URL

The dev branch is intentional so remote model providers can fetch the
file directly (proxy URLs point to localhost and aren't reachable). But
the pre-signed URL doesn't contain the fileId anywhere, so our regex
extract silently returned [] for every local upload — agent never saw
any attached image.

Same shape happens for historical cloud data where the editor stored
pre-signed URLs.

Fix: make `extractFileIdsFromEditorData` async and take a `{ db, userId }`
context. Fast path stays the proxy-URL regex; URLs that don't match fall
back to a single batched `SELECT id FROM files WHERE user_id = ? AND url
IN (…)` keyed on the storage path extracted from each URL's pathname.

Verified against real local data:

  T-6 (proxy URL form)         → file_2vFD2sdzW9VO   (regex fast path)
  T-8 (pre-signed R2 URL)      → file_cAQ4naT8G8r5   (DB fallback)
  T-9 (pre-signed R2 URL × 2)  → file_…, file_…      (DB fallback)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(task): dedupe fileIds by storage key in DB fallback

Same bytes re-uploaded by the same user produce multiple `files` rows
with identical `url` + `file_hash`. The DB fallback in
`extractFileIdsFromEditorData` was returning every matching row, so a
task with one inline image but three historical upload attempts fed
the agent three copies of the same image — wasteful multimodal tokens
and noisy provider input.

Group results by `files.url` and keep the first row per key. Verified
against real local data:

  T-6  (1 img, 1 upload)              → 1 fileId
  T-8  (1 img, 1 upload)              → 1 fileId
  T-9  (1 img, 2 dup uploads)         → 1 fileId (was 2)
  T-10 (1 img, 3 dup uploads)         → 1 fileId (was 3)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style(editor): render inline file nodes as block-level cards

The default @lobehub/editor `ReactFile` decorator paints file attachments
as a tiny inline pill (icon + filename in monospace, inline-block with
0.4em padding), so a single PDF on its own line looked cramped and
hugged the surrounding text.

Override the upstream styling via the `className` prop the plugin
already exposes: full-width flex row, 10px gap, 14px padding,
`borderRadiusLG` corner, subtle hover, primary tint on `.selected`.
Aligns the editor's file attachment row with the Linear attachment
card look — and with the LexicalRenderer card the comment thread
already uses, so the same file looks consistent across surfaces.

The upstream component still only renders icon + name (no size), but
the layout change is the main UX win.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 💄 style(editor): Linear-style file card with hover download

Replace the upstream inline pill FileNode UI with a full-width card
(icon + name + size + hover-revealed download button) wired in both the
live editor and the read-only LexicalRenderer for saved comments.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* 🐛 fix(editor): use existing editor:file.* keys for file card states

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-01 00:34:18 +08:00
..