diff --git a/docs/usage/agent/claude-code.mdx b/docs/usage/agent/claude-code.mdx new file mode 100644 index 0000000000..a05dd2e9a5 --- /dev/null +++ b/docs/usage/agent/claude-code.mdx @@ -0,0 +1,123 @@ +--- +title: Claude Code +description: >- + Delegate Anthropic's Claude Code inside LobeHub — chat with the Claude Code + CLI from your desktop app, watch tasks, todos, skills, and tool calls stream + in real time, and resume sessions across turns. +tags: + - LobeHub + - Claude Code + - Coding Agent + - Desktop + - CLI + - Anthropic +--- + +# Claude Code + +Claude Code is Anthropic's coding agent that reads, writes, and runs code from your terminal. In LobeHub, you can delegate Claude Code from the desktop app — keep the chat UX you already use, while Claude Code does the work locally with full access to your project. + +Send a prompt and Claude Code reads files, makes edits, runs commands, and reports back. Tasks, todos, skills, and tool calls stream into the chat as the agent moves; sessions resume across turns so a long task can span many messages. + +## What Is Claude Code in LobeHub? + +A bridge between LobeHub's chat UI and the Claude Code CLI running on your machine. LobeHub spawns the `claude` command as a local subprocess, streams its events into a chat conversation, and renders Claude Code's output — partial messages, tasks, todos, skills, sub-agent threads — as first-class chat blocks. You drive the agent in natural language; Claude Code executes locally with your environment, credentials, and project context. + +## Requirements + +- **LobeHub desktop app** — Claude Code agents only work in the desktop build. The web app cannot spawn local processes. +- **Claude Code CLI installed** — the `claude` command must be available on your `PATH`. +- **Signed in** — you must run `claude` once in a terminal to authenticate before LobeHub can drive it. Requires an Anthropic account. + +## Install the Claude Code CLI + +Pick one of the install paths: + +**Recommended (install script)** + +```bash +curl -fsSL https://claude.ai/install.sh | bash +``` + +**Homebrew (macOS)** + +```bash +brew install --cask claude-code +``` + +After installing, run `claude` once in a terminal to sign in. See the [Claude Code setup guide](https://docs.anthropic.com/en/docs/claude-code/setup) for details. + +If LobeHub can't find the CLI, it shows an **Install Claude Code CLI** prompt with the same commands and an **Open System Tools** button — click it after installing to re-detect the CLI. + +## Add Claude Code in LobeHub + +When LobeHub detects the Claude Code CLI on your machine, an **Add Claude Code** recommendation card appears on the home page tagged "Coding Agent". Click it to create a Claude Code agent in one step. + +You can also create one manually from the **Create Agent** menu and pick **Claude Code** as the type. + +Each agent is independent, so you can keep multiple Claude Code agents pinned to different projects or workflows. + +## Working Directory + +Every Claude Code session is pinned to a working directory — the folder Claude Code sees as the project root. Set it from the chat input bar before sending your first message. Switching mid-conversation triggers a **Switch working directory?** confirmation: chat messages stay, but the previous session context cannot be resumed and a new session starts for this topic. + +If you change folders and the saved Claude Code session can't be resumed, LobeHub shows: *"Working directory changed. Previous Claude Code session can only be resumed from its original directory, so a new conversation has started."* + +Inside the working directory, Claude Code runs with **Full access** — read and write to anything in the folder. Switching permission modes from inside LobeHub is not yet supported. + +## What Gets Rendered in Chat + +LobeHub renders Claude Code's tool calls and structured output with purpose-built blocks instead of raw JSON: + +**Tasks** — When Claude Code uses its task manager, tasks render as a live progress card. Watch items move through pending → in-progress → completed as Claude Code works. + +**Todos** — `TodoWrite` plans render as a progress card with completion counts and check states. Useful for tracking multi-step work. + +**Skills** — When Claude Code invokes a built-in or user-installed skill, the call appears in a Skill block showing inputs, outputs, and any artifacts. + +**Tool calls** — Reads, edits, shell runs, web fetches, and other tool uses get their own block in the conversation. Streamed partial output appears as Claude Code generates it. + +**Sub-agents** — Claude Code can spawn sub-agents to handle parallel or scoped work. Their threads render in isolation inside the conversation without leaking into the main bubble. + +**Interventions** — When Claude Code needs to ask you something mid-run, it shows a prompt inline so you can answer without leaving the chat. + +## Sessions and Resume + +Claude Code sessions persist across messages in the same topic. LobeHub captures the underlying session ID and reuses it on every follow-up, so you can pick up a long-running task at any point. + +A session can't be resumed if: + +- The working directory changed since the session was created +- The Claude Code CLI returns a resume error (session no longer exists, credentials expired, etc.) + +In either case, LobeHub starts a fresh conversation automatically. + +## Where It Can Run + +The **Execution Device** selector lets you pick where the Claude Code agent runs: + +- **This device** — runs Claude Code as a local process inside the desktop app. Default. +- **Cloud sandbox** — runs Claude Code in an ephemeral cloud sandbox. Useful when you don't want the agent touching your local filesystem. +- **Remote device** — drives a remote machine you've connected with `lh connect`. Useful when the project lives on a different machine. + +## Limitations + +- **Desktop only** — the Claude Code agent runs in the LobeHub desktop app. The web app cannot spawn the CLI. +- **One sign-in per machine** — Claude Code shares its authentication with the global CLI. If `claude` works in your terminal, it works in LobeHub. +- **Working-directory-bound** — sessions don't follow you across folders or machines. +- **Full access only** — switching permission modes from inside LobeHub is not yet supported. + +## Tips + +- **Run `claude` once in a terminal first** — sign-in happens at the CLI level, not in LobeHub. +- **Pick the working directory before your first message** — switching it later starts a new session. +- **Use one Claude Code agent per project** — pinning each agent to a specific repo keeps sessions tidy and resumable. +- **Watch the task card** — when Claude Code uses its task manager, the card is the fastest read on what's done, what's running, and what's queued. + + + + + + + + diff --git a/docs/usage/agent/claude-code.zh-CN.mdx b/docs/usage/agent/claude-code.zh-CN.mdx new file mode 100644 index 0000000000..05429a1d5d --- /dev/null +++ b/docs/usage/agent/claude-code.zh-CN.mdx @@ -0,0 +1,120 @@ +--- +title: Claude Code +description: 在 LobeHub 中委派 Anthropic Claude Code —— 通过桌面应用与 Claude Code CLI 对话,实时查看任务、待办、技能与工具调用,并跨轮次恢复会话。 +tags: + - LobeHub + - Claude Code + - 编程助理 + - 桌面端 + - CLI + - Anthropic +--- + +# Claude Code + +Claude Code 是 Anthropic 推出的编程助理,能在终端中读取、修改、运行代码。在 LobeHub 中,你可以通过桌面应用委派 Claude Code —— 保留熟悉的对话界面,让 Claude Code 在本地完成实际工作,并完整访问你的项目。 + +发送一条提示,Claude Code 会读取文件、修改代码、运行命令,并把过程反馈给你。任务、待办、技能与工具调用会随着助理推进实时进入聊天;会话能跨轮次恢复,一项长任务可以分布在多条消息中持续推进。 + +## 什么是 LobeHub 中的 Claude Code? + +它是 LobeHub 对话界面与本地 Claude Code CLI 之间的桥梁。LobeHub 在本地以子进程形式启动 `claude` 命令,把它的事件流接入聊天会话,并将 Claude Code 的输出 —— 增量消息、任务、待办、技能、子助理线程 —— 渲染为一等公民的聊天块。你用自然语言指挥助理,Claude Code 在本地用你的环境、凭据与项目上下文执行。 + +## 使用条件 + +- **LobeHub 桌面应用** —— Claude Code 助理只在桌面版可用,Web 端无法启动本地进程。 +- **已安装 Claude Code CLI** —— `claude` 命令需要在你的 `PATH` 中可用。 +- **已登录** —— 在 LobeHub 调用前,需在终端中先运行一次 `claude` 完成认证,需要 Anthropic 账号。 + +## 安装 Claude Code CLI + +任选一种方式: + +**推荐(安装脚本)** + +```bash +curl -fsSL https://claude.ai/install.sh | bash +``` + +**Homebrew(macOS)** + +```bash +brew install --cask claude-code +``` + +安装完成后,在终端中运行一次 `claude` 完成登录。详情见 [Claude Code 安装指南](https://docs.anthropic.com/en/docs/claude-code/setup)。 + +若 LobeHub 未能检测到 CLI,会弹出**安装 Claude Code CLI** 引导,并提供**打开系统工具**按钮 —— 安装完成后点击即可重新检测。 + +## 在 LobeHub 中添加 Claude Code + +当 LobeHub 检测到本机已安装 Claude Code CLI,首页会出现一张标记为「编程助理」的**添加 Claude Code** 推荐卡片,点击即可一步创建 Claude Code 助理。 + +你也可以手动创建:从**创建助理**菜单中选择 **Claude Code** 类型即可。 + +每个助理彼此独立,可以分别绑定到不同的项目或工作流。 + +## 工作目录 + +每个 Claude Code 会话都绑定一个工作目录 —— 即 Claude Code 视为项目根的文件夹。在发出第一条消息前,先在聊天输入区域设置工作目录。会话进行中切换目录会触发**切换工作目录?**确认:聊天记录会保留,但旧会话的上下文无法恢复,将为该话题开启新的会话。 + +如果切换目录后,已保存的 Claude Code 会话无法恢复,LobeHub 会提示:**「工作目录已更改。之前的 Claude Code 会话只能在原始目录下恢复,已开启新的对话。」** + +在工作目录内,Claude Code 以**完全访问**权限运行 —— 可对文件夹内任何文件进行读写。LobeHub 内部暂不支持切换权限模式。 + +## 聊天中会渲染什么 + +LobeHub 不会把 Claude Code 的工具调用渲染成原始 JSON,而是用专用区块呈现: + +**任务** —— Claude Code 使用任务管理器时,任务会渲染为实时进度卡片。可以看到条目在「待办 → 进行中 → 已完成」之间流转。 + +**待办** —— `TodoWrite` 计划会渲染为进度卡片,展示完成数量与勾选状态。适合追踪多步骤工作。 + +**技能** —— Claude Code 调用内置或用户安装的技能时,调用会呈现为 Skill 区块,展示输入、输出与产物。 + +**工具调用** —— 文件读取、编辑、命令执行、网页抓取等工具使用都会在对话中拥有独立区块,并随 Claude Code 输出实时增量展示。 + +**子助理** —— Claude Code 可以派生子助理处理并行或局部任务。它们的线程在会话中以独立线程呈现,不会污染主对话气泡。 + +**询问** —— 当 Claude Code 需要在过程中向你提问时,会在聊天中内联呈现,让你无需离开对话即可回答。 + +## 会话与恢复 + +Claude Code 会话在同一话题中跨消息持续。LobeHub 会捕获底层 session ID 并在每次追问时复用,因此你可以随时回到长任务的任意进度点继续。 + +下列情况下,会话无法恢复: + +- 自会话创建以来工作目录被更改 +- Claude Code CLI 返回恢复错误(会话已不存在、凭据过期等) + +任一情况发生时,LobeHub 都会自动开启一段新会话。 + +## 它在哪里运行 + +**执行设备**选择器让你决定 Claude Code 助理在哪里运行: + +- **本机** —— Claude Code 在桌面应用内作为本地进程运行,默认选项。 +- **云沙箱** —— Claude Code 在临时云沙箱中运行。当你不希望助理触碰本地文件时适用。 +- **远程设备** —— 驱动你通过 `lh connect` 接入的另一台机器。当项目位于另一台设备上时适用。 + +## 限制 + +- **仅桌面端** —— Claude Code 助理只在 LobeHub 桌面应用中可用,Web 端无法启动 CLI。 +- **每台机器一次登录** —— Claude Code 与全局 CLI 共享认证。终端里 `claude` 能用,LobeHub 里就能用。 +- **绑定工作目录** —— 会话不会跨文件夹或机器跟随你。 +- **仅支持完全访问** —— LobeHub 内部暂不支持切换权限模式。 + +## 使用技巧 + +- **先在终端中运行一次 `claude`** —— 登录在 CLI 层面完成,不在 LobeHub 里。 +- **第一条消息前先选好工作目录** —— 之后切换会开启新会话。 +- **一个项目用一个 Claude Code 助理** —— 每个助理绑定一个仓库,会话更整洁也更容易恢复。 +- **多关注任务卡片** —— Claude Code 使用任务管理器时,这张卡片是了解「已完成、进行中、待办」的最快方式。 + + + + + + + + diff --git a/docs/usage/agent/codex.mdx b/docs/usage/agent/codex.mdx new file mode 100644 index 0000000000..cc40ee7aa9 --- /dev/null +++ b/docs/usage/agent/codex.mdx @@ -0,0 +1,117 @@ +--- +title: Codex +description: >- + Delegate OpenAI Codex inside LobeHub — chat with the Codex CLI from your + desktop app, watch file changes, todos, and command output stream in real + time, and resume sessions across turns. +tags: + - LobeHub + - Codex + - Coding Agent + - Desktop + - CLI + - OpenAI +--- + +# Codex + +Codex is OpenAI's coding agent that edits files, runs commands, and ships changes from your terminal. In LobeHub, you can delegate Codex from the desktop app — keep the chat UX you already use, while Codex does the work locally with full access to your project. + +Send a prompt and Codex opens files, makes edits, runs tests, and reports back. File changes, todos, and command output stream into the chat as the agent moves; sessions resume across turns so a long task can span many messages. + +## What Is Codex in LobeHub? + +A bridge between LobeHub's chat UI and the Codex CLI running on your machine. LobeHub spawns the Codex CLI as a local subprocess, streams its events into a chat conversation, and renders Codex's tool output — file changes, todo lists, command runs — as first-class chat blocks. You drive the agent in natural language; Codex executes locally with your environment, credentials, and project context. + +## Requirements + +- **LobeHub desktop app** — Codex agents only work in the desktop build. The web app cannot spawn local processes. +- **Codex CLI installed** — the `codex` command must be available on your `PATH`. +- **Signed in** — you must run `codex` once in a terminal to authenticate before LobeHub can drive it. + +## Install the Codex CLI + +Pick one of the install paths: + +**Recommended (npm)** + +```bash +npm install -g @openai/codex +``` + +**Homebrew (macOS)** + +```bash +brew install --cask codex +``` + +After installing, run `codex` once in a terminal to sign in. See the [Codex installation guide](https://github.com/openai/codex#installing-and-running-codex-cli) for details. + +If LobeHub can't find the CLI, it shows an **Install Codex CLI** prompt with the same commands and an **Open System Tools** button — click it after installing to re-detect the CLI. + +## Add Codex in LobeHub + +When LobeHub detects the Codex CLI on your machine, an **Add Codex** recommendation card appears on the home page tagged "Coding Agent". Click it to create a Codex agent in one step. + +You can also create one manually from the **Create Agent** menu and pick **Codex** as the type. + +Each agent is independent, so you can keep multiple Codex agents pinned to different projects or workflows. + +## Working Directory + +Every Codex session is pinned to a working directory — the folder Codex sees as the project root. Set it from the chat input bar before sending your first message. Switching the working directory mid-conversation starts a new Codex session for the topic; chat history stays, but the previous session context cannot be resumed. + +If you change folders and the saved Codex thread can't be resumed safely, LobeHub shows: *"The saved Codex thread could not be resumed safely, so a new conversation has started for this topic."* + +## What Gets Rendered in Chat + +LobeHub renders Codex's tool calls with purpose-built blocks instead of raw JSON: + +**File changes** — Codex's edits show up as an expandable list with the operation kind (added, deleted, modified, renamed), the file path, and a per-file line count delta (+/−). Click to see what changed. + +**Todo lists** — When Codex plans a multi-step task, the plan renders as a progress card with completed / in-progress / pending items and a running count (e.g. "3/5 completed"). Watch tasks tick off as Codex finishes them. + +**Command execution** — Shell commands Codex runs show the command, exit code, and stdout / stderr output. Success and failure states are clearly marked. + +**Subagents** — Codex can spawn subagents to work in parallel. Their work appears in isolated threads inside the conversation without leaking into the main bubble. + +## Sessions and Resume + +Codex sessions persist across messages in the same topic. You can send a follow-up like "now also update the tests" and Codex picks up where it left off — same files, same context, same plan. + +A session can't be resumed if: + +- The working directory changed since the saved thread was created +- The original Codex thread no longer exists +- The CLI returns a "no conversation found" or "thread not found" error + +In any of these cases, LobeHub starts a fresh conversation automatically. + +## Where It Can Run + +The **Execution Device** selector lets you pick where the Codex agent runs: + +- **This device** — runs Codex as a local process inside the desktop app. Default. +- **Cloud sandbox** — runs Codex in an ephemeral cloud sandbox. Useful when you don't want the agent touching your local filesystem. +- **Remote device** — drives a remote machine you've connected with `lh connect`. Useful when the project lives on a different machine. + +## Limitations + +- **Desktop only** — the Codex agent runs in the LobeHub desktop app. The web app cannot spawn the CLI. +- **One sign-in per machine** — Codex shares its authentication with the global CLI. If `codex` works in your terminal, it works in LobeHub. +- **Working-directory-bound** — sessions don't follow you across folders or machines. + +## Tips + +- **Run `codex` once in a terminal first** — sign-in happens at the CLI level, not in LobeHub. +- **Pick the working directory before your first message** — switching it later starts a new session. +- **Watch the todo card** — it's the fastest read on what Codex thinks it still has to do. +- **Use one Codex agent per project** — pinning each agent to a specific repo keeps sessions tidy and resumable. + + + + + + + + diff --git a/docs/usage/agent/codex.zh-CN.mdx b/docs/usage/agent/codex.zh-CN.mdx new file mode 100644 index 0000000000..f02eb805ce --- /dev/null +++ b/docs/usage/agent/codex.zh-CN.mdx @@ -0,0 +1,114 @@ +--- +title: Codex +description: 在 LobeHub 中委派 OpenAI Codex —— 通过桌面应用与 Codex CLI 对话,实时查看文件变更、待办与命令输出,并跨轮次恢复会话。 +tags: + - LobeHub + - Codex + - 编程助理 + - 桌面端 + - CLI + - OpenAI +--- + +# Codex + +Codex 是 OpenAI 推出的编程助理,能在终端中编辑文件、运行命令、提交改动。在 LobeHub 中,你可以通过桌面应用委派 Codex —— 保留熟悉的对话界面,让 Codex 在本地完成实际工作,并完整访问你的项目。 + +发送一条提示,Codex 会打开文件、修改代码、运行测试,并把过程反馈给你。文件变更、待办列表、命令输出会随着助理推进实时进入聊天;会话能跨轮次恢复,一项长任务可以分布在多条消息中持续推进。 + +## 什么是 LobeHub 中的 Codex? + +它是 LobeHub 对话界面与本地 Codex CLI 之间的桥梁。LobeHub 在本地以子进程形式启动 Codex CLI,把它的事件流接入聊天会话,并将 Codex 的工具输出 —— 文件变更、待办列表、命令执行 —— 渲染为一等公民的聊天块。你用自然语言指挥助理,Codex 在本地用你的环境、凭据与项目上下文执行。 + +## 使用条件 + +- **LobeHub 桌面应用** —— Codex 助理只在桌面版可用,Web 端无法启动本地进程。 +- **已安装 Codex CLI** —— `codex` 命令需要在你的 `PATH` 中可用。 +- **已登录** —— 在 LobeHub 调用前,需在终端中先运行一次 `codex` 完成认证。 + +## 安装 Codex CLI + +任选一种方式: + +**推荐(npm)** + +```bash +npm install -g @openai/codex +``` + +**Homebrew(macOS)** + +```bash +brew install --cask codex +``` + +安装完成后,在终端中运行一次 `codex` 完成登录。详情见 [Codex 安装指南](https://github.com/openai/codex#installing-and-running-codex-cli)。 + +若 LobeHub 未能检测到 CLI,会弹出**安装 Codex CLI** 引导,并提供**打开系统工具**按钮 —— 安装完成后点击即可重新检测。 + +## 在 LobeHub 中添加 Codex + +当 LobeHub 检测到本机已安装 Codex CLI,首页会出现一张标记为「编程助理」的**添加 Codex** 推荐卡片,点击即可一步创建 Codex 助理。 + +你也可以手动创建:从**创建助理**菜单中选择 **Codex** 类型即可。 + +每个助理彼此独立,可以分别绑定到不同的项目或工作流。 + +## 工作目录 + +每个 Codex 会话都绑定一个工作目录 —— 即 Codex 视为项目根的文件夹。在发出第一条消息前,先在聊天输入区域设置工作目录。会话进行中切换目录会为该话题开启一个新的 Codex 会话;聊天记录会保留,但旧会话的上下文无法恢复。 + +如果切换目录后,已保存的 Codex 线程无法安全恢复,LobeHub 会提示:**「已保存的 Codex 线程无法安全恢复,已为该话题开启新的会话。」** + +## 聊天中会渲染什么 + +LobeHub 不会把 Codex 的工具调用渲染成原始 JSON,而是用专用区块呈现: + +**文件变更** —— Codex 对文件的修改会展示为可展开的列表,包含操作类型(新增、删除、修改、重命名)、文件路径,以及每个文件的行数变化(+/−)。点击可查看改动详情。 + +**待办列表** —— Codex 规划多步任务时,待办会渲染为进度卡片,列出已完成、进行中和待办项,并显示完成进度(如「3/5 已完成」)。Codex 完成任务时,待办会自动勾选。 + +**命令执行** —— Codex 运行的 shell 命令会显示命令本身、退出码以及 stdout / stderr 输出。成功与失败状态一目了然。 + +**子助理** —— Codex 可以派生子助理并行工作。它们的输出在会话中以独立线程呈现,不会污染主对话气泡。 + +## 会话与恢复 + +Codex 会话在同一话题中跨消息持续。你可以发出追问,例如「顺便也更新一下测试」,Codex 会接着上一次的进度继续 —— 同样的文件、同样的上下文、同样的计划。 + +下列情况下,会话无法恢复: + +- 自上次保存以来工作目录被更改 +- 原始 Codex 线程已不存在 +- CLI 报错「no conversation found」或「thread not found」 + +任一情况发生时,LobeHub 都会自动开启一段新会话。 + +## 它在哪里运行 + +**执行设备**选择器让你决定 Codex 助理在哪里运行: + +- **本机** —— Codex 在桌面应用内作为本地进程运行,默认选项。 +- **云沙箱** —— Codex 在临时云沙箱中运行。当你不希望助理触碰本地文件时适用。 +- **远程设备** —— 驱动你通过 `lh connect` 接入的另一台机器。当项目位于另一台设备上时适用。 + +## 限制 + +- **仅桌面端** —— Codex 助理只在 LobeHub 桌面应用中可用,Web 端无法启动 CLI。 +- **每台机器一次登录** —— Codex 与全局 CLI 共享认证。终端里 `codex` 能用,LobeHub 里就能用。 +- **绑定工作目录** —— 会话不会跨文件夹或机器跟随你。 + +## 使用技巧 + +- **先在终端中运行一次 `codex`** —— 登录在 CLI 层面完成,不在 LobeHub 里。 +- **第一条消息前先选好工作目录** —— 之后切换会开启新会话。 +- **多关注待办卡片** —— 这是了解 Codex 还剩什么任务的最快方式。 +- **一个项目用一个 Codex 助理** —— 每个助理绑定一个仓库,会话更整洁也更容易恢复。 + + + + + + + + diff --git a/docs/usage/getting-started/generation.mdx b/docs/usage/getting-started/generation.mdx new file mode 100644 index 0000000000..8e55d12fb7 --- /dev/null +++ b/docs/usage/getting-started/generation.mdx @@ -0,0 +1,212 @@ +--- +title: Image & Video Generation +description: >- + Create high-quality images and videos from text descriptions using AI models + like DALL-E 3, Flux, Sora, Veo, Kling, and more. Learn how to write effective + prompts, choose the right model, and configure parameters for each medium. +tags: + - LobeHub + - Image Generation + - Video Generation + - AI Drawing + - AI Video + - DALL-E + - Sora + - Veo + - Kling + - Text to Image + - Text to Video + - Prompt Writing +--- + +# Image & Video Generation + +Describe what you want — LobeHub turns text into images and videos. Product prototypes, design inspiration, illustrations, motion concepts, short clips, or creative exploration: choose a model, set your parameters, and generate in seconds. All output lands in your generation feed and can be downloaded or saved to your Resource Library. + +LobeHub ships two parallel workspaces — **Image** and **Video** — built on the same generation pipeline but tuned for each medium. + +## Get Started + +From the LobeHub sidebar: + +- Click **Image** (the picture icon) to open the image generation workspace at `/image`. +- Click **Video** (the video icon) to open the video generation workspace at `/video`. + +Each workspace has the same three-pane layout: prompt input, configuration panel, and a generation feed for past results. + +## Image Generation + +### Enter a Prompt + +Describe the image you want in the input box. The more specific your description, the more accurate the result. + +**Effective prompt structure:** + +``` +[Subject] [Style/Medium] [Setting/Background] [Lighting] [Mood] [Technical details] +``` + +Examples: + +``` +"A futuristic city skyline at sunset, digital art, cyberpunk style, neon lights reflecting on wet streets, cinematic lighting, 4K detail" + +"A cozy coffee shop interior, watercolor illustration, warm golden light streaming through windows, potted plants on windowsills, soft and inviting atmosphere" + +"A product photo of a minimalist leather wallet on a clean white background, studio lighting, sharp focus, commercial photography style" +``` + +**Prompt tips:** + +- **Be specific about style** — "oil painting", "watercolor", "digital art", "photorealistic", "anime", "vector illustration" +- **Describe lighting** — "dramatic shadows", "soft diffused light", "golden hour", "studio lighting" +- **Specify composition** — "portrait view", "wide angle", "close-up", "bird's eye view" +- **Add quality modifiers** — "high detail", "4K", "sharp focus", "professional quality" +- **Avoid vagueness** — "beautiful", "nice", "good" add little — describe what you actually want + +### Choose an AI Model + +LobeHub offers multiple AI image generation models. Different models have different strengths: + +![Choose a Model](/blog/assetsdd913561927c64d32bd390cee6846f9a.webp) + +| Model | Best For | +| -------------------- | ------------------------------------------------------------- | +| **DALL-E 3** | Realistic photos, illustrations, following prompts accurately | +| **GPT Image** | High-fidelity edits, text rendering inside images | +| **Flux** | Artistic styles, creative images, fast generation | +| **Stable Diffusion** | Highly customizable, community styles and fine-tuned models | +| **Gemini Imagen** | Photoreal scenes, strong global composition | +| **fal.ai models** | Various specialized styles and fast generation | + +Try different models with the same prompt to see which gives the best results for your use case. + +### Reference Images (Optional) + +If you have reference images, upload them to guide the generation process. Click the upload button or drag and drop your reference images directly. You can upload multiple reference images depending on the model. + +![Upload Reference Images for Image Generation](/blog/assets3c160860feef0bd7c653eeb46f683445.webp) + +Reference images help the model understand your desired style, composition, or color palette — and many models also support reference-based **edits** (e.g. swap the background, change the outfit) when you describe the change in the prompt. + +### Configure Generation Parameters + +The right-hand config panel exposes everything the selected model supports. Common controls: + +- **Aspect Ratio** — `1:1`, `16:9`, `9:16`, `4:3`, `3:2`. Lock or unlock to free-form size. +- **Size / Resolution** — pick a preset (`512px`, `1K`, `2K`, `4K`) or set width × height directly. +- **Number of Images** — generate 1–4 variations per run. +- **Quality** — Standard or High Definition (model-dependent). +- **Seed** — leave random for variety, or paste a fixed seed to reproduce a previous result. +- **Steps / Guidance Intensity (CFG)** — fine-tune the speed-vs-quality and prompt-adherence tradeoffs. +- **Watermark** — toggle on/off where supported. +- **Web Search** / **Prompt Extend** — let an LLM enrich your prompt with current references before generation. + +**Aspect ratio cheatsheet:** + +- **1:1** — Social media posts, profile pictures +- **16:9** — Widescreen, presentations, banners +- **9:16** — Mobile screens, stories, reels +- **4:3** — General use, older display formats +- **3:2** — Photography standard, prints + +### View and Download Images + +Once generated, images appear in the generation feed. You can: + +- Preview any image at full size by clicking it +- Download, copy the seed, copy the prompt, or reuse the full settings on a new run +- Delete a single image or the whole batch + +![Generated Images in Asset Library](/blog/assets974acc551878f2f395518a3fbb9bd924.webp) + +## Video Generation + +The Video workspace mirrors Image — same prompt-first flow, same config panel, same feed — but with controls tuned for motion. + +### Enter a Prompt + +Describe the **scene, motion, and camera**, not just the subject. Models reward verbs and shot language. + +``` +"A red fox trotting through fresh snow at golden hour, breath visible in the cold air, slow tracking shot, cinematic" + +"An astronaut floating into a colorful nebula, slow dolly-in, dreamy atmosphere, soft volumetric light" + +"A cup of coffee being poured in macro slow motion, steam rising, shallow depth of field, commercial product shot" +``` + +**Prompt tips for video:** + +- **Describe motion explicitly** — "slow tracking shot", "dolly-in", "handheld", "static wide", "pan left" +- **Set a time progression** — "starts misty then clears", "the door slowly opens" +- **Reference cinematography** — "shallow depth of field", "anamorphic lens flare", "golden hour" +- **Keep it focused** — one main action per clip works better than several + +### Choose an AI Model + +LobeHub integrates the major text-to-video and image-to-video providers: + +| Model | Best For | +| ------------------------------ | ------------------------------------------------------------ | +| **OpenAI Sora 2 / Sora 2 Pro** | Coherent multi-second clips, strong scene understanding | +| **Google Veo 3 / 3.1** | Photoreal motion, native audio generation, cinematic look | +| **Kling V3** | High-motion fidelity, image-to-video and omni-video | +| **MiniMax Hailuo 2.3** | Fast text-to-video, expressive characters | +| **Qwen / Wan** | Text-to-video with strong Chinese prompt understanding | +| **fal.ai models** | Specialised models, fast turnaround | + +Different models support different parameter sets — switching models updates the config panel automatically. + +### Start & End Frames (Optional) + +Many video models support image conditioning: + +- **Start Frame** — upload an image to use as the first frame of the clip. Great for animating a still you generated in the Image workspace. +- **End Frame** — upload an image to land on as the final frame. Requires a start frame. + +When a start frame is set, the prompt placeholder shifts to "Describe the scene you want to generate with the image". + +### Configure Generation Parameters + +Controls vary by model, but typically include: + +- **Duration** — clip length in seconds (model-dependent, e.g. 4s / 6s / 8s). +- **Aspect Ratio** — `16:9`, `9:16`, `1:1`, `4:3`, `3:4`, `21:9`. +- **Resolution** — `480p`, `720p`, `1080p`. +- **Fixed Camera** — lock the camera in place instead of letting the model animate it. +- **Generate Audio** — produce a synced soundtrack alongside the video (model-dependent, e.g. Veo). +- **Seed** — random or fixed for reproducibility. +- **Watermark** — toggle on/off where supported. +- **Web Search** / **Prompt Extend** — same LLM-assisted prompt enrichment as the image flow. + +### View and Download Videos + +Generated clips appear in the feed and play inline. You can: + +- Play, pause, and scrub through the clip +- Download the video +- Copy the error message to clipboard if a generation fails +- Delete a single clip or the whole batch + +A "🎁 N free videos today" badge shows your remaining free quota; once it's used up, credits are consumed per generation. + +## Tips for Better Results + +**Iterate on prompts** — If the first result isn't quite right, adjust one element at a time rather than rewriting the whole prompt. Add more detail, change the style descriptor, or specify what you don't want. + +**Use a reference image or start frame** — Uploading a reference helps the model match your intended style, color palette, composition, or — for video — your opening shot. + +**Try multiple variations** — Generate several images per run, or re-generate videos with the same seed and a tweaked prompt. AI generation has inherent randomness — some variations will be significantly better than others. + +**Match model to task** — Photorealistic models (DALL-E 3, Flux, Imagen) for product photos and realistic scenes; style-focused models for artistic illustrations; Veo or Sora for cinematic motion; Kling or Hailuo for character-heavy clips. + +**Bridge image → video** — Generate a strong still in the Image workspace, then feed it into the Video workspace as a start frame to animate it. + + + + + + + + diff --git a/docs/usage/getting-started/generation.zh-CN.mdx b/docs/usage/getting-started/generation.zh-CN.mdx new file mode 100644 index 0000000000..ce7cf26fb6 --- /dev/null +++ b/docs/usage/getting-started/generation.zh-CN.mdx @@ -0,0 +1,209 @@ +--- +title: 图像与视频生成 +description: 使用 DALL-E 3、Flux、Sora、Veo、Kling 等 AI 模型,通过文字描述生成高质量图像和视频。学习如何编写有效的提示词、选择合适的模型,并配置每种媒介的参数。 +tags: + - LobeHub + - 图像生成 + - 视频生成 + - AI 画图 + - AI 视频 + - DALL-E + - Sora + - Veo + - Kling + - 文字生成图像 + - 文字生成视频 + - 提示词写作 +--- + +# 图像与视频生成 + +用文字描述你想要的内容 ——LobeHub 帮你把想法变成图像和视频。产品原型、设计灵感、插图配图、动态概念、短片创作、创意探索:选择模型、设置参数,几秒钟内获得结果。所有生成内容都会出现在生成流中,可以下载或保存到你的资源库。 + +LobeHub 提供两个并行的工作区 ——**图像**与**视频**——基于同一套生成管线,但针对各自的媒介进行了优化。 + +## 开始生成 + +在 LobeHub 侧边栏: + +- 点击**图像**(图片图标)进入 `/image` 的图像生成工作区。 +- 点击**视频**(视频图标)进入 `/video` 的视频生成工作区。 + +两个工作区采用相同的三栏布局:提示词输入、配置面板、历史生成流。 + +## 图像生成 + +### 输入提示词 + +在输入框中描述你想要的图像。描述越具体,结果越符合预期。 + +**有效的提示词结构:** + +``` +[主体] [风格/媒介] [场景/背景] [光线] [氛围] [技术细节] +``` + +示例: + +``` +"赛博朋克风格的未来城市天际线,日落时分,霓虹灯在湿润街道上的倒影,数字艺术,电影级光线,4K 细节" + +"温馨咖啡馆室内,水彩插画风格,阳光透过窗户洒入,窗台上摆放绿植,柔和温暖的氛围" + +"极简皮革钱包产品照,白色干净背景,棚拍灯光,对焦清晰,商业摄影风格" +``` + +**提示词技巧:** + +- **明确指定风格** — "油画"、"水彩"、"数字艺术"、"照片写实"、"动漫"、"矢量插画" +- **描述光线** — "戏剧性阴影"、"柔和漫射光"、"黄金时段"、"棚拍灯光" +- **指定构图** — "竖拍人像"、"广角"、"特写"、"俯拍鸟瞰" +- **加入质量词** — "高细节"、"4K"、"对焦清晰"、"专业品质" +- **避免模糊描述** — "漂亮"、"好看"、"不错" 对结果帮助有限 —— 要具体描述你真正想要的内容 + +### 选择 AI 模型 + +LobeHub 提供多个 AI 画图模型,不同模型各有所长: + +![选择模型](/blog/assetsdd913561927c64d32bd390cee6846f9a.webp) + +| 模型 | 最适合 | +| -------------------- | ------------------ | +| **DALL-E 3** | 写实照片、插画、精准遵循提示词 | +| **GPT Image** | 高保真编辑、图像内文本渲染 | +| **Flux** | 艺术风格、创意图像、快速生成 | +| **Stable Diffusion** | 高度可定制,支持社区风格和微调模型 | +| **Gemini Imagen** | 真实场景,整体构图能力强 | +| **fal.ai 系列模型** | 多种专业风格,生成速度快 | + +用同一个提示词尝试不同模型,找到最适合你使用场景的。 + +### 参考图片(可选) + +如果你有参考图片,可以上传作为生成的参考。点击上传按钮或直接拖入参考图片即可。根据模型不同,可以上传多张参考图片。 + +![上传参考图片](/blog/assets3c160860feef0bd7c653eeb46f683445.webp) + +参考图片有助于模型理解你期望的风格、构图或配色方案 —— 配合提示词描述(例如替换背景、更换服饰),许多模型还支持基于参考图的**编辑**。 + +### 配置生成参数 + +右侧配置面板会展示当前模型支持的全部参数。常见控件: + +- **比例(Aspect Ratio)** — `1:1`、`16:9`、`9:16`、`4:3`、`3:2`。可锁定比例或解锁自由调整。 +- **尺寸 / 分辨率** — 选择预设(`512px`、`1K`、`2K`、`4K`),或直接设定宽 × 高。 +- **生成数量** — 一次生成 1–4 张变体。 +- **质量** — 标准 / 高清(取决于模型)。 +- **Seed(随机种子)** — 随机以获得多样性,或粘贴固定 seed 复现之前的结果。 +- **Steps / 引导强度(CFG)** — 调节速度 vs 质量、提示词遵循程度的权衡。 +- **水印** — 在支持的模型上开启或关闭。 +- **联网搜索** / **提示词扩写** — 让 LLM 在生成前为你的提示词补充最新参考信息。 + +**比例速查:** + +- **1:1** — 社交媒体发帖、头像 +- **16:9** — 宽屏、演示文稿、横幅 +- **9:16** — 手机屏幕、动态、竖屏视频 +- **4:3** — 通用用途、旧显示格式 +- **3:2** — 摄影标准、打印 + +### 查看和下载图片 + +图像生成完成后,会显示在生成流中。你可以: + +- 点击任意图片查看全尺寸预览 +- 下载、复制 seed、复制提示词,或在新一轮生成中复用完整参数 +- 删除单张图片或整批 + +![生成的图片在资源库中](/blog/assets974acc551878f2f395518a3fbb9bd924.webp) + +## 视频生成 + +视频工作区与图像工作区结构一致 —— 同样以提示词为先、同样的配置面板、同样的生成流 —— 只是参数针对动态画面做了调整。 + +### 输入提示词 + +描述**场景、运动和镜头**,不只是主体。模型对动词和镜头语言更敏感。 + +``` +"金色时分一只红狐在新鲜雪地上小跑,呼气在冷空气中清晰可见,缓慢跟拍镜头,电影感" + +"宇航员漂入色彩斑斓的星云,缓慢推进镜头,梦幻氛围,柔和的体积光" + +"咖啡杯被慢动作微距倒入,蒸汽升腾,浅景深,商业产品镜头" +``` + +**视频提示词技巧:** + +- **明确描述运动** — "缓慢跟拍"、"推进"、"手持"、"静态远景"、"向左横摇" +- **设置时间推进** — "起初有雾随后散去"、"门缓缓打开" +- **借用电影语言** — "浅景深"、"变形宽银幕镜头眩光"、"黄金时段" +- **保持焦点** — 一个镜头一个核心动作往往比塞进多个动作效果更好 + +### 选择 AI 模型 + +LobeHub 接入了主流的文生视频与图生视频提供商: + +| 模型 | 最适合 | +| ------------------------------ | ------------------------------ | +| **OpenAI Sora 2 / Sora 2 Pro** | 连贯的多秒镜头,强场景理解能力 | +| **Google Veo 3 / 3.1** | 真实运动质感,原生音频生成,电影级画面 | +| **Kling V3** | 高质量运动表现,支持图生视频和 omni-video | +| **MiniMax Hailuo 2.3** | 快速文生视频,表现力强的人物 | +| **Qwen / Wan** | 文生视频,对中文提示词理解强 | +| **fal.ai 系列模型** | 多种专业模型,出片快 | + +不同模型支持的参数不同,切换模型时配置面板会自动更新。 + +### 起始帧与结束帧(可选) + +许多视频模型支持图像条件输入: + +- **起始帧(Start Frame)** —— 上传一张图作为视频的第一帧。非常适合把图像工作区生成的静帧动起来。 +- **结束帧(End Frame)** —— 上传一张图作为视频的最后一帧。必须先设置起始帧。 + +设置起始帧后,提示词占位文案会变为"描述你想要基于该图像生成的场景"。 + +### 配置生成参数 + +参数因模型而异,常见包括: + +- **时长(Duration)** —— 视频长度(秒),取决于模型(如 4s / 6s / 8s)。 +- **比例** —— `16:9`、`9:16`、`1:1`、`4:3`、`3:4`、`21:9`。 +- **分辨率** —— `480p`、`720p`、`1080p`。 +- **固定镜头(Fixed Camera)** —— 锁定镜头不动,而非让模型自由运镜。 +- **生成音频(Generate Audio)** —— 同步生成配音(取决于模型,例如 Veo)。 +- **Seed** —— 随机或固定以复现结果。 +- **水印** —— 在支持的模型上开启或关闭。 +- **联网搜索** / **提示词扩写** —— 与图像流程相同的 LLM 辅助扩写。 + +### 查看和下载视频 + +生成的视频会出现在生成流中并可直接内嵌播放。你可以: + +- 播放、暂停、拖动进度 +- 下载视频 +- 生成失败时复制错误信息到剪贴板 +- 删除单条视频或整批 + +"🎁 今日剩余 N 条免费视频"角标显示你的免费额度;用完后每次生成将按额度扣费。 + +## 获得更好结果的技巧 + +**迭代优化提示词** —— 如果第一次的结果不够理想,每次只调整一个要素,而不是重写整个提示词。可以增加细节、改变风格词,或指定你不想要的内容。 + +**使用参考图或起始帧** —— 上传参考能帮助模型匹配你期望的风格、配色、构图,或者 —— 对视频而言 —— 你想要的起始画面。 + +**多变体对比** —— 一次生成多张图片,或用相同 seed + 微调提示词重生视频。AI 生成本身具有随机性 —— 不同变体的质量可能差异明显。 + +**根据任务选模型** —— 产品照和写实场景选写实系模型(DALL-E 3、Flux、Imagen);艺术插画选风格化模型;电影感运动镜头选 Veo 或 Sora;人物为主的短片选 Kling 或 Hailuo。 + +**串联图像 → 视频** —— 先在图像工作区生成满意的静帧,再把它作为起始帧送入视频工作区,让它动起来。 + + + + + + + + diff --git a/docs/usage/getting-started/image-generation.mdx b/docs/usage/getting-started/image-generation.mdx deleted file mode 100644 index 26d47efd91..0000000000 --- a/docs/usage/getting-started/image-generation.mdx +++ /dev/null @@ -1,117 +0,0 @@ ---- -title: Image Generation -description: >- - Create high-quality images from text descriptions using AI models like DALL-E - 3, Flux, and more. Learn how to write effective prompts and choose the right - model. -tags: - - LobeHub - - Image Generation - - AI Drawing - - DALL-E - - Text to Image - - Prompt Writing ---- - -# Image Generation - -Describe what you want — LobeHub turns text into images. Product prototypes, design inspiration, illustrations, or creative exploration: choose a model, set your parameters, and get high-quality images in seconds. All generated images are automatically saved to your Resource Library. - -## Get Started - -Click **Drawing** on the LobeHub main interface to open the image generation page. - -## Enter a Prompt - -Describe the image you want in the input box. The more specific your description, the more accurate the result. - -**Effective prompt structure:** - -``` -[Subject] [Style/Medium] [Setting/Background] [Lighting] [Mood] [Technical details] -``` - -Examples: - -``` -"A futuristic city skyline at sunset, digital art, cyberpunk style, neon lights reflecting on wet streets, cinematic lighting, 4K detail" - -"A cozy coffee shop interior, watercolor illustration, warm golden light streaming through windows, potted plants on windowsills, soft and inviting atmosphere" - -"A product photo of a minimalist leather wallet on a clean white background, studio lighting, sharp focus, commercial photography style" -``` - -**Prompt tips:** - -- **Be specific about style** — "oil painting", "watercolor", "digital art", "photorealistic", "anime", "vector illustration" -- **Describe lighting** — "dramatic shadows", "soft diffused light", "golden hour", "studio lighting" -- **Specify composition** — "portrait view", "wide angle", "close-up", "bird's eye view" -- **Add quality modifiers** — "high detail", "4K", "sharp focus", "professional quality" -- **Avoid vagueness** — "beautiful", "nice", "good" add little — describe what you actually want - -## Choose an AI Model - -LobeHub offers multiple AI image generation models. Different models have different strengths: - -![Choose a Model](/blog/assetsdd913561927c64d32bd390cee6846f9a.webp) - -| Model | Best For | -| -------------------- | ------------------------------------------------------------- | -| **DALL-E 3** | Realistic photos, illustrations, following prompts accurately | -| **Flux** | Artistic styles, creative images, fast generation | -| **Stable Diffusion** | Highly customizable, community styles and fine-tuned models | -| **fal.ai models** | Various specialized styles and fast generation | - -Try different models with the same prompt to see which gives the best results for your use case. - -## Select Reference Images (Optional) - -If you have reference images, upload them to guide the generation process. Click the upload button or drag and drop your reference images directly. You can upload multiple reference images. - -![Upload Reference Images for Image Generation](/blog/assets3c160860feef0bd7c653eeb46f683445.webp) - -Reference images help the model understand your desired style, composition, or color palette. - -## Choose Image Aspect Ratio - -Select an aspect ratio based on your intended use: - -- **1:1** — Social media posts, profile pictures -- **16:9** — Widescreen, presentations, banners -- **9:16** — Mobile screens, stories, reels -- **4:3** — General use, older display formats -- **3:2** — Photography standard, prints - -## Set Number of Images - -Choose how many images to generate in one go. Generating multiple images at once gives you variations to choose from. Start with 2–4 to find the best result. - -## View and Download Images - -Once generated, images appear on the drawing page. You can: - -- Preview any image at full size by clicking it -- Select favorites and download them -- Share directly from the image viewer - -All generated images are automatically saved to your Resource Library. - -![Generated Images in Asset Library](/blog/assets974acc551878f2f395518a3fbb9bd924.webp) - -## Tips for Better Results - -**Iterate on prompts** — If the first result isn't quite right, adjust one element at a time rather than rewriting the whole prompt. Add more detail, change the style descriptor, or specify what you don't want. - -**Use a reference image** — Uploading a reference image with your prompt helps the model match your intended style, color palette, or composition. - -**Try multiple variations** — Generate 4+ images at once and pick the best one. AI image generation has inherent randomness — some variations will be significantly better than others. - -**Match model to task** — Use photorealistic models (DALL-E 3, Flux) for product photos and realistic scenes; use style-focused models for artistic illustrations. - - - - - - - - diff --git a/docs/usage/getting-started/image-generation.zh-CN.mdx b/docs/usage/getting-started/image-generation.zh-CN.mdx deleted file mode 100644 index 911797e43f..0000000000 --- a/docs/usage/getting-started/image-generation.zh-CN.mdx +++ /dev/null @@ -1,114 +0,0 @@ ---- -title: 图像生成 -description: 使用 DALL-E 3、Flux 等 AI 模型,通过文字描述生成高质量图像。学习如何编写有效的提示词并选择合适的模型。 -tags: - - LobeHub - - 图像生成 - - AI 画图 - - DALL-E - - 文字生成图像 - - 提示词写作 ---- - -# 图像生成 - -用文字描述你想要的内容 ——LobeHub 帮你把想法变成图像。产品原型、设计灵感、插图配图、创意探索:选择模型、设置参数,几秒钟内获得高质量图像。生成的图片会自动保存到你的资源库。 - -## 开始画图 - -在 LobeHub 主界面点击**绘画**板块,进入画图页面。 - -## 输入提示词 - -在输入框中描述你想要的图像。描述越具体,结果越符合预期。 - -**有效的提示词结构:** - -``` -[主体] [风格/媒介] [场景/背景] [光线] [氛围] [技术细节] -``` - -示例: - -``` -"赛博朋克风格的未来城市天际线,日落时分,霓虹灯在湿润街道上的倒影,数字艺术,电影级光线,4K 细节" - -"温馨咖啡馆室内,水彩插画风格,阳光透过窗户洒入,窗台上摆放绿植,柔和温暖的氛围" - -"极简皮革钱包产品照,白色干净背景,棚拍灯光,对焦清晰,商业摄影风格" -``` - -**提示词技巧:** - -- **明确指定风格** — "油画"、"水彩"、"数字艺术"、"照片写实"、"动漫"、"矢量插画" -- **描述光线** — "戏剧性阴影"、"柔和漫射光"、"黄金时段"、"棚拍灯光" -- **指定构图** — "竖拍人像"、"广角"、"特写"、"俯拍鸟瞰" -- **加入质量词** — "高细节"、"4K"、"对焦清晰"、"专业品质" -- **避免模糊描述** — "漂亮"、"好看"、"不错" 对结果帮助有限 —— 要具体描述你真正想要的内容 - -## 选择 AI 模型 - -LobeHub 提供多个 AI 画图模型,不同模型各有所长: - -![选择模型](/blog/assetsdd913561927c64d32bd390cee6846f9a.webp) - -| 模型 | 最适合 | -| -------------------- | ----------------- | -| **DALL-E 3** | 写实照片、插画、精准遵循提示词 | -| **Flux** | 艺术风格、创意图像、快速生成 | -| **Stable Diffusion** | 高度可定制,支持社区风格和微调模型 | -| **fal.ai 系列模型** | 多种专业风格,生成速度快 | - -用同一个提示词尝试不同模型,找到最适合你使用场景的。 - -## 选择参考图片(可选) - -如果你有参考图片,可以上传作为生成的参考。点击上传按钮或直接拖入参考图片即可。可以上传多张参考图片。 - -![上传参考图片](/blog/assets3c160860feef0bd7c653eeb46f683445.webp) - -参考图片有助于模型理解你期望的风格、构图或配色方案。 - -## 选择图片比例 - -根据使用场景选择合适的比例: - -- **1:1** — 社交媒体发帖、头像 -- **16:9** — 宽屏、演示文稿、横幅 -- **9:16** — 手机屏幕、动态、竖屏视频 -- **4:3** — 通用用途、旧显示格式 -- **3:2** — 摄影标准、打印 - -## 设置生成数量 - -选择一次生成多少张图片。一次生成多张可以获得不同变体供你选择。建议从 2–4 张开始,从中挑选最佳结果。 - -## 查看和下载图片 - -图像生成完成后,会显示在画图页面。你可以: - -- 点击任意图片查看全尺寸预览 -- 选择满意的图片并下载 -- 在图片查看器中直接分享 - -生成的图片会自动保存到你的资源库。 - -![生成的图片在资源库中](/blog/assets974acc551878f2f395518a3fbb9bd924.webp) - -## 获得更好结果的技巧 - -**迭代优化提示词** — 如果第一次的结果不够理想,每次只调整一个要素,而不是重写整个提示词。可以增加细节、改变风格词,或指定你不想要的内容。 - -**使用参考图片** — 上传参考图配合提示词,帮助模型匹配你期望的风格、配色或构图。 - -**多变体对比** — 一次生成 4 张以上,从中挑选最佳。AI 图像生成本身具有随机性 —— 不同变体的质量可能差异明显。 - -**根据任务选模型** — 产品照和写实场景选写实系模型(DALL-E 3、Flux);艺术插画选风格化模型。 - - - - - - - - diff --git a/docs/usage/getting-started/resource.mdx b/docs/usage/getting-started/resource.mdx index b9046b6f6f..cf6ce253aa 100644 --- a/docs/usage/getting-started/resource.mdx +++ b/docs/usage/getting-started/resource.mdx @@ -281,5 +281,5 @@ When answering product questions: - + diff --git a/docs/usage/getting-started/resource.zh-CN.mdx b/docs/usage/getting-started/resource.zh-CN.mdx index 9a6b6ff7b8..1c4d7b2702 100644 --- a/docs/usage/getting-started/resource.zh-CN.mdx +++ b/docs/usage/getting-started/resource.zh-CN.mdx @@ -106,5 +106,5 @@ tags: - + diff --git a/docs/usage/getting-started/vision.mdx b/docs/usage/getting-started/vision.mdx index 34d6319318..218a0c8ea6 100644 --- a/docs/usage/getting-started/vision.mdx +++ b/docs/usage/getting-started/vision.mdx @@ -210,7 +210,7 @@ Other providers may also offer vision models — check the model's capability ta - + diff --git a/docs/usage/getting-started/vision.zh-CN.mdx b/docs/usage/getting-started/vision.zh-CN.mdx index 72c11377a4..9b7e4ec056 100644 --- a/docs/usage/getting-started/vision.zh-CN.mdx +++ b/docs/usage/getting-started/vision.zh-CN.mdx @@ -147,7 +147,7 @@ LobeHub 支持视觉功能 —— 助理能够 "看见" 并理解你分享的图 - +