💄 style: support for parsing imageOutput (#7140)

Co-authored-by: Arvin Xu <arvinx@foxmail.com>
This commit is contained in:
wzdnzd
2025-03-28 10:21:56 +08:00
committed by GitHub
parent c67af99b56
commit 05bae9d3db
4 changed files with 28 additions and 6 deletions
+5 -3
View File
@@ -17,7 +17,7 @@ LobeChat supports customizing the model list during deployment. This configurati
You can use `+` to add a model, `-` to hide a model, and use `model name->deploymentName=display name<extension configuration>` to customize the display name of a model, separated by English commas. The basic syntax is as follows:
```text
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>,model2,model3
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file:imageOutput>,model2,model3
```
For example: `+qwen-7b-chat,+glm-6b,-gpt-3.5-turbo,gpt-4-0125-preview=gpt-4-turbo`
@@ -29,7 +29,7 @@ In the above example, it adds `qwen-7b-chat` and `glm-6b` to the model list, rem
Considering the diversity of model capabilities, we started to add extension configuration in version `0.147.8`, with the following rules:
```shell
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file:imageOutput>
```
The first value in angle brackets is designated as the `maxToken` for this model. The second value and beyond are the model's extension capabilities, separated by colons `:`, and the order is not important.
@@ -41,7 +41,8 @@ Examples are as follows:
- `gemini-1.5-flash-latest=Gemini 1.5 Flash<16000:vision>`: Google Vision model, maximum context of 16k, supports image recognition;
- `o3-mini=OpenAI o3-mini<200000:reasoning:fc>`: OpenAI o3-mini model, maximum context of 200k, supports reasoning and Function Call;
- `qwen-max-latest=Qwen Max<32768:search:fc>`: Qwen 2.5 Max model, maximum context of 32k, supports web search and Function Call;
- `gpt-4-all=ChatGPT Plus<128000:fc:vision:file>`, hacked version of ChatGPT Plus web, context of 128k, supports image recognition, Function Call, file upload.
- `gpt-4-all=ChatGPT Plus<128000:fc:vision:file>`, hacked version of ChatGPT Plus web, context of 128k, supports image recognition, Function Call, file upload;
- `gemini-2.0-flash-exp-image-generation=Gemini 2.0 Flash (Image Generation) Experimental<32768:imageOutput:vision>`, Gemini 2.0 Flash Experimental model for image generation, maximum context of 32k, supports image generation and recognition.
Currently supported extension capabilities are:
@@ -49,6 +50,7 @@ Currently supported extension capabilities are:
| ----------- | -------------------------------------------------------- |
| `fc` | Function Calling |
| `vision` | Image Recognition |
| `imageOutput` | Image Generation |
| `reasoning` | Support Reasoning |
| `search` | Support Web Search |
| `file` | File Upload (a bit hacky, not recommended for daily use) |
@@ -16,7 +16,7 @@ LobeChat 支持在部署时自定义模型列表,详情请参考 [模型提供
你可以使用 `+` 增加一个模型,使用 `-` 来隐藏一个模型,使用 `模型名->部署名=展示名<扩展配置>` 来自定义模型的展示名,用英文逗号隔开。通过 `<>` 来添加扩展配置。基本语法如下:
```text
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>,model2,model3
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file:imageOutput>,model2,model3
```
例如: `+qwen-7b-chat,+glm-6b,-gpt-3.5-turbo,gpt-4-0125-preview=gpt-4-turbo`
@@ -28,7 +28,7 @@ id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>,model2,
考虑到模型的能力多样性,我们在 `0.147.8` 版本开始增加扩展性配置,它的规则如下:
```shell
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>
id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file:imageOutput>
```
尖括号第一个值约定为这个模型的 `maxToken` 。第二个及以后作为模型的扩展能力,能力与能力之间用冒号 `:` 作为分隔符,顺序不重要。
@@ -40,7 +40,8 @@ id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>
- `gemini-1.5-flash-latest=Gemini 1.5 Flash<16000:vision>`Google 视觉模型,最大上下文 16k,支持图像识别;
- `o3-mini=OpenAI o3-mini<200000:reasoning:fc>`OpenAI o3-mini 模型,最大上下文 200k,支持推理及 Function Call
- `qwen-max-latest=Qwen Max<32768:search:fc>`:通义千问 2.5 Max 模型,最大上下文 32k,支持联网搜索及 Function Call
- `gpt-4-all=ChatGPT Plus<128000:fc:vision:file>`hack 的 ChatGPT Plus 网页版,上下 128k ,支持图像识别、Function Call、文件上传
- `gpt-4-all=ChatGPT Plus<128000:fc:vision:file>`hack 的 ChatGPT Plus 网页版,上下 128k ,支持图像识别、Function Call、文件上传
- `gemini-2.0-flash-exp-image-generation=Gemini 2.0 Flash (Image Generation) Experimental<32768:imageOutput:vision>`Gemini 2.0 Flash 实验模型,最大上下文 32k,支持图像生成和识别
目前支持的扩展能力有:
@@ -48,6 +49,7 @@ id->deploymentName=displayName<maxToken:vision:reasoning:search:fc:file>
| ----------- | ---------------------- |
| `fc` | 函数调用(function calling |
| `vision` | 视觉识别 |
| `imageOutput` | 图像生成 |
| `reasoning` | 支持推理 |
| `search` | 支持联网搜索 |
| `file` | 文件上传(比较 hack,不建议日常使用) |
+14
View File
@@ -86,6 +86,20 @@ describe('parseModelString', () => {
});
});
it('token and image output', () => {
const result = parseModelString('gemini-2.0-flash-exp-image-generation=Gemini 2.0 Flash (Image Generation) Experimental<32768:imageOutput>');
expect(result.add[0]).toEqual({
displayName: 'Gemini 2.0 Flash (Image Generation) Experimental',
abilities: {
imageOutput: true,
},
id: 'gemini-2.0-flash-exp-image-generation',
contextWindowTokens: 32_768,
type: 'chat',
});
});
it('multi models', () => {
const result = parseModelString(
'gemini-1.5-flash-latest=Gemini 1.5 Flash<16000:vision>,gpt-4-all=ChatGPT Plus<128000:fc:vision:file>',
+4
View File
@@ -84,6 +84,10 @@ export const parseModelString = (modelString: string = '', withDeploymentName =
model.abilities!.search = true;
break;
}
case 'imageOutput': {
model.abilities!.imageOutput = true;
break;
}
default: {
console.warn(`Unknown capability: ${capability}`);
}