1373 Commits

Author SHA1 Message Date
Classic298 02b2a391e9 fix: block private-IP webhook URLs to close SSRF on caller-controlled URL (#24587)
* fix: block private-IP webhook URLs to close SSRF on caller-controlled URL

post_webhook(url, ...) in utils/webhook.py forwards the URL straight to
aiohttp.ClientSession.post with no SSRF gate. The URL is caller-controlled
on two surfaces:

- User notification settings under ENABLE_USER_WEBHOOKS=true — any
  authenticated user can set the URL their notifications POST to.
- Automation notification triggers (calendar alerts, etc.).

Without a gate, the URL can target cloud metadata (169.254.169.254 /
fd00:ec2::254), localhost-bound services, RFC1918 internal hosts, or any
other private address reachable from the server process. Blind SSRF — no
response body returned to the caller — but enough to enumerate internal
services via response timing / status codes, and on cloud deployments
enough to issue requests against IMDSv1 if available.

Call validate_url() at the top of post_webhook. The function blocks
private/reserved IPs when ENABLE_RAG_LOCAL_WEB_FETCH is False (the
default), is the project's chosen SSRF gate, and is already applied to
the equivalent fetch surfaces (retrieval, image-load, OAuth profile
picture). Operators who legitimately need to webhook to private IPs
(internal monitoring, self-hosted Slack alternatives, etc.) can set
ENABLE_RAG_LOCAL_WEB_FETCH=True — same opt-out as the other gated
surfaces.

Scope intentionally limited to webhooks. The OAuth discovery and
external reranker paths cwanglab also flagged are admin-configured with
intentional private-IP defaults (reranker defaults to
http://localhost:8080/v1/rerank) and are out of scope per Rule 9 — the
admin owns the URL choice and the operator opt-out exists for them too.

Reported by cwanglab in GHSA-5x9f-85cg-w3hf (cluster canonical with six
closed siblings: g36v-23gj-j69x, 6j8f-h58v-xgmw, xpwv-52pm-p8hj,
v9gp-hv2c-9qv8, fw7w-jrw7-p3v9, x7xq-74rg-m8mf).

Co-authored-by: cwanglab <cwanglab@users.noreply.github.com>

* fix: also pass allow_redirects=False on webhook post_webhook session.post

Companion to the previous commit. validate_url() only validates the
initial URL; aiohttp's default allow_redirects=True would still follow
a 302 to a private-IP target. Same redirect-bypass class as the rh5x
cluster's five call sites, sixth call site to receive the same gate.

Co-authored-by: cwanglab <cwanglab@users.noreply.github.com>

---------

Co-authored-by: cwanglab <cwanglab@users.noreply.github.com>
2026-06-01 14:15:51 -07:00
Timothy Jaeryang Baek eebbc48f80 refac
Co-Authored-By: Jacob Leksan <63938553+jmleksan@users.noreply.github.com>
2026-06-01 14:13:28 -07:00
Timothy Jaeryang Baek cff51f05f5 chore: format 2026-06-01 14:10:40 -07:00
Timothy Jaeryang Baek a4735e46b9 refac
Co-Authored-By: Syed Mustafa Quadri <175467872+code-quad3@users.noreply.github.com>
2026-06-01 14:09:54 -07:00
Timothy Jaeryang Baek 6fce92aa12 chore: format 2026-06-01 13:56:55 -07:00
Justin Williams 478bc9e3f1 fix(oauth): use Protected Resource Metadata scopes in static OAuth 2.1 flow (#24690)
The static credentials OAuth flow currently sets scope=None, relying on
the OAuth provider's default scopes. This breaks providers like GitHub
that default to minimal/public-only access when no scope is requested.

This change reads scopes_supported from the Protected Resource Metadata
document (RFC 9728) and uses them in the authorization request. Unlike
the Authorization Server's scopes_supported (a full catalog of every
scope the AS can grant), the PRM scopes_supported represents what the
specific resource requires — making it safe to request without breaking
providers like Entra ID that reject broad scope requests.

Fixes the regression introduced in 349ea4ea where all scope handling was
removed from the static flow.
2026-06-01 13:52:18 -07:00
Jacob Leksan 80da840ae5 refactor: move background tasks handler call to ensure consistent execution in chat response handlers (#24717) 2026-06-01 13:50:15 -07:00
Timothy Jaeryang Baek c8eb8edca4 refac 2026-06-01 13:38:40 -07:00
Timothy Jaeryang Baek 778dba1d6b refac 2026-06-01 13:18:44 -07:00
Timothy Jaeryang Baek 01810e32ad refac 2026-06-01 13:02:48 -07:00
Timothy Jaeryang Baek 4297c02b12 refac 2026-06-01 12:44:16 -07:00
Timothy Jaeryang Baek e3ab4bd212 refac
Co-Authored-By: Zixin Yu <183055163+ivvi0927@users.noreply.github.com>
2026-06-01 12:37:34 -07:00
Timothy Jaeryang Baek fd76b51ab2 refac
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2026-06-01 12:27:08 -07:00
Classic298 507b8b213c refac: mirror native FC code_interpreter authz gates onto legacy XML-tag path (#24724)
The native function-calling tool resolver in utils/tools.py applies five
gates before exposing execute_code as a builtin tool: builtin-category
enable, ENABLE_CODE_INTERPRETER global config, model capability,
features.code_interpreter request flag, and the per-user
features.code_interpreter permission.

The legacy XML-tag detection path in streaming_chat_response_handler
applied only the request-flag gate. Brings the legacy path to parity by
running the same five-gate check before activating tag detection.
Behaviour change is limited to deployments that previously relied on
the asymmetry — admins who set ENABLE_CODE_INTERPRETER=False or revoked
the per-user permission, on the legacy tool-calling mode, with the
client supplying features.code_interpreter=true. Any of those three
conditions met now correctly disables tag detection.

Co-authored-by: sfwani <sfwani@users.noreply.github.com>
2026-06-01 12:07:15 -07:00
Algorithm5838 309caa82fb fix: persist outlet filter changes to message output (#24884) 2026-06-01 10:24:40 -07:00
Timothy Jaeryang Baek 07cbc91a8e refac
Co-Authored-By: Boris Rybalkin <ribalkin@gmail.com>
2026-06-01 10:16:01 -07:00
Classic298 83890f18b9 feat: cap profile image data URI size to bound model/avatar bloat (#25476)
* feat: cap profile image data URI size to bound model/avatar bloat

validate_profile_image_url() validated data-URI format (MIME allowlist,
SVG rejection, scheme checks) but never its length, so a valid
data:image/...;base64,<huge> passed for both custom-model icons and user
avatars. Large inline images bloat Postgres and the Redis MODELS hash and
degrade model-list latency.

Add PROFILE_IMAGE_MAX_DATA_URI_SIZE (default 256 KiB, 0 disables) and
reject oversized data URIs in the shared validator, so both model meta
(ModelMeta.profile_image_url) and user avatars (UpdateProfileForm) are
bounded at one chokepoint. ModelMeta already clears invalid values to
None on read, so existing oversized icons stop propagating into the
MODELS hash on the next refresh.

Fixes #25468

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* fix: default PROFILE_IMAGE_MAX_DATA_URI_SIZE to None (no cap)

Per review: opt-in rather than a 256 KiB default. Unset leaves data URIs
uncapped; the validator already skips the check on a falsy value.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 09:57:06 -07:00
Timothy Jaeryang Baek b64fd988f0 refac 2026-06-01 09:30:15 -07:00
Timothy Jaeryang Baek f16b5c4460 refac 2026-05-31 14:59:28 -07:00
Classic298 4719881105 fix: move bypass_system_prompt off query parameter onto request.state (#25156)
bypass_system_prompt is an internal flag used by utils/middleware.py and utils/chat.py to skip applying the model system prompt on recursive base-model calls, but it was still declared as a positional argument on the openai/ollama chat-completion route handlers, so FastAPI bound it from the query string. Move it to request.state so external clients cannot set it, matching how bypass_filter is handled.

Drop the argument from both route signatures and read getattr(request.state, 'bypass_system_prompt', False); utils/chat.py sets request.state.bypass_system_prompt alongside bypass_filter and drops the kwarg from the two route-handler calls (the recursive self-calls keep it). Mirrors c0385f60b.

Co-authored-by: anishgirianish <161533316+anishgirianish@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 14:53:53 -07:00
G30 66126f3861 fix(auth): use request.scope["path"] to prevent CVE-2026-48710 (BadHost) (#25123)
Starlette reconstructs request.url.path from the HTTP Host header without
validation. An attacker can inject a path into the Host header to make
request.url.path return a different value than the path Starlette routes on.

The API key endpoint restriction check was using request.url.path to decide
whether to allow or deny access — making it bypassable via a crafted Host
header on any Starlette version prior to 1.0.1.

Fix: replace request.url.path with request.scope["path"], which reads the
raw ASGI scope path that Starlette uses for routing. This value is set by
the ASGI server from the actual request path and cannot be injected via
HTTP headers, making it safe regardless of Starlette version.

Affected code path:
  get_current_user_by_api_key() in backend/open_webui/utils/auth.py
  (only triggered when ENABLE_API_KEYS_ENDPOINT_RESTRICTIONS is enabled)

References:
  CVE-2026-48710 / BadHost
  https://arstechnica.com/information-technology/2026/05/millions-of-ai-agents-imperiled-by-critical-vulnerability-in-open-source-package/
2026-05-28 16:41:56 -05:00
Timothy Jaeryang Baek 79bf3d28d8 refac 2026-05-28 16:33:48 -05:00
Timothy Jaeryang Baek 154679200f refac: clean up Redis sentinel utilities and import grouping 2026-05-21 11:47:25 +04:00
Timothy Jaeryang Baek 2b99945d27 refac
Co-Authored-By: Classic298 <27028174+Classic298@users.noreply.github.com>
2026-05-20 00:22:27 +04:00
Classic298 d07fd7d6d8 fix: disable redirect following in OAuth picture fetch (SSRF) (#24809)
_process_picture_url validated the initial picture URL with validate_url()
but then aiohttp followed 3xx redirects without re-validating the target,
so a validate_url-passing public URL could 302 to an internal address and
the body was base64-stored in the user's profile_image_url. This is the
sixth call site of the CVE-2026-45401 redirect-bypass cohort; the other
five already pass allow_redirects=AIOHTTP_CLIENT_ALLOW_REDIRECTS. Apply
the same.
2026-05-19 23:57:38 +04:00
Timothy Jaeryang Baek cfa6908d57 refac 2026-05-19 22:25:39 +04:00
Classic298 d169f086da fix: respect access_type in shared-chat file authorization branch (#24755)
has_access_to_file granted access whenever the file was attached to a
shared chat the user could read, ignoring the requested access_type. A
read-only shared-chat recipient therefore satisfied write and delete
checks and could delete or mutate the chat owner's attached file. Gate
the shared-chat branch on read access, matching the channels branch
directly above it.

Co-authored-by: oxsignal <oxsignal@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 22:09:56 +04:00
Classic298 a803372805 fix: log expected fetch/transcript/tool-server failures as warnings (#24903) 2026-05-19 21:55:40 +04:00
Timothy Jaeryang Baek ed73ef3d8d refac 2026-05-19 21:35:04 +04:00
Timothy Jaeryang Baek cc94a90b4d refac
Co-Authored-By: Algorithm5838 <108630393+Algorithm5838@users.noreply.github.com>
2026-05-19 21:03:23 +04:00
Timothy Jaeryang Baek c75fe8e74b fix: get_image_base64_from_file_id 2026-05-19 20:33:46 +04:00
Timothy Jaeryang Baek 1ea54c3217 refac 2026-05-14 13:10:22 +09:00
Timothy Jaeryang Baek ecec86dd32 refac 2026-05-13 15:55:21 +09:00
Timothy Jaeryang Baek 5b125c24d4 feat: kb_exec 2026-05-13 15:41:30 +09:00
Timothy Jaeryang Baek 6d0295588e refac: modernize type annotations (PEP 604 / PEP 585) 2026-05-12 17:10:15 +09:00
Timothy Jaeryang Baek 1413ce4a52 refac 2026-05-12 05:30:25 +09:00
Timothy Jaeryang Baek 4856ce48be chore: format 2026-05-11 02:51:59 +09:00
Timothy Jaeryang Baek 0037baeb26 enh: channels streaming agent 2026-05-11 02:50:30 +09:00
Timothy Jaeryang Baek 15e696691c refac 2026-05-11 02:25:11 +09:00
Timothy Jaeryang Baek 2dbf7b6764 refac 2026-05-11 02:12:38 +09:00
Classic298 d11e06f1b7 fix: prevent redirect-based SSRF and enforce collecton write access (#24524)
* fix: prevent redirect-based SSRF in get_image_base64_from_url

Cohort follow-up to PR #24491. That PR patched three call sites
(SafeWebBaseLoader._scrape, get_content_from_url, load_url_image) to
pass allow_redirects=False on the underlying HTTP client; this fourth
call site in utils/files.py was missed.

get_image_base64_from_url() is invoked from convert_url_images_to_base64
in utils/middleware.py on every /api/chat/completions request whose
message content includes an image_url part. validate_url() is called on
the originally-submitted URL only; the aiohttp session.get() call had
no allow_redirects argument and the shared session pool does not
override the aiohttp default (allow_redirects=True). An authenticated
user sending a chat message with image_url pointing at an attacker host
that 302-redirects to 169.254.169.254 / 127.0.0.1 / RFC1918 reached the
internal target. This is the most reachable variant in the redirect
cluster: no special endpoint, no admin permission, no feature flag.

Apply the same one-line fix as the other three call sites: pass
allow_redirects=AIOHTTP_CLIENT_ALLOW_REDIRECTS (defaults to False).

Reported by nayakchinmohan in GHSA-88jq-grjp-jx6f; consolidated under
GHSA-rh5x-h6pp-cjj6.

Co-authored-by: nayakchinmohan <nayakchinmohan@users.noreply.github.com>

* fix: enforce collection write access on process_file endpoint

Cohort follow-up to ba83613ff. That commit added _validate_collection_access
to process_text and process_web (the user-supplied collection_name path)
but missed process_file in the same router.

process_file accepts a user-supplied collection_name and writes the file's
embedded content into that collection via save_docs_to_vector_db. The
file_id is gated by file ownership (line 1562) but collection_name was
unchecked, so an authenticated user could append content from a file they
own into another user's knowledge-base collection by passing the victim's
KB UUID as collection_name. Identical pattern to the process_text and
process_web gaps that ba83613ff closed.

Apply the same one-line gate as the sibling endpoints: when
collection_name is user-supplied (not the default file-{file.id} fallback),
require write access via _validate_collection_access. The shared validator
delegates to filter_accessible_collections, which already correctly
handles file-* prefixes (via has_access_to_file) and KB UUIDs
(via Knowledges.check_access_by_user_id) — admins bypass.

Reported by tenbbughunters (Tenable) in GHSA-4g37-7p2c-38r9 (the
comprehensive write-path filing covering process_text / process_file /
process_web / process_youtube and the _validate_collection_access UUID
root cause), and independently re-identified for the missed process_file
call site by kodareef5 in GHSA-4m74-3cmc-293g.

Co-authored-by: tenbbughunters <tenbbughunters@users.noreply.github.com>
Co-authored-by: kodareef5 <kodareef5@users.noreply.github.com>

* fix: enforce collection write access on process_files_batch endpoint

Cohort follow-up to ba83613ff and the prior process_file fix on this
branch. process_files_batch (line 2604) is the third write endpoint in
the same router that accepts a user-supplied collection_name; it was
covered in the same Tenable filing as process_file and was missed by
the same cohort fix. The endpoint validates per-file ownership at line
2642 but does not check whether the caller has write access to the
target collection_name before save_docs_to_vector_db writes into it
at line 2683-2690 with add=True.

Apply the same one-line gate as the sibling endpoints. Validate only
when collection_name is user-supplied (truthy) so the existing fall
through behavior for the None case is unchanged.

Same Tenable / kodareef5 cohort as the previous commit.

Co-authored-by: tenbbughunters <tenbbughunters@users.noreply.github.com>
Co-authored-by: kodareef5 <kodareef5@users.noreply.github.com>

---------

Co-authored-by: nayakchinmohan <nayakchinmohan@users.noreply.github.com>
Co-authored-by: tenbbughunters <tenbbughunters@users.noreply.github.com>
Co-authored-by: kodareef5 <kodareef5@users.noreply.github.com>
2026-05-11 01:09:15 +09:00
Timothy Jaeryang Baek df42d96c95 refac 2026-05-09 21:05:49 +09:00
Timothy Jaeryang Baek 6116c6dca0 refac 2026-05-09 16:06:09 +09:00
Timothy Jaeryang Baek 93931efaa7 refac 2026-05-09 16:05:21 +09:00
Timothy Jaeryang Baek 3ccf263b10 refac 2026-05-09 15:46:33 +09:00
Timothy Jaeryang Baek 7bcc0e2e5c chore: format 2026-05-09 15:25:27 +09:00
Timothy Jaeryang Baek 4d99baa292 refac 2026-05-09 15:04:09 +09:00
Timothy Jaeryang Baek 85c7373f68 refac 2026-05-09 07:37:53 +09:00
Timothy Jaeryang Baek 5b80932e59 refac 2026-05-09 06:56:22 +09:00
Timothy Jaeryang Baek 2ba6b423aa refac 2026-05-09 06:50:11 +09:00