`lookup` is already keyed by `wp.id.to_s`, so the second index built
inside `fold_in_alias_keys` was redundant. The alias-lookup branch can
read straight from `lookup` using the plucked work_package_id.
The doc-level preload that backs the macro pipeline reads identifiers
straight out of user-pasted CKEditor content. Without a bound, a single
multi-megabyte comment could push thousands of values into one
`WHERE id IN (...)` and an alias `WHERE identifier IN (...)`, both built
in memory and shipped in one query.
`MAX_PRELOAD_IDENTIFIERS = 500` caps the per-render Set; references past
the cap render via the link handler's cache-miss fallback (numeric → bare
`#N` link, semantic → literal text).
Spec stubs the constant to 2 and asserts the WP IN-list never exceeds it.
Replace duplicated `%w(# ## ###).include?(matcher.sep)` and
`["##", "###"].include?(matcher.sep)` checks with named predicates on
the WorkPackages link handler: HASH_TRIGGERS for the `#`/`##`/`###`
family, hash_trigger? to gate the handler, and quickinfo?/detailed?
to distinguish the three macro shapes. The PDF-export subclass picks
both up via inheritance.
"Trigger" replaces "separator" in the WP context: in `#1234` the `#`
is a sigil that triggers mention recognition, not something that
separates parts of the reference. The matcher's generic `sep`
vocabulary stays — it still fits the `version#3` and `version:1.0.0`
cases that other handlers own.
Cuts comments that restate code mechanics, narrate the journey, or
pile on implementation detail. Genuine WHYs (perf rationale of the
to_i.to_s round-trip, the alias second-query reason, leading-zero
rejection, classic-mode preload skip) all stayed.
The matcher regex now accepts both numeric and semantic identifier shapes
(`\d+|PROJ-\d+`) on a single alternation branch shared by the bare WP
handler and the prefixed-resource handler. Without this guard,
`version#PROJ-1` reaches HashSeparator with `oid = "PROJ-1".to_i = 0`,
which is `present?`, so it issues `Version.find_by(id: 0)` — a guaranteed
miss against the wrong table. The same shape applies to every prefix
HashSeparator handles (document, message, meeting, project, user, group,
view), turning a single pasted comment into one wasted SELECT per prefix.
Gating `applicable?` on `numeric_id?(matcher.identifier)` short-circuits
the handler before any DB lookup; the regex still matches and the text
falls through to literal output. Regression test asserts zero
`FROM "versions"` SELECTs for `version#PROJ-1`.
The `value == value.to_i.to_s` round-trip check that filters leading-
zero ID forms ("0123") was duplicated across the WP link handler, the
PDF export macro, and the cost-query filter, and the matching guard in
`ResourceLinksMatcher#extract_work_package_identifier` was a no-op
(the regex's `\d+|[A-Z][A-Z0-9_]*-\d+` branch already gates the shape).
A new `WorkPackage::SemanticIdentifier.numeric_id?(value)` predicate
captures the canonical-numeric check at one site. It pairs with
`semantic_id?` as the WP-finder shape gate; the two answer different
questions (shape vs routing) and so are kept independent rather than
expressed as one another's negation.
The redundant `extract_work_package_identifier` guard is dropped along
with its misleading "rejecting leading-zero forms" comment — rejection
actually happens downstream in the link handler, where the new
predicate now reads as the contract it always was.
Comments shouldn't read as commit-history narration or as reviewer
shorthand. Each one now describes the present invariant in plain words:
- Drop "Pre-PR-E behaviour" / "matching pre-PR behaviour" / "matches the
pre-PR" — the spec and inline comments now state the current rule.
- Drop "load-bearing", "already in this codebase", "we fall back" — say
what the code does, not how a reviewer would describe it.
- Reword the `semantic_id?` rationale to a forward-looking constraint
(`don't tighten it`) instead of a comparative description of why this
module owns it.
The regex now declares each capture group by name, so `parse_match`
reads as a flat name-to-name map rather than "group 7 || 9 || 11"
arithmetic. Adding or removing a branch later doesn't ripple through
group numbers — historically that was the bug-window in this method.
Also bring the class-level docstring up to date with `##N`/`###N`
quickinfo macros (which were missing from the docs entirely) and the
new `#PROJ-N` semantic-id forms.
`#1234` text macros render `formatted_id`/`display_id` already, but the
*input* side still requires the numeric primary key. Authors typing or
pasting `#PROJ-1` (or `##PROJ-1` / `###PROJ-1`) in a comment, WP
description, or meeting body would see literal text rather than a
resolved link.
Three changes that move together:
1. `ResourceLinksMatcher.regexp` — the hash-separator branch now accepts
either the numeric shape `\d+` or the semantic shape
`[A-Z][A-Z0-9_]*-\d+`, mirroring `WorkPackage::SemanticIdentifier::
ID_ROUTE_CONSTRAINT`. The revision branch (`r\d+`) stays numeric-only
via a separate alternation. `parse_match` is the single site that
maps the new regex group indices to semantic field names; everything
else flows from there.
2. `LinkHandlers::WorkPackages#call` — splits into a numeric path
(preserving the leading-zero rejection from before) and a semantic
path. Semantic-shape input only links when `semantic_mode_active?` is
true; classic instances render literal text. Plain `#PROJ-N` requires
a cache hit (literal-text fallback when missing); `##PROJ-N` /
`###PROJ-N` quickinfo elements emit unconditionally with `data-id`
set to the user-facing identifier — APIv3 already resolves either
shape, and the frontend Angular component handles missing WPs.
Hover-card URLs now also speak `display_id` so the URL matches the
user-facing identifier (the route accepts both shapes — see
`HoverCardComponent#initialize`).
3. The preload cache extends to string keys via the
`WorkPackage.where_display_id_in` batch finder added in #23016.
`with_preloaded_resources` runs one WP SELECT for the common case
(numerics + current semantic identifiers); historical alias
references add a second targeted alias-table pluck, so an
alias-heavy doc costs at most two round-trips per render.
Specs cover: `#PROJ-N` resolves with formatted_id / display_id href in
semantic mode; classic mode leaves it as literal text and issues zero
WP SELECTs; `##/###` quickinfo carries `display_id` in `data-id`; mixed
numeric+semantic resolves in 1 SELECT; alias references resolve in 2
round-trips; `#GHOST-99` falls through cleanly; nested `format_text`
calls preserve outer save/restore semantics.
Address follow-up polish on PR 22976 review:
- Extract `OpenProject::TextFormatting::PreformattedBlocks` (BLOCKS set +
ancestor? helper) so PatternMatcherFilter and ResourceLinksMatcher share
a single source of truth for the `<pre>`/`<code>` ancestry skip.
- Lift `parse_match(match)` so `process_match` and `extract_work_package_id`
consume the same regex group → semantic name mapping.
- `with_preloaded_resources` captures `previous` unconditionally so the
`ensure` block no longer needs a `defined?(previous)` guard.
- Preload `WorkPackage.where(id: ids).select(:id, :identifier)` only —
`display_id`/`formatted_id` don't read other columns.
- N+1 spec switches from `have_a_query_limit(1)` to a SQL-notification
subscriber filtered to `FROM "work_packages"` SELECTs, avoiding false
positives from incidental Setting/User/Project queries.
Three changes from the rails-code-reviewer agent + the GitHub Copilot
reviewer:
1. Switch from `thread_mattr_accessor` to `RequestStore.store` for the
per-render WorkPackage lookup. Aligns with the existing `Cache`,
`Setting`, `CustomStyle`, and `WorkPackage#available_custom_field_key`
conventions in the codebase. Cleanup-on-request-end via
`RequestStore::Middleware` is defense in depth — the filter's explicit
save/restore is what actually scopes the cache to a single render.
2. Save/restore around the per-node loop, exposed via a single yielding
API on the matcher: `with_preloaded_resources(doc, context) { … }`.
The previous lookup is captured on entry and restored on exit, so a
nested `format_text` (custom-field formatter, recursive markdown
render, etc.) cannot clobber the outer render's lookup. The flat
`preload_for_doc`/`cleanup_after_doc` pair is replaced; the filter
recursively wraps the loop in each matcher's hook.
3. Wrap the lookup behind two class methods — `work_package_for(id)` for
the link handler, `with_preloaded_resources` for the filter — so the
RequestStore key stays a private constant. Future storage swaps don't
ripple to callers.
Plus the GitHub Copilot review's two findings:
- Skip the preload entirely in classic mode (`semantic_mode_active?`
guard). `display_id` and `formatted_id` collapse to the numeric form,
so the link handler renders the legacy shape from `wp_id` alone — no
DB load required, restoring pre-PR query-free behaviour. New spec
"classic mode is query-free" is the regression guard.
- `extract_work_package_id` now requires `prefix.nil?` so prefixed
matches like `version#3` and `message#12` don't contribute to the
preload set, mirroring `LinkHandlers::WorkPackages#applicable?`.
New specs:
- save/restore semantics: nested `with_preloaded_resources` calls
preserve the outer lookup and clear after the outer block exits.
- classic mode: `format_text("#1#2#3")` issues zero `work_packages`
SELECTs.
Refs https://github.com/opf/openproject/pull/22976#pullrequestreview-4194628380
In semantic mode, plain `#N` references inside formatted text rendered
`<a href="/work_packages/N">#N</a>`. The link should carry the
human-readable identifier on both the label and the href, matching how
the `##N` quickinfo macro already renders via Angular.
Two pieces:
1. `PatternMatcherFilter` gains an opt-in pre/cleanup hook around the
per-node loop. Matchers that own per-render lookup caches (e.g. a
batched WP load) implement `preload_for_doc` and `cleanup_after_doc`
to populate and drop them.
2. `ResourceLinksMatcher` implements those hooks: it scans every text
node for `#N` matches, runs a single batched `WorkPackage.where(id:
ids)`, and exposes the result via a thread-isolated class attribute.
The `WorkPackages` link handler reads from it to choose
`wp.formatted_id` for the label and `wp.display_id` for the href.
Falls back to the legacy `#N` shape when the WP isn't loadable
(deleted, out of scope, or no preload ran).
Visibility filtering is intentionally not introduced — the matcher
links regardless of viewer permissions on the referenced WP, preserving
pre-existing behaviour. Out of scope for this ticket.
Refs https://community.openproject.org/work_packages/74315
The activity feed renders an automatic entry whenever a parent / child /
predecessor / related work package change cascades dates onto the
current WP. Both render paths spoke numeric ids:
- The plain-text branch (used for the API JSON) hardcoded `##{id}`.
- The HTML branch went through `link_to_work_package`, which built the
visible link label as `Type ##{id}: subject`.
Both now use `formatted_id`, which produces `PROJ-7` in semantic mode
and `#42` in classic — same shape, mode-aware. The link href was
already mode-correct via `to_param`; only the visible label needed the
swap.
The performance decreased because of a combination of calls that were supposed to increase performance.
We have a mechanism in place which automatically eager loads models needed in the element representers when a collection of them is rendered. This is to avoid N+1 queries of course. But, if eager loading is combined with e.g., a LIMIT, which we do because we paginate, rails automatically falls back to issuing two instead of just one SQL statement. Which makes sense as otherwise LEFT JOINS might mess with the result set.
But Rails does so in a somewhat simple fashion. It uses the first query to get the DISTINCT ids. The second is used to load the values (without a limit). But instead of removing all WHERE statements in the second SQL statement and then apply just the one for the ids, it keeps the original WHERE statement and applies the one for the ids on top. The problem with that is that the database trips on that (I didn't check the why) and uses a less than optimal query plan.
That was the problem here as well. The first query remained reasonable quick (300ms) but the second one took 25s.
The fix is to split the two statements by hand in the representer whenever eager loading is defined. The first query has all the filters but no eager loading and fetches the ids. The second takes the ids, and with eager loading included loads the data. Et voila, second query takes 10ms.
That at least works for relations, work_packages and projects. But there are other representers that also seem to have custom behaviour. I'm looking into whether they can be easily adapted.
So far, we've been using the relaxed sanitization rule set. We only need some styles for the table display in CKEditor,
other styles should not be allowed.
The ancestors/children representer change calls `child.display_id`,
which consults `identifier` in semantic mode. The Hierarchy eager
loader preloads children with a minimal `SELECT id, subject, project_id,
parent_id` for performance, so `identifier` was missing and
`ActiveModel::MissingAttributeError` fired the moment a query endpoint
rendered a work package with visible children in semantic mode.
Add `identifier` to the SELECT. It's one extra short text column per
child row.
In semantic mode, the work package breadcrumb renders numeric IDs
instead of the semantic identifier because ancestor HAL resources are
built from `_links.ancestors[]` entries that only carry `href` and
`title`. With no top-level `displayId` in `$source`, the frontend getter
falls through to the numeric id parsed from the href.
Emit `displayId` alongside `href`/`title` on each ancestor and child
link in the representer, and have the `displayId` getter fall back to
the self link's `displayId` so resources built from a link payload
alone still surface the semantic identifier.
We delegate the resource name to the object itself, that way we can
override it in our record and no knowledge about the backlogs module is
necessary in the core.