Accept semantic work-package identifiers in text macros

`#1234` text macros render `formatted_id`/`display_id` already, but the
*input* side still requires the numeric primary key. Authors typing or
pasting `#PROJ-1` (or `##PROJ-1` / `###PROJ-1`) in a comment, WP
description, or meeting body would see literal text rather than a
resolved link.

Three changes that move together:

1. `ResourceLinksMatcher.regexp` — the hash-separator branch now accepts
   either the numeric shape `\d+` or the semantic shape
   `[A-Z][A-Z0-9_]*-\d+`, mirroring `WorkPackage::SemanticIdentifier::
   ID_ROUTE_CONSTRAINT`. The revision branch (`r\d+`) stays numeric-only
   via a separate alternation. `parse_match` is the single site that
   maps the new regex group indices to semantic field names; everything
   else flows from there.

2. `LinkHandlers::WorkPackages#call` — splits into a numeric path
   (preserving the leading-zero rejection from before) and a semantic
   path. Semantic-shape input only links when `semantic_mode_active?` is
   true; classic instances render literal text. Plain `#PROJ-N` requires
   a cache hit (literal-text fallback when missing); `##PROJ-N` /
   `###PROJ-N` quickinfo elements emit unconditionally with `data-id`
   set to the user-facing identifier — APIv3 already resolves either
   shape, and the frontend Angular component handles missing WPs.
   Hover-card URLs now also speak `display_id` so the URL matches the
   user-facing identifier (the route accepts both shapes — see
   `HoverCardComponent#initialize`).

3. The preload cache extends to string keys via the
   `WorkPackage.where_display_id_in` batch finder added in #23016.
   `with_preloaded_resources` runs one WP SELECT for the common case
   (numerics + current semantic identifiers); historical alias
   references add a second targeted alias-table pluck, so an
   alias-heavy doc costs at most two round-trips per render.

Specs cover: `#PROJ-N` resolves with formatted_id / display_id href in
semantic mode; classic mode leaves it as literal text and issues zero
WP SELECTs; `##/###` quickinfo carries `display_id` in `data-id`; mixed
numeric+semantic resolves in 1 SELECT; alias references resolve in 2
round-trips; `#GHOST-99` falls through cleanly; nested `format_text`
calls preserve outer save/restore semantics.
This commit is contained in:
Kabiru Mwenja
2026-05-06 20:27:46 +03:00
parent 5518ccfe6b
commit 8bddd2d24b
3 changed files with 293 additions and 68 deletions
@@ -42,50 +42,89 @@ module OpenProject::TextFormatting::Matchers
#
# Examples:
#
# #1234, ##1234, ###1234
# #1234, ##1234, ###1234, #PROJ-7, ##PROJ-7, ###PROJ-7
def call
wp_id = matcher.identifier.to_i
identifier = matcher.identifier
# Ensure that the element was entered numeric,
# prohibits links to things like #0123
return if wp_id.to_s != matcher.identifier
if WorkPackage::SemanticIdentifier.semantic_id?(identifier)
# Semantic shape is only valid when the instance is in semantic
# mode. Classic instances render the literal text fallback.
return nil unless Setting::WorkPackageIdentifier.semantic_mode_active?
render_link(wp_id, matcher)
end
def render_link(wp_id, matcher)
if ["##", "###"].include?(matcher.sep)
render_work_package_macro(wp_id, detailed: (matcher.sep === "###"))
render_for_semantic(identifier, matcher)
else
render_work_package_link(wp_id)
# Numeric branch: reject leading-zero shapes ("#0123") that round-
# trip back to a different integer.
return nil if identifier != identifier.to_i.to_s
render_for_numeric(identifier.to_i, matcher)
end
end
private
def render_work_package_macro(wp_id, detailed: false)
ApplicationController.helpers.content_tag "opce-macro-wp-quickinfo",
"",
data: { id: wp_id, detailed: }
def render_for_semantic(display_id, matcher)
if ["##", "###"].include?(matcher.sep)
# Quickinfo: the frontend Angular component does its own APIv3
# lookup and handles missing WPs. data-id carries the user-facing
# display_id straight through.
render_work_package_macro(display_id, detailed: matcher.sep == "###")
else
# Plain `#PROJ-N` link: needs the WP record for the formatted_id
# label and hover-card URL. Cache miss → literal text fallback
# rather than a broken `/work_packages/PROJ-N` URL.
wp = OpenProject::TextFormatting::Matchers::ResourceLinksMatcher.work_package_for(display_id)
return nil unless wp
render_work_package_link(wp, fallback_id: display_id)
end
end
def render_work_package_link(wp_id)
# Nil means no preload ran (classic mode, no `#N` references in the
# doc, or a render path that bypasses `PatternMatcherFilter`) OR the
# WP wasn't found. In both cases we fall back to the legacy `#N`
# shape rather than running a per-link query inside the renderer.
def render_for_numeric(wp_id, matcher)
wp = OpenProject::TextFormatting::Matchers::ResourceLinksMatcher.work_package_for(wp_id)
label = wp&.formatted_id || "##{wp_id}"
# `display_id` is the semantic identifier (PROJ-7) in semantic mode and
# the numeric id in classic mode — same field, mode-agnostic.
href_id = wp&.display_id || wp_id
if ["##", "###"].include?(matcher.sep)
# Prefer the resolved WP's display_id so `##1234` rendered in
# semantic mode also shows the user-facing identifier in the
# editor model. Cache miss (classic mode or unknown WP) keeps the
# numeric data-id, matching the pre-PR behaviour.
data_id = wp&.display_id || wp_id
render_work_package_macro(data_id, detailed: matcher.sep == "###")
else
render_work_package_link(wp, fallback_id: wp_id)
end
end
# `data-id` carries the user-facing display id (semantic in semantic
# mode, numeric in classic) end-to-end. The frontend Angular component
# passes it straight to APIv3, which resolves either shape via
# `find_by_display_id`.
def render_work_package_macro(data_id, detailed: false)
ApplicationController.helpers.content_tag "opce-macro-wp-quickinfo",
"",
data: { id: data_id, detailed: }
end
def render_work_package_link(work_package, fallback_id:)
# Nil `work_package` means no preload ran (classic mode, no
# references in the doc, or a render path that bypasses
# `PatternMatcherFilter`) OR the WP wasn't found. We fall back to
# the legacy `#N` shape rather than running a per-link query inside
# the renderer.
label = work_package&.formatted_id || "##{fallback_id}"
# `display_id` is the semantic identifier (PROJ-7) in semantic mode
# and the numeric id in classic mode — same field, mode-agnostic.
href_id = work_package&.display_id || fallback_id
link_to(label,
work_package_path_or_url(id: href_id, only_path: context[:only_path]),
class: "issue work_package",
data: {
hover_card_trigger_target: "trigger",
hover_card_url: hover_card_work_package_path(wp_id)
# The hover-card route accepts both numeric and semantic
# ids (HoverCardComponent calls find_by_display_id). Pass
# display_id so the URL matches the user-facing identifier.
hover_card_url: hover_card_work_package_path(href_id)
})
end
end
@@ -86,19 +86,27 @@ module OpenProject::TextFormatting
include ActionView::Helpers::UrlHelper
def self.regexp
# Hash and revision separators are split into separate alternation
# branches so the semantic-id identifier shape only applies to `#`
# references — `r` revisions stay numeric-only. Splitting them shifts
# the colon-separator group indices; `parse_match` is the single
# place that maps regex group numbers to semantic field names.
semantic_id = WorkPackage::SemanticIdentifier::ID_ROUTE_CONSTRAINT.source
%r{
([[[:space:]](,~\-\[>]|^) # Leading string
(!)? # Escaped marker
(([a-z0-9\-_]+):)? # Project identifier
(#{allowed_prefixes.join('|')})? # prefix
(
(\#+|r)(\d+) # separator and its identifier
([[[:space:]](,~\-\[>]|^) # Leading string [1]
(!)? # Escaped marker [2]
(([a-z0-9\-_]+):)? # Project identifier wrapper [3] + identifier [4]
(#{allowed_prefixes.join('|')})? # prefix [5]
( # [6] outer
(\#+)(#{semantic_id}) # hash sep [7] + identifier (numeric or semantic) [8]
|
(:) # or colon separator
(
[^"\s<>][^\s<>]*? # And a non-quoted value [10]
(r)(\d+) # revision sep [9] + numeric identifier [10]
|
(:) # colon separator [11]
( # [12] non-quoted-or-quoted
[^"\s<>][^\s<>]*? # And a non-quoted value
|
"([^"]+)" # Or a quoted value [11]
"([^"]+)" # Or a quoted value [13]
)
)
(?=
@@ -138,11 +146,12 @@ module OpenProject::TextFormatting
end
# Reader for the link handler. Returns the preloaded WorkPackage for the
# given numeric id, or nil if no preload is active (e.g. classic mode,
# no `#N` references in the doc, or pipeline path that bypasses
# `with_preloaded_resources`).
def self.work_package_for(id)
RequestStore.store[WORK_PACKAGES_LOOKUP_KEY]&.[](id)
# given identifier (numeric id as Integer or String, or semantic shape
# like "PROJ-7"), or nil if no preload is active (classic mode, no `#N`
# references in the doc, or pipeline path that bypasses
# `with_preloaded_resources`) or the WP couldn't be resolved.
def self.work_package_for(identifier)
RequestStore.store[WORK_PACKAGES_LOOKUP_KEY]&.[](identifier.to_s)
end
# Doc-level preload (called by `PatternMatcherFilter` around the per-node
@@ -168,48 +177,90 @@ module OpenProject::TextFormatting
return yield unless Setting::WorkPackageIdentifier.semantic_mode_active?
ids = collect_work_package_ids(doc)
return yield if ids.empty?
identifiers = collect_work_package_identifiers(doc)
return yield if identifiers.empty?
# Only `id` and `identifier` are read downstream (by `display_id` /
# `formatted_id`). Skip `description` and other heavy columns — a
# comment thread with 50+ `#N` references would otherwise pull in
# megabytes of unused payload.
RequestStore.store[WORK_PACKAGES_LOOKUP_KEY] =
WorkPackage.where(id: ids).select(:id, :identifier).index_by(&:id)
RequestStore.store[WORK_PACKAGES_LOOKUP_KEY] = build_lookup(identifiers)
yield
ensure
RequestStore.store[WORK_PACKAGES_LOOKUP_KEY] = previous
end
def self.collect_work_package_ids(doc)
ids = Set.new
def self.collect_work_package_identifiers(doc)
identifiers = Set.new
doc.search(".//text()").each do |node|
next if OpenProject::TextFormatting::PreformattedBlocks.ancestor?(node)
node.to_s.scan(regexp) { extract_work_package_id(Regexp.last_match)&.then { |id| ids << id } }
node.to_s.scan(regexp) do
extract_work_package_identifier(Regexp.last_match)&.then { identifiers << it }
end
end
ids
identifiers
end
# Returns the numeric WP id for a `#N` plain link match, or nil for any
# other shape (`##`/`###` quickinfo, prefixed `resource#id` links like
# `version#3` / `message#12`, `:`-separator resources, or leading-zero
# / non-numeric identifiers we don't link). The `prefix.nil?` guard
# mirrors `LinkHandlers::WorkPackages#applicable?` so the preload set
# only contains ids the WP handler will actually try to render.
def self.extract_work_package_id(match)
# Returns the WP identifier string for any `#N` / `##N` / `###N` (or
# semantic-shape) reference the WP link handler will try to render —
# `#PROJ-1` plain links need the WP record for the `formatted_id`
# label and hover-card URL; `##PROJ-1` / `###PROJ-1` quickinfo macros
# use it to emit the user-facing `display_id` in `data-id`. Returns
# nil for prefixed resource links (`version#3`, `message#12`),
# `:`-separator resources, and leading-zero numerics we don't link.
def self.extract_work_package_identifier(match)
parts = parse_match(match)
identifier = parts[:identifier]
return nil unless parts[:prefix].nil? && parts[:sep] == "#" && identifier.present?
return nil unless identifier == identifier.to_i.to_s
return nil unless parts[:prefix].nil? && parts[:sep]&.start_with?("#") && identifier.present?
identifier.to_i
# Accept either the semantic shape (PROJ-7) or a numeric round-trip
# (rejecting leading-zero "0123" forms that hit the regex's numeric
# branch but aren't valid PK references).
return nil unless WorkPackage::SemanticIdentifier.semantic_id?(identifier) ||
identifier == identifier.to_i.to_s
identifier
end
# Builds the per-render WP cache from a Set of identifier strings (mixed
# numeric and semantic).
#
# Step 1 — `where_display_id_in` resolves all references in one SELECT
# via id-IN / current-identifier-IN / alias-EXISTS. Rows index by
# `id.to_s` and `identifier` (the WP's *current* slug).
#
# Step 2 — any input still unmapped after Step 1 must have matched via
# the alias EXISTS subquery, since the loaded row only carries the
# current identifier. One targeted `WorkPackageSemanticAlias` lookup
# fills those mappings in. Skipped when no historical aliases are
# referenced — the common case stays at 1 SELECT.
def self.build_lookup(identifiers)
work_packages = WorkPackage.where_display_id_in(identifiers).select(:id, :identifier).to_a
lookup = index_by_id_and_identifier(work_packages)
fold_in_alias_keys(lookup, identifiers, work_packages)
lookup
end
def self.index_by_id_and_identifier(work_packages)
work_packages.each_with_object({}) do |wp, lookup|
lookup[wp.id.to_s] = wp
lookup[wp.identifier] = wp if wp.identifier.present?
end
end
private_class_method :index_by_id_and_identifier
def self.fold_in_alias_keys(lookup, identifiers, work_packages)
unmapped = identifiers.map(&:to_s) - lookup.keys
return if unmapped.empty?
wps_by_id = work_packages.index_by(&:id)
WorkPackageSemanticAlias
.where(identifier: unmapped)
.pluck(:identifier, :work_package_id)
.each { |ident, wp_id| lookup[ident] = wps_by_id[wp_id] }
end
private_class_method :fold_in_alias_keys
# Single source of truth for which regex group means what. Both
# `process_match` and `extract_work_package_id` consume this — change
# the regex layout in `regexp` and only this site needs to follow.
# `process_match` and `extract_work_package_identifier` consume this —
# change the regex layout in `regexp` and only this site needs to follow.
def self.parse_match(match)
{
leading: match[1],
@@ -217,9 +268,9 @@ module OpenProject::TextFormatting
project_prefix: match[3],
project_identifier: match[4],
prefix: match[5],
sep: match[7] || match[9],
raw_identifier: match[8] || match[10],
identifier: match[8] || match[11] || match[10]
sep: match[7] || match[9] || match[11],
raw_identifier: match[8] || match[10] || match[12],
identifier: match[8] || match[10] || match[13] || match[12]
}
end
@@ -156,4 +156,139 @@ RSpec.describe OpenProject::TextFormatting::Matchers::LinkHandlers::WorkPackages
"expected exactly one work_packages SELECT, got #{wp_selects.size}:\n#{wp_selects.join("\n")}"
end
end
describe "the `#PROJ-N` semantic reference",
with_flag: { semantic_work_package_ids: true },
with_settings: { work_packages_identifier: "semantic" } do
let(:role) { create(:project_role, permissions: %i[view_work_packages]) }
let(:author) { create(:user, member_with_roles: { project => role }) }
let(:project) { create(:project, identifier: "MACROPROJ") }
let(:work_package) { create(:work_package, project:, author:) }
before do
allow(User).to receive(:current).and_return(author)
work_package.allocate_and_register_semantic_id
end
it "renders the formatted_id label and display_id href for `#PROJ-N`" do
wp = work_package.reload
rendered = format_text("##{wp.display_id}")
expect(wp.display_id).to start_with("MACROPROJ-")
expect(rendered).to include(">#{wp.formatted_id}<")
expect(rendered).to include(%(href="/work_packages/#{wp.display_id}"))
# Hover-card URL also speaks the user-facing identifier — the
# controller's HoverCardComponent calls find_by_display_id, so the
# numeric and semantic shapes both resolve.
expect(rendered).to include(%(data-hover-card-url="/work_packages/#{wp.display_id}/hover_card"))
end
it "renders `##PROJ-N` as a quickinfo macro element with display_id in data-id" do
wp = work_package.reload
# Prepend "see " so Markly doesn't parse `##...` as an H2 ATX heading.
rendered = format_text("see ###{wp.display_id} here")
expect(rendered).to include(%(<opce-macro-wp-quickinfo data-id="#{wp.display_id}" data-detailed="false">))
end
it "renders `###PROJ-N` as a detailed quickinfo macro element" do
wp = work_package.reload
rendered = format_text("see ####{wp.display_id} here")
expect(rendered).to include(%(<opce-macro-wp-quickinfo data-id="#{wp.display_id}" data-detailed="true">))
end
context "when the referenced work package does not exist" do
it "falls back to literal text (no DB error, no broken link)" do
rendered = format_text("see #GHOST-99 here")
# No `<a>` tag, no quickinfo element — the matcher leaves the literal
# text alone when a semantic-shaped reference can't be resolved. This
# mirrors the user expectation: a `/work_packages/GHOST-99` URL would
# 404, so we'd rather show the bare text.
expect(rendered).to include("#GHOST-99")
expect(rendered).not_to include('href="/work_packages/GHOST-99"')
expect(rendered).not_to include("opce-macro-wp-quickinfo")
end
end
context "with mixed numeric and semantic references in one render" do
it "resolves both with a single work_packages SELECT" do
wps = create_list(:work_package, 2, project:, author:)
wps.each(&:allocate_and_register_semantic_id)
loaded = wps.map(&:reload)
text = "see ##{loaded[0].id} and ##{loaded[1].display_id}"
sql = []
callback = ->(_, _, _, _, v) { sql << v[:sql] unless %w[CACHE SCHEMA].include?(v[:name]) }
rendered = ActiveSupport::Notifications.subscribed(callback, "sql.active_record") { format_text(text) }
wp_selects = sql.grep(/FROM "work_packages"/i)
expect(wp_selects.size).to eq(1),
"expected exactly one work_packages SELECT, got #{wp_selects.size}:\n#{wp_selects.join("\n")}"
# Both render with the user-facing display_id, regardless of which
# form the user typed.
expect(rendered).to include(%(href="/work_packages/#{loaded[0].display_id}"))
expect(rendered).to include(%(href="/work_packages/#{loaded[1].display_id}"))
end
end
context "with a historical alias reference" do
it "resolves via the alias table with two round-trips total" do
wp = work_package.reload
# Simulate a project rename: the WP keeps its current MACROPROJ-N
# identifier on the row, but a historical OLD-prefix alias row points
# at the same WP. Authors writing pre-rename content shouldn't see
# broken refs.
WorkPackageSemanticAlias.create!(work_package_id: wp.id, identifier: "OLDPROJ-1")
sql = []
callback = ->(_, _, _, _, v) { sql << v[:sql] unless %w[CACHE SCHEMA].include?(v[:name]) }
rendered = ActiveSupport::Notifications.subscribed(callback, "sql.active_record") { format_text("see #OLDPROJ-1") }
# Two database round-trips: (1) `where_display_id_in` runs a single
# WP SELECT whose WHERE clause includes an EXISTS subquery against
# the alias table (matching by historical identifier); (2) the
# sidecar alias pluck maps the historical input string back to its
# WP for the cache. Round-trips are what we care about for N+1, not
# which tables show up in each query.
wp_selects = sql.grep(/FROM "work_packages"/i)
standalone_alias_selects = sql.grep(/FROM "work_package_semantic_aliases"/i)
.grep_v(/FROM "work_packages"/i)
expect(wp_selects.size).to eq(1)
expect(standalone_alias_selects.size).to eq(1)
# Renders against the WP's CURRENT display_id, not the historical
# alias the user typed — old content stays alive but points at the
# current identifier.
expect(rendered).to include(%(href="/work_packages/#{wp.display_id}"))
expect(rendered).to include(">#{wp.formatted_id}<")
end
end
end
describe "the `#PROJ-N` semantic reference in classic mode",
with_flag: { semantic_work_package_ids: false },
with_settings: { work_packages_identifier: "classic" } do
let(:role) { create(:project_role, permissions: %i[view_work_packages]) }
let(:project) { create(:project, identifier: "macroproj") }
let(:author) { create(:user, member_with_roles: { project => role }) }
before { allow(User).to receive(:current).and_return(author) }
it "leaves `#PROJ-1` as literal text and issues no work_packages SELECTs" do
sql = []
callback = ->(_, _, _, _, v) { sql << v[:sql] unless %w[CACHE SCHEMA].include?(v[:name]) }
rendered = ActiveSupport::Notifications.subscribed(callback, "sql.active_record") { format_text("see #PROJ-1 here") }
expect(rendered).to include("#PROJ-1")
expect(rendered).not_to include('href="/work_packages/PROJ-1"')
wp_selects = sql.grep(/FROM "work_packages"/i)
expect(wp_selects).to be_empty,
"classic mode added unexpected WP SELECTs for semantic input:\n#{wp_selects.join("\n")}"
end
end
end