Skip to content

Nested Documents UI (#3382)#3667

Open
luis100 wants to merge 18 commits into
developmentfrom
feature/nested-documents-ui
Open

Nested Documents UI (#3382)#3667
luis100 wants to merge 18 commits into
developmentfrom
feature/nested-documents-ui

Conversation

@luis100
Copy link
Copy Markdown
Member

@luis100 luis100 commented Apr 30, 2026

Summary

This PR implements UI support for Solr nested documents, enabling advanced search against child document fields (e.g. individual emails inside an EmailArchive AIP) while returning parent AIPs in search results.

Closes #3382

Phases

  • Phase 0 — EmailArchive nested-document metadata schema (XSLT crosswalk, Solr schema)
  • Phase 1 — Advanced AIP Search with nested filter groups (ParentWhichFilterParameter block-join)
  • Phase 2 — Virtual Catalogue list view

Changes in this branch

EmailArchive metadata & indexing

  • Added emailarchive.xslt ingest crosswalk generating parent AIP fields and nested child email documents
  • Added HTML dissemination crosswalk for email archive display
  • Fixed state=ACTIVE on child email documents (required for Solr block-join visibility)
  • Mapped dateStart/dateEnd → standard dateInitial/dateFinal AIP fields

Search — nested block-join queries

  • ParentWhichFilterParameter / ChildOfFilterParameter support in SolrUtils.parseFilter
  • Block-join query format fixes for q.op=AND compatibility:
    • Use which="field:value" (double-quoted local param) instead of which=(field:"value")
    • Use v="child_query" local param to pass child queries — avoids outer Lucene parser splitting the child query at whitespace when the block-join is combined with other filter clauses
    • Use +field:"value" required-operator syntax in child/parent sub-queries
    • Added range parameter handlers (DateIntervalFilterParameter, DateRangeFilterParameter, LongRangeFilterParameter) in the block-join sub-query builder for clean ISO8601 range clauses
  • AllFilterParameter is not suppressed alongside block-join (harmless *:* AND prefix)
  • emailSentDate search field changed from datedate_interval (from/to date picker, proper ISO8601 format via DateIntervalFilterParameter)
  • Wildcard values in block-join child queries are emitted unquoted (+field:value*) since wildcards cannot appear inside quoted phrases

Virtual Catalogue (Phase 2)

  • New createVirtualListAndSearchPanel() in SearchWrapper for list+search panels keyed by string list ID (not class)
  • CatalogueSearch reads ui.catalogue.virtual array and constructs either:
    • Type 1 (filtered AIP): catalogue.filter = field:value → standard Filter + bindOpener
    • Type 2 (nested child docs): catalogue.childOf.filter = field:valueChildOfFilterParameter + custom opener navigating to parent AIP
  • Virtual catalogue tabs appear after AIPs but before Representations/Files in the dropdown
  • AsyncTableCellOptions: customOpener (per-row click override) and includeNestedDocuments flag
  • SearchPanel.resolveSearchFieldScope() falls back to list ID if ui.search.fields.{listId} is configured
  • Advanced search enabled for Search_emailarchive (title, description, dates) and Search_emails (sender, subject, sent date)
  • Email subject (subject_txt) rendered via LIST hint — no more [value] array toString
  • Email sent date (sentDate_dt) rendered via DATETIME_FORMAT_SIMPLE — human-readable formatted date
  • Fix: LIST rendering hint recursive call now passes null hint to avoid ClassCastException on string inner values
  • Configured Search_emailarchive and Search_emails virtual catalogues in roda-wui.properties
  • i18n labels for all 9 supported locales

i18n

  • Added missing GWT plural forms ([one], [few]) across 7 locale files (de_AT, de_DE, es, hr, hu, pt_PT, sv_SE)

Tests

  • NestedDocumentSearchTest: full Solr round-trip test with 5 assertion layers including string-level query format verification

@luis100 luis100 force-pushed the feature/nested-documents-ui branch from 4f5de6f to 3edd6bb Compare May 4, 2026 14:17
@luis100 luis100 requested a review from hmiguim May 4, 2026 14:29
@luis100 luis100 changed the title WIP: Nested Documents UI (#3382) Nested Documents UI (#3382) May 7, 2026
@luis100 luis100 marked this pull request as ready for review May 7, 2026 08:01
@dosubot dosubot Bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label May 7, 2026
@dosubot dosubot Bot added the enhancement label May 7, 2026
luis100 and others added 18 commits May 7, 2026 09:09
Introduces the EmailArchive metadata type as the reference implementation
for Solr nested-document support in RODA — fully config-driven, zero Java.

- emailarchive.xsd: XML schema (parent mailbox + child email elements)
- emailarchive.xslt: ingest crosswalk producing nested Solr child docs via
  <field name="emails"><doc>…</doc></field> blocks; follows rakenskapsinfo pattern
- Register type in roda-wui.properties and i18n ServerMessages.properties
- EmailArchiveCrosswalkTest: 12 TestNG tests (full/minimal/no-emails fixtures)
  covering parent fields, date fields, child count, multi-value recipients,
  and absent-optional-field assertions

Part of: #3382

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `nestedType` and `nestedParentType` properties to SearchField so
config-declared fields can carry a Solr block-join context.
`SearchPanel.buildSearchFilter()` groups all nested-type fields by their
(nestedType, nestedParentType) pair and wraps each group in a
`ParentWhichFilterParameter` — following the RepresentationInformation
pattern — instead of emitting flat filter parameters.

Config registers three EmailArchive child search fields
(emailSubject, emailSender, emailSentDate) as the reference example;
any future nested metadata type benefits automatically through
`roda-wui.properties` alone — no Java changes required.

Note: the implementation uses a `nestedType` property on SearchField
rather than the `nested_group` field type originally proposed in #3661.
This is simpler (no new GWT widget), equally expressive, and avoids
adding a UI rendering path that would be unused for all current types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion crosswalk

Ingest XSLT now emits `title` (custodian name, emailAddress as fallback)
and a fixed `level = item` so emailarchive AIPs display correctly in the
browse list alongside other AIP types.

New HTML dissemination crosswalk renders all mailbox-level fields with
i18n labels and shows the indexed email messages as a compact table
(subject, sender, sent date, folder).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Format: "Custodian <email> (dateStart / dateEnd)"
Date range is omitted when both dates are absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without the AND operator prefix, combining AllFilterParameter (*:*) with
ParentWhichFilterParameter or ChildOfFilterParameter produced invalid Solr
syntax (*:*{!parent which=...}) that returned 0 results. Also corrects
inner parseFilterParameter calls to use false since inner builders are empty.

Update CLAUDE.md to reflect that tests use Testcontainers — no docker compose
setup or environment variables required to run tests, only Docker on the host.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add [one] forms alongside existing [=1] forms for de_AT, de_DE, es, pt_PT,
sv_SE — required by GWT for locales where 'one' is a mandatory plural
category. Add [one] and [few] forms for hr (Croatian), which also requires
'few'. Fix hu (Hungarian) by renaming [one] to [=1] since Hungarian uses
DefaultRule which has no 'one' plural category.

Affected keys: selected, representationInformation*, retentionPeriod,
liftDisposalHoldDialogMessage, disposalHoldAssociatedFromValue,
applyDisposalHoldDialogMessage, clearDisposalHoldDialogMessage,
disassociateDisposalHoldDialogMessage, disassociateDisposalScheduleDialogMessage,
disposalPolicySchedule{Day,Month,Year}Summary.

Also adds missing Swedish translation for liftDisposalHoldDialog and
Croatian translation for liftDisposalHoldDialog.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…action

Solr's standard query parser tokenizes {!parent which=...} and the child
query as SEPARATE clauses when whitespace separates them. With q.op=AND,
the child query becomes a required top-level clause matching child documents
directly, returning 0 AIP results.

Removing the space makes the child/parent filter directly adjacent to the
closing brace, so Solr correctly treats it as the sub-query for the local
params parser: {!parent which=<parentFilter>}<childFilter>.

Extend NestedDocumentSearchTest with a layer-5 assertion that uses
AllFilterParameter + ParentWhichFilterParameter (the production scenario),
covering the *:* AND {!parent ...}<children> query form.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use double-quoted which= local param: `which="field:value"` instead of `which=(field:"value")`
- Use required-operator (+) prefix in child/parent sub-queries: `(+field:"value")`
- Skip AllFilterParameter when block-join params are present to avoid spurious *:* AND prefix
- Add dedicated buildBlockJoinMask() and buildBlockJoinSubQuery() helpers to generate
  correct Solr block-join syntax without relying on the standard filter-to-query path
- Add string-level query assertion in NestedDocumentSearchTest (Layer 4) to lock in format
- Update Layer 5 to assert AllFilterParameter is suppressed when block-join is present

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…teFinal fields

Map the email archive date range fields to the standard AIP date fields so they
appear in the existing date range search facet without needing custom Solr fields.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…join path

The *:* AND prefix produced by AllFilterParameter alongside a block-join query
is functionally harmless — *:* matches every document so intersecting with it
leaves the block-join result unchanged. The specialization was unnecessary.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lder

Wildcards (* ?) cannot appear inside quoted phrases in Solr's standard query
parser. When a BasicSearchFilterParameter value contains wildcards, emit each
whitespace-separated token as an unquoted +field:token* clause instead of the
phrase-quoted form that was causing SyntaxError.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change emailSentDate search field type from 'date' to 'date_interval' so the UI
  uses a proper from/to date picker and creates DateIntervalFilterParameter with
  Date objects rather than SimpleFilterParameter with Date.toString() (which
  produces 'Mon Mar 15 00:00:00 GMT+000 2021' — invalid for Solr)
- Add explicit DateIntervalFilterParameter, DateRangeFilterParameter and
  LongRangeFilterParameter handlers in buildBlockJoinSubQueryClause to emit
  clean +field:[from TO to] range clauses with proper ISO8601 dates instead
  of going through the verbose appendRangeInterval fallback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Solr block-join nested documents (e.g. email children inside an EmailArchive
AIP) were leaking into the normal AIP catalogue search, showing up as
ghost rows with no title/level and navigating to broken child UUIDs on click.

Adds a -_nest_path_:* filter query to all IndexedAIP find/cursor/suggest
paths when ChildOfFilterParameter is not present. Behaviour is also exposed
as a configurable flag:

  FindRequest.includeNestedDocuments (default: false)

Clients can set ui.lists.<listId>.includeNestedDocuments = true in
roda-wui.properties to opt-in to receiving nested documents in a specific
list (used by the upcoming virtual catalogue for EmailArchive).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… in virtual catalogues

- Fix LIST rendering hint bug: recursive call passed LIST hint causing
  ClassCastException when inner values are strings; now passes null
- Apply LIST rendering hint to subject_txt so multivalued subjects
  display cleanly instead of as Java array toString ([value])
- Apply DATETIME_FORMAT_SIMPLE rendering hint to sentDate_dt for
  human-readable date formatting
- Enable advanced search for Search_emailarchive and Search_emails
  virtual catalogue tabs
- Add explicit search fields for Search_emailarchive (title,
  description, dates) to avoid inheriting nestedType email fields
  from the IndexedAIP scope fallback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@luis100 luis100 force-pushed the feature/nested-documents-ui branch from 4da8efb to 51c85d0 Compare May 7, 2026 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant