Nested Documents UI (#3382)#3667
Open
luis100 wants to merge 18 commits into
Open
Conversation
4f5de6f to
3edd6bb
Compare
Introduces the EmailArchive metadata type as the reference implementation for Solr nested-document support in RODA — fully config-driven, zero Java. - emailarchive.xsd: XML schema (parent mailbox + child email elements) - emailarchive.xslt: ingest crosswalk producing nested Solr child docs via <field name="emails"><doc>…</doc></field> blocks; follows rakenskapsinfo pattern - Register type in roda-wui.properties and i18n ServerMessages.properties - EmailArchiveCrosswalkTest: 12 TestNG tests (full/minimal/no-emails fixtures) covering parent fields, date fields, child count, multi-value recipients, and absent-optional-field assertions Part of: #3382 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `nestedType` and `nestedParentType` properties to SearchField so config-declared fields can carry a Solr block-join context. `SearchPanel.buildSearchFilter()` groups all nested-type fields by their (nestedType, nestedParentType) pair and wraps each group in a `ParentWhichFilterParameter` — following the RepresentationInformation pattern — instead of emitting flat filter parameters. Config registers three EmailArchive child search fields (emailSubject, emailSender, emailSentDate) as the reference example; any future nested metadata type benefits automatically through `roda-wui.properties` alone — no Java changes required. Note: the implementation uses a `nestedType` property on SearchField rather than the `nested_group` field type originally proposed in #3661. This is simpler (no new GWT widget), equally expressive, and avoids adding a UI rendering path that would be unused for all current types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…tion crosswalk Ingest XSLT now emits `title` (custodian name, emailAddress as fallback) and a fixed `level = item` so emailarchive AIPs display correctly in the browse list alongside other AIP types. New HTML dissemination crosswalk renders all mailbox-level fields with i18n labels and shows the indexed email messages as a compact table (subject, sender, sent date, folder). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Format: "Custodian <email> (dateStart / dateEnd)" Date range is omitted when both dates are absent. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without the AND operator prefix, combining AllFilterParameter (*:*) with
ParentWhichFilterParameter or ChildOfFilterParameter produced invalid Solr
syntax (*:*{!parent which=...}) that returned 0 results. Also corrects
inner parseFilterParameter calls to use false since inner builders are empty.
Update CLAUDE.md to reflect that tests use Testcontainers — no docker compose
setup or environment variables required to run tests, only Docker on the host.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add [one] forms alongside existing [=1] forms for de_AT, de_DE, es, pt_PT,
sv_SE — required by GWT for locales where 'one' is a mandatory plural
category. Add [one] and [few] forms for hr (Croatian), which also requires
'few'. Fix hu (Hungarian) by renaming [one] to [=1] since Hungarian uses
DefaultRule which has no 'one' plural category.
Affected keys: selected, representationInformation*, retentionPeriod,
liftDisposalHoldDialogMessage, disposalHoldAssociatedFromValue,
applyDisposalHoldDialogMessage, clearDisposalHoldDialogMessage,
disassociateDisposalHoldDialogMessage, disassociateDisposalScheduleDialogMessage,
disposalPolicySchedule{Day,Month,Year}Summary.
Also adds missing Swedish translation for liftDisposalHoldDialog and
Croatian translation for liftDisposalHoldDialog.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…action
Solr's standard query parser tokenizes {!parent which=...} and the child
query as SEPARATE clauses when whitespace separates them. With q.op=AND,
the child query becomes a required top-level clause matching child documents
directly, returning 0 AIP results.
Removing the space makes the child/parent filter directly adjacent to the
closing brace, so Solr correctly treats it as the sub-query for the local
params parser: {!parent which=<parentFilter>}<childFilter>.
Extend NestedDocumentSearchTest with a layer-5 assertion that uses
AllFilterParameter + ParentWhichFilterParameter (the production scenario),
covering the *:* AND {!parent ...}<children> query form.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use double-quoted which= local param: `which="field:value"` instead of `which=(field:"value")` - Use required-operator (+) prefix in child/parent sub-queries: `(+field:"value")` - Skip AllFilterParameter when block-join params are present to avoid spurious *:* AND prefix - Add dedicated buildBlockJoinMask() and buildBlockJoinSubQuery() helpers to generate correct Solr block-join syntax without relying on the standard filter-to-query path - Add string-level query assertion in NestedDocumentSearchTest (Layer 4) to lock in format - Update Layer 5 to assert AllFilterParameter is suppressed when block-join is present Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…teFinal fields Map the email archive date range fields to the standard AIP date fields so they appear in the existing date range search facet without needing custom Solr fields. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…join path The *:* AND prefix produced by AllFilterParameter alongside a block-join query is functionally harmless — *:* matches every document so intersecting with it leaves the block-join result unchanged. The specialization was unnecessary. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…lder Wildcards (* ?) cannot appear inside quoted phrases in Solr's standard query parser. When a BasicSearchFilterParameter value contains wildcards, emit each whitespace-separated token as an unquoted +field:token* clause instead of the phrase-quoted form that was causing SyntaxError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Change emailSentDate search field type from 'date' to 'date_interval' so the UI uses a proper from/to date picker and creates DateIntervalFilterParameter with Date objects rather than SimpleFilterParameter with Date.toString() (which produces 'Mon Mar 15 00:00:00 GMT+000 2021' — invalid for Solr) - Add explicit DateIntervalFilterParameter, DateRangeFilterParameter and LongRangeFilterParameter handlers in buildBlockJoinSubQueryClause to emit clean +field:[from TO to] range clauses with proper ISO8601 dates instead of going through the verbose appendRangeInterval fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Solr block-join nested documents (e.g. email children inside an EmailArchive AIP) were leaking into the normal AIP catalogue search, showing up as ghost rows with no title/level and navigating to broken child UUIDs on click. Adds a -_nest_path_:* filter query to all IndexedAIP find/cursor/suggest paths when ChildOfFilterParameter is not present. Behaviour is also exposed as a configurable flag: FindRequest.includeNestedDocuments (default: false) Clients can set ui.lists.<listId>.includeNestedDocuments = true in roda-wui.properties to opt-in to receiving nested documents in a specific list (used by the upcoming virtual catalogue for EmailArchive). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… in virtual catalogues - Fix LIST rendering hint bug: recursive call passed LIST hint causing ClassCastException when inner values are strings; now passes null - Apply LIST rendering hint to subject_txt so multivalued subjects display cleanly instead of as Java array toString ([value]) - Apply DATETIME_FORMAT_SIMPLE rendering hint to sentDate_dt for human-readable date formatting - Enable advanced search for Search_emailarchive and Search_emails virtual catalogue tabs - Add explicit search fields for Search_emailarchive (title, description, dates) to avoid inheriting nestedType email fields from the IndexedAIP scope fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4da8efb to
51c85d0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements UI support for Solr nested documents, enabling advanced search against child document fields (e.g. individual emails inside an EmailArchive AIP) while returning parent AIPs in search results.
Closes #3382
Phases
ParentWhichFilterParameterblock-join)Changes in this branch
EmailArchive metadata & indexing
emailarchive.xsltingest crosswalk generating parent AIP fields and nested child email documentsstate=ACTIVEon child email documents (required for Solr block-join visibility)dateStart/dateEnd→ standarddateInitial/dateFinalAIP fieldsSearch — nested block-join queries
ParentWhichFilterParameter/ChildOfFilterParametersupport inSolrUtils.parseFilterq.op=ANDcompatibility:which="field:value"(double-quoted local param) instead ofwhich=(field:"value")v="child_query"local param to pass child queries — avoids outer Lucene parser splitting the child query at whitespace when the block-join is combined with other filter clauses+field:"value"required-operator syntax in child/parent sub-queriesDateIntervalFilterParameter,DateRangeFilterParameter,LongRangeFilterParameter) in the block-join sub-query builder for clean ISO8601 range clausesAllFilterParameteris not suppressed alongside block-join (harmless*:*AND prefix)emailSentDatesearch field changed fromdate→date_interval(from/to date picker, proper ISO8601 format viaDateIntervalFilterParameter)+field:value*) since wildcards cannot appear inside quoted phrasesVirtual Catalogue (Phase 2)
createVirtualListAndSearchPanel()inSearchWrapperfor list+search panels keyed by string list ID (not class)CatalogueSearchreadsui.catalogue.virtualarray and constructs either:catalogue.filter = field:value→ standardFilter+bindOpenercatalogue.childOf.filter = field:value→ChildOfFilterParameter+ custom opener navigating to parent AIPAsyncTableCellOptions:customOpener(per-row click override) andincludeNestedDocumentsflagSearchPanel.resolveSearchFieldScope()falls back to list ID ifui.search.fields.{listId}is configuredSearch_emailarchive(title, description, dates) andSearch_emails(sender, subject, sent date)subject_txt) rendered viaLISThint — no more[value]array toStringsentDate_dt) rendered viaDATETIME_FORMAT_SIMPLE— human-readable formatted dateLISTrendering hint recursive call now passesnullhint to avoidClassCastExceptionon string inner valuesSearch_emailarchiveandSearch_emailsvirtual catalogues inroda-wui.propertiesi18n
[one],[few]) across 7 locale files (de_AT, de_DE, es, hr, hu, pt_PT, sv_SE)Tests
NestedDocumentSearchTest: full Solr round-trip test with 5 assertion layers including string-level query format verification