Skip to content

feat(sof): sync upstream tests, implement %rowIndex, MongoDB conformance parity#178

Open
smunini wants to merge 2 commits into
mainfrom
feat/sof-rowindex-and-mongodb-parity
Open

feat(sof): sync upstream tests, implement %rowIndex, MongoDB conformance parity#178
smunini wants to merge 2 commits into
mainfrom
feat/sof-rowindex-and-mongodb-parity

Conversation

@smunini

@smunini smunini commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

Brings the SQL-on-FHIR implementation up to date with the relocated upstream test suite and closes the %rowIndex gap that showed red on https://sql-on-fhir.org/extra/impls.html, then brings the MongoDB in-DB runner to full conformance parity with SQLite/PostgreSQL.

Three threads, delivered together because they're interdependent:

1. Sync the upstream conformance fixtures

  • The official tests moved to FHIR/sql-on-fhir.js. Resynced all 22 fixtures verbatim + added the new row_index.json (9 cases).
  • crates/sof/tests/sql-on-fhir-v2/README.md documents the upstream source + verbatim-sync policy.
  • In-memory runners now tolerate a missing top-level fhirVersion (constant_types.json dropped it) and compare rows order-insensitively, matching the upstream compliance runner.

2. Implement %rowIndex across every SOF engine

  • Primary engine (run_view_definition, used by sof-cli/sof-server, S3 in-process, ES): per-iteration index for forEach/forEachOrNull/repeat; 144/144 in-memory fixtures pass.
  • In-DB SQLite + PostgreSQL: json_each.key / WITH ORDINALITY for forEach; pre-order ord_path + DENSE_RANK in the repeat CTE. Verified against real Postgres via testcontainers.
  • Fixed second-precision time highBoundary()/lowBoundary() in the in-DB emitter (surfaced by the synced fn_boundary data).
  • Removed obsolete row_index KNOWN_SKIPS; raised SQLite/PG PASS_FLOOR 124 → 132.

3. MongoDB in-DB runner: 59 → 132 (parity)

New crates/rest/tests/sof_conformance_mongodb.rs (testcontainers, floor 132). Emitter work:

  • Array-guarded [n]/.first(); three-valued comparison (null, not false).
  • unionAll via $facet; where()/extension() via $filter with a scoped $$var; join() + collection columns via $reduce; date/time/dateTime/decimal boundary.
  • repeat: via a server-side-JS $function pre-order traversal + $unwind.
  • Coerce serde_json arbitrary_precision number wrappers (which bson::to_bson stores as objects) to real numbers in comparison/arithmetic/boundary, preserving the digit-string for decimal-boundary precision.
  • Flattened trailing-[N] forEach (contact.telecom[0]) → single full-path unnest, flattened before indexing.

The 12 remaining MongoDB failures are the identical structural gaps SQLite/PG also fail (nested/sibling repeat, unionAll nested inside another select).

Verification

  • MongoDB / PostgreSQL conformance: real DBs via testcontainers, 132 each
  • SQLite conformance 132; in-memory engine 144/144; SQLite runner, sof unit tests (39), Mongo emit unit tests (9)
  • Default (SQLite-only) build compiles; cargo fmt; clippy clean under CI flags

Note

repeat: on MongoDB relies on server-side JavaScript ($function) — enabled by default, but disabled in some hardened/Atlas configs, where repeat: views would fail at runtime. Everything else uses native aggregation operators.

smunini added 2 commits June 18, 2026 10:28
…nce parity

Sync the SQL-on-FHIR v2 conformance fixtures from the relocated upstream
(FHIR/sql-on-fhir.js) verbatim — including the new row_index.json — implement
the %rowIndex environment variable across every SOF execution engine, and bring
the MongoDB in-DB runner to full conformance parity with SQLite/PostgreSQL.

Test fixtures
- Resync all 22 fixtures byte-for-byte from upstream main + add row_index.json.
- README documents the upstream source and verbatim-sync policy.
- Make the in-memory runners tolerate a missing top-level fhirVersion
  (constant_types.json dropped it) and compare rows order-insensitively, matching
  the upstream compliance runner.

%rowIndex
- Primary engine (run_view_definition): default 0; per-iteration index threaded
  through forEach/forEachOrNull (enumerate) and repeat (pre-order flatten).
  S3 (in-process) and Elasticsearch inherit this. All 144 in-memory fixtures pass.
- In-DB SQLite/PostgreSQL compiler: json_each.key / WITH ORDINALITY for forEach,
  a pre-order ord_path ranked with DENSE_RANK in the repeat CTE.
- Fix second-precision (HH:MM:SS) and full-precision time boundary in the in-DB
  emitter (surfaced by the synced fn_boundary data).
- Drop the obsolete row_index KNOWN_SKIPS; raise SQLite/PG PASS_FLOOR 124 -> 132.

MongoDB in-DB runner: 59 -> 132 (parity)
- Array-guarded [n]/.first(); three-valued comparison (null, not false).
- unionAll via $facet; where()/extension via $filter with a scoped $$var; join()
  and collection columns via $reduce; date/time/dateTime/decimal boundary.
- repeat: via a server-side-JS $function pre-order traversal + $unwind.
- Coerce serde_json arbitrary_precision number wrappers (stored as objects by
  bson::to_bson) to real numbers in comparison/arithmetic/boundary, preserving
  the wrapper's digit-string for decimal-boundary precision.
- Flattened trailing-[N] forEach (contact.telecom[0]) compiles to a single
  full-path unnest flattened before indexing.
- New testcontainers suite crates/rest/tests/sof_conformance_mongodb.rs (floor
  132); the remaining 12 failures are the structural gaps shared with SQLite/PG
  (nested/sibling repeat, unionAll nested in another select).

Note: repeat: on MongoDB requires server-side JavaScript ($function).
The synced constant_types.json drops the top-level `fhirVersion` field
(upstream treats absence as "all versions"). Two more test parsers still
required it and panicked on parse — `test_base64_binary_constant.rs` (loads
constant_types.json directly) and, defensively, `run_foreach_tests.rs`. Make
their `fhirVersion` optional like the other conformance runners.
@claude

claude Bot commented Jun 18, 2026

Copy link
Copy Markdown

Code review

One documentation discrepancy found. Checked for bugs and CLAUDE.md compliance.


README floor out of sync with test file

File: crates/sof/tests/sql-on-fhir-v2/README.md (lines 32–34) —

- `crates/rest/tests/sof_conformance_mongodb.rs` — MongoDB via testcontainers
(floor 59; the Stage-1 aggregation emitter doesn't yet cover `unionAll`,
`repeat`, collections, and several functions).

The README describes the MongoDB conformance suite as having floor 59 and lists unionAll, repeat, and collections as unsupported:

(floor 59; the Stage-1 aggregation emitter doesn't yet cover unionAll, repeat, collections, and several functions)

But the same PR sets MONGO_PASS_FLOOR = 132 in sof_conformance_mongodb.rs

— and the new emitter implements all three constructs. This description appears to be from an earlier draft state of the PR.

Suggested fix: replace lines 32–34 with:

  - `crates/rest/tests/sof_conformance_mongodb.rs` — MongoDB via testcontainers (floor 132).

@codecov

codecov Bot commented Jun 18, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 71.13821% with 71 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
crates/persistence/src/sof/emit.rs 47.76% 35 Missing ⚠️
crates/sof/src/lib.rs 82.94% 22 Missing ⚠️
crates/persistence/src/sof/compile_view.rs 64.10% 14 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant