Skip to content

Advance the facade to the pin-12l MEOS catalog (stacks on #26)#27

Open
estebanzimanyi wants to merge 15 commits into
MobilityDB:mainfrom
estebanzimanyi:reconcile/jmeos-pin-12l
Open

Advance the facade to the pin-12l MEOS catalog (stacks on #26)#27
estebanzimanyi wants to merge 15 commits into
MobilityDB:mainfrom
estebanzimanyi:reconcile/jmeos-pin-12l

Conversation

@estebanzimanyi

Copy link
Copy Markdown
Member

Summary

Advances the canonical org.mobilitydb.meos facade (the #22#26 architecture) to the pin-12l MEOS catalog, closing the last equivalence gap between the consolidate and the deliverable stack: the thirteen base-type operations the facade referenced but that were absent from the catalog.

The catalog codegen/input/meos-idl.json is regenerated from the integration pin (4408 functions) so the binding surface tracks the current MEOS public API:

  • the base-type operations exposed upstream by MobilityDB#1203 (int32_cmp, int64_cmp, float8_exp, float8_ln, float8_log10, add_interval_interval, mul_interval_double, minus_date_date, minus_date_int, minus_timestamptz_interval, minus_timestamptz_timestamptz, date_to_timestamp, date_to_timestamptz),
  • the tjsonb surface,
  • the clean standalone base-type names (float8_exp, date_in, cstring_to_text … replacing the float_ / pg_ / 2 spellings — see the standalone-naming rule).

The build-time GeneratedFunctions binding regenerates from this catalog, and the MeosOps* facade is re-derived from the freshly built jar by regen_facade_from_jar.py. That tool now resolves each facade method to its jar-current backing by documented naming convention only — drop the pg_ collision prefix, 2_to_, float_float8_ — and drops any method with no backing, so the facade auto-tracks the surface with no per-function hand map (North Star: bindings generated, zero hand special-cases). All thirteen base-type operations are retained under their clean names.

Verification

  • mvn compile green; GeneratedFunctions exports all 13.
  • Test suite green apart from ten pre-existing span/box adjacency cases (testAdjacency/testAdjacent/testIsSame/IntSpanTopologicalPositionFunctions) that are unrelated to this change — the span-adjacency bindings are byte-identical across the catalog bump.

Dependencies

Supersedes the divergent consolidate line in #19.

estebanzimanyi and others added 15 commits May 29, 2026 19:08
… surface

Bump codegen/input/meos-idl.json to the MEOS-API IDL and regenerate
functions.GeneratedFunctions over the full consolidated superset: mul_* (incl.
tbigint); minDistance; the circular-buffer and network-point MF-JSON readers; the
ever- and always-covers families (ecovers_*/acovers_*); trgeo_*; the H3 /
th3index family (ever_eq_h3indexset_th3index, h3index_in/out, H3Index lowered to
long); PostgreSQL type I/O; tgeogpoint_great_circle_distance;
meos_initialize_noexit_error_handler. 2916 functions.
…ld flags

The functions.GeneratedFunctions facade is generated at build time from the MEOS
IDL with the optional type families selected by the same flag names and ON|OFF
(also 1|0) values as the MobilityDB/MEOS build: -DCBUFFER, -DNPOINT, -DPOSE,
-DRGEO, -DH3. Every family is included by default; passing -DCBUFFER=OFF (or =0)
drops that family's functions from the generated binding so a subset jar ships
without it (RGEO needs POSE). FunctionsGenerator maps each function's source
header to its family and omits excluded families; jmeos-core runs the generator
at generate-sources (so the flag flows through mvn) and compiles the generated
functions.GeneratedFunctions.
…try C API

MobilityDB #1137 renamed the public rigid-geometry C API from trgeo to
trgeometry. The MEOS IDL the facade is generated from adopts the new names
(verified 1:1 against the master meos_rgeo.h: 67 trgeo->trgeometry; the
trgeoinst_make instant constructor is unchanged, matching master), so the
generated functions.GeneratedFunctions and the bundled jar resolve against a
post-#1137 libmeos.
Bumps codegen/input/meos-idl.json to the public+bound MEOS surface of the
ecosystem pin: the set-set spatial-join family (edwithin/tdwithin/adisjoint
_tgeoarr_tgeoarr), the mindistance_tgeoarr_tgeoarr rename, the trgeometry
analytics (frechet/hausdorff/dyntimewarp/centroid/length/speed), tpose and
tnpoint value accessors, tcbuffer traversed-area, and the aggregate combine
functions. 3031 bound functions (was 2916).
Hoists the tier-aware MeosOps* facade (62 classes) into JMEOS so every
JVM binding inherits the one canonical Java idiom from the shared jar
instead of duplicating it per engine. The facade forwards to
functions.GeneratedFunctions under a package-private MeosOpsRuntime probe
gated by the canonical -Dmeos.enabled property; javadoc is engine-neutral.
Relocates the maintained generator (regen_facade_from_jar + the gap / sql /
tbigint / h3 emitters + parity_audit + meos-ref) under jmeos-core/tools so
the facade stays regenerated, not hand-edited; regeneration is idempotent
against the pin jar.
MeosSetSetJoin exposes the MEOS *_tgeoarr_tgeoarr family as eDwithinPairs /
tDwithinPairs / aDisjointPairs over two arrays of temporal-geometry handles:
it marshals the native pointer arrays the kernel prunes in C, keeps them
reachable across the call with reachabilityFence, and reads back the
flattened 0-based index pairs (and, for tDwithin, the per-pair tstzspanset of
in-range times). Both JVM engines call it from the shared org.mobilitydb.meos
layer, so the NxN spatial-join surface derives once. Verified against
libmeos.
The IDL and bundled libmeos carry the 54a9d4bc54 public surface: the per-thread
PROJ context, the box3d_in/gbox_in parsers, and tpose_to_tpoint. The parity-gap
forwarders bind the value-at-timestamptz wrappers through their result-returning
form and drop the pointcloud initializer absent from the surface.
Compile jmeos-core for Java 17 and rewrite the type-pattern switches in STBox and
the time types as instanceof-pattern if/else chains (instanceof patterns are Java
17). The facade bytecode then loads on the Spark Connect server's Java 17 runtime
(Spark 3.5's supported JRE), and still runs on later runtimes.
Add extract_named_surface.py, which produces meos-named-surface.json from the two
canonical sources already in the MobilityDB tree: the SQL CREATE FUNCTION catalog
(named functions, overloads, per-argument DEFAULTs -> valid call arities) and the
doxygen chain (@sqlfn on the PG wrapper, @csqlfn on the MEOS function) linking
each SQL name to its PG and MEOS C functions. This is the layer above the C-FFI
IDL from which a binding's named surface and its Spark Connect registrar are
generated, rather than hand-maintained. 1284 named functions, asMFJSON resolving
to temporal_as_mfjson with minArity 1 / maxArity 4.
…tter

extract_spark_impls.py scans the MobilitySpark UDFs (register name + field +
body GeneratedFunctions call) and joins on the named surface's SQL->MEOS C
linkage to recover canonical name -> Spark impl mechanically, so the emitter
needs no hand-written remap. The join classifies each function for emission:
single-impl (identity name over one impl), multi-impl (identity name with a
WKB-type-tag dispatch builder), and join gaps to close.
…face

generate_spark_registrar.py joins the canonical named surface with the Spark impl
scan and emits MobilitySparkConnectExtensionsGen.scala: a SparkSessionExtensions
that injects each canonical function under its identity name (asMFJSON, not
temporalAsMfjson), no hand-written remap. Shipped ScalaUDF closures live in a
companion object so they capture only the serializable UDF; the builder null-pads
the impl's optional args to the call-site arity. The 81 single-impl functions are
generated, compiled, and serve live over Spark Connect under their identity names;
the 139 multi-impl names are listed for the per-row meos_typeof_hexwkb dispatch.
…trar

A multi-impl canonical name (one SQL name over several type-specific Spark impls)
whose first argument differs in MEOS type is emitted as a single ScalaUDF that
peeks meostype_name(meos_typeof_hexwkb(arg0)) per row and routes to the impl whose
receiver type matches, with the Temporal-receiver impl as the catch-all default.
The receiver category is read from the impl's primary MEOS function (the last
non-marshaling GeneratedFunctions call in the UDF body) first C-parameter type.

The registrar serves the /items-collection OGC function set under identity names:
asMFJSON, stbox, the Xmin/Ymin/Xmax/Ymax/Tmin/Tmax accessors, numSequences,
sequenceN, trajectory. Functions that differ only on a later argument (atTime on
its time argument) are listed for the arg-N dispatch extension.
…th the SQL default

Several MobilitySpark UDFs register one MEOS operation under both a bare name and a
camelCase name (asText/tpointAsText, getTime/time, cumulativeLength/...). When a
canonical name's impls all share one primary MEOS function, bind the identity name
to a single impl rather than treating it as a type dispatch.

Capture each optional argument's SQL DEFAULT literal in the named surface and fill
an omitted optional argument with it, but only when the impl exposes a full
overload's worth of arguments (impl arity equals that overload's maxArity, so the
positions align); otherwise null-pad and let the impl's own default hold. This
serves asText/asEWKT (maxdecimaldigits default 15) while keeping asMFJSON at full
coordinate precision (its impl exposes a non-leading argument subset).
Generalize the multi-impl dispatch from arg0 to the first argument position at
which the type-specific impls differ, peeking that argument's MEOS type tag: atTime
routes on its time argument (tstzspan/tstzset/tstzspanset), duration routes on its
span argument. The concrete type for a generic Span/Set receiver is taken from the
MEOS type-name embedded in the impl's primary function name.

Each dispatch route carries its impl's arity and SQL-default fills, so an omitted
optional argument of the chosen impl is filled with the canonical default rather
than a null pad (duration on a tstzspanset supplies boundspan=FALSE); boolean SQL
defaults are emitted alongside integer ones.

The named surface is regenerated from the pin, so speed(tgeompoint) resolves
through the dedicated Tpoint_speed wrapper to tpoint_speed and binds to the speed
impl.
Regenerate codegen/input/meos-idl.json from the integration pin (4408
functions) so the binding surface tracks the current MEOS public API:
the base-type operations exposed by MobilityDB#1203, the tjsonb surface,
and the clean standalone base-type names (float8_exp, date_in,
cstring_to_text … replacing the float_/pg_/2 spellings). The build-time
GeneratedFunctions binding is regenerated from this catalog.

Re-derive the MeosOps* facade from the freshly built jar via
regen_facade_from_jar.py, which now resolves each facade method to its
jar-current backing by documented naming convention only (drop the pg_
collision prefix, 2->_to_, float_->float8_) and drops any method with no
backing — no per-function hand map. All thirteen base-type operations
(int32_cmp, int64_cmp, float8_exp, float8_ln, float8_log10,
add_interval_interval, mul_interval_double, minus_date_date,
minus_date_int, minus_timestamptz_interval, minus_timestamptz_timestamptz,
date_to_timestamp, date_to_timestamptz) are retained under their clean
names. Build is green; the test suite is green apart from ten pre-existing
span/box adjacency cases unrelated to this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant