Register the catalog-generated UDF dispatch surface#28
Open
estebanzimanyi wants to merge 3 commits into
Open
Conversation
tools/codegen_spark_udfs.py emits MobilitySpark UDF-registration classes from the MEOS-API catalog (output/meos-idl.json), resolving each SQL name to its MEOS-C backing via the @sqlfn / @sqlop map (MEOS-API MobilityDB#18). Two modes: - SINGLE: one backing -> a 1:1 UDF (type-marshalling: each MEOS C type <-> its parse-from-String / serialize-to-String form). - DISPATCH: an overloaded SQL name / operator (overlaps via &&, stbox(geom,time), timeSpan) -> ONE UDF that classifies each arg by its MEOS type and routes to the catalog-determined backing. Classification is MEOS-driven and wire-format-safe: spans/stboxes/geometries travel as TEXT, only temporals as hex, so the leading token disambiguates ('['/'(' span, STBOX stbox, hex temporal, else geometry) and temporal_from_hexwkb is never fed a non-temporal. Emitted lambdas call only static GeneratedFunctions (no captured state -> Spark-serializable). Zero hand heuristics, zero new MEOS functions.
…roup Generalize the generator over the whole JMEOS public surface (was a 4-UDF POC): mirror JMEOS FunctionsGenerator's marshalling conventions — temporals / spans / sets / boxes / jsonb as hex-WKB or type text, TimestampTz as OffsetDateTime, DateADT as int, and bool f(.., result) out-params dropped with their value returned. Cross-check every emission against the JMEOS jar signatures (arity + return kind) so a collapsed catalog type can never miscompile. Organize the emitted UDFs into one class per doxygen @InGroup module — the reference-manual structure, so a function is found in the same place across tools — excluding meos_internal_*, and splitting oversized groups to stay under the JVM class limits. Emits ~2200 1:1 UDFs, compiling green.
Add a dispatch pass: each portable comparison bare name (everEq/everNe/everLt/ everLe/everGt/everGe, alwaysEq.., tempEq.. — RFC #920 / contract families everComparison/alwaysComparison/temporalComparison) is emitted ONCE wrapping its MEOS superclass entrypoint (ever_<op>_temporal_temporal / always_<op>_temporal_ temporal / temporal_<op>), which dispatches every concrete temporal type internally from the type-erased hex-WKB string — so Spark needs no per-type overload and no Java type-inspection. 18 bare names, emitted into GeneratedUdfs_portable_comparison, compiling green.
1cb68f7 to
de62e78
Compare
estebanzimanyi
added a commit
to estebanzimanyi/MobilitySpark
that referenced
this pull request
Jun 13, 2026
Pick up the generator's dispatch pass (MobilityDB#28): adds GeneratedUdfs_portable_comparison with the 18 contract comparison bare names (everEq..everGe / alwaysEq..alwaysGe / tempEq..tempGe) wrapping the MEOS superclass *_temporal_temporal entrypoints. Compiles green; the bare names register alongside the hand PortableOperatorAliasUDFs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stacked on #26. Adds
GeneratedSpatioTemporalUDFs— emitted by the generator (#27) from the MEOS-API@sqlfn/@sqlopcatalog (#18) — and registers it last increate(). It provides runtime type-dispatchingoverlaps/stbox(geom,time)/timeSpan: each String arg is classified by MEOS type (spans/stboxes/geometries as text, only temporals as hex, so the temporal parser is never misfed) and routed to the catalog-determined backing. Closes the BerlinMOD bench's overlaps/stbox gaps with generated, serialization-safe code — no hand UDFs, no MEOS-API growth. Validated: 17/18 suite queries run clean, no OOM.