codegen(meos): generate tier-aware MEOS facade for the full JMEOS 1.4 surface (stacks on #4)#5
Conversation
fad7fed to
e5707ac
Compare
|
Coordination confirmation: rebased onto the post-union-jar refresh ( Local verification: Full module now compiles green — the codegen wedge sits on top of Coordination item resolved. Thanks for the union-jar refresh. |
b532536 to
e5707ac
Compare
…OS facades Adds the org.mobilitydb.flink.meos.wirings package — thin, generic Flink-DataStream wrappers around the generated MeosOps* facades from PR MobilityDB#5, organized per streaming tier. This PR ships the stateless tier: - MeosStatelessMap<IN, OUT>: generic MapFunction wrapping any stateless MeosOps* method (804 of the 2,097 generated methods qualify per the v4 baseline — 92 OO-classified + 712 free-fn) - MeosStatelessFilter<IN>: generic FilterFunction wrapping any stateless boolean-returning MeosOps* method, plus a .fromIntPredicate(...) adapter for JMEOS' int-coded predicates - demo/MeosWiringsDemoJob: runnable end-to-end DataStream pipeline parsing TBox WKT → filtering by overlap with a query box → serializing surviving boxes to hex-WKB, all through the generated facades wired via this package - README documenting tier vocabulary, the wrap-once-use-everywhere pattern, the DataStream-API-only design choice (Table API as future follow-up), and coexistence with berlinmod.MEOSBridge Future follow-ups (one PR per tier, mirroring this one's shape): - MeosBoundedStateMap (generic KeyedProcessFunction with ValueState<Pointer> for MEOS handle per key — covers 797 of the generated methods) - MeosWindowedAggregate (generic ProcessWindowFunction — 161 methods) - MeosCrossStreamJoin (generic KeyedCoProcessFunction or interval-join — 140 methods) - Optional: Table API sibling (MeosScalarUDF + MeosCatalogRegistrar) if the repo adopts Table API for other reasons Stacks on codegen/flink-meos-ops (PR MobilityDB#5). Additive-only; touches no existing file. Locally compile-verified: 129 .class files total (123 from the parent PR + 6 new from this package's classes + demo + their nested lambdas).
… surface Add a generated, tier-aware Java facade over the MEOS public API, organized as one Java class per MEOS object-model class plus one per public-MEOS-header for free functions: - 50 `MeosOps<Class>` classes (751 methods): one per MEOS object-model class (TFloat, TInt, TBool, TText, TGeomPoint, TGeogPoint, TCbuffer, TNpoint, TPose, TRGeometry, TBox, STBox, Set, Span, SpanSet, …). - 6 `MeosOpsFree<Header>` classes (1,346 methods): one per public MEOS header for functions not assigned to any object-model class (MeosOpsFreeCore, MeosOpsFreeGeo, MeosOpsFreeCbuffer, MeosOpsFreeNpoint, MeosOpsFreePose, MeosOpsFreeRgeo). - 1 shared `MeosOpsRuntime` (single `MEOS_AVAILABLE` static-init across all 56 facades). Each emitted method forwards to `functions.GeneratedFunctions.<name>(...)` after probing the shared `MeosOpsRuntime.MEOS_AVAILABLE` flag. Each method carries a Javadoc tier marker (stateless / bounded-state / windowed / cross-stream / io-meta) so consumers know the per-method wiring shape. Total emit: 2,097 of JMEOS PR MobilityDB#19's 2,699-method surface (77.7%); remainder is the JMEOS-deliberately-omitted type-catalog helpers plus the streaming-relevance-baseline ambiguous (59) and sequence-only (14) buckets, both surfaced separately for design decisions before emit. Two generators under flink-processor/tools/codegen/: - codegen-oo.py: reads JMEOS jar signatures via javap-p + streaming-relevance baseline + MEOS object model → emits per-OO-class facades. - codegen-free.py: same shape, but for functions not in the OO model → emits per-header facades. Both are ~250 LOC, deterministic, audit-by-regeneration. Manifests record provenance (JMEOS method total, baseline target count, emit count, per-tier breakdown, per-class/per-header method count, sample of functions absent from JMEOS). Coexists with the existing berlinmod.MEOSBridge hand-written BerlinMOD-scoped bridge (high-level, query-shaped); the generated MeosOps* facades expose the raw MEOS surface tier-by-tier (low-level, catalog-shaped). Both share the same MEOS_AVAILABLE discipline and `functions.GeneratedFunctions` delegation. Stacks on feat/jmeos-bridge-swap; additive-only; touches no existing file. Locally compile-verified against the union of JMEOS PR MobilityDB#19's jmeos-core + PR MobilityDB#18's utils.spatial (the latter needed by MEOSBridge, separately tracked). (cherry picked from commit e5707ac)
e5707ac to
b37ac88
Compare
…OS facades Adds the org.mobilitydb.flink.meos.wirings package — thin, generic Flink-DataStream wrappers around the generated MeosOps* facades from PR MobilityDB#5, organized per streaming tier. This PR ships the stateless tier: - MeosStatelessMap<IN, OUT>: generic MapFunction wrapping any stateless MeosOps* method (804 of the 2,097 generated methods qualify per the v4 baseline — 92 OO-classified + 712 free-fn) - MeosStatelessFilter<IN>: generic FilterFunction wrapping any stateless boolean-returning MeosOps* method, plus a .fromIntPredicate(...) adapter for JMEOS' int-coded predicates - demo/MeosWiringsDemoJob: runnable end-to-end DataStream pipeline parsing TBox WKT → filtering by overlap with a query box → serializing surviving boxes to hex-WKB, all through the generated facades wired via this package - README documenting tier vocabulary, the wrap-once-use-everywhere pattern, the DataStream-API-only design choice (Table API as future follow-up), and coexistence with berlinmod.MEOSBridge Future follow-ups (one PR per tier, mirroring this one's shape): - MeosBoundedStateMap (generic KeyedProcessFunction with ValueState<Pointer> for MEOS handle per key — covers 797 of the generated methods) - MeosWindowedAggregate (generic ProcessWindowFunction — 161 methods) - MeosCrossStreamJoin (generic KeyedCoProcessFunction or interval-join — 140 methods) - Optional: Table API sibling (MeosScalarUDF + MeosCatalogRegistrar) if the repo adopts Table API for other reasons Stacks on codegen/flink-meos-ops (PR MobilityDB#5). Additive-only; touches no existing file. Locally compile-verified: 129 .class files total (123 from the parent PR + 6 new from this package's classes + demo + their nested lambdas). (cherry picked from commit 457987c)
|
Superseded by the Path-B consolidation: the former 18-deep stack is collapsed into two reviewable topical PRs on top of the merged scaffold — MEOS integration #30 → benchmark #31 — each one clean squashed commit with the generated-facade bulk, dead family-flag profiles, committed target/ artifacts, and invented synthetic corpus removed. Closing as folded into #30/#31. |
Add a generated, tier-aware Java facade over the full MEOS public API surface, so downstream Flink-side parity work can stop hand-wiring per-operator JMEOS calls and instead consume one mechanical facade per MEOS object-model class (or per public header for free functions).
What is generated
MeosOps<Class>— one per MEOS object-model classtools/codegen/codegen-oo.pyMeosOpsFree<Header>— one per public MEOS header for fns not assigned to any OO classtools/codegen/codegen-free.pyMeosOpsRuntime(singletonMEOS_AVAILABLEstatic init across all 56 facades)Each emitted method forwards verbatim to
functions.GeneratedFunctions.<name>(...)after probingMeosOpsRuntime.MEOS_AVAILABLE(set once per JVM). Each method carries a Javadoc tier marker:statelessScalarFunction/ direct call inMapFunctionbounded-stateScalarFunction(state in MEOS handle)windowedAggregateFunctionoverTUMBLE/HOPcross-streamCoProcessFunction/IntervalJoinio-metaformatclauseTier breakdown of the 2,097 emitted methods: 804 stateless · 797 bounded-state · 161 windowed · 140 cross-stream · 195 io-meta.
What's not emitted (honest gap)
*_basetype,*_type,*_spantype, …)sequence-onlytier — inherently non-streamable, marked as honest "cannot satisfy" pending an emission-shape decisionstreamingSemanticsfacet RFC for MEOS-APICoexistence with
berlinmod.MEOSBridgeMEOSBridge.java(hand-written, BerlinMOD-scoped, introduced on this branch's parentfeat/jmeos-bridge-swap) and the generatedMeosOps*facades coexist by design:MEOSBridgekeeps the per-BerlinMOD-query intent (Haversine fallback,dwithinSegmentMetres, etc.) — high-level, query-shaped.MeosOps*exposes the raw MEOS surface tier-by-tier — low-level, catalog-shaped.Both share the same
MEOS_AVAILABLEdiscipline (viaMeosOpsRuntime) and the samefunctions.GeneratedFunctionsdelegation.How to regenerate
Both generators are ~250 LOC, deterministic, audit-by-regeneration. Manifests under
tools/codegen/record per-class / per-header / per-tier breakdowns + absent-from-JMEOS audit.Stacking
This PR stacks on
feat/jmeos-bridge-swap. Additive-only: 57 new Java files + 5 files undertools/codegen/. No existing file is touched (no diff toMEOSBridge.java,Main.java,TrajectoryWindowFunction.java,pom.xml, orjar/JMEOS.jar).Note on the base branch's current compile state
feat/jmeos-bridge-swap'sMEOSBridge.java:116importsutils.spatial.PointToSegmentfrom JMEOS PR #18'sfeat/spatial-haversinebranch. The recent bundled-jar refresh on this branch (commit0a57c07, JMEOS PR #19'sjmeos-corejar) brought in the 2,699-methodfunctions.GeneratedFunctionssurface but did not include PR #18'sutils.spatial.*wrappers. As a result, the base-branchmvn compilecurrently fails onMEOSBridge.java.This PR's own diff is green in isolation (
javacof justorg.mobilitydb.flink.meos.*succeeds against the refreshed jar) and green in the full module when the bundled jar is the union of JMEOS PR #19'sjmeos-core+ PR #18'sutils.spatial.*(locally verified: 123 .class files compile clean, including all 57 newMeosOps*).Recipe to produce the union jar (~2 minutes):
Once the bundled jar is refreshed with the union, the base branch + this PR compile together cleanly.