Skip to content

fix: replace GraphSONExporter spray-json write path with Jackson Core streaming#361

Open
eoliphan wants to merge 4 commits into
joernio:masterfrom
eoliphan:feature/streaming-graphson-export
Open

fix: replace GraphSONExporter spray-json write path with Jackson Core streaming#361
eoliphan wants to merge 4 commits into
joernio:masterfrom
eoliphan:feature/streaming-graphson-export

Conversation

@eoliphan

@eoliphan eoliphan commented Jun 18, 2026

Copy link
Copy Markdown

Fixes #360.

Summary

  • GraphSONExporter.runExport previously built the entire graph as a single java.lang.String via spray-json's PrettyPrinter, causing OutOfMemoryError on large graphs (Java strings are UTF-16, ~1 GB ceiling).
  • Replaces the export write path with Jackson Core streaming API — tokens are written directly to a BufferedOutputStream as each node/edge is processed, keeping peak memory bounded by the generator buffer regardless of graph size.
  • Smoke-tested against a 3.4M node / 24M edge CPG → 19 GB GraphSON output with no OOM.
  • Import path (GraphSONImporter), GraphSONProtocol, and all other exporters are unchanged.
  • Public API of GraphSONExporter is unchanged.
  • Adds Float, Double, Long, and Boolean property types to the generic test schema (pre-existing coverage gap) and a round-trip test exercising all PropertyValue subtypes.

eoliphan added 4 commits June 17, 2026 19:18
…ypes

Add floatOptional, doubleOptional, longOptional, and booleanOptional
properties to the Generic test schema. These properties enable testing of
FloatValue, DoubleValue, LongValue, and BooleanValue branches in the
exporter and GraphSONProtocol that were previously untested.

Regenerate domain classes and update Neo4j CSV test data and assertions
to reflect the new schema.
… streaming

Fixes joernio#360 — OOM when exporting large graphs. The spray-json PrettyPrinter
built the entire graph as a single java.lang.String (UTF-16, ~1GB ceiling).
Jackson Core streams tokens directly to a BufferedOutputStream, so memory
usage is bounded by the generator buffer, not graph size.

The import path (GraphSONImporter) is unchanged. Public API is unchanged.
@mpollmeier mpollmeier self-requested a review June 22, 2026 15:34
@mpollmeier

Copy link
Copy Markdown
Contributor

thank you @eoliphan , I'll take a look at this, hopefully tomorrow 🤞🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GraphSONExporter OOMs on large graphs

2 participants