Skip to content

Add REAPI content-defined chunking (SplitBlob/SpliceBlob) support#2497

Draft
erneestoc wants to merge 3 commits into
TraceMachina:mainfrom
erneestoc:experimental-cdc-split-splice
Draft

Add REAPI content-defined chunking (SplitBlob/SpliceBlob) support#2497
erneestoc wants to merge 3 commits into
TraceMachina:mainfrom
erneestoc:experimental-cdc-split-splice

Conversation

@erneestoc

@erneestoc erneestoc commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Implements the server side of the REAPI blob split/splice extension (remote-apis#282) used by Bazel's --experimental_remote_cache_chunking (Bazel 8.7.0+ / 9.1.0+).

Fixes #2496.

What this adds

  • Vendored protocol: SplitBlob/SpliceBlob RPCs, ChunkingFunction, FastCdc2020Params, RepMaxCdcParams, and CacheCapabilities fields 8–12, verbatim from upstream remote-apis.
  • SpliceBlob re-assembles chunked uploads: validates chunk sizes/counts, checks all chunks exist (touching them), streams them through an incremental hasher into the CAS with reads pipelined 10-at-a-time — the digest is verified before EOF so a mismatch aborts the upload uncommitted — then persists the chunk layout in a configurable index store. The blob is fully materialized so ByteStream/FindMissingBlobs stay correct for non-chunking clients.
  • SplitBlob serves stored layouts (validated to sum to the blob size, so corrupt or truncated index entries are never served), and otherwise chunks blobs on demand with FastCDC 2020 — this is the path that matters for RBE, since worker-uploaded action outputs are never spliced by a client. Unusable layouts (evicted chunks, decode failures, transient existence-check errors) fall back to re-chunking.
  • Native proxy forwarding: grpc-store-backed instances forward both RPCs verbatim to the backend (instance-name rewriting + retries), making NativeLink relays transparent for chunking.
  • Capabilities advertise split_blob_support/splice_blob_support and FastCDC 2020 params per instance (collected across all server blocks, so split-listener deployments work), gated behind the new opt-in experimental_chunking CAS service config.
  • Observability: ChunkingMetrics — splice/split request totals, split hit/miss/on-demand counters, byte totals, and digest verification failures (tracked precisely via an explicit flag, not error-code inference).

Conformance

The chunker is the fastcdc crate at 3.2.1 — the exact version the REAPI spec names as a compliant implementation — used with the spec-mandated normalization level 2 (the crate default level 1 does not match). fastcdc_conformance_test.rs validates chunk offsets, lengths, SHA-256s, and gear-hash fingerprints against the official fastcdc2020_test_vectors.txt from remote-apis, for both seed 0 and seed 666, in both in-memory and streaming form. This is the guarantee that server-produced chunks dedupe byte-for-byte against Bazel-produced chunks. The canonical fixture image is the one already vendored at nativelink-util/tests/data/SekienAkashita.jpg for the existing FastCDC tests (its SHA-256 is asserted in the test).

Note: the vendored nativelink-util/src/fastcdc.rs used by DedupStore is not FastCDC-2020-compliant and is deliberately untouched — deployed dedup indexes depend on its exact boundaries; the two implementations coexist.

Safety / hardening

  • Digests are never trusted: splice verifies the re-assembled digest before commit; per-chunk content lengths are validated against their digests.
  • index_store == cas_store is rejected at startup (layouts stored under blob digests would overwrite blob content).
  • Setting index_store on a grpc-store instance is rejected (the backend owns layouts there).
  • Per-chunk size cap (16 MiB) and a configurable per-instance max_chunk_count (default 50k) bound memory, layout size, and response size; layout reads are capped consistently with the configured count and truncation is detected by the size-consistency check.
  • Off by default; with experimental_chunking unset, behavior is byte-for-byte identical to before (new RPCs return Unimplemented, capabilities advertise false).

Design decisions worth reviewer attention

  1. Materialize-on-splice: the spliced blob is written whole into the CAS rather than serving reads from the layout. This keeps every existing read path correct and lets storage-level dedup compose naturally when cas_store is a DedupStore; the cost is chunks + blob both stored. A ChunkingStore that serves reads from layouts is a possible future direction.
  2. Chunk/index lifetime coupling is best-effort: existence checks touch chunks, but stores answering from existence caches may not promote backing entries, and layouts can outlive chunks. Degradation is graceful (fallback to re-chunking or full download), never incorrect. A CompletenessCheckingStore-style wrapper is the analogous future fix.
  3. ChunkingMetrics follows the ByteStreamMetrics pattern (not yet wired into a metrics root — same as existing service metrics).

Testing

  • 23 service tests: splice/split round trip, digest & size mismatch rejection, missing-chunk NotFound, on-demand chunking (single- and multi-chunk with reassembly verification), layout reuse, unusable-layout fallback, absent-blob NotFound, disabled-instance Unimplemented, config rejection cases, and metric assertions throughout.
  • 2 conformance tests against upstream vectors.
  • GrpcStore forwarding tested against a fake CAS backend over a real gRPC round trip (passthrough + instance-name rewriting).
  • bazel test green across nativelink-service, nativelink-store, nativelink-config, nativelink-util (63 tests) with clippy + rustfmt aspects; cargo fmt --check clean; bazel build //:nativelink succeeds.

Notes

  • Blobs exceeding the configurable max_chunk_count (default 50k, ~25 GiB at the default 512 KiB average) are served without chunking: SplitBlob returns NOT_FOUND and clients fall back to a regular download. Raise the knob for larger blobs, keeping client gRPC message limits in mind.
  • The chunking flag requires Bazel 8.7+/9.1+ on the client; the repo's current Bazel pin (9.0.2) predates it, so the LRE E2E jobs will start exercising this path once the pin moves past 9.1.

🤖 Generated with Claude Code


This change is Reviewable

erneestoc and others added 2 commits July 2, 2026 13:41
Implements the server side of the remote-apis blob split/splice
extension used by Bazel's --experimental_remote_cache_chunking
(Bazel 8.7.0+/9.1.0+), fixes TraceMachina#2496.

- Vendor SplitBlob/SpliceBlob RPCs, ChunkingFunction, FastCdc2020Params
  and CacheCapabilities fields 8-12 from upstream remote-apis.
- SpliceBlob re-assembles chunked uploads: verifies chunk existence and
  the spliced digest before committing, materializes the blob so
  non-chunking clients stay correct, and persists the chunk layout in a
  configurable index store. Chunk reads are pipelined while hashing
  stays in chunk order.
- SplitBlob serves stored layouts (validated against the blob size so
  corrupt or truncated index entries are never served), or chunks blobs
  on demand with FastCDC 2020 (fastcdc crate, normalization level 2) so
  outputs uploaded whole by remote execution workers also get chunked
  downloads. Unusable layouts fall back to re-chunking.
- Capabilities advertise split/splice support and FastCDC 2020
  parameters per instance, collected across all server blocks, gated
  behind the new opt-in experimental_chunking CAS service config (off
  by default; zero behavior change when unset).
- Reject foot-gun configs at startup: index_store == cas_store (chunk
  layouts stored under blob digests would overwrite blob content) and
  chunking on grpc proxy stores (would download/re-upload entire blobs
  instead of forwarding RPCs).
- Conformance-test the chunker against the official REAPI
  fastcdc2020_test_vectors.txt (offsets, lengths, sha256s and gear
  fingerprints, seeds 0 and 666, in-memory and streaming).
- Add ChunkingMetrics (splice/split totals, hit/miss/on-demand rates,
  byte counters, digest verification failures).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
For grpc-store-backed CAS instances the chunking RPCs are now forwarded
verbatim to the backend (with instance-name rewriting and the store's
usual retry handling) instead of being rejected at startup. This makes
NativeLink relays transparent for content-defined chunking: one RPC in,
one RPC out, the backend owns chunking and the layout index.

- Add GrpcStore::split_blob/splice_blob following the existing
  find_missing_blobs/batch_*/get_tree forwarding pattern.
- Shortcut to the proxy in the CAS handlers before any local chunking
  machinery is consulted, mirroring the other four CAS RPCs.
- Make experimental_chunking.index_store optional: required for locally
  chunked instances, rejected for grpc-store instances where the
  backend owns the chunk layouts. The capabilities service still
  advertises split/splice + FastCDC params from the same config block,
  so relay operators set avg_chunk_size_bytes to match their backend.
- Test forwarding against a fake CAS backend over a real gRPC round
  trip (verifies passthrough and instance-name rewriting) and the new
  constructor rules.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@vercel

vercel Bot commented Jul 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
nativelink Ready Ready Preview, Comment Jul 2, 2026 10:40pm
nativelink-aidm Ready Ready Preview, Comment Jul 2, 2026 10:40pm

Request Review

- Reuse the FastCDC test fixture already vendored at
  nativelink-util/tests/data/SekienAkashita.jpg (and already excluded
  from the forbid-binary-files hook) instead of adding a duplicate
  binary copy; export it from nativelink-util for the conformance test.
- Format nativelink-service/Cargo.toml per taplo.
- Replace the hardcoded 50k chunk cap with a per-instance
  experimental_chunking.max_chunk_count knob (default 50000). Blobs
  above the cap are served without chunking; the layout read cap is
  derived from the configured count so the two can never disagree.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support REAPI content-defined chunking (SplitBlob/SpliceBlob) for Bazel's --experimental_remote_cache_chunking

1 participant