Sliding-Window RLC (RFC 8681) FEC scheme + --fec-scheme switch by josephnef · Pull Request #87 · OpenIPC/devourer

josephnef · 2026-06-07T18:27:45Z

Summary

Companion to PR #86. RaptorQ is a block code; each IP packet has to wait for K source symbols to accumulate before the encoder emits anything, giving an unavoidable ~95 ms per-packet latency floor on this link even at K=8/flush=20 ms. RFC 8681 RLC is a sliding-window code: every source symbol is shipped systematically (zero encoder buffer), repair symbols are linear combinations over the last window source symbols, and the decoder can emit each source the instant it arrives — better for interactive traffic.

Per the user's direction the implementation wraps Inria's reference swif-codec C library (irtf-nwcrg/swif-codec, co-authored by Vincent Roca, the RFC 8681 author) via cffi rather than reimplementing the codec in pure Python.

Wire format

A new RLC envelope MAGIC 0xF534 lives alongside RaptorQ's frozen 0xF52E. The receive dispatcher peeks the first two bytes of the stream payload and routes per-frame — a mixed-scheme deployment is silently rejected (the foreign decoder counts the symbol as malformed) instead of corrupting IP traffic.

MAGIC(2)  VER(1)  TYPE(1)  SYMBOL_SIZE(2)  ESI(4)  WIN(1)  KEY(2)  DT(1)  PAYLOAD(symbol_size)

Source symbols ride the same length-prefix concat-packing scheme as RaptorQ.

Files

Path	Purpose
`vendor/swif-codec/`	Pinned snapshot of upstream commit `de8cd8e` (CeCILL-B) + four small patches in `PATCHES.md` (C99 callback-signature fix, drop `#define DEBUG` and a handful of unconditional `printf`/`full_symbol_dump` calls).
`_swif_build.py`	cffi extension builder. Drives gcc directly — modern setuptools' distutils path-handling broke the standard `ffi.compile()` path.
`stream_fec.py` (modified)	Reshaped into a thin dispatcher: `FecConfig` grows `scheme`/`window`/`density_threshold`; `make_encoder`/`make_decoder` route to the right module. `FecConfig(k=…) / FecEncoder(cfg) / FecDecoder(cfg)` callers still work via backward-compat aliases — `scheme` defaults to `raptorq` for them; `tun_p2p.py`'s `--fec-scheme` defaults to `rlc`.
`stream_fec_raptorq.py`	Moved-out RaptorQ code, renamed `RaptorQEncoder` / `RaptorQDecoder`. Otherwise unchanged.
`stream_fec_rlc.py`	New `RlcEncoder` / `RlcDecoder` over the cffi binding. Encoder emits 1 source envelope + `ceil(overhead)` repair envelopes per sealed source symbol; decoder feeds source symbols straight through (systematic) and rebuilds the encoder's coding window on each repair to re-derive the same TinyMT32 coefficients.
`test_stream_fec_rlc.py`	13 RLC tests: round-trip, loss tolerance at 0/10/20 %, overhead bumping for 30 % loss, concatenation packing, oversized rejection, dispatcher MAGIC routing, garbage envelope drop, partial-symbol flush, distinct-MAGIC, config validation.
`tun_p2p.py` (modified)	New `--fec-scheme {rlc,raptorq}`, `--fec-window`, `--fec-density-threshold`. `make_encoder`/`make_decoder` factories. TUN-write boundary drops malformed FEC-recovered packets as `mal`-counted instead of taking the bridge down.

Verification

Offline — cd tools/precoder && uv sync && uv run pytest:

100 tests pass (31 pipeline + 37 stream + 19 raptorq + 13 rlc).

Hardware (two-netns single-host bench, RTL8812AU 0x8812 ↔ TP-Link Archer T2U Plus / RTL8821AU 2357:0120, ch 6), 10-min soak (600 pings at 1 Hz, no --repeat):

Scheme	Config	Loss	RTT min/avg/max	Airtime/pkt	blk-lost
RLC	W=16, R/K=1, 20 ms	6.0 %	13 / 42 / 79 ms	~4 ms (2 envelopes)	10 / 586
RaptorQ	K=8, R/K=1, 20 ms	0.17 %	59 / 95 / 146 ms	~34 ms (17 envelopes)	1 / 606

The 6 % RLC loss is end-to-end after recovery: ~1.7 % of RLC blocks are unrecoverable (window expires before enough repairs land), and each unrecoverable block can carry multiple concatenation-packed ICMP packets. Raising --fec-overhead closes the gap at the cost of airtime.

Bandwidth-vs-recovery trade-off (this is the bigger story than RTT)

Per-IP-packet envelope count from the same soak: RLC ships 2 envelopes per packet, RaptorQ ships 17 — an 8.5× airtime gap at this traffic shape. The reason is structural: RaptorQ's block code emits K source + R repair symbols per block regardless of how full the block is, so a 1 Hz ping triggers a flush of 1 real packet plus K-1 zero-padded sources + K repairs. RLC's sliding-window code emits 1 source + R repair per source symbol, so sparse traffic stays sparse on the wire.

This makes --fec-scheme a bandwidth knob as much as a latency one:

RLC wins on interactive / sparse traffic — lower per-packet airtime and lower per-packet latency, at the cost of higher residual loss.
RaptorQ wins on bulk / saturated traffic — its asymptotic ~50 % loss tolerance at large K beats RLC's ~30 % at small W, and the K-symbol amortisation cost only stings when blocks aren't naturally full.

Test plan

uv run pytest → 100 passed
python tun_p2p.py --help | grep '^\s*--fec' parses scheme + window + density-threshold + the existing five flags
10-min RLC soak: 564/600 (6 %), RTT avg 42 ms, ~2 envelopes/pkt
10-min RaptorQ regression: 599/600 (0.17 %), RTT avg 95 ms, ~17 envelopes/pkt — matches PR RaptorQ (RFC 6330) FEC layer for the stream link #86 baseline
Mixed-scheme negative: test_dispatcher_routes_by_magic proves RLC envelopes go nowhere through a RaptorQ decoder and vice-versa
Reviewer to re-soak on their bench with their own pair of 8812/8821 adapters

Builds on master (#86 merged). cffi extension is built lazily on first import; pyproject.toml adds cffi>=1.16.

Open caveats (documented in script)

swif-codec upstream's C99 / verbose-codec quirks are patched in the vendored copy; reapply on next upstream pull.
The fixed encoder repair cadence (no rateless overhead bumping) is shared with the RaptorQ path; a future PR could add reverse-channel hints.
Windows MSVC build of the cffi extension is untested — Linux/macOS only on CI initially.

🤖 Generated with Claude Code

Companion to the RaptorQ implementation (#86). RaptorQ is a block code; each IP packet has to wait for K source symbols to accumulate before any encoded output ships, giving an unavoidable ~95 ms per-packet latency floor on this link even at K=8/flush=20 ms. RFC 8681 RLC is a sliding-window code: every source symbol is shipped systematically (zero encoder buffer), repair symbols are linear combinations over the last `window` source symbols, and the decoder can emit each source the instant it arrives unmodified — better for interactive traffic. Per user direction the implementation wraps Inria's reference C library (irtf-nwcrg/swif-codec, co-authored by Vincent Roca, the RFC 8681 author) via cffi rather than reimplementing the codec in pure Python; the vendored source is patched (see vendor/swif-codec/PATCHES.md) to (1) fix a void/void* mismatch in the generic callback signature that breaks strict C99 builds, (2) drop a debug `#define DEBUG 1` and a few hardcoded `printf` / `full_symbol_dump` calls that flooded stdout. Wire format A new RLC envelope MAGIC `0xF534` lives alongside RaptorQ's frozen `0xF52E`. The receive dispatcher peeks the first two bytes of the stream payload and routes per-frame — a mixed-scheme deployment is silently rejected (the foreign decoder counts the symbol as malformed) instead of corrupting IP traffic. RLC inner envelope (14 B header + symbol_size payload): MAGIC(2) VER(1) TYPE(1) SYMBOL_SIZE(2) ESI(4) WIN(1) KEY(2) DT(1) Source symbols ride the same length-prefix concat-packing scheme as RaptorQ; the shared `PACKET_LEN_PREFIX` constant lives at the dispatcher. Files * `stream_fec.py` reshapes into a thin dispatcher: `FecConfig` grows `scheme` / `window` / `density_threshold` fields, `make_encoder` / `make_decoder` route to the right module. Pre-RLC callers that construct `FecConfig(k=…)` and call `FecEncoder(cfg)` still work via the backward-compat aliases — `scheme` defaults to "raptorq" so existing tests don't change behaviour. `tun_p2p.py`'s `--fec-scheme` flag flips the user-facing default to "rlc". * `stream_fec_raptorq.py` carries the moved-out RaptorQ classes unchanged (renamed `RaptorQEncoder` / `RaptorQDecoder`). * `stream_fec_rlc.py` new — `RlcEncoder` / `RlcDecoder` over the cffi binding. Encoder emits 1 source envelope + ceil(overhead) repair envelopes per sealed source symbol; decoder feeds source symbols straight through (systematic) and rebuilds the encoder's coding window on each repair to let swif-codec re-derive the same TinyMT32-driven coefficients. * `_swif_build.py` — cffi extension builder (drives gcc directly, bypasses setuptools-distutils path mangling that broke the standard `ffi.compile()` path on modern setuptools). * `vendor/swif-codec/` — pinned snapshot of upstream commit `de8cd8e`, CeCILL-B; PATCHES.md / COMMIT / AUTHORS retained. * `test_stream_fec_rlc.py` — 13 RLC tests: round-trip, loss tolerance at 0/10/20%, overhead bumping for 30% loss, concatenation packing, oversized packet, dispatcher MAGIC routing, garbage envelope drop, partial-symbol flush, distinct-MAGIC assertion, config validation. * `tun_p2p.py` — `--fec-scheme` / `--fec-window` / `--fec-density-threshold` flags. The `make_encoder` / `make_decoder` factories replace the direct constructor calls; the tx_thread / rx_thread / fec_flush_thread are scheme-agnostic. TUN write paths now drop malformed FEC-recovered packets at the boundary (count as `mal`) instead of taking the bridge down. Verification Offline — `cd tools/precoder && uv sync && uv run pytest`: * 100 tests pass (31 pipeline + 37 stream + 19 raptorq + 13 rlc). * `python tun_p2p.py --help` parses both scheme flag families. Hardware (two-netns single-host bench, RTL8812AU 0x8812 + TP-Link Archer T2U Plus / RTL8821AU 0x0120, ch 6), 10-minute soak (600 pings at 1 Hz, no --repeat): | Scheme | Config | Loss | RTT min/avg/max | blk-lost | |---------|---------------------|-------:|----------------:|---------:| | RLC | W=16, R/K=1, 20ms | 6.0% | 13 / **42** / 79 ms | 10 / 586 | | RaptorQ | K=8, R/K=1, 20ms | 0.17% | 59 / **95** / 146 ms | 1 / 606 | The 6 % RLC loss is end-to-end after recovery: ~1.7 % of RLC blocks are unrecoverable (window expires before enough repairs land); each unrecoverable block can carry multiple concatenation-packed ICMP packets, hence the higher packet-level number. Raising `--fec-overhead` closes the gap at the cost of airtime. Median RTT drops by ~45 % vs RaptorQ — the systematic-emission win that motivated bringing in a second scheme. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Both fields are already on the RX descriptor: `seq_num` is parsed at FrameParser.cpp:98, `tsfl` was one commented-out line at line 129. The FEC layer (#86 / #87) and any latency-measurement consumer want both visible; this is the data the chip already gives us. * src/FrameParser.h — add `uint32_t tsfl` to rx_pkt_attrib alongside the existing seq_num. * src/FrameParser.cpp — uncomment the TSFL parser: - /* pattrib.tsfl=(byte)GET_RX_STATUS_DESC_TSFL_8812(pdesc); */ + pattrib.tsfl = GET_RX_STATUS_DESC_TSFL_8812(pdesc); Drop the bogus `(byte)` cast — the macro reads all 32 bits of pdesc+20 as a u32, not a byte (verified against rtl8812a_recv.h). * demo/main.cpp — extend the <devourer-stream> printf with `seq=%u tsfl=%u`. Optional fields; PR #84's regex pattern in stream_rx.py / tun_p2p.py / corruption_analysis.py already tolerates the new fields via the same pass-through approach used for rssi/evm/snr (no Python-side change required to keep working). What this enables (out of scope for this PR — just data surfacing) * FEC RX side can dedup by chip-side seq before feeding the codec, so air-level retransmissions stop double-counting at the codec. * One-way latency measurement by diffing TSF against the host clock at TX time — a building block for the F5 TX-RPT goodput numbers and for any adaptive `--fec-overhead` loop. Verification * `cmake --build build -j` clean. * Default behaviour: <devourer-stream> lines now carry seq + tsfl fields; existing Python consumers (regexes are tolerant) keep working. tests/regress.py 4-cell matrix byte-identical. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

## Summary Both fields are already on the RX descriptor: `seq_num` is parsed at `FrameParser.cpp:98`, `tsfl` was one commented-out line at line 129. The FEC layer (#86 / #87) and any latency-measurement consumer want both visible; this PR surfaces what the chip already gives us. ## Changes - **`src/FrameParser.h`** — add `uint32_t tsfl` to `rx_pkt_attrib` alongside the existing `seq_num`. - **`src/FrameParser.cpp`** — uncomment the TSFL parser and drop the bogus `(byte)` cast (the macro reads all 32 bits of `pdesc+20` as a u32, not a byte — verified against `rtl8812a_recv.h`): ```diff - /* pattrib.tsfl=(byte)GET_RX_STATUS_DESC_TSFL_8812(pdesc); */ + pattrib.tsfl = GET_RX_STATUS_DESC_TSFL_8812(pdesc); ``` - **`demo/main.cpp`** — extend the `<devourer-stream>` printf with `seq=%u tsfl=%u`. Optional fields; PR #84's regex pattern in `stream_rx.py` / `tun_p2p.py` / `corruption_analysis.py` already tolerates them via the same pass-through approach used for rssi/evm/snr. ## What this enables (out of scope for this PR — just data surfacing) - FEC RX side can dedup by chip-side seq before feeding the codec, so air-level retransmissions stop double-counting at the codec. - One-way latency measurement by diffing TSF against the host clock at TX time — a building block for the F5 TX-RPT goodput numbers and any adaptive `--fec-overhead` loop. ## Test plan - [x] `cmake --build build -j` clean - [x] `<devourer-stream>` lines on master now carry `seq` + `tsfl` fields; existing Python consumers tolerate the additions via their existing regex pass-through (no Python-side change required). - [ ] Reviewer to run an existing tun_p2p bench and confirm the new fields appear without disturbing throughput / loss numbers. Second in the five-feature C++ series. Followed by: - F3 — selectable stream-carrier rate/BW (uses F1's HT-MCS unlock + this PR's seq/tsfl plumbing for dup detection) - F5 — C2H TX-RPT parser + REG_FIFOPAGE_INFO queue-depth poll - F2 — BB-dbgport per-subcarrier IQ spike (research) Predecessor: F1 (#88). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

josephnef merged commit 006891a into master Jun 7, 2026
5 checks passed

josephnef deleted the rlc-fec branch June 7, 2026 18:30

josephnef mentioned this pull request Jun 7, 2026

F4: Surface RX seq_num + TSF low on <devourer-stream> #89

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sliding-Window RLC (RFC 8681) FEC scheme + --fec-scheme switch#87

Sliding-Window RLC (RFC 8681) FEC scheme + --fec-scheme switch#87
josephnef merged 1 commit into
masterfrom
rlc-fec

josephnef commented Jun 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

josephnef commented Jun 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Wire format

Files

Verification

Bandwidth-vs-recovery trade-off (this is the bigger story than RTT)

Test plan

Open caveats (documented in script)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

josephnef commented Jun 7, 2026 •

edited

Loading