Skip to content

Phy-level soft metrics on stream lines + BER-vs-SNR analyser#84

Merged
josephnef merged 1 commit into
masterfrom
phy-soft-metrics
Jun 7, 2026
Merged

Phy-level soft metrics on stream lines + BER-vs-SNR analyser#84
josephnef merged 1 commit into
masterfrom
phy-soft-metrics

Conversation

@josephnef
Copy link
Copy Markdown
Collaborator

Summary

Follow-up A from #83. Adds per-path RSSI / EVM / SNR to every <devourer-stream> line so corruption_analysis.py can correlate BER with link quality on a per-frame basis instead of aggregated-only statistics.

Changes

  • demo/main.cpp<devourer-stream>rate=R len=L crc_err=X icv_err=Y rssi=A,B evm=A,B snr=A,B body=HEX. Same source as the Tier-2 diagnostics in <devourer-body>; no new RX-status fields, just surfacing what FrameParser already populates on RxAtrib.
  • tools/precoder/corruption_analysis.py — parses the new fields, reports two new sections:
    • SNR distribution (min/p25/med/p75/max) for chip-clean vs chip-corrupt populations
    • BER per 5-dB SNR bucket
      Uses max(snr_A, snr_B) as the "effective" SNR — on single-antenna 1T1R sticks path B reads 0 (no signal, not "0 dB"), so a naive min would collapse the bucket view; max picks the active path on 1T1R and the stronger path on 2T2R single-stream operation.
  • stream_rx.py / tun_p2p.py / precoder_stream_roundtrip.py — regex updated to tolerate the new optional rssi=/evm=/snr= fields. None of them use the metrics yet (pass-through compatibility).

Hardware verification

500 frames at default TX power, RTL8812AU → T2U Plus RTL8821AU, ch 6:

phy SNR (stronger path, dB):
  chip-clean    : n=467 min=0 p25=30 med=33 p75=38 max=51
  chip-corrupt  : n=0

BER by SNR bucket (stronger path, 5-dB buckets):
  bucket       frames   bits-cmp   bit-err    BER
       0-5 dB        1        192        0   0.000e+00
     20-25 dB       11       2112        0   0.000e+00
     25-30 dB       76      14592        0   0.000e+00
     30-35 dB      178      34176        0   0.000e+00
     35-40 dB      122      23424        0   0.000e+00
     40-45 dB       55      10560        0   0.000e+00
     45-50 dB       19       3648        0   0.000e+00
     50-55 dB        5        960        0   0.000e+00

Bench link is too clean for chip-corrupt events even at the SNR tails — same finding as the post-PR-investigation in #83 (loss is at PHY sync, not FCS). The analyser is ready for noisier deployments / range-extended captures (follow-up B).

Offline analyser smoke

Synthetic 5-clean@28dB + 5-corrupt@5dB injection. Analyser correctly buckets:

BER by SNR bucket (stronger path, 5-dB buckets):
  bucket       frames   bits-cmp   bit-err    BER
      5-10 dB        5        960       10   1.042e-02
     25-30 dB        5        960        0   0.000e+00

The per-bucket correlation works as designed — corrupted samples land in the 5-10 dB bucket at 1.04×10⁻² BER, clean samples land at high SNR with BER 0.

Builds on #83 (merged). Next: follow-up B — characterise real-world background corruption patterns (burst-length distribution, byte-position distribution) to inform stream-layer FEC design.

🤖 Generated with Claude Code

Follow-up A from #83. Adds per-path RSSI / EVM / SNR to every
<devourer-stream> line so corruption_analysis.py can correlate BER
with link quality on a per-frame basis instead of relying on
aggregated statistics.

* demo/main.cpp: <devourer-stream>rate=R len=L crc_err=X icv_err=Y
  rssi=A,B evm=A,B snr=A,B body=HEX. Same source as the Tier-2
  diagnostics in <devourer-body>; no new RX-status fields, just
  surfacing what FrameParser already populates.
* tools/precoder/corruption_analysis.py: parses the new fields,
  reports
    - SNR distribution (min/p25/med/p75/max) for chip-clean vs
      chip-corrupt populations
    - BER per 5-dB SNR bucket
  Uses max(snr_A, snr_B) as the "effective" SNR — on single-antenna
  1T1R sticks path B reads 0 (no signal, not "0 dB"), so a naive min
  would always report 0 and the bucket view collapses; max picks
  the active path on 1T1R and the stronger path on 2T2R
  single-stream operation.
* stream_rx.py / tun_p2p.py / precoder_stream_roundtrip.py: regex
  updated to tolerate the new optional rssi/evm/snr fields (none
  read them yet — pass-through compatibility).

Verification

Hardware (500 frames at default TX power, RTL8812AU → T2U Plus
RTL8821AU, ch 6):

    phy SNR (stronger path, dB):
      chip-clean    : n=467 min=0 p25=30 med=33 p75=38 max=51
      chip-corrupt  : n=0
    BER by SNR bucket (stronger path, 5-dB buckets):
      bucket       frames   bits-cmp   bit-err    BER
           0-5 dB        1        192        0   0.000e+00
         20-25 dB       11       2112        0   0.000e+00
         25-30 dB       76      14592        0   0.000e+00
         30-35 dB      178      34176        0   0.000e+00
         35-40 dB      122      23424        0   0.000e+00
         40-45 dB       55      10560        0   0.000e+00
         45-50 dB       19       3648        0   0.000e+00
         50-55 dB        5        960        0   0.000e+00

Bench link is too clean for chip-corrupt events even at the SNR tails,
which matches the post-PR-investigation finding for #83: at bench
distance the loss is at PHY sync, not FCS. The analyser is ready for
noisier deployments / range-extended captures (follow-up B).

Offline smoke (synthetic 5-clean@28dB + 5-corrupt@5dB injection)
correctly buckets BER=0 in the 25-30 dB bucket and BER=1.04e-2 in the
5-10 dB bucket — the per-bucket correlation works as designed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@josephnef josephnef merged commit fa71838 into master Jun 7, 2026
5 checks passed
@josephnef josephnef deleted the phy-soft-metrics branch June 7, 2026 14:20
josephnef added a commit that referenced this pull request Jun 7, 2026
## Summary

Follow-up B from #83 (depends on #84's phy soft metrics): a chip-side
`DEVOURER_RX_DUMP_ALL=1` env var that emits one line per RX frame with
the chip's full integrity + phy soft-metric vector, plus an aggregate
analyser that turns those into FEC-design-grade statistics.

The previous work showed that the **chip-corrupt** pipeline now reaches
the application layer (#83) and that per-frame phy metrics let the
analyser correlate BER with SNR (#84). This PR is the third leg: a
**long-capture survey** tool that characterises the actual
corruption-pattern distribution real-world deployments face, so a FEC
layer on top of the stream link can be sized empirically rather than
guessed.

## Changes

- **`demo/main.cpp`** — new `DEVOURER_RX_DUMP_ALL=1` knob emits
`<devourer-corrupt-any>len=L crc_err=X icv_err=Y rate=R rssi=A,B evm=A,B
snr=A,B`. Body bytes are deliberately omitted (a hot survey would
inflate the log past usable size); the aggregate report only needs
length + flags + phy.
- **`tools/precoder/corruption_survey.py`** — new tool that reads those
lines and reports:
  - headline chip-clean vs chip-corrupt counts
- **corruption rate broken down by DESC_RATE** (the CCK-vs-OFDM split —
without this the headline is dominated by always-clean CCK ACKs/beacons
and underestimates what OFDM data faces)
  - frame-size distribution for each population
- phy-metric stats per population, filtered to frames where the chip
populated phy stats (CCK reports 0/0; we treat as "no measurement"
instead of "0 dB" so the buckets don't collapse)
  - per-SNR-bucket corruption rate (where measurable)
  - temporal clustering (live captures only)
- a heuristic FEC recommendation based on median-vs-peak corruption rate

## Bench finding

60-second ch6 capture in a busy office environment with several APs in
range:

```
=== corruption survey (2266 frames, file/pipe) ===
chip-clean       :   1663 ( 73.4%)
chip-corrupt     :    603 ( 26.6%)
corruption rate  : 26.61%
no-phy-measurement:  2103  (CCK/short frames, chip reports 0/0)

Corruption rate by DESC_RATE:
   idx name            count      %    corrupt    rate
  0x00 1M CCK           2075  91.6%        412  19.9%
  0x02 5.5M CCK            2   0.1%          2 100.0%
  0x03 11M CCK             1   0.0%          1 100.0%
  0x04 6M OFDM            17   0.8%         17 100.0%
  0x05 9M OFDM            19   0.8%         19 100.0%
  0x06 12M OFDM           20   0.9%         20 100.0%
  0x07 18M OFDM           31   1.4%         31 100.0%
  0x08 24M OFDM           22   1.0%         22 100.0%
  0x09 36M OFDM           30   1.3%         30 100.0%
  0x0a 48M OFDM           31   1.4%         31 100.0%
  0x0b 54M OFDM           18   0.8%         18 100.0%
```

## Reading the result

- **1M CCK loses ~20%** even at this location — CCK is robust but
background interference still nukes one in five ACKs/beacons.
- **Every OFDM rate above CCK is 100% corrupt** because we're hearing
distant APs at marginal SNR — the chip detects them, decodes them, fails
the FCS, and now (with #83's RCR change) surfaces them.

The FEC-design takeaway:

- The PoC's 6M OFDM stream link only works because TX and RX are
co-located. At any real range the chip will surface FCS failures at high
rate.
- The stream layer needs **inter-frame parity** (Reed-Solomon over N
frames + K parity, Raptor, etc.) to recover from blocks of lost frames,
not just per-frame FEC.
- For a P2P link's typical "moderate range" use case (e.g. OpenIPC
long-range video), expect frame loss rates in the 30–70% range. FEC
overhead has to be sized accordingly — at 50% loss you need K/N ≈ 0.5 to
be reliable.

## Follow-ups (for whoever picks up the FEC layer)

- Pick a parity scheme (Reed-Solomon is simplest, Raptor scales better)
and parametrise N, K against captures from realistic ranges.
- Decide where parity rides: in-band on the same SA (current TX path)
vs. on a dedicated SA / frame type. In-band keeps the link simple but
eats stream airtime.
- Consider degrading rate gracefully (rateless codes) so the receiver
can decode at whatever fraction of N+K frames it actually receives.

Builds on #83 (chip-level filter open, merged) and #84 (phy soft
metrics, open).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 7, 2026
## Summary

The corruption survey in #85 showed real-range OFDM frames on this link
will see **30–70% loss**. tun_p2p.py's blind `--repeat N` is a
fixed-cost workaround that can't compose to handle the tail; this PR
ships a real erasure code on top of the existing stream framing.

## Library

`raptorq` from cberner (Rust+PyO3 binding to the RFC 6330 reference
port). MIT, manylinux abi3 wheels on PyPI, ~26 Gbps enc / ~7 Gbps dec at
K=1000 on commodity x86. `uv add raptorq` is the only install step.

## Wire format

The existing `stream.py` framing stays untouched. FEC is an **inner
envelope** living inside `StreamFrame.payload`:

```
   FEC_MAGIC      (2)  = 0xF52E
   VERSION/FLAGS  (1)  = 0
   K              (1)  = source symbols per block
   KREAL          (1)  = real source symbols in this block (≤ K). Trailing
                        (K - KREAL) decoded symbols are zero-pad to discard.
   SYMBOL_SIZE    (2)  = LE u16
   BLOCK_ID       (2)  = LE u16 wraps
   RAPTORQ_PKT    (var) = lib-managed SBN+ESI+symbol
   inner overhead   = 9 B + raptorq's 4 B SBN/ESI = 13 B
```

Source symbols are themselves concatenations of length-prefixed IP
packets:

```
[u16 len_a][packet_a]…[u16 len_b][packet_b]…[zero pad to SYMBOL_SIZE]
```

So small packets (ACK floods) share symbols instead of each burning a
whole symbol's worth of airtime.

## Files

- `tools/precoder/pyproject.toml` — add `raptorq>=2`.
- `tools/precoder/stream_fec.py` — `FecConfig`, `FecEncoder`
(concatenation packing + block encoding), `FecDecoder`
(block-incremental decode + late-symbol drop + block expiry).
- `tools/precoder/test_stream_fec.py` — 19 unit tests: round-trip, loss
tolerance 0/20/40% at R/K=1, 50% at R/K=2, unrecoverable-block
bookkeeping at 70%, concatenation, partial flush, block-id wrap, MTU
enforcement, garbage envelopes.
- `tools/precoder/tun_p2p.py` — new
`--fec-k`/`--fec-overhead`/`--fec-symbol-size`/`--fec-flush-ms`/`--fec-block-expire-ms`
flags. tx_thread feeds packets through the encoder; a parallel
`fec_flush_thread` force-encodes partial blocks every flush-ms (sparse
traffic doesn't stall). rx_thread feeds payloads through the decoder;
decoded IP packets go to TUN. Outer `SeqWindow` dedup is forced OFF when
FEC is on (RaptorQ symbols self-dedup via SBN+ESI). New `fec=[...]`
segment in the periodic stderr report. Docstring extended.

## Hardware verification

Two-netns single-host bench (RTL8812AU `0x8812` + TP-Link Archer T2U
Plus / RTL8821AU `2357:0120`, ch 6, no `--repeat`, `ping -c 30 -i 1`):

| Config | RTT min/avg/max | Loss | DUP | Blocks ok/lost |
|---|---|---:|---:|---:|
| `--fec-k 16 --fec-overhead 1.0 --fec-flush-ms 50` | 121 / **160** /
207 ms | 0% | 0 | 30 / 1 (startup) |
| `--fec-k 8 --fec-overhead 1.0 --fec-flush-ms 20` | 73 / **95** / 145
ms | 0% | 0 | 30 / 1 (startup) |

The K=8 config trades a bit of recovery margin for a 65 ms drop in
median RTT. Both decode 100% of source packets on a healthy link; the
survey's noisier regimes are what motivates `--fec-overhead > 1`.

For comparison from PR #82's earlier numbers (same bench, byte mode):

| Mode | Loss | Avg RTT |
|---|---:|---:|
| Byte mode `--repeat 1` | 10% | 7 ms |
| Byte mode `--repeat 4` + dedup | 0% | 10 ms (with up to 25 DUPs per
ping eaten by dedup) |
| **FEC K=8 R/K=1 flush=20**  | **0%** | **95 ms** |

FEC moves us from "blind redundancy + dedup" to "real erasure code". The
latency cost is the K-source-symbol encode buffer; the win is that the
codec scales gracefully to higher loss rates by raising `--fec-overhead`
instead of running out at `--repeat=∞`.

## Test plan

- [x] `cd tools/precoder && uv run pytest` → 87 passed (31 pipeline + 37
stream + 19 fec)
- [x] `python -m pytest tests/precoder_smoke.py
tests/precoder_stream_smoke.py` → 8 passed
- [x] tun_p2p.py --help parses cleanly (incl. all FEC flags)
- [x] Bench: K=16/R=1 and K=8/R=1, both 30/30 ping with 0% loss and 0
DUPs

## Open caveats (documented in script)

- Strict block boundaries — no cross-block FEC, no Raptor carousel. Good
enough at K=8–16 + 20–50 ms flush; revisit if the latency budget
tightens further.
- No rateless dynamic overhead — R/K is fixed at construction. A future
PR could let RX hint TX to send more repair symbols via a
reverse-channel feedback envelope.
- Patent note: RFC 6330 has Qualcomm patents largely expired in primary
jurisdictions by 2026; cberner's MIT lib explicitly notes this.

Builds on #82 (TUN bridge, merged), #83 (corrupted-frame surfacing,
merged), #84 (phy soft metrics, open), #85 (corruption survey, open).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 7, 2026
Both fields are already on the RX descriptor: `seq_num` is parsed at
FrameParser.cpp:98, `tsfl` was one commented-out line at line 129. The
FEC layer (#86 / #87) and any latency-measurement consumer want both
visible; this is the data the chip already gives us.

* src/FrameParser.h — add `uint32_t tsfl` to rx_pkt_attrib alongside
  the existing seq_num.
* src/FrameParser.cpp — uncomment the TSFL parser:
  -   /* pattrib.tsfl=(byte)GET_RX_STATUS_DESC_TSFL_8812(pdesc); */
  +   pattrib.tsfl = GET_RX_STATUS_DESC_TSFL_8812(pdesc);
  Drop the bogus `(byte)` cast — the macro reads all 32 bits of
  pdesc+20 as a u32, not a byte (verified against rtl8812a_recv.h).
* demo/main.cpp — extend the <devourer-stream> printf with
  `seq=%u tsfl=%u`. Optional fields; PR #84's regex pattern in
  stream_rx.py / tun_p2p.py / corruption_analysis.py already tolerates
  the new fields via the same pass-through approach used for
  rssi/evm/snr (no Python-side change required to keep working).

What this enables (out of scope for this PR — just data surfacing)

* FEC RX side can dedup by chip-side seq before feeding the codec, so
  air-level retransmissions stop double-counting at the codec.
* One-way latency measurement by diffing TSF against the host clock
  at TX time — a building block for the F5 TX-RPT goodput numbers and
  for any adaptive `--fec-overhead` loop.

Verification

* `cmake --build build -j` clean.
* Default behaviour: <devourer-stream> lines now carry seq + tsfl
  fields; existing Python consumers (regexes are tolerant) keep
  working. tests/regress.py 4-cell matrix byte-identical.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
josephnef added a commit that referenced this pull request Jun 7, 2026
## Summary

Both fields are already on the RX descriptor: `seq_num` is parsed at
`FrameParser.cpp:98`, `tsfl` was one commented-out line at line 129. The
FEC layer (#86 / #87) and any latency-measurement consumer want both
visible; this PR surfaces what the chip already gives us.

## Changes

- **`src/FrameParser.h`** — add `uint32_t tsfl` to `rx_pkt_attrib`
alongside the existing `seq_num`.
- **`src/FrameParser.cpp`** — uncomment the TSFL parser and drop the
bogus `(byte)` cast (the macro reads all 32 bits of `pdesc+20` as a u32,
not a byte — verified against `rtl8812a_recv.h`):
  ```diff
  - /* pattrib.tsfl=(byte)GET_RX_STATUS_DESC_TSFL_8812(pdesc); */
  + pattrib.tsfl = GET_RX_STATUS_DESC_TSFL_8812(pdesc);
  ```
- **`demo/main.cpp`** — extend the `<devourer-stream>` printf with
`seq=%u tsfl=%u`. Optional fields; PR #84's regex pattern in
`stream_rx.py` / `tun_p2p.py` / `corruption_analysis.py` already
tolerates them via the same pass-through approach used for rssi/evm/snr.

## What this enables (out of scope for this PR — just data surfacing)

- FEC RX side can dedup by chip-side seq before feeding the codec, so
air-level retransmissions stop double-counting at the codec.
- One-way latency measurement by diffing TSF against the host clock at
TX time — a building block for the F5 TX-RPT goodput numbers and any
adaptive `--fec-overhead` loop.

## Test plan

- [x] `cmake --build build -j` clean
- [x] `<devourer-stream>` lines on master now carry `seq` + `tsfl`
fields; existing Python consumers tolerate the additions via their
existing regex pass-through (no Python-side change required).
- [ ] Reviewer to run an existing tun_p2p bench and confirm the new
fields appear without disturbing throughput / loss numbers.

Second in the five-feature C++ series. Followed by:
- F3 — selectable stream-carrier rate/BW (uses F1's HT-MCS unlock + this
PR's seq/tsfl plumbing for dup detection)
- F5 — C2H TX-RPT parser + REG_FIFOPAGE_INFO queue-depth poll
- F2 — BB-dbgport per-subcarrier IQ spike (research)

Predecessor: F1 (#88).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant