Skip to content

fix(payment): use XOR-only local lookup for close-group verification#140

Merged
jacderida merged 1 commit into
WithAutonomi:rc-2026.6.2from
jacderida:fix/xor-only-closeness-verification
Jun 13, 2026
Merged

fix(payment): use XOR-only local lookup for close-group verification#140
jacderida merged 1 commit into
WithAutonomi:rc-2026.6.2from
jacderida:fix/xor-only-closeness-verification

Conversation

@jacderida

Copy link
Copy Markdown
Collaborator

Problem

On a testnet with NAT-simulated nodes, ~100% of uploads fail. Each failed upload reports N-1/N chunks stored, 1 failed, dominated by:

ClientPut receiver <peer> is not among this node's local 9 closest peers (close group plus storage margin)
Paid quote issuer <peer> is not among this node's local 7 closest peers for <chunk>

Because one un-storable chunk fails the whole file, the failure rate scales multiplicatively with file size.

Cause

Both local-admission verification checks call find_closest_nodes_local_with_self:

  • AntProtocol::validate_store_membership (src/storage/handler.rs) — is the receiver responsible for this chunk?
  • PaymentVerifier::validate_paid_quote_issuer_close_group (src/payment/verifier.rs) — is the paid quote's issuer in the close group?

find_closest_nodes_local_with_self ranks the local routing table by reachability (directly-reachable peers first, XOR distance only as a tiebreaker). That ordering demotes an XOR-close relay-only / NAT'd peer out of the compared window. The client, however, selects its quoted close group by pure XOR distance (network lookup). So on a network with NAT'd nodes the two views diverge and the node rejects honest payments — including the receiver wrongly deciding it is not responsible for a chunk it is XOR-close to.

Fix

Closeness verification must mirror the uploader's pure XOR-distance peer selection. Switch both checks to the XOR-only sibling find_closest_nodes_local_by_distance_with_self (which exists for exactly this purpose). No dependency change.

  • Receiver check: XOR-only, keeps its storage-admission width.
  • Issuer check: XOR-only, verifies against the configured close group.

Note

This supersedes and reverts the earlier issuer-check width-widening (#139, close_group_sizestorage_admission_width). That change targeted the wrong mechanism: widening a reachability-reranked window cannot recover an XOR-close peer that the re-rank demoted, and the receiver check (unchanged, already at the wider width) was failing just as hard — which is what pinpointed the re-rank, not the width, as the cause.

🤖 Generated with Claude Code

Both local-admission verification checks — the ClientPut receiver
storage-responsibility check and the paid-quote issuer close-group check —
called `find_closest_nodes_local_with_self`, which ranks the local routing
table by reachability (preferring directly-reachable peers, XOR distance
only as a tiebreaker). That ordering demotes an XOR-close relay-only / NAT'd
peer out of the compared window, so on a network with NAT'd nodes the
verifying node's close-group view diverges from the client's pure
XOR-distance quote selection and honest payments are rejected:

  ClientPut receiver <peer> is not among this node's local 9 closest peers
  Paid quote issuer <peer> is not among this node's local 7 closest peers

One un-storable chunk fails the whole upload, so the failure rate scales
multiplicatively with file size — on a 30%-NAT testnet uploads fail ~100%.

Closeness *verification* must mirror the uploader's pure XOR-distance peer
selection, so switch both checks to the XOR-only sibling
`find_closest_nodes_local_by_distance_with_self` (added for exactly this
purpose). The receiver check keeps its storage-admission width; the issuer
check verifies against the configured close group.

This supersedes the earlier width-widening of the issuer check
(close_group_size + STORAGE_ADMISSION_MARGIN), which targeted the wrong
mechanism — widening a reachability-reranked window cannot recover a demoted
XOR-close peer — and reverts that change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dirvine

dirvine commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

Hermes review

Verdict: LGTM / no code blockers found. The change is well-scoped and matches the staging failure mode described.

What I checked:

  • Diff is only two files:
    • src/storage/handler.rs
    • src/payment/verifier.rs
  • ClientPut receiver admission now uses find_closest_nodes_local_by_distance_with_self(..., storage_admission_width(...))
  • paid quote issuer verification now uses find_closest_nodes_local_by_distance_with_self(..., close_group_size)
  • This keeps the important split intact:
    • receiver/storage admission: close group + storage margin
    • paid issuer authority: strict configured close group
  • Confirmed the saorsa-core method is explicitly documented as XOR-only, self-inclusive, and intended for closeness verification rather than reachability-ranked storage selection.

Local checks run:

cargo test --no-run -q
cargo test issuer_close_group --lib --no-fail-fast
cargo test test_put_rejects_out_of_range_receiver_before_payment_cache --lib --no-fail-fast
cargo test --lib --no-fail-fast
cargo fmt --all -- --check
cargo clippy --lib -- -D warnings

Results:

  • cargo test --lib: 540 passed
  • targeted issuer/receiver tests: passed
  • fmt: passed
  • clippy: passed

CI status when checked:

  • Build: ubuntu/mac/windows passed
  • Clippy, format, docs, security audit, no-logging tests: passed
  • OS test matrix still had ubuntu/mac/windows test jobs pending, so I’d wait for those before merge.

Only minor note: there is no new ant-node-level regression test directly modelling “XOR-close relay-only peer demoted by reachability ranking”. The saorsa-core dependency does have tests/documentation for the XOR-only comparator, and this PR is mostly wiring to that intended API, so I don’t see that as blocking.

@jacderida jacderida merged commit 2cf8ad2 into WithAutonomi:rc-2026.6.2 Jun 13, 2026
11 checks passed
jacderida added a commit that referenced this pull request Jun 13, 2026
Includes PR #140 (XOR-only local lookup for close-group verification).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants