fix(payment): use XOR-only local lookup for close-group verification#140
Merged
jacderida merged 1 commit intoJun 13, 2026
Conversation
Both local-admission verification checks — the ClientPut receiver storage-responsibility check and the paid-quote issuer close-group check — called `find_closest_nodes_local_with_self`, which ranks the local routing table by reachability (preferring directly-reachable peers, XOR distance only as a tiebreaker). That ordering demotes an XOR-close relay-only / NAT'd peer out of the compared window, so on a network with NAT'd nodes the verifying node's close-group view diverges from the client's pure XOR-distance quote selection and honest payments are rejected: ClientPut receiver <peer> is not among this node's local 9 closest peers Paid quote issuer <peer> is not among this node's local 7 closest peers One un-storable chunk fails the whole upload, so the failure rate scales multiplicatively with file size — on a 30%-NAT testnet uploads fail ~100%. Closeness *verification* must mirror the uploader's pure XOR-distance peer selection, so switch both checks to the XOR-only sibling `find_closest_nodes_local_by_distance_with_self` (added for exactly this purpose). The receiver check keeps its storage-admission width; the issuer check verifies against the configured close group. This supersedes the earlier width-widening of the issuer check (close_group_size + STORAGE_ADMISSION_MARGIN), which targeted the wrong mechanism — widening a reachability-reranked window cannot recover a demoted XOR-close peer — and reverts that change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Collaborator
Hermes reviewVerdict: LGTM / no code blockers found. The change is well-scoped and matches the staging failure mode described. What I checked:
Local checks run: Results:
CI status when checked:
Only minor note: there is no new ant-node-level regression test directly modelling “XOR-close relay-only peer demoted by reachability ranking”. The saorsa-core dependency does have tests/documentation for the XOR-only comparator, and this PR is mostly wiring to that intended API, so I don’t see that as blocking. |
jacderida
added a commit
that referenced
this pull request
Jun 13, 2026
Includes PR #140 (XOR-only local lookup for close-group verification).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
On a testnet with NAT-simulated nodes, ~100% of uploads fail. Each failed upload reports
N-1/N chunks stored, 1 failed, dominated by:Because one un-storable chunk fails the whole file, the failure rate scales multiplicatively with file size.
Cause
Both local-admission verification checks call
find_closest_nodes_local_with_self:AntProtocol::validate_store_membership(src/storage/handler.rs) — is the receiver responsible for this chunk?PaymentVerifier::validate_paid_quote_issuer_close_group(src/payment/verifier.rs) — is the paid quote's issuer in the close group?find_closest_nodes_local_with_selfranks the local routing table by reachability (directly-reachable peers first, XOR distance only as a tiebreaker). That ordering demotes an XOR-close relay-only / NAT'd peer out of the compared window. The client, however, selects its quoted close group by pure XOR distance (network lookup). So on a network with NAT'd nodes the two views diverge and the node rejects honest payments — including the receiver wrongly deciding it is not responsible for a chunk it is XOR-close to.Fix
Closeness verification must mirror the uploader's pure XOR-distance peer selection. Switch both checks to the XOR-only sibling
find_closest_nodes_local_by_distance_with_self(which exists for exactly this purpose). No dependency change.Note
This supersedes and reverts the earlier issuer-check width-widening (
#139,close_group_size→storage_admission_width). That change targeted the wrong mechanism: widening a reachability-reranked window cannot recover an XOR-close peer that the re-rank demoted, and the receiver check (unchanged, already at the wider width) was failing just as hard — which is what pinpointed the re-rank, not the width, as the cause.🤖 Generated with Claude Code