feat: MCP rate-limit handling (typed 429, init cool-off, init race fix, ensure_initialized, docs) by KhaledSalhab-Develeap · Pull Request #25 · develeap/hyperping-python

KhaledSalhab-Develeap · 2026-05-21T03:32:57Z

Summary

Fixes how the MCP client handles initialize-bucket rate limiting and adds a small ergonomic surface:

A1 (JSON-RPC rate-limit classification): Sync and async _send_rpc now recognise HTTP 200 responses carrying JSON-RPC error.code == -32000 with "rate limit" in the message and raise HyperpingRateLimitError with retry_after parsed from Retry after Ns, status_code=200, and the original error preserved as response_body. Previously these surfaced as a plain HyperpingAPIError with no typed handling and no .retry_after.
A2 (initialize cool-off latch): Both transports gained _init_blocked_until on a monotonic clock. A rate-limited handshake arms the latch with max(retry_after, 30); subsequent call_tool calls within the window short-circuit via initialize() raising HyperpingRateLimitError(retry_after=remaining) with zero further HTTP requests until the deadline elapses. Prevents the for _ in range(...): HyperpingMcpClient(...) repro from burning further slots after the first hit.
A3 (TOCTOU init race): Added a dedicated _init_lock (separate from _lock, which still guards _request_id). initialize() now takes the init lock and either returns the cached _init_result or runs _initialize_locked(). call_tool unconditionally calls initialize() (no separate flag read), so two concurrent first calls produce exactly one handshake.
A4 (narrow retry): Explicit regression tests pin that HyperpingRateLimitError is never retried by call_tool's 5xx-only retry block. Docstrings updated.
A5 (ensure_initialized()): HyperpingMcpClient and AsyncHyperpingMcpClient gained an ensure_initialized() method that delegates to the transport. Lets services perform a startup-time handshake probe and catch HyperpingRateLimitError early.
B (docs): New "MCP rate limits and connection lifecycle" subsection in README; CHANGELOG [Unreleased] block.

Server-side asks (undocumented per-verb initialize cap, true rolling window, HTTP 429 vs 200/JSON-RPC, accurate Retry-After) remain open and should be raised separately with Hyperping; this change is pure client-side mitigation.

Test Plan

pytest -q -> 454 passed, 0 skipped, coverage 96.07% (gate 85%).
ruff check src tests -> clean.
mypy --strict src -> clean.
29 new tests covering: JSON-RPC -32000 classification with one positive and three negative variants (sync + async); TOCTOU concurrency with threading.Barrier(2) (sync) and asyncio.gather (async); initialize() idempotency; cool-off latch with monkeypatched time.monotonic; latch-clear after deadline; rate-limit-not-retried invariants; the user's 6-fresh-client repro; ensure_initialized() delegation and real-transport idempotency; a README docs-artifact gate test.

… race Classify HTTP 200 + JSON-RPC error code -32000 with a rate-limit message as HyperpingRateLimitError with the parsed retry_after, alongside the existing HTTP 429 path. After a rate-limited initialize, latch a process-monotonic cool-off so subsequent call_tool invocations on the same client fail fast with HyperpingRateLimitError until the deadline elapses, without burning more slots from the server's bucket. Make initialize() idempotent under a dedicated _init_lock with the double-checked flag, closing the lazy-init TOCTOU race where two concurrent first calls could each POST initialize. Mirror the change in the async transport using asyncio.Lock held across the awaitable handshake. Pin that call_tool's transient retry never catches a rate-limit (HTTP 429 or JSON-RPC -32000). Add ensure_initialized() on HyperpingMcpClient and AsyncHyperpingMcpClient for startup health checks, delegating to the transport's idempotent initialize(). Document the new behaviour in README under "MCP rate limits and connection lifecycle" and record it under an [Unreleased] CHANGELOG block.

Addresses the issues surfaced by the unbiased review of #25: - Tighten the rate-limit marker to "rate limit exceeded" so future server messages that merely mention "rate limit" cannot be misclassified. - Broaden the Retry-After parser to accept "Retry-After: Ns" header-style, "retry after N seconds" wordy units, mixed case, and no-units variants. Parametrized tests cover the variants. - Persist the originating status_code (200 vs 429) on the cool-off latch so the short-circuit no longer falsely reports 200 for HTTP 429 sources. - Use math.ceil(remaining) for the cool-off retry_after instead of int(remaining)+1; eliminates the systematic +1s over-report. - Treat retry_after=0 as "no latch" (server says retry now) instead of falling back to the 30s default. - Add a lockless fast path on initialize() so post-handshake call_tool invocations do not acquire _init_lock on every call. - Classify JSON-RPC -32000 rate-limit signals returned on the notifications/initialized leg too; previously they were silently swallowed by the early notification short-circuit. - Add response_body to the HTTP 429 path for symmetry with the JSON-RPC path. - Strengthen tests: concurrency test inspects request bodies for the initialize method count, idempotency test asserts call_count before/after the second initialize(), cool-off-clears test asserts route.call_count at each phase so a regression that hit the network during the latch would fail. - Add tests for status_code preservation, math.ceil semantics, retry_after=0 no-latch behavior, the tightened marker, and the notification-leg case. - Simplify the CHANGELOG docs gate to a single regex assertion. - Expand HyperpingRateLimitError, ensure_initialized(), and README copy to accurately reflect the new behavior; cite the Hyperping MCP docs for the "stateless over HTTP" claim and call out that the latch is per-process. pytest: 482 passed; ruff: clean; mypy --strict: clean; coverage: 95.76%.

KhaledSalhab-Develeap added 5 commits May 20, 2026 22:20

chore: ignore .worktrees/ directory

e982bc9

docs(plan): MCP rate-limit fixes implementation plan (A1+A2+A3+A4+A5+B)

fd86599

docs(plan): MCP rate-limit fixes test plan (A1+A2+A3+A4+A5+B)

7d36009

KhaledSalhab-Develeap merged commit afae9e1 into main May 21, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MCP rate-limit handling (typed 429, init cool-off, init race fix, ensure_initialized, docs)#25

feat: MCP rate-limit handling (typed 429, init cool-off, init race fix, ensure_initialized, docs)#25
KhaledSalhab-Develeap merged 5 commits into
mainfrom
mcp-rate-limit-fixes

KhaledSalhab-Develeap commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KhaledSalhab-Develeap commented May 21, 2026

Summary

Test Plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant