Skip to content

Releases: develeap/hyperping-python

v1.7.0 — MCP rate-limit handling, init cool-off, ensure_initialized()

21 May 08:36
9fcbba7

Choose a tag to compare

Highlights

This release hardens how the MCP client deals with the Hyperping server's initialize rate limit. Until now, a rate-limit on the MCP handshake came back as a generic HyperpingAPIError that callers couldn't dispatch on, and a tight retry loop on a fresh HyperpingMcpClient could burn through the bucket in seconds. After 1.7.0 the SDK detects, types, and cools-off these signals correctly.

Why upgrade

  • Typed rate-limit errors on MCP initialize. The Hyperping MCP server signals its undocumented initialize cap by returning HTTP 200 with JSON-RPC error.code = -32000 (not 429). The SDK now classifies that as HyperpingRateLimitError with retry_after parsed from the message, so except HyperpingRateLimitError works just like it does for the REST 429 path.

    from hyperping import HyperpingMcpClient, HyperpingRateLimitError
    
    try:
        mcp.get_status_summary()
    except HyperpingRateLimitError as e:
        time.sleep(e.retry_after or 30)
  • Initialize cool-off latch. After a rate-limited handshake, subsequent calls on the same client short-circuit with HyperpingRateLimitError and issue zero further HTTP requests until the advertised retry_after elapses. Stops accidentally burning more slots from the bucket.

  • ensure_initialized() on both clients. Lets you perform the MCP handshake explicitly at service boot so you fail fast on cold-start rate-limits instead of failing on the first business call.

    mcp = HyperpingMcpClient(api_key="sk_...")
    try:
        mcp.ensure_initialized()  # raises HyperpingRateLimitError if capped
    except HyperpingRateLimitError as e:
        print(f"cold-start rate-limited; retry in {e.retry_after}s")
        raise
  • Concurrency fix. Closed a TOCTOU race in lazy initialize where two concurrent first calls on one HyperpingMcpClient could each POST initialize. Now exactly one handshake fires, guarded by a dedicated lock with a lockless fast path so post-handshake call_tool doesn't contend.

  • status_code preservation through cool-off. When the latch short-circuits, the raised exception carries the originating status (200 for JSON-RPC -32000, 429 for HTTP 429) so callers can distinguish the two buckets.

  • Notification-leg classification. A rate-limit signal returned on the notifications/initialized leg of the handshake is now classified too, instead of being silently swallowed.

  • New README section: "MCP rate limits and connection lifecycle" with operational guidance: one long-lived HyperpingMcpClient per process, avoid instantiating in a loop, and how multiple workloads on one API key collide on the initialize cap.

Server-side caveats (not fixed here, by design)

The Hyperping MCP server still:

  • Returns HTTP 200 + JSON-RPC -32000 for rate-limit (instead of HTTP 429).
  • Enforces an undocumented per-key cap on initialize (observed ~5/minute) on top of the documented 300/min shared with REST.

These are tracked for upstream feedback; this release is pure client-side mitigation so existing users get value today.


Full changelog

Added

  • ensure_initialized() on HyperpingMcpClient and AsyncHyperpingMcpClient for startup health checks. Performs the MCP handshake now if it hasn't happened yet and raises HyperpingRateLimitError if the server's initialize cap is hit.
  • New "MCP rate limits and connection lifecycle" section in README documenting Hyperping's stateless MCP server, the undocumented initialize cap, and the recommended client lifetime per process.

Fixed

  • MCP rate-limit errors that the server returns as HTTP 200 with JSON-RPC error.code = -32000 (notably the initialize per-minute cap) are now classified as HyperpingRateLimitError with retry_after parsed from the message, instead of a generic HyperpingAPIError. Existing HTTP 429 handling is unchanged.
  • After a rate-limit on initialize, the MCP transport latches a cool-off so subsequent call_tool invocations short-circuit with HyperpingRateLimitError until the advertised retry_after elapses, instead of issuing further HTTP requests that would burn more slots from the bucket.
  • TOCTOU race in lazy initialize where two concurrent first calls on the same HyperpingMcpClient could each POST initialize. The handshake is now performed under a dedicated lock with a double-checked flag, including a lockless fast path so post-handshake call_tool does not contend on it.
  • Cool-off short-circuit now preserves the originating status code (200 for JSON-RPC -32000, 429 for HTTP 429) so callers can distinguish buckets, and retry_after uses math.ceil to avoid over-reporting by one second.
  • JSON-RPC rate-limit signals returned on the notifications/initialized leg are now classified as HyperpingRateLimitError (previously they were silently treated as a successful notification).
  • Rate-limit detection requires the message to contain "rate limit exceeded" (the observed phrasing) to avoid false positives on unrelated server messages that happen to mention "rate limit". The Retry-After parser now also accepts Retry-After: and retry after N seconds variants.

Compatibility

  • Public API additive only (ensure_initialized() on both MCP clients).
  • Existing HyperpingRateLimitError consumers continue to work; the same exception type is now raised on a wider set of server responses, with status_code set to whichever signal was used (200 or 429).
  • Python 3.11 / 3.12 / 3.13. No new dependencies.

Install

pip install hyperping==1.7.0
# or
uv add hyperping==1.7.0

Links