Skip to content

fix: add upload retry#399

Open
not-matthias wants to merge 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts
Open

fix: add upload retry#399
not-matthias wants to merge 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts

Conversation

@not-matthias

@not-matthias not-matthias commented Jun 11, 2026

Copy link
Copy Markdown
Member

No description provided.

@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 7 untouched benchmarks


Comparing cod-2839-retry-upload-on-connection-errorstimeouts (2298ed2) with main (41f4db9)

Open in CodSpeed

@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from df83123 to bee5a31 Compare June 11, 2026 14:41
@not-matthias not-matthias marked this pull request as ready for review June 11, 2026 16:48
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

Adds upload retry for on-disk (streaming) profile archives. Previously STREAMING_CLIENT had no retry middleware because the stream body can't be replayed; this PR introduces send_streamed_with_retry, which reopens the file from disk on each attempt and drives the shared ExponentialBackoff policy manually.

  • upload_backoff() is extracted to request_client.rs so both the middleware on REQUEST_CLIENT and the new manual retry loop share the same policy; under #[cfg(test)] the intervals are shrunk to milliseconds.
  • send_streamed_with_retry uses DefaultRetryableStrategy to classify responses and ExponentialBackoff::should_retry to gate each attempt, rebuilding the tokio file stream on every retry.
  • Two new unit tests (streamed_upload_is_retried, in_memory_upload_is_retried) use a raw TCP mock server to assert that both code paths make exactly 1 + UPLOAD_RETRY_COUNT attempts on persistent 503 responses.

Confidence Score: 5/5

Safe to merge. The production retry logic is correct, response-level errors are handled by the existing caller check, and both new tests pass through realistic code paths.

The production retry logic is correct — send_streamed_with_retry correctly rebuilds the stream body, classifies transient errors via DefaultRetryableStrategy, and gates retries through ExponentialBackoff. The only finding is a test-infrastructure concern that does not affect production behaviour.

No files require special attention.

Important Files Changed

Filename Overview
src/request_client.rs Extracts upload_backoff() as a shared ExponentialBackoff factory; uses #[cfg(test)] to shrink intervals to milliseconds so tests don't sleep through real backoff. UPLOAD_RETRY_COUNT is made pub so the uploader can reference it.
src/upload/uploader.rs Adds send_streamed_with_retry which manually rebuilds the stream body from disk on each attempt and drives ExponentialBackoff + DefaultRetryableStrategy for transient errors; replaces the previous single-shot streaming upload. Two new unit tests verify the retry counts for both the streaming and in-memory code paths using a raw TCP mock server.

Sequence Diagram

sequenceDiagram
    participant U as upload_profile_archive
    participant S as send_streamed_with_retry
    participant F as File (disk)
    participant SC as STREAMING_CLIENT
    participant SV as Upload Server

    U->>S: call (path, upload_data, …)
    loop up to 1 + UPLOAD_RETRY_COUNT attempts
        S->>F: File::open(path)
        F-->>S: file handle
        S->>SC: PUT upload_url (streaming body)
        SC->>SV: HTTP PUT
        SV-->>SC: 503 (transient)
        SC-->>S: "Ok(Response{503})"
        S->>S: DefaultRetryableStrategy → Transient
        S->>S: policy.should_retry(start, n_past_retries)
        alt Retry allowed
            S->>S: "sleep(backoff), n_past_retries += 1"
        else Retries exhausted
            S-->>U: "Ok(Response{503})"
        end
    end
    U->>U: !response.status().is_success() → bail!
Loading

Reviews (2): Last reviewed commit: "fix(upload): retry streamed uploads on t..." | Re-trigger Greptile

Comment thread src/upload/uploader.rs Outdated
Comment thread src/upload/uploader.rs
Streamed (on-disk) uploads went through STREAMING_CLIENT, which has no
retry middleware because a consumed stream body can't be replayed. Add a
manual retry loop that rebuilds the stream from disk on each attempt and
reuses reqwest_retry's backoff policy and transient classifier, matching
the in-memory path's behavior.

Extract the shared backoff into request_client::upload_backoff() so both
the retry middleware and the streamed loop use one policy; under cfg(test)
its intervals shrink to milliseconds to keep the retry tests fast. Add
tests asserting both the streamed and in-memory paths retry transient
503s 1 + UPLOAD_RETRY_COUNT times.
@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from bee5a31 to 2298ed2 Compare June 11, 2026 17:22

@GuillaumeLagrange GuillaumeLagrange left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants