Skip to content

Data integrity: score or leaderboard write fails after Finished status #2423

@hanane-ca

Description

@hanane-ca

Problem

The compute worker sets submission status to FINISHED before uploading scores and outputs. If push_scores() or push_output() fails (network error, timeout, API failure), the submission is marked Finished but has no scores or outputs.

Root Cause

In compute_worker.py, the flow was:

run.start()         # Sets status to FINISHED at line 1448
if run.is_scoring:
    run.push_scores()  # Upload scores AFTER status update
run.push_output()      # Upload outputs AFTER status update

If push_scores() fails, the submission stays Finished with no scores → data integrity violation.

Impact

  • Users see Finished submissions with no results
  • Leaderboard is incomplete
  • No retry mechanism exists to recover

This bug was discovered during the EEG Foundation Challenge incident analysis.

Solution

  1. Reorder operations: upload scores/outputs before setting status to FINISHED
  2. Add retry logic to push_scores() with exponential backoff
  3. Validate HTTP responses and raise errors on 4xx/5xx

New Flow

run.start()         # Completes scoring but doesn't set FINISHED
if run.is_scoring:
    run.push_scores()       # Upload scores first (with retries)
run.push_output()           # Then upload outputs
if run.is_scoring:
    run._update_status(SubmissionStatus.FINISHED)  # Only now mark as FINISHED

Testing

Added comprehensive K6 integration test:

  • tests/k6/test_m8_finished_has_scores.js — verifies invariant: Finished ⟹ has scores
  • tests/k6/run_m8_test.sh — bash orchestrator
  • Pass criteria: finished_without_scores == 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions