FEAT: Better Scenario Tracking#1758
Conversation
Previously, if a Scenario was interrupted mid-AtomicAttack (Ctrl-C, OOM, crash), completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully.
This change makes scenario linkage a first-class column on AttackResultEntry:
- New columns: scenario_result_id (indexed FK, ON DELETE SET NULL) and scenario_data (JSON with fixed schema {atomic_attack_name, objective_index}).
- New ExecutionAttribution dataclass in pyrit/executor/attack/core/ (so the executor never imports from the scenario layer) is set on AttackContext by AttackExecutor per-task before scheduling, and read by the default attack event handler when persisting.
- Hydration in get_scenario_results uses the FK with a merge-mode fallback to the legacy manifest for partially-migrated DBs.
- Resume uses objective_index (deterministic, parallel-safe; derived from seed_groups input_indices) rather than objective text, so duplicate objective text doesn't collapse two seed groups.
- Drops the unreleased error_attack_result_ids_json column outright; error AttackResults are now linkable via get_attack_results(scenario_result_id=..., outcome=ERROR).
- attack_results_json stays write-through this release for downgrade safety; future releases will stop populating and then drop.
- update_scenario_run_state becomes a targeted UPDATE rather than a full row rebuild (so it doesn't clobber the manifest during the deprecation window).
Includes Alembic migration with idempotent backfill, scenario_data round-trip on AttackResultEntry, and tests for: event-handler attribution stamping, executor attribution propagation at max_concurrency>1, FK + manifest + mixed hydration paths, migration backfill correctness/idempotency/downgrade, interruption-recovery regression, duplicate-objective-text resume safety, and duplicate atomic_attack_name validation.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lify hydration - Delete dishonest no-op add_attack_results_to_scenario shim. - Standardize on print_deprecation_message (drop ad-hoc warnings.warn). Style guide gains a concise Deprecations section with the `removed_in = current minor + 2` rule. - Remove stale per-scenario state left over from the manifest era: AttackContext._error_attack_result_id, _StrategyRuntimeError.error_attack_result_id, Scenario._result_lock, Scenario._original_objectives_map, and the stray `import asyncio`. Replace defensive `getattr(context, '_attribution', None)` with direct attribute access — the contract is mandatory. - Rename ExecutionAttribution -> ScenarioExecutionAttribution (and the module file) to match its scenario-specific schema. - Refactor MemoryInterface.get_scenario_results: split into _build_scenario_result_query_conditions, _query_scenario_result_entries, _hydrate_scenario_attack_results. The hydrator now issues a single batched IN-query on AttackResultEntry.scenario_result_id (fixes the previous N+1) and drops the legacy attack_results_json manifest fallback entirely — the FK is the sole source of truth. - Narrow _stamp_attribution(result=) to AttackResult to satisfy ty. - Update affected tests; rewrite the four hydration tests that incidentally relied on the manifest fallback to use the production FK write path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Bumps pyrit/scenario/core/atomic_attack.py coverage from 37% to 94% by exercising the three resume-critical surfaces the PR introduced or changed but that had no dedicated tests: - TestAtomicAttackFilterSeedGroupsByIndices: stable-identity filter that drops completed seeds while preserving each survivor's original index across successive filter calls. - TestAtomicAttackFilterSeedGroupsByObjectives: keeps the deprecated legacy path under test and asserts the DeprecationWarning fires until removed_in=0.16.0. - TestAtomicAttackAttributionFactory: the closure built in run_async when _scenario_result_id is set — no factory outside a Scenario, factory maps input_index -> original objective_index after filtering, and the snapshot is taken by value so post-call mutations cannot poison in-flight attributions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Decouple the attack persistence path from scenario vocabulary. The attack layer now ships an opaque attribution dataclass (parent_id, parent_collection, position) — the scenario layer interprets those fields to mean (scenario_result_id, atomic_attack_name, objective_index). - ScenarioExecutionAttribution -> AttackResultAttribution (renamed module and class) - AttackResult.scenario_result_id / scenario_data -> attribution_parent_id / attribution_data - AttackResultEntry columns, index, and foreign key constraint renamed; migration 9c8b7a6d5e4f rewritten in place (still unreleased on this branch) - Replaced FK abbreviation with foreign key / ForeignKey in comments and docstrings The DB foreign key still targets ScenarioResultEntries.id; that is a relational fact, not a layering violation. The attack layer has no scenario-specific identifiers in its type signatures. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| what they mean and how to query them back later. For example, | ||
| ``Scenario`` uses ``parent_id`` for the scenario result UUID, | ||
| ``parent_collection`` for the atomic attack name, and ``position`` for | ||
| the original 0-based seed-group index. |
There was a problem hiding this comment.
i think the field definitions are more well defined than this comment makes it seem with the "chooses" wording but I think these fields should be standardized and clear so parent_id is the orchestrator results uuid, and parent_collection is the grouping, etc
but this also makes me question how generic this is. position and parent id make sense but parent_collection seems specific to atomic attacks. can this be folded into the parent_id or excluded here and just included in attribution_data
There was a problem hiding this comment.
The docstring now describes it as "free-form label naming the per-parent collection" — generic, not atomic-attack-specific. But folding into others feels a bit wrong; I'd like to keep for now, but we can revisit if we have other thigns besides scenarios and it feels wrong
| ) | ||
|
|
||
| def _get_completed_objectives_for_attack(self, *, atomic_attack_name: str) -> set[str]: | ||
| def _get_completed_objective_indices_for_attack(self, *, atomic_attack_name: str) -> set[int]: |
There was a problem hiding this comment.
I have concerns over relying on indices for resume, which seems brittle. The position is only stable if the seed group list is identical between runs, but I don't think there's a guarantee of that. For example, _apply_max_dataset_size uses random.sample with no fixed seed, so a resumed run gets a different sample and the indices silently point to wrong objectives. Even without sampling, if memory.get_seed_groups() doesn't have a deterministic ORDER BY, the list could come back in a different order. There's a comment that calls out the sampling issue ("the very ADO 9012 bug this PR fixes") but only fixes it for that one path.
I know we moved away from objective-text matching because of duplicate objectives colliding, but could we use more of the objective/seed group data to generate a composite hash as the resume key instead? That way reordering or resampling wouldn't silently corrupt the mapping.
There was a problem hiding this comment.
yeah this is great feedback. I'm going to re-look at this.
There was a problem hiding this comment.
resume identity is now the objective_sha256; AtomicAttack.init rejects duplicate objective hashes; max_dataset_size samples are pinned in ScenarioResult.metadata so a fresh random.sample on resume can't shift the working set.
This is a lot more resilient. Leaving open so you can take a look because it changes how things work
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…5_18_scenario_resume # Conflicts: # pyrit/memory/memory_models.py # pyrit/scenario/core/atomic_attack.py # tests/unit/models/test_scenario_result.py
Previously, if a Scenario was interrupted mid-AtomicAttack, completed AttackResults persisted to the DB became orphaned because the scenario-to-attack-result link only lived in a JSON manifest (attack_results_json) written after the whole AtomicAttack returned. On resume, those objectives re-executed wastefully.
This change makes scenario linkage a first-class column on AttackResultEntry. It allows resume to use more completed results. It also allows for progress to be tracked better.