[TDD-red] CloudAIGymEnv cache key must include env_params (drives domain-randomization correctness fix) by rutayan-nv · Pull Request #900 · NVIDIA/cloudai

rutayan-nv · 2026-05-26T23:10:59Z

Issue

Cache correctness: CloudAIGymEnv keys the trajectory cache on action only, so the same action under a different env_params sample returns a stale reward — the domain-randomization signal is silently lost.

Fix (test-only, TDD-red)

4 tests pin that the cache key must include env_params: 2 fail today (exposing the bug), 2 pass as controls. The production fix lands in fix(configurator): make env_params first-class to fix the trajectory cache key #901.

Testing

The 2 bug-exposing tests fail by design (turned green by fix(configurator): make env_params first-class to fix the trajectory cache key #901); the 2 control tests pass; rest of test_cloudaigym.py green (30 passed). ruff + pyright clean.

Stack: main ← #893 ← #900 ← #901. TDD-red — do not merge alone; #901 turns the failing tests green. (The GymnasiumAdapter previously bundled here moved to its own stacked PR.)

coderabbitai · 2026-05-26T23:11:05Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds TestRun.increment_step() to advance the trial counter. CloudAIGymEnv calls this at the start of step(). BaseAgent.run() implements the full env-reset → action-select → env-step → policy-update loop. handle_dse_job delegates to agent.run() and gains exception handling with dse_failure.txt reporting via _record_run_failure. Tests verify step advancement semantics, trajectory cache behavior with env_params, and agent orchestration flows.

Changes

Agent Run Loop and DSE Handler Delegation

Layer / File(s)	Summary
TestRun step counter `src/cloudai/_core/test_scenario.py`, `tests/test_test_scenario.py`	`increment_step()` advances `step` by 1 and returns the new value; unit tests cover initial increment, repeated monotonic calls, and pre-seeded starting values.
CloudAIGymEnv step counter integration `src/cloudai/configurator/cloudai_gym.py`	`step()` calls `self.test_run.increment_step()` at the start before action application and cached trajectory logic.
CloudAIGymEnv step and cache behavior tests `tests/test_cloudaigym.py`	Updates existing tests to seed and assert post-increment step indices; adds unit tests for `env_params`-aware trajectory cache (miss on differing params, hit on match, back-compat when absent); adds integration test for workload re-execution on `env_params` change.
BaseAgent.run() orchestration loop `src/cloudai/configurator/base_agent.py`	Implements `BaseAgent.run()`: resets env, loops up to `max_steps` with `select_action` → `env.step` → `update_policy` feedback dict, early-exits on `None` action, returns 0.
handle_dse_job delegation and failure reporting `src/cloudai/cli/handlers.py`	Refactors `handle_dse_job` to call `agent.run()` and accumulate non-zero codes; catches unexpected exceptions, logs them, and calls `generate_reports(error=...)` before re-raising; adds `_record_run_failure` writing `dse_failure.txt`; extends `generate_reports` with optional `error` parameter.
handle_dse_job tests with stub agent `tests/test_handlers.py`	Adds `CustomRunStubAgent` with class-level run controls and `custom_run_agent_name` fixture; tests verify delegation, return code accumulation, exception propagation, aborted remaining runs, and failure file creation before re-raise.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐇 Hop! The counter ticks — one step, then two,
The agent loops around the gym, brand new.
On error caught, a file records the pain,
Then re-raises the exception once again.
Each trial tracked, each policy fed,
The rabbit checks the cache — no dupe ahead! 🥕

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: adding tests to verify that CloudAIGymEnv's cache key must include env_params to fix a domain-randomization bug.
Description check	✅ Passed	The description directly explains the cache correctness issue, the test-only fix approach, testing results, and the TDD-red stack context—all closely related to the changeset.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Line 148: The log call assumes observation is a numeric list and will raise
TypeError for non-numeric entries; update the logging around the line with
logging.info so it formats observation safely by mapping each item to a rounded
number when isinstance(item, (int, float)) and falling back to str(item) (or by
wrapping the comprehension in try/except and formatting with str on failure),
then log the resulting list along with step and reward; modify the code that
constructs the observation string (the expression using round(obs, 4)) to use
this safe formatting approach.
- Around line 137-147: The loop handling env.step(...) in base_agent.py
currently ignores the done flag and continues stepping; modify the control flow
in the method that contains the lines where observation, reward, done, *_ =
self.env.step(action) and the subsequent self.update_policy({...}) call so that
after calling update_policy you check if done is True and break (or return) to
stop the episode; ensure the terminal transition is still recorded (i.e., keep
the update_policy call for the final step) and then exit the loop to avoid
stepping into terminal states.

In `@tests/test_cloudaigym.py`:
- Around line 505-520: The test function
test_cache_hit_when_neither_has_env_params has a multi-line signature that fails
ruff-format; reformat the function signature and body to conform to ruff (e.g.,
collapse the signature to a single line and run ruff format) so the file passes
ruff-format checks, then re-run tests; target the function named
test_cache_hit_when_neither_has_env_params in the tests/test_cloudaigym.py diff
and apply ruff format or equivalent style fixes.

In `@tests/test_handlers.py`:
- Around line 287-288: The override of select_action must match the base
signature; update the tests/test_handlers.py select_action method to accept the
observation parameter (e.g., def select_action(self, observation: Any) ->
tuple[int, dict[str, Any]]), preserving the same return type and behavior (still
raise AssertionError), so the signature is compatible with
BaseAgent.select_action and satisfies the type checker.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 5198c57f-3f25-4c55-92be-50bcc9233c1a

📥 Commits

Reviewing files that changed from the base of the PR and between 18ce72e and 6b9f126.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (11)

pyproject.toml
src/cloudai/_core/test_scenario.py
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_cloudaigym.py
tests/test_gymnasium_adapter.py
tests/test_handlers.py
tests/test_test_scenario.py

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cloudai/configurator/cloudai_gym.py (1)

255-258: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Include env_params in the cache match predicate.

At Line 255–Line 258, cache lookup keys on action only, so repeated actions under different current_env_params can return stale rewards/observations from another environment sample.

Proposed targeted fix

 def get_cached_trajectory_result(self, action: Any) -> TrajectoryEntry | None:
+    current_env_params = getattr(self.test_run, "current_env_params", None)
     for entry in self.current_trajectory:
-        if self._values_match_exact(entry.action, action):
+        if self._values_match_exact(entry.action, action) and self._values_match_exact(
+            getattr(entry, "env_params", None), current_env_params
+        ):
             return entry
 
     return None

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 255 - 258, The
get_cached_trajectory_result method currently matches cache entries only on the
action parameter, which causes it to return stale results when the same action
is repeated under different environment configurations. Modify the cache lookup
logic in get_cached_trajectory_result to include environment parameter
validation in addition to the action match check. Add a condition that verifies
both the entry.action matches the provided action AND that the environment
parameters associated with the cached entry match the current_env_params before
returning the cached entry.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/cli/handlers.py`:
- Line 163: Replace the f-string in the logging.exception call with lazy
interpolation using % formatting instead. Change the f-string message to use the
% operator for string formatting, passing agent_type as a parameter to the
logging method, so that the string interpolation only happens if the message is
actually logged. This eliminates the Ruff G004 violation in the exception
logging path.

In `@tests/test_cloudaigym.py`:
- Line 502: The compound assertion using the `and` operator prevents pytest from
providing clear introspection about which part of the assertion failed. Split
the compound assertion `assert result is not None and result.step == 1` into two
separate assertions: one to verify `result is not None` and a second to verify
`result.step == 1`. Apply this same change to all similar compound assertions in
the test file to ensure pytest can clearly indicate which clause fails during
test execution.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 255-258: The get_cached_trajectory_result method currently matches
cache entries only on the action parameter, which causes it to return stale
results when the same action is repeated under different environment
configurations. Modify the cache lookup logic in get_cached_trajectory_result to
include environment parameter validation in addition to the action match check.
Add a condition that verifies both the entry.action matches the provided action
AND that the environment parameters associated with the cached entry match the
current_env_params before returning the cached entry.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: cd6c14de-9eea-4691-bcee-a41f93fd8e2b

📥 Commits

Reviewing files that changed from the base of the PR and between 6b9f126 and 923ca0b.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

pyproject.toml
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_cloudaigym.py
tests/test_gymnasium_adapter.py
tests/test_handlers.py

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/cloudai/configurator/cloudai_gym.py (1)

74-83: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep constraint-failure observations shape-consistent with the declared observation space.

After this change, observation width is metric-count based, but step() still returns [-1.0] on constraint failure (Line 144). That can emit the wrong observation length for multi-metric runs.

Suggested fix

         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_observation = [-1.0] * len(self.define_observation_space())
+            return failure_observation, self.rewards.constraint_failure, True, {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 74 - 83, The
`define_observation_space()` method declares an observation space with width
equal to the number of agent metrics, but the `step()` method returns a
fixed-length observation of `[-1.0]` when a constraint failure occurs at line
144. This creates a shape mismatch: multi-metric runs will get an observation of
incorrect length during constraint failures. Fix this by modifying the
constraint failure return in `step()` to generate an observation list with the
same shape as `define_observation_space()` — use `[-1.0] *
max(len(self.test_run.test.agent_metrics), 1)` instead of the hard-coded
`[-1.0]` to ensure observation consistency across all execution paths.

♻️ Duplicate comments (1)

src/cloudai/configurator/base_agent.py (1)

149-161: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop stepping once the environment reports termination.

Line 149 captures done, but the loop never checks it, so run() can continue stepping after episode termination.

Proposed minimal fix

             observation, reward, done, *_ = self.env.step(action)
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+            if done:
+                break
         return 0

Based on learnings from prior review comments, this is the same unresolved done-handling gap.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The `run()`
method unpacks the `done` flag from the environment step at line 149 but never
checks it, causing the loop to continue even after the episode has terminated.
Add a conditional check after the `update_policy()` call to break out of the
loop when `done` is True, ensuring that stepping stops once the environment
reports episode termination.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 74-83: The `define_observation_space()` method declares an
observation space with width equal to the number of agent metrics, but the
`step()` method returns a fixed-length observation of `[-1.0]` when a constraint
failure occurs at line 144. This creates a shape mismatch: multi-metric runs
will get an observation of incorrect length during constraint failures. Fix this
by modifying the constraint failure return in `step()` to generate an
observation list with the same shape as `define_observation_space()` — use
`[-1.0] * max(len(self.test_run.test.agent_metrics), 1)` instead of the
hard-coded `[-1.0]` to ensure observation consistency across all execution
paths.

---

Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The `run()` method unpacks the `done` flag from the
environment step at line 149 but never checks it, causing the loop to continue
even after the episode has terminated. Add a conditional check after the
`update_policy()` call to break out of the loop when `done` is True, ensuring
that stepping stops once the environment reports episode termination.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 39e97025-4afa-46ac-aa93-b4260ba58cfa

📥 Commits

Reviewing files that changed from the base of the PR and between 923ca0b and 468419b.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (8)

pyproject.toml
src/cloudai/configurator/__init__.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_cloudaigym.py
tests/test_gymnasium_adapter.py
tests/test_handlers.py

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/cloudai/configurator/cloudai_gym.py (2)

142-145: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure observation shape can violate the declared observation space.

Line 144 returns [-1.0] unconditionally, but observation shape is now metric-count based (Line 82). Multi-metric configs can produce shape mismatch in downstream consumers.

Proposed fix

         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_obs = [-1.0] * len(self.define_observation_space())
+            return failure_obs, self.rewards.constraint_failure, True, {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 142 - 145, The
constraint failure path in the method that handles constraint_check() returns a
hardcoded observation of [-1.0], but the observation space shape is metric-count
based as defined elsewhere in the code. For multi-metric configurations, this
creates a shape mismatch. Fix this by replacing the hardcoded [-1.0] return
value with an observation array that matches the correct shape based on the
metric count, so the observation shape remains consistent regardless of the
number of metrics in the configuration.

255-258: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Include env_params in cache matching to prevent stale DR labels.

Line 255 currently keys cache hits by action only. This reuses rewards across different current_env_params samples and breaks reward/condition integrity.

Proposed fix

 def get_cached_trajectory_result(self, action: Any) -> TrajectoryEntry | None:
+    current_env_params = getattr(self.test_run, "current_env_params", None)
     for entry in self.current_trajectory:
-        if self._values_match_exact(entry.action, action):
+        entry_env_params = getattr(entry, "env_params", None)
+        if self._values_match_exact(entry.action, action) and self._values_match_exact(
+            entry_env_params, current_env_params
+        ):
             return entry

     return None

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 255 - 258, The
get_cached_trajectory_result method only matches cache entries based on action,
which causes it to reuse rewards across different environment parameter samples
and breaks reward/condition integrity. Modify the matching logic in this method
to check both the action AND the current_env_params when iterating through
self.current_trajectory. Update the condition used in _values_match_exact or add
an additional check to ensure that a cache entry is only returned if both the
action matches and the environment parameters match the current state. This
prevents stale DR labels from being reused when env_params differ.

Source: Pipeline failures

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 151-163: The _sync_underlying_step_counter method sets
test_run.step to self._step_count + 1 before the underlying CloudAIGymEnv.step()
method is called, but that method increments the step counter again internally,
resulting in an off-by-one error. Remove the + 1 and set test_run.step to just
self._step_count so that when CloudAIGymEnv.step() increments it internally, the
step counter will have the correct final value.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 142-145: The constraint failure path in the method that handles
constraint_check() returns a hardcoded observation of [-1.0], but the
observation space shape is metric-count based as defined elsewhere in the code.
For multi-metric configurations, this creates a shape mismatch. Fix this by
replacing the hardcoded [-1.0] return value with an observation array that
matches the correct shape based on the metric count, so the observation shape
remains consistent regardless of the number of metrics in the configuration.
- Around line 255-258: The get_cached_trajectory_result method only matches
cache entries based on action, which causes it to reuse rewards across different
environment parameter samples and breaks reward/condition integrity. Modify the
matching logic in this method to check both the action AND the
current_env_params when iterating through self.current_trajectory. Update the
condition used in _values_match_exact or add an additional check to ensure that
a cache entry is only returned if both the action matches and the environment
parameters match the current state. This prevents stale DR labels from being
reused when env_params differ.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 7bff3cab-2384-4f5f-9360-3fc537c95a58

📥 Commits

Reviewing files that changed from the base of the PR and between 468419b and f6515cf.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

pyproject.toml
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_cloudaigym.py
tests/test_gymnasium_adapter.py
tests/test_handlers.py

coderabbitai · 2026-06-15T23:27:02Z

+    def _sync_underlying_step_counter(self) -> None:
+        """
+        Mirror ``handle_dse_job``'s 1-based ``test_run.step`` so artifact paths match.
+
+        The first step is written under ``…/<iteration>/1/``, matching how
+        ``handle_dse_job`` numbers steps; this keeps reports and trajectory
+        analysis consistent regardless of whether the env is driven by the
+        DSE loop or by an external training loop wrapping the adapter.
+        """
+        test_run = getattr(self._env, "test_run", None)
+        if test_run is not None:
+            test_run.step = self._step_count + 1
+


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Step synchronization is off-by-one when wrapping CloudAIGymEnv.

Line 162 sets test_run.step to self._step_count + 1 before delegating, but CloudAIGymEnv.step() increments again internally. This shifts artifact/trajectory step indexing by one.

Proposed fix

def _sync_underlying_step_counter(self) -> None: @@ test_run = getattr(self._env, "test_run", None) if test_run is not None: - test_run.step = self._step_count + 1 + # Underlying CloudAIGymEnv.step() increments at step start. + test_run.step = self._step_count

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/gymnasium_adapter.py` around lines 151 - 163, The _sync_underlying_step_counter method sets test_run.step to self._step_count + 1 before the underlying CloudAIGymEnv.step() method is called, but that method increments the step counter again internally, resulting in an off-by-one error. Remove the + 1 and set test_run.step to just self._step_count so that when CloudAIGymEnv.step() increments it internally, the step counter will have the correct final value.

…ole mutator The trial index lives on TestRun and is advanced exactly once per CloudAIGymEnv.step(), before output_path and the trajectory row are computed. Step artifacts and trajectory rows are therefore tagged with the advanced index rather than the pre-step value.

handle_dse_job calls agent.run() uniformly; agents (e.g. RLlib-based) implement their own training loop in run(). Failure handling follows a hybrid contract: a non-zero rc accumulates via `err |= rc` and the sweep continues, while an unexpected exception hard-fails the scenario after reports are generated and the aborting error is documented.

rutayan-nv · 2026-06-16T14:09:01Z

@coderabbitai review

coderabbitai · 2026-06-16T14:09:12Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

tests/test_cloudaigym.py (2)

502-502: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Split compound assertion for clearer pytest introspection.

The compound assertion assert result is not None and result.step == 1 prevents pytest from clearly indicating which clause failed.

✂️ Proposed fix

-    assert result is not None and result.step == 1
+    assert result is not None
+    assert result.step == 1

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cloudaigym.py` at line 502, The compound assertion `assert result
is not None and result.step == 1` combines two independent checks which prevents
pytest from clearly identifying which specific condition failed during test
execution. Split this into two separate assertions: one to verify that result is
not None, and another to verify that result.step equals 1. This allows pytest's
introspection to pinpoint exactly which condition caused the failure.

Source: Linters/SAST tools

520-520: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Split compound assertion for clearer pytest introspection.

The compound assertion assert result is not None and result.step == 1 prevents pytest from clearly indicating which clause failed.

✂️ Proposed fix

-    assert result is not None and result.step == 1
+    assert result is not None
+    assert result.step == 1

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cloudaigym.py` at line 520, The compound assertion using the `and`
operator on line 520 prevents pytest from clearly indicating which clause failed
when the assertion breaks. Split the assertion `assert result is not None and
result.step == 1` into two separate assert statements, one checking that result
is not None and another checking that result.step equals 1. This allows pytest
to provide more granular failure reporting and makes debugging test failures
easier.

Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cloudaigym.py`:
- Around line 523-525: The function signature for
test_step_reruns_workload_when_env_params_change violates ruff-format rules due
to improper multi-line formatting. Reformat the function signature to comply
with the project's formatting standards by either running ruff format on the
file or manually adjusting the parameter list and opening parenthesis placement
to align with ruff's style requirements.

---

Duplicate comments:
In `@tests/test_cloudaigym.py`:
- Line 502: The compound assertion `assert result is not None and result.step ==
1` combines two independent checks which prevents pytest from clearly
identifying which specific condition failed during test execution. Split this
into two separate assertions: one to verify that result is not None, and another
to verify that result.step equals 1. This allows pytest's introspection to
pinpoint exactly which condition caused the failure.
- Line 520: The compound assertion using the `and` operator on line 520 prevents
pytest from clearly indicating which clause failed when the assertion breaks.
Split the assertion `assert result is not None and result.step == 1` into two
separate assert statements, one checking that result is not None and another
checking that result.step equals 1. This allows pytest to provide more granular
failure reporting and makes debugging test failures easier.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: bd869259-1364-4c02-9fda-40317c432bf3

📥 Commits

Reviewing files that changed from the base of the PR and between f6515cf and 1f2191c.

📒 Files selected for processing (7)

src/cloudai/_core/test_scenario.py
src/cloudai/cli/handlers.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
tests/test_cloudaigym.py
tests/test_handlers.py
tests/test_test_scenario.py

rutayan-nv · 2026-06-16T15:15:15Z

@coderabbitai review

coderabbitai · 2026-06-16T15:15:27Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

run() already treats a None return from select_action() as a stop signal, but the abstract method declared only tuple[int, dict[str, Any]]. Widen the return annotation to tuple[int, dict[str, Any]] | None and document the termination semantics so the declared contract matches run()'s usage.

CloudAIGymEnv.get_cached_trajectory_result(action) keys the trajectory cache on action alone. When a workload declares env_params (e.g. drop_rate) and the agent re-selects the same action under a different env_params sample, the cache returns a stale reward measured under a different env, silently invalidating any domain-randomization workflow. Four tests pin the contract; today two FAIL (bug-exposing), two PASS (back-compat sanity). All 27 pre-existing tests are untouched. FAIL test_cache_miss_when_env_params_differ FAIL test_step_reruns_workload_when_env_params_change PASS test_cache_hit_when_action_and_env_params_match PASS test_cache_hit_when_neither_has_env_params Driver for the follow-up PR that adds env_params as a first-class field on TestDefinition + TrajectoryEntry and rewrites the cache key as (action, env_params). Both unit and integration shapes are covered so the fix can be validated end-to-end through env.step().

…ge signature Satisfy ruff-format on the env_params cache-key TDD test.

CodeRabbit (Ruff PT018): `assert a and b` obscures which clause failed. Split the two `assert result is not None and result.step == 1` checks in the env_params cache-key tests into separate assertions for clearer pytest introspection. Test-only change, no behavior impact.

Make env_params a first-class part of CloudAIGymEnv trial identity so the trajectory cache keys on (action, env_params) rather than action alone, fixing the domain-randomization correctness bug where the same action under a different env_params sample returned a stale reward. - Cache key now includes env_params; cache-key tests pin the contract (formerly the TDD-red specs of NVIDIA#900, folded in here). - Keep env.csv and trajectory.csv 1:1 step-aligned: a single TrajectoryEntry sinks both files coherently, including on constraint failure. - Reject env_params on non-DSE jobs; reject non-finite / negative weights. - Add cache-hit + declared-env_params integration coverage. Folds the test-only PR NVIDIA#900 (cache-key TDD) into this PR so the stack has no permanently-red standalone PR.

rutayan-nv · 2026-06-20T22:16:43Z

Folded into #901. As of c7f7f1f, #901 now carries the cache-key TDD specs (test_cache_miss_when_env_params_differ, test_step_reruns_workload_when_env_params_change) together with the production fix in a single green commit, eliminating this permanently-red standalone PR.

The stack now goes #933 -> #901 -> ... (this branch is dropped). Closing.

Make env_params a first-class part of CloudAIGymEnv trial identity so the trajectory cache keys on (action, env_params) rather than action alone, fixing the domain-randomization correctness bug where the same action under a different env_params sample returned a stale reward. - Cache key now includes env_params; cache-key tests pin the contract (formerly the TDD-red specs of NVIDIA#900, folded in here). - Keep env.csv and trajectory.csv 1:1 step-aligned: a single TrajectoryEntry sinks both files coherently, including on constraint failure. - Reject env_params on non-DSE jobs; reject non-finite / negative weights. - Add cache-hit + declared-env_params integration coverage. Folds the test-only PR NVIDIA#900 (cache-key TDD) into this PR so the stack has no permanently-red standalone PR.

rutayan-nv mentioned this pull request May 26, 2026

fix(configurator): make env_params first-class to fix the trajectory cache key #901

Open

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 7a27fa2 to 6b9f126 Compare June 12, 2026 21:14

rutayan-nv mentioned this pull request Jun 12, 2026

[CLI/Core] Polymorphic agent dispatch + TestRun owns trial counter #893

Closed

rutayan-nv marked this pull request as ready for review June 12, 2026 21:27

rutayan-nv requested review from jeffnvidia, podkidyshev and srivatsankrishnan as code owners June 12, 2026 21:27

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

Comment thread src/cloudai/configurator/base_agent.py

Comment thread src/cloudai/configurator/base_agent.py

Comment thread tests/test_cloudaigym.py Outdated

Comment thread tests/test_handlers.py Outdated

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 6b9f126 to 923ca0b Compare June 15, 2026 17:39

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

Comment thread src/cloudai/cli/handlers.py Outdated

Comment thread tests/test_cloudaigym.py Outdated

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 923ca0b to 468419b Compare June 15, 2026 22:49

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 468419b to f6515cf Compare June 15, 2026 23:20

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from f6515cf to 3608921 Compare June 16, 2026 00:01

rutayan-nv mentioned this pull request Jun 16, 2026

feat(core): add ContinuousSpace action-space primitive #927

Draft

rutayan-nv added 2 commits June 16, 2026 09:44

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 3608921 to 1f2191c Compare June 16, 2026 13:50

rutayan-nv mentioned this pull request Jun 16, 2026

[CLI/Core] Polymorphic agent dispatch + TestRun owns trial counter #933

Merged

coderabbitai Bot reviewed Jun 16, 2026

View reviewed changes

Comment thread tests/test_cloudaigym.py Outdated

rutayan-nv added 3 commits June 16, 2026 17:55

style(tests): one-line test_step_reruns_workload_when_env_params_chan…

210e411

…ge signature Satisfy ruff-format on the env_params cache-key TDD test.

rutayan-nv force-pushed the rpatro/env-params-cache-key-tdd branch from 7368d06 to 210e411 Compare June 16, 2026 22:15

rutayan-nv closed this Jun 20, 2026

rutayan-nv deleted the rpatro/env-params-cache-key-tdd branch June 20, 2026 22:17

Uh oh!

Conversation

rutayan-nv commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Fix (test-only, TDD-red)

Testing

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

rutayan-nv commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rutayan-nv commented Jun 16, 2026

Uh oh!

coderabbitai Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rutayan-nv commented Jun 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rutayan-nv commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading

coderabbitai Bot commented Jun 16, 2026 •

edited

Loading