Skip to content

fix(configurator): make env_params first-class to fix the trajectory cache key#901

Open
rutayan-nv wants to merge 10 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/env-params-first-class
Open

fix(configurator): make env_params first-class to fix the trajectory cache key#901
rutayan-nv wants to merge 10 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/env-params-first-class

Conversation

@rutayan-nv

@rutayan-nv rutayan-nv commented May 26, 2026

Copy link
Copy Markdown
Contributor

Issue

Fix

  • Cache correctness: env_params becomes first-class schema (TestDefinition.env_params, TestRun.current_env_params, TrajectoryEntry.env_params); the cache key is now (action, current_env_params).
  • Artifact alignment: CloudAIGymEnv builds an EnvParamsObserver that samples deterministically and writes env.csv before the cache lookup, so it aligns 1:1 with trajectory.csv; workloads without env_params pay zero overhead.

Testing

Stack: main#893#900#901 (top). Depends on #900#893.

@coderabbitai

coderabbitai Bot commented May 26, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: c9f41e83-f4ec-40e7-9ab6-6a43ed0109c8

📥 Commits

Reviewing files that changed from the base of the PR and between 7fec7f5 and 8f91664.

📒 Files selected for processing (3)
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_env_params.py

📝 Walkthrough

Walkthrough

Adds per-trial env parameter sampling and validation, threads sampled env_params through TestRun and CloudAIGymEnv, persists env.csv alongside trajectories, tightens cache matching on env_params, and updates DSE CLI handling and tests for the new flow.

Changes

Env parameter sampling and DSE flow

Layer / File(s) Summary
Env parameter primitives
src/cloudai/configurator/env_params.py, tests/test_env_params.py
EnvParamSpec, EnvParam, EnvParams, and EnvParamsSink are added for weighted sampling, deterministic per-trial draws, and CSV persistence. Tests cover validation, sampling, sink output, and resolution from test definitions.
Workload and TestRun overlays
src/cloudai/models/workload.py, src/cloudai/_core/test_scenario.py, tests/test_env_params.py
TestDefinition adds env_params, validates env-param annotations, and excludes env-sampled cmd_args from DSE classification. TestRun stores current_env_params, filters env-sampled keys from param_space, and overlays action and env_params through apply_params_set. Tests cover validation, DSE classification, and overlay behavior.
CloudAIGymEnv env_params flow
src/cloudai/configurator/cloudai_gym.py, tests/test_cloudaigym.py
CloudAIGymEnv samples env_params per step, records them in trajectory entries and env.csv, and matches cached trajectories on both action and env_params. Tests cover cache keying, step alignment, overlay application, and the no-env_params path.
DSE validation and execution
src/cloudai/configurator/base_agent.py, src/cloudai/configurator/env_params.py, src/cloudai/cli/handlers.py, tests/test_handlers.py
BaseAgent gains samples_env_params. validate_dse_env_params is wired into dry-run and scenario verification. handle_dse_job runs each DSE case through CloudAIGymEnv and agent.run(), records dse_failure.txt on unexpected errors, and passes run errors into report generation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/cloudai#933: Refactors handle_dse_job and generate_reports around agent.run() and optional error handling, which this PR extends for env-parameter-aware DSE execution.

Suggested labels

enhancement

Suggested reviewers

  • srivatsankrishnan
  • jeffnvidia
  • amaslenn

Poem

A rabbit hopped through env.csv at dawn,
With weighted carrots freshly drawn.
The cache kept pace, the trials sang,
And DSE twinkled as samples sprang.
\o/ 🐇

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly captures the main change: making env_params first-class to fix trajectory cache correctness.
Description check ✅ Passed The description accurately summarizes the cache-key and env.csv alignment changes plus testing coverage.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (2)

143-171: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Keep trajectory.csv aligned on the constraint-failure path.

observer.before_step() has already persisted the sampled env row by Line 147, but the early return on Lines 169-171 skips write_trajectory(). Any trial rejected by constraint_check() now leaves an orphan env.csv row, which breaks the PR’s stated 1:1 env.csvtrajectory.csv contract and makes per-trial artifacts inconsistent.

Suggested fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 143 - 171, The early
return on constraint_check leaves an env row without a matching trajectory; to
fix, when test_run.test.constraint_check(test_run, runner.system) returns False,
write a TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.

186-202: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Preserve current_env_params when recovering from runner failures.

On Line 189 the fallback rebuilds self.test_run from self.original_test_run, which resets current_env_params to its default. The trajectory row written on Line 202 then records {} even though env.csv already contains the sampled params, and the cached failure is stored under (action, {}) instead of (action, sampled_env_params). Repeating the same failing trial will miss the cache and produce disagreeing artifacts.

Suggested fix
         if self.runner.test_scenario.test_runs and self.runner.test_scenario.test_runs[0].output_path.exists():
             self.test_run = self.runner.test_scenario.test_runs[0]
         else:
-            self.test_run = copy.deepcopy(self.original_test_run)
+            self.test_run = copy.deepcopy(self.original_test_run).apply_params_set(action)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = dict(new_tr.current_env_params)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 186 - 202, When
rebuilding self.test_run in the fallback branch (inside cloudai_gym.py) you must
preserve the sampled env params so the trajectory/cache keys match the actual
env.csv; after creating the deep-copied self.test_run from
self.original_test_run and after setting step/output_path, copy the sampled
params into self.test_run.current_env_params (e.g. assign a deepcopy of
self.original_test_run.current_env_params or the sampled params present on the
runner) before calling get_observation/compute_reward and write_trajectory; this
ensures dict(self.test_run.current_env_params) in the TrajectoryEntry reflects
the real sampled parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 116-149: In run(), replace the eager f-string logging calls with
%-style deferred logging so formatting happens only if the message will be
emitted; specifically update the two logging.info calls inside the run method
(the "Running step ..." and "Step {step}: Observation..." messages) to use
logging.info("Running step %s (of %s) with action %s", step, self.max_steps,
action) and logging.info("Step %s: Observation: %s, Reward: %.4f", step, ...,
reward) respectively, ensuring you format the observation list without using
f-strings so the logging framework can lazily interpolate.

In `@src/cloudai/configurator/env_params.py`:
- Around line 1-2: The file header years in
src/cloudai/configurator/env_params.py are incorrect for the touched year;
update the two top lines (the SPDX copyright line and the subsequent Copyright
line) by replacing "2024-2026" with the exact year string the repo's header test
expects for this file (e.g., "2026") so the header matches
tests/test_check_copyright_headers.py.

In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 133-165: Add a public close(self) -> None method to
GymnasiumAdapter that safely delegates to the wrapped environment's close
method: check for an attribute (e.g. getattr(self._env, "close", None)), call it
if callable, and return None; do not assume inheritance from gymnasium.Env. Add
the method near other lifecycle helpers (e.g. alongside
_sync_underlying_step_counter, _step_with_params) so callers that call
env.close() won't raise AttributeError.
- Around line 77-80: GymnasiumAdapter.__init__ currently filters action space
lists into _tunable_params (len>1) and _fixed_params (len==1) but silently drops
any keys with empty lists; modify GymnasiumAdapter.__init__ to detect any entry
in the raw_action_space where len(v) == 0 and raise a ValueError identifying the
offending key(s) so callers (e.g., TestRun.param_space) fail fast; add a unit
test that constructs a TestRun or raw_action_space containing an empty list for
a cmd_args key and asserts that GymnasiumAdapter(...) raises ValueError with a
message mentioning the empty key(s).

In `@tests/test_env_params.py`:
- Around line 86-87: The test uses a broad pytest.raises(ValueError) when
calling sink.write(0, {"drop_rate": 0.0}); narrow it by adding a match parameter
to pytest.raises that matches the specific ValueError message expected (e.g.,
the message about invalid drop_rate or index) so the test asserts the exact
error condition raised by sink.write; update the pytest.raises invocation around
sink.write to include match="expected message substring" referencing the
sink.write call and pytest.raises.
- Around line 1-2: The copyright header in the file contains the range
"2025-2026" which fails CI; update the header lines (the SPDX line and the
Copyright line that currently reads "2025-2026 NVIDIA CORPORATION & AFFILIATES")
to use a single year "2026" so they match the expected format for new 2026
files.

In `@tests/test_handlers.py`:
- Around line 287-288: The override of select_action in tests/test_handlers.py
has the wrong signature; change the function declaration for select_action in
that file to match the base class signature BaseAgent.select_action(self,
observation: list[float] | None = None) (i.e., add the observation parameter
with the same type and default) and keep the existing body (raising
AssertionError) so pyright/CI type-checking passes while preserving the test
intent.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 143-171: The early return on constraint_check leaves an env row
without a matching trajectory; to fix, when
test_run.test.constraint_check(test_run, runner.system) returns False, write a
TrajectoryEntry (use self.write_trajectory with step=self.test_run.step,
action=action, reward=self.rewards.constraint_failure, observation=[-1.0],
env_params=dict(self.test_run.current_env_params)) and call each
observer.after_step(self.test_run, observation, reward) before returning the
same quadruple (observation, reward, True, {}); this mirrors the cached-result
path and preserves the 1:1 env.csv ↔ trajectory.csv contract.
- Around line 186-202: When rebuilding self.test_run in the fallback branch
(inside cloudai_gym.py) you must preserve the sampled env params so the
trajectory/cache keys match the actual env.csv; after creating the deep-copied
self.test_run from self.original_test_run and after setting step/output_path,
copy the sampled params into self.test_run.current_env_params (e.g. assign a
deepcopy of self.original_test_run.current_env_params or the sampled params
present on the runner) before calling get_observation/compute_reward and
write_trajectory; this ensures dict(self.test_run.current_env_params) in the
TrajectoryEntry reflects the real sampled parameters.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 0097c65c-5ab7-4bd2-8683-6e1ea26cc182

📥 Commits

Reviewing files that changed from the base of the PR and between 18ce72e and d076095.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • pyproject.toml
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_gymnasium_adapter.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

Comment thread src/cloudai/configurator/base_agent.py
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/gymnasium_adapter.py Outdated
Comment thread src/cloudai/configurator/gymnasium_adapter.py Outdated
Comment thread tests/test_env_params.py Outdated
Comment thread tests/test_env_params.py Outdated
Comment thread tests/test_handlers.py Outdated
@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch from d076095 to 5c85e08 Compare June 15, 2026 17:39

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

169-171: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve env/traj alignment and env_params identity on non-happy paths.

Line 170 currently returns on constraint failure after before_step() has already persisted env.csv, so env.csv and trajectory.csv can diverge.
Also, in Lines 189-203, replacing self.test_run with original_test_run can drop current_env_params, so the trajectory row is written with empty env params and weakens (action, env_params) cache identity.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
@@
         if self.runner.test_scenario.test_runs and self.runner.test_scenario.test_runs[0].output_path.exists():
             self.test_run = self.runner.test_scenario.test_runs[0]
         else:
+            preserved_env_params = dict(new_tr.current_env_params)
             self.test_run = copy.deepcopy(self.original_test_run)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = preserved_env_params

Also applies to: 189-203

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 169 - 171, The
constraint check at lines 169-171 returns after before_step() has already
persisted env.csv, causing env.csv and trajectory.csv to diverge. Move the
constraint check before the before_step() call so that early returns don't
create misalignment between the two files. Additionally, at lines 189-203,
ensure that self.test_run is used instead of original_test_run to preserve
current_env_params in the trajectory row, maintaining proper action-env_params
cache identity.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 151-163: The _sync_underlying_step_counter method has an
off-by-one error where it sets test_run.step to self._step_count + 1, but
CloudAIGymEnv.step() also increments test_run.step on entry, causing a double
increment. Fix this by changing the assignment to test_run.step =
self._step_count instead, since the increment will be handled by
CloudAIGymEnv.step() when it is called, ensuring the first adapter-driven step
is recorded as trial 1 rather than trial 2.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 169-171: The constraint check at lines 169-171 returns after
before_step() has already persisted env.csv, causing env.csv and trajectory.csv
to diverge. Move the constraint check before the before_step() call so that
early returns don't create misalignment between the two files. Additionally, at
lines 189-203, ensure that self.test_run is used instead of original_test_run to
preserve current_env_params in the trajectory row, maintaining proper
action-env_params cache identity.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ea1d6771-30cc-49dd-bc66-394901097b13

📥 Commits

Reviewing files that changed from the base of the PR and between d076095 and 5c85e08.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • pyproject.toml
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_gymnasium_adapter.py
  • tests/test_handlers.py

Comment thread src/cloudai/configurator/gymnasium_adapter.py Outdated
@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch from 5c85e08 to 0efcd11 Compare June 15, 2026 22:49

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (1)

146-171: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep env.csv and trajectory.csv aligned on constraint-failure steps.

before_step() runs before the constraint check, so env.csv is already appended. The early return on Line 171 skips write_trajectory(), creating step drift between artifacts when constraints fail.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 146 - 171, The
constraint check failure on line 171 returns early without calling
write_trajectory(), while before_step() has already been called on line 146,
causing env.csv and trajectory.csv to become misaligned. Add a
write_trajectory() call before the early return on constraint failure, similar
to what is done in the cached result case, to write a TrajectoryEntry with the
current step, action, constraint_failure reward, an appropriate observation, and
current environment parameters to keep the CSV artifacts synchronized.
tests/test_cloudaigym.py (1)

165-194: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add an env_params-enabled constraint-failure regression test.

Current constraint-failure coverage does not exercise observer-enabled runs. Add a variant with tdef.env_params declared and assert failed steps keep env.csv and trajectory.csv step-aligned.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cloudaigym.py` around lines 165 - 194, The test_constraint_failure
function needs to be expanded to cover observer-enabled runs. Add a new test
variant by either parameterizing the existing test or creating an additional
test case where tdef.env_params is declared before creating the CloudAIGymEnv.
After the env.step() call, add assertions to verify that the env.csv and
trajectory.csv output files are kept step-aligned following the constraint
failure, ensuring that observer-enabled runs handle failed steps correctly
without misaligning the step records across output files.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 146-171: The constraint check failure on line 171 returns early
without calling write_trajectory(), while before_step() has already been called
on line 146, causing env.csv and trajectory.csv to become misaligned. Add a
write_trajectory() call before the early return on constraint failure, similar
to what is done in the cached result case, to write a TrajectoryEntry with the
current step, action, constraint_failure reward, an appropriate observation, and
current environment parameters to keep the CSV artifacts synchronized.

In `@tests/test_cloudaigym.py`:
- Around line 165-194: The test_constraint_failure function needs to be expanded
to cover observer-enabled runs. Add a new test variant by either parameterizing
the existing test or creating an additional test case where tdef.env_params is
declared before creating the CloudAIGymEnv. After the env.step() call, add
assertions to verify that the env.csv and trajectory.csv output files are kept
step-aligned following the constraint failure, ensuring that observer-enabled
runs handle failed steps correctly without misaligning the step records across
output files.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 804b42a1-6f0a-462b-b3b1-291ec8eea021

📥 Commits

Reviewing files that changed from the base of the PR and between 5c85e08 and 0efcd11.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • pyproject.toml
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_gymnasium_adapter.py
  • tests/test_handlers.py

@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch from 0efcd11 to b3adf41 Compare June 15, 2026 23:20

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 146-148: The constraint-failure early return in the code is
preventing `trajectory.csv` from being written and `after_step()` hooks from
being called, which breaks trial artifact consistency. Additionally, the early
return provides a fixed one-element observation instead of one matching the
configured observation space. Restructure the constraint-failure handling at
both affected locations to ensure that after encountering a constraint failure,
the code still executes the necessary finalization steps including writing
`trajectory.csv` and calling `after_step()` on all observers, while also
returning an observation that matches the shape of the configured observation
space rather than a hardcoded single-element observation. This applies at
src/cloudai/configurator/cloudai_gym.py lines 146-148 (anchor) and lines 169-172
(sibling).

In `@tests/test_cloudaigym.py`:
- Around line 567-615: Add a new test function to validate step alignment when
constraint_check fails during the env_params flow. Create a test similar to
test_env_csv_is_step_aligned_with_trajectory but configure the environment and
test scenario to trigger a constraint violation during env.step() calls. The
test must verify that despite the constraint failure: (1) env.csv and
trajectory.csv maintain 1:1 step row alignment (same step values in both files),
and (2) the observation returned from env.step() has length matching the number
of entries in agent_metrics. Use the same CloudAIGymEnv, TestRun, TestScenario,
and BaseRunner mocking patterns as the existing alignment test, but introduce
constraint conditions that will cause failures and ensure the alignment
verification assertions apply to the failure case.

In `@tests/test_gymnasium_adapter.py`:
- Around line 70-76: The `_GymWithTestRun` class initializes a `test_run` object
with a step counter but never increments it in the `step()` method, unlike the
real `CloudAIGymEnv.step()`. This means test assertions won't catch counter-sync
regressions in the adapter. Override or update the `step()` method in
`_GymWithTestRun` to increment `self.test_run.step` each time the method is
called, ensuring the test model accurately reflects the behavior of
`CloudAIGymEnv` and allows the adapter's pre-step sync assertions to detect
regressions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 0af1c210-2370-43d6-90f0-330aa01c925f

📥 Commits

Reviewing files that changed from the base of the PR and between 0efcd11 and b3adf41.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • pyproject.toml
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/__init__.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/configurator/gymnasium_adapter.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_gymnasium_adapter.py
  • tests/test_handlers.py

Comment thread src/cloudai/configurator/cloudai_gym.py Outdated
Comment thread tests/test_cloudaigym.py Outdated
Comment thread tests/test_gymnasium_adapter.py Outdated
@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

168-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure early return still breaks env.csv/trajectory.csv alignment.

When constraint_check fails, before_step() has already executed (line 146), writing to env.csv. But this early return skips both write_trajectory() and after_step() calls. This creates misalignment: env.csv has step N while trajectory.csv does not.

The past review flagged this and was marked "Addressed", but the current code still shows the bare early return. Either the fix was incomplete or reverted.

🛠️ Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_obs = [-1.0]
+            failure_reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=failure_reward,
+                    observation=failure_obs,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, failure_obs, failure_reward)
+            return failure_obs, failure_reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 168 - 170, The
constraint_check failure early return at line 168-170 skips both
write_trajectory() and after_step() calls while before_step() has already
executed and written to env.csv, causing misalignment between env.csv and
trajectory.csv. Move the write_trajectory() and after_step() method calls to
execute before this early return statement to ensure both files stay
synchronized, so that whenever a step is written to env.csv, the corresponding
trajectory is written to trajectory.csv.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 168-170: The constraint_check failure early return at line 168-170
skips both write_trajectory() and after_step() calls while before_step() has
already executed and written to env.csv, causing misalignment between env.csv
and trajectory.csv. Move the write_trajectory() and after_step() method calls to
execute before this early return statement to ensure both files stay
synchronized, so that whenever a step is written to env.csv, the corresponding
trajectory is written to trajectory.csv.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 3128d272-6f62-499c-b04a-76f177dbb799

📥 Commits

Reviewing files that changed from the base of the PR and between b3adf41 and 2411f39.

📒 Files selected for processing (10)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch from 2411f39 to c53596d Compare June 16, 2026 15:06
@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

168-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve step finalization on constraint failure to keep artifacts aligned.

At Line 168, this early return skips both trajectory persistence and observer after_step() even though before_step() already ran. That can leave env.csv and trajectory.csv out of sync by step and breaks the observer lifecycle contract on failure paths.

💡 Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_observation = [-1.0] * max(len(self.test_run.test.agent_metrics), 1)
+            failure_reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=failure_reward,
+                    observation=failure_observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, failure_observation, failure_reward)
+            return failure_observation, failure_reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 168 - 170, The
constraint check failure at line 168 returns early before executing trajectory
persistence and the observer after_step() method, breaking the observer
lifecycle since before_step() already executed. Refactor this code so that the
finalization steps (trajectory persistence and observer after_step() call) run
even when constraint_check() fails. Move the early return to after all
finalization code completes, ensuring env.csv and trajectory.csv remain
synchronized and the observer contract is maintained on all code paths including
constraint failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 168-170: The constraint check failure at line 168 returns early
before executing trajectory persistence and the observer after_step() method,
breaking the observer lifecycle since before_step() already executed. Refactor
this code so that the finalization steps (trajectory persistence and observer
after_step() call) run even when constraint_check() fails. Move the early return
to after all finalization code completes, ensuring env.csv and trajectory.csv
remain synchronized and the observer contract is maintained on all code paths
including constraint failures.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: e0a47585-a77f-469b-ad79-79b222c1b20a

📥 Commits

Reviewing files that changed from the base of the PR and between 2411f39 and c53596d.

📒 Files selected for processing (6)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py

@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch 2 times, most recently from 1298463 to 9e379d1 Compare June 17, 2026 18:33
@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch 2 times, most recently from 4fce989 to 5c3f65e Compare June 21, 2026 15:34

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/cli/handlers.py`:
- Around line 147-162: Replace the f-string formatting in the logging.error call
(starting with "No agent available for type") with lazy logging interpolation by
converting the f-string to a regular string with placeholder values and passing
the variables as separate arguments to the logging.error function. The same
pattern should also be applied to the logging.info call at line 162 (starting
with "Using deterministic first sweep"). This defers string formatting until the
log is actually emitted, improving performance and satisfying the Ruff G004
rule.

In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 66-67: The CsvSink for env.csv is created once during
initialization but trajectory_file_path is iteration-dependent, causing the sink
to append to stale directories when the current_iteration changes. Instead of
creating the env sink once in the initialization method, move the CsvSink
creation logic for self._env_sink from the initialization code around line 80 to
be called per iteration where current_iteration is known. Initialize
self._env_sink as None in __init__ and then recreate it for each iteration using
the current iteration-dependent trajectory_file_path to ensure proper
per-iteration alignment of env.csv with the trajectory files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 062e4bc1-2770-4594-8ee6-896015e5c6b2

📥 Commits

Reviewing files that changed from the base of the PR and between c53596d and 5c3f65e.

📒 Files selected for processing (10)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

Comment thread src/cloudai/cli/handlers.py
Comment thread src/cloudai/configurator/cloudai_gym.py Outdated
Comment thread src/cloudai/configurator/env_params.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

80-81: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Rebind env.csv sink using the current iteration path before each write.

Line 81 initializes CsvSink once, but trajectory writes use the dynamic trajectory_file_path. If current_iteration changes, trajectory.csv moves while env.csv keeps appending to the old sink path, breaking 1:1 alignment.

🛠️ Proposed fix
         if self._env_sink is not None:
-            self._env_sink.write(entry.step, entry.env_params)
+            # Keep env.csv colocated with the current iteration's trajectory.csv.
+            self._env_sink = CsvSink(self._env_csv_path())
+            self._env_sink.write(entry.step, entry.env_params)

Also applies to: 303-305

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 80 - 81, The CsvSink
for _env_sink is initialized once at the beginning but needs to be dynamically
rebinded before each write to track the current iteration path, similar to how
trajectory files use dynamic paths. Instead of setting self._env_sink once
during initialization in the constructor, you need to rebind the _env_sink to
the current iteration path before each write operation by calling
CsvSink(self._env_csv_path()) dynamically when needed. This ensures that env.csv
stays in sync with the trajectory.csv files as the current_iteration changes.
Apply this same fix to all locations where _env_sink is used for writing,
including the area referenced at lines 303-305.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/models/workload.py`:
- Around line 236-244: The validation loop that checks env_params should reject
structured cmd_args targets instead of silently skipping them. Remove or modify
the early continue statement when spec.weights is None, and instead add
validation to check if the cmd_args field referenced by name is a structured
type (not a list). If it is a structured field without weights declared, raise a
ValueError to prevent the field from being silently excluded from the action
space by param_space or is_dse_job checks. This ensures that env_params only
references candidate list fields in cmd_args, preventing action dimensions from
being unintentionally removed.

---

Duplicate comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 80-81: The CsvSink for _env_sink is initialized once at the
beginning but needs to be dynamically rebinded before each write to track the
current iteration path, similar to how trajectory files use dynamic paths.
Instead of setting self._env_sink once during initialization in the constructor,
you need to rebind the _env_sink to the current iteration path before each write
operation by calling CsvSink(self._env_csv_path()) dynamically when needed. This
ensures that env.csv stays in sync with the trajectory.csv files as the
current_iteration changes. Apply this same fix to all locations where _env_sink
is used for writing, including the area referenced at lines 303-305.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 1960514a-f9b1-4af4-b0c4-ed40d1947f92

📥 Commits

Reviewing files that changed from the base of the PR and between 5c3f65e and d81001c.

📒 Files selected for processing (8)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py

Comment thread src/cloudai/models/workload.py Outdated
@rutayan-nv

rutayan-nv commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

Core owns the randomization and the TestRun update; it is not agent-specific.

Real-world use. Domain randomization: vary an uncontrolled environment condition per trial so the learned policy generalizes instead of overfitting one fixed setting. Example: the agent tunes paddle_width (action) while each trial randomizes ball_speed (env_param).

Injection. env_params is an annotation, not a value store. Candidate values live in cmd_args, same as the action space (cmd_args.ball_speed = [4, 6, 8]); [env_params.ball_speed] reclassifies that list from "search" to "sample." Per trial, CloudAIGymEnv samples one value (EnvParamsObserver), writes it to TestRun.current_env_params, and overlays it onto cmd_args via model_copy(update=…) before the workload runs.

Core vs agent_config. The agent never reads or sets env_params; it only sees the reward. Sampling and the TestRun write happen in core (cloudai's Gym environment). It is environment state, not an agent hyperparameter.

vs. sweep. Inverse of a sweep over the same list: a bare list in cmd_args is the action space (searched); the same list annotated env_params is sampled, not searched (excluded from param_space/is_dse_job). Hence the hard reject under grid_search.

#900. current_env_params is part of the trajectory cache key, so revisiting an action under a different sample no longer serves a stale reward.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/models/workload.py (1)

228-230: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Don’t skip validation for empty cmd_args schemas.

Line 229 treats an empty model_fields mapping the same as a missing one, so env_params={"foo": ...} is accepted when cmd_args is a Pydantic model with no fields. Only skip when model_fields is absent.

Proposed fix
-        if not cmd_args_fields:
+        if cmd_args_fields is None:
             return self
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/models/workload.py` around lines 228 - 230, The validation logic
in the workload model is treating an empty cmd_args schema the same as no schema
at all, so env_params can slip through when the Pydantic model has zero fields.
Update the check in the method that inspects cmd_args_fields to only return
early when model_fields is missing (None), not when it is an empty mapping, so
empty schemas still trigger validation and reject unexpected env_params.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/models/workload.py`:
- Around line 228-230: The validation logic in the workload model is treating an
empty cmd_args schema the same as no schema at all, so env_params can slip
through when the Pydantic model has zero fields. Update the check in the method
that inspects cmd_args_fields to only return early when model_fields is missing
(None), not when it is an empty mapping, so empty schemas still trigger
validation and reject unexpected env_params.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: c111e710-1247-4ba9-b834-99fd924d1ef8

📥 Commits

Reviewing files that changed from the base of the PR and between cf7dbf6 and 3ce5507.

📒 Files selected for processing (2)
  • src/cloudai/models/workload.py
  • tests/test_env_params.py

Comment thread src/cloudai/cli/handlers.py Outdated
Comment thread src/cloudai/configurator/cloudai_gym.py Outdated
Comment thread src/cloudai/configurator/env_params.py
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/cloudai_gym.py Outdated
Comment thread src/cloudai/configurator/cloudai_gym.py Outdated
Comment thread src/cloudai/models/workload.py Outdated
if isinstance(value, list) or (isinstance(value, dict) and check_dict(value, path)):
# env_params lists are sampled by the env, not searched by the agent, so they
# are not action-space dimensions and must not make a run count as DSE.
if skip_env_params and path.split(".", 1)[0] in self.env_params:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep seeing this path.split(".", 1)[0] in regards of env_params. Does it really need to be dot-separated in all of those places?

moreover, some workloads declare more than just the root cmd_args to contain CLIs. e.g. vllm has bench_cmd_args just as well. at the same time certain feature are truly constrained to support only cmd_args

What I'm saying, I suggest the format of env_params is the full dot-separated path to a given attributed inside TestDefinition, not CmdArgs. I believe it will look cleaner in the toml:

num_nodes = 2

[cmd_args]
  foo = 1

  [cmd_args.args]
  bar = [1, 2, 3]

[env_params]
"cmd_args.bar" = [0.1, 0.1, 0.8]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done - f25ef4d (removed the redundant split('.', 1)[0]; env_params keys are flat cmd_args field names. Kept scope to cmd_args for now rather than full TestDefinition dot-paths — can generalize to bench_cmd_args etc. in a follow-up if you'd prefer)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would do a follow-up later to support full paths cmd_args.foo.bar or anything.abc.def meaning any nested attribute starting from TestDefinition. So that all parsing feature act consistently

just to be clear - don't need to be this PR but highly appreciated not to be prolonged. I can take it myself as well

@podkidyshev

Copy link
Copy Markdown
Contributor

Overall arch suggestion (summary of some of my comments): move parsing and applying of env_params closer to where the regular sweeping knobs are processed - TestRun, TestDefinition classes. So that, we reuse most of the code related to sweeping/knobs in regards of how it works with the TestDefinition and CmdArgs body

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

157-159: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Return a failure observation with the configured metric shape.

BaseAgent.run() intentionally does not stop on done=True; it forwards this observation to update_policy(). With multiple agent_metrics, [-1.0] has the wrong shape compared with normal get_observation() output.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_observation = [self.rewards.metric_failure] * max(len(self.test_run.test.agent_metrics), 1)
+            return failure_observation, self.rewards.constraint_failure, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 157 - 159, The
constraint-check early return in cloudai_gym.py uses a hardcoded [-1.0]
observation, which does not match the configured agent_metrics shape expected by
BaseAgent.run() and update_policy(). Update the failure path in the gym step
logic to return a failure observation built with the same structure as
get_observation(), using the configured metric set rather than a single scalar
placeholder, while keeping the existing reward, done flag, and info dict
behavior.

Source: Learnings

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/_core/test_scenario.py`:
- Around line 205-215: The revalidation pass in test_scenario’s constructor call
is re-running validate_env_params after action and env_params have already been
applied, so the post-overlay cmd_args can make weighted specs fail. Update the
validation-only path around type(tdef)(**tdef.model_dump()) to clear env_params
before the second construction, or otherwise avoid revalidating that field after
overlays in this flow. Keep the fix localized to the trial setup logic that
applies _apply to action and env_params.

---

Duplicate comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 157-159: The constraint-check early return in cloudai_gym.py uses
a hardcoded [-1.0] observation, which does not match the configured
agent_metrics shape expected by BaseAgent.run() and update_policy(). Update the
failure path in the gym step logic to return a failure observation built with
the same structure as get_observation(), using the configured metric set rather
than a single scalar placeholder, while keeping the existing reward, done flag,
and info dict behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: d61df24e-8964-44da-b280-80330b0f1f78

📥 Commits

Reviewing files that changed from the base of the PR and between 3ce5507 and 99d7bf2.

📒 Files selected for processing (9)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py

Comment thread src/cloudai/_core/test_scenario.py Outdated
@rutayan-nv

rutayan-nv commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

Overall arch suggestion (summary of some of my comments): move parsing and applying of env_params closer to where the regular sweeping knobs are processed - TestRun, TestDefinition classes. So that, we reuse most of the code related to sweeping/knobs in regards of how it works with the TestDefinition and CmdArgs body

done - d81001c, f25ef4d.

  • Parsing: env_params is a TestDefinition field; candidates live in cmd_args and validate_env_params validates against them. is_dse_job / is_env_sampled traverse cmd_args in the same pass as sweep detection.
  • Applying: TestRun.apply_params_set writes the agent action and the sampled env_params through the same _apply onto cmd_args; param_space reuses the same cmd_args list-values, excluding env-sampled keys.
  • Per-trial sampling stays in the env layer (CloudAIGymEnv.step), keeping apply_params_set deterministic.

Make env_params a first-class part of CloudAIGymEnv trial identity so the
trajectory cache keys on (action, env_params) rather than action alone,
fixing the domain-randomization correctness bug where the same action under
a different env_params sample returned a stale reward.

- Cache key now includes env_params; cache-key tests pin the contract
  (formerly the TDD-red specs of NVIDIA#900, folded in here).
- Keep env.csv and trajectory.csv 1:1 step-aligned: a single TrajectoryEntry
  sinks both files coherently, including on constraint failure.
- Reject env_params on non-DSE jobs; reject non-finite / negative weights.
- Add cache-hit + declared-env_params integration coverage.

Folds the test-only PR NVIDIA#900 (cache-key TDD) into this PR so the stack has no
permanently-red standalone PR.
…rom search space

Make env_params a thin annotation over cmd_args fields instead of a holder of
candidate values. Candidate values live in cmd_args (the single source of truth,
exactly like an action-space dimension); env_params.<name> only marks a field as
env-sampled and carries optional sampling weights, never the values.

- EnvParamSpec drops `values`; validates weights (finite, non-negative, sum=1.0).
- Sampler/observer resolve candidate lists from cmd_args; scalar knobs are no-ops.
- TestDefinition.validate_env_params cross-checks annotations against cmd_args
  (key must be a real field; weights require a list and must match its length).
- Exclude env_params keys from both param_space and is_dse_job: an env-sampled
  list is not a search dimension, so an env-params-only workload is not a DSE job.
- validate_dse_env_params rejects env_params on non-DSE runs and on grid_search
  (exhaustive search cannot exploit per-trial randomization).
- Scrub private-implementation references from public docstrings.
- Unit tests use generic Atari Breakout semantics (ball_speed / paddle_width).
…pyright

- validate_env_params: reject structured (non-leaf) cmd_args targets. The
  observer cannot sample them, yet param_space/is_dse_job exclude the whole
  key, which would silently drop nested action dimensions.
- CloudAIGymEnv.write_trajectory: rebind the env.csv sink to the current
  iteration path before each write, so env.csv stays 1:1 aligned with
  trajectory.csv when the env instance is reused across iterations.
- test_env_params: assert the unknown-field rejection via model_validate so the
  negative test no longer trips pyright's call-arg check (CI Linting fix); add a
  structured-target rejection test.
An unweighted env_params spec skipped the candidate-list check, so an empty cmd_args list
(e.g. ball_speed = []) passed validation and only failed later in EnvParamsSampler.sample()
via rng.choice([]) (IndexError). Guard against an empty candidate list in validate_env_params
so the error surfaces at TestDefinition build time. Addresses CodeRabbit feedback.
…ms value objects

Replace the EnvParamsSampler class and the StepObserver/EnvParamsObserver
indirection with two frozen dataclasses: EnvParam (one resolved knob: candidates,
optional weights, single draw) and EnvParams (per-run knobs + seed, built via
from_test, sampled per trial). The sampling RNG lives in the env: step() draws
this trial's values and hands concrete values to TestRun.apply_params_set(action,
env_params=...), which overlays action and sample through one deterministic path.

Centralize the cmd_args -> env_params lookup in TestDefinition.is_env_sampled and
access current_env_params directly. Expand EnvParam/EnvParams unit tests to cover
draw, from_test, sample, and immutability.
Drop the EnvParamsSink Protocol + CsvSink pair (and runtime_checkable) for a
single concrete EnvParamsSink, built unconditionally in CloudAIGymEnv. The
sink is now stateless: write() takes the record path per call and skips empty
samples, so non-DR runs write nothing and write_trajectory needs no branch.

Derive both records from a new iteration_dir property and expose the env
record via the env_params_record_path property (was _env_csv_path), keeping
env.csv and trajectory.csv step-aligned without coupling the name to CSV.
…ty flag

Replace the hardcoded `agent == "grid_search"` check with a BaseAgent.samples_env_params
capability flag (opt-in, defaults False). Only agents whose search consumes per-trial
env_params sampling set it True; enumerating/surrogate agents leave it False, so a config
that declares env_params for an agent that would ignore them is rejected up front instead
of silently no-op'ing. New agents answer for themselves with no string to maintain.

Relocate validate_dse_env_params out of the CLI handlers into configurator/env_params.py
next to the logic it guards, looking the agent up via the Registry. Unknown agents are
deferred to the dedicated agent-resolution error rather than masked here.

Keep all public-facing comments, docstrings, and the error message generic (no internal
agent names). Cover the full validator matrix, including the unknown-agent deferral.
Compress multi-line inline comments down to the single non-obvious rationale (or drop
them where the code already speaks), per the self-documenting-code principle. Public API
docstrings and test intent comments are left intact.
apply_params_set overlays sampled scalar draws onto cmd_args, then
reconstructs the TestDefinition to validate the applied action values.
That pass re-ran validate_env_params, which rejects a weighted env_param
whose cmd_args target is no longer a candidate list - exactly what the
overlay produces. env_params is already validated at parse time, so drop
it from the validation-only dump. Adds a regression test covering a
weighted env_param's scalar draw.
@rutayan-nv rutayan-nv force-pushed the rpatro/env-params-first-class branch from 5790ab9 to 7fec7f5 Compare June 25, 2026 19:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

153-155: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Return a failure observation with the normal metric width.

This branch still returns [-1.0] unconditionally. If agent_metrics has more than one entry, the agent sees a different observation shape on constraint-failure trials than on every successful trial.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            failure_observation = [self.rewards.metric_failure] * max(len(self.test_run.test.agent_metrics), 1)
+            return failure_observation, self.rewards.constraint_failure, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 153 - 155, The
constraint-failure branch in the gym step logic returns a fixed one-element
observation, which can change the observation shape when agent_metrics normally
has multiple values. Update the early return in the step method that performs
the constraint_check so it builds a failure observation with the same width as
the normal observation from agent_metrics, while still returning the failure
reward and done flag.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/_core/test_scenario.py`:
- Around line 206-222: The num-nodes sweep handling in test_scenario should
special-case NUM_NODES before the generic action overlay, since _apply()
currently treats every action key as a cmd_args field and can incorrectly write
NUM_NODES into the test definition. Update the logic in the action application
block so NUM_NODES is removed or handled separately from the loop that calls
_apply(), then continue using the existing new_tr.num_nodes assignment to
preserve the sweep value on the copied trajectory object.

In `@src/cloudai/cli/handlers.py`:
- Around line 307-312: The validation in handle_dry_run_and_run only checks
validate_dse_env_params against the parsed test_scenario, so env_params can
still slip through when args.single_sbatch forces the non-DSE path. Update the
validation flow in handle_dry_run_and_run to also consider args.single_sbatch
and reject env-aware DSE scenarios before dispatching to handle_non_dse_job. Use
the existing symbols validate_dse_env_params, handle_dry_run_and_run, and
handle_non_dse_job to locate the check and add the explicit guard for the
single-sbatch case.

In `@src/cloudai/configurator/env_params.py`:
- Around line 181-188: The validation loop in EnvParams.from_test is treating
any non-empty tr.test.env_params as requiring DSE sampling, which incorrectly
rejects scalar-only annotations that are meant to be fixed no-ops. Update the
check around the sampled calculation so it distinguishes scalar markers from
actual sampled env params, and only raises/blocks when the env_params entry
contains values that would truly be sampled. Keep the logic localized to
EnvParams.from_test and use the existing agent resolution path (agents.get,
tr.is_dse_job, agent.samples_env_params) to preserve the current unknown-agent
behavior.

---

Duplicate comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 153-155: The constraint-failure branch in the gym step logic
returns a fixed one-element observation, which can change the observation shape
when agent_metrics normally has multiple values. Update the early return in the
step method that performs the constraint_check so it builds a failure
observation with the same width as the normal observation from agent_metrics,
while still returning the failure reward and done flag.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 998335ea-88c1-4486-8001-0d885dc9d775

📥 Commits

Reviewing files that changed from the base of the PR and between 5790ab9 and 7fec7f5.

📒 Files selected for processing (9)
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/models/workload.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py

Comment thread src/cloudai/_core/test_scenario.py
Comment thread src/cloudai/cli/handlers.py
Comment thread src/cloudai/configurator/env_params.py
An env_params entry only reclassifies a list-valued cmd_args sweep as
env-sampled; a scalar is already fixed, so annotating it is a meaningless
label. Previously such an annotation was tolerated as a silent no-op, which
let it slip through parse-time validation and inconsistently trip (or not)
the downstream "no agent will sample them" check depending on run mode.

Reject it where the contract lives - TestDefinition.validate_env_params -
so the failure is immediate and mode-independent. EnvParams.from_test's
non-list guard becomes defensive (parse-time now guarantees lists); the
post-overlay path already drops env_params before re-validating, so concrete
scalar draws are unaffected.

Extract the per-field checks into a helper to keep the validator under the
complexity limit, and update tests: scalar annotations now assert rejection
instead of no-op tolerance.
Comment on lines -189 to +193
for key, value in action.items():

def _apply(key: str, value: Any) -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd suggest reverting the changes with the _apply function (local functions are a troublesome thing during debugging) and rather replace the previous root

for key, value in action.items():

with

full_action = action | (env_params or {})
for key, value in full_action.items():

it makes the delta of the PR quite shorter + no inner function

new_tr.test = tdef
if "NUM_NODES" in action:
new_tr.num_nodes = action["NUM_NODES"]
new_tr.current_env_params = dict(env_params or {})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
new_tr.current_env_params = dict(env_params or {})
new_tr.current_env_params = env_params or {}

action=action,
reward=cached_result.reward,
observation=cached_result.observation,
env_params=dict(self.test_run.current_env_params),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

and I see this (to my understanding) redundant dict(...) call across entire PR. is there really a need for it?

Suggested change
env_params=dict(self.test_run.current_env_params),
env_params=self.test_run.current_env_params,

Comment on lines +177 to +178
# The test_run rebuild above drops the sample; restore it so the entry, cache key, and env.csv match.
self.test_run.current_env_params = new_tr.current_env_params

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the same line in src/cloudai/_core/test_scenario.py:223. If it's indeed a duplicate - I would vote for removing this one and keeping one in scenario

Comment on lines +296 to 299
if not self._values_match_exact(entry.action, action):
continue
if self._values_match_exact(entry.env_params, current_env_params):
return entry

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit again, but I'd like to push a bit for readibility. we sacrifice a few ms for validation but I believe it's just mush easier to read

Suggested change
if not self._values_match_exact(entry.action, action):
continue
if self._values_match_exact(entry.env_params, current_env_params):
return entry
action_match = self._values_match_exact(entry.action, action)
env_params_match = self._values_match_exact(entry.env_params, current_env_params)
if action_match and env_params_match:
return entry

seed: int

@classmethod
def from_test(cls, test: Any) -> Optional["EnvParams"]:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it really Any?

Comment thread src/cloudai/models/workload.py Outdated
if isinstance(value, list) or (isinstance(value, dict) and check_dict(value, path)):
# env_params lists are sampled by the env, not searched by the agent, so they
# are not action-space dimensions and must not make a run count as DSE.
if skip_env_params and path.split(".", 1)[0] in self.env_params:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would do a follow-up later to support full paths cmd_args.foo.bar or anything.abc.def meaning any nested attribute starting from TestDefinition. So that all parsing feature act consistently

just to be clear - don't need to be this PR but highly appreciated not to be prolonged. I can take it myself as well

return {name: param.draw(random.Random(f"{self.seed}:{name}:{trial}")) for name, param in self.params.items()}


class EnvParamsSink:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant class definition: no state, no polymorphism

writer.writerow([step, sample])


def validate_dse_env_params(test_scenario: "TestScenario") -> None:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you could just merge this validation with TestDefinition.validate_env_params (eliminating this one at all and calling it explicitly from handlers.py)

@rutayan-nv

Copy link
Copy Markdown
Contributor Author

Follow-up gap: single-candidate ([1]) env_params still slip through

This PR rejects a scalar env_params target ("nothing to sample"), but a single-element list target (cmd_args.osl = [1] + [env_params.osl]) currently passes: validate_env_params only checks "is a non-empty list", not "≥ 2 candidates".

Semantically [1] is "sample from {1}" = always 1 = a constant draw — exactly the same no-op this PR is meant to kill, just spelled as a list. So it's the same class of bug leaking through.

Consistent fix: require an env_params target to be a candidate list of ≥ 2 values (a single candidate is a no-op, like a scalar). This also lines up with the parse-time canonicalization proposed in #946 (collapse non-env [x]x): once that exists, a [1] env_param naturally reduces to the already-rejected scalar case.

Filing as a follow-up rather than expanding this PR's scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants