fix(aiconfig): unwrap single-value list dims in standalone command-gen by rutayan-nv · Pull Request #938 · NVIDIA/cloudai

rutayan-nv · 2026-06-21T23:07:31Z

Summary

The Aiconfigurator standalone command-gen rendered single-value parallelism dims — declared as one-element lists in TOML (e.g. p_pp = [1]) — as the literal string [1]. simple_predictor.py parses those args as int and rejected "[1]", so every predictor invocation failed and the workload emitted empty reports (all -1 observations).

This adds a _scalar() helper that unwraps one-element lists to their scalar and raises on multi-element lists (an unresolved sweep leaking into command generation). Applied to the disagg, agg, and isl/osl args.

Independent of the RL/Gym enablement stack — a standalone correctness fix for the Aiconfigurator workload. It was surfaced while running aiconfigurator disagg RL end-to-end: the run only produced real metrics after this fix.

Test plan

test_gen_exec_command_unwraps_single_value_list_dims — p_pp=[1] renders as --p-pp 1, never [1]
test_gen_exec_command_rejects_unresolved_sweep — a multi-element list raises
existing standalone command-gen tests stay green
e2e: aiconfigurator disagg RL (ppo + dqn) now produces real metrics in trajectory.csv (was all -1 before)

coderabbitai · 2026-06-21T23:07:42Z

📝 Walkthrough

Walkthrough

Adds env-parameter sampling and validation, wires env params into scenario handling, CLI validation, CloudAIGym caching and trajectories, introduces a Gymnasium adapter, and tightens Aiconfig command generation to require resolved scalars.

Changes

Env parameter sampling, workload rules, and validation

Layer / File(s)	Summary
Env-param models and workload rules `src/cloudai/configurator/env_params.py`, `src/cloudai/models/workload.py`, `tests/test_env_params.py`	Defines env-parameter specs, sampling, CSV persistence, and DSE validation; adds workload-level env-parameter annotations and classification rules; and covers the new models and validators with tests.

Scenario overlay and CLI validation

Layer / File(s)	Summary
Scenario overlay and CLI validation `src/cloudai/_core/test_scenario.py`, `src/cloudai/cli/handlers.py`, `tests/test_handlers.py`	Adds env-parameter state to test runs, applies env overlays during parameter updates, validates parsed scenarios in CLI handlers, and covers the new validation paths in handler tests.

Gymnasium adapter and exports

Layer / File(s)	Summary
Gymnasium adapter and exports `pyproject.toml`, `src/cloudai/util/lazy_imports.py`, `src/cloudai/configurator/base_agent.py`, `src/cloudai/configurator/gymnasium_adapter.py`, `src/cloudai/configurator/__init__.py`, `src/cloudai/core.py`, `tests/test_gymnasium_adapter_contract.py`	Adds lazy gymnasium imports, the Gymnasium adapter, optional RL dependencies, and public exports for adapter and structured-observation types, with contract tests for action mapping and observation shaping.

Env sampling in CloudAIGym

Layer / File(s)	Summary
Env sampling in CloudAIGym `src/cloudai/configurator/cloudai_gym.py`, `tests/test_cloudaigym.py`	Adds per-step env-parameter sampling, env-aware caching, env.csv persistence, observation-shape updates, and trajectory entries that carry sampled env parameters, with integration tests for cache and CSV behavior.

Aiconfig scalar validation

Layer / File(s)	Summary
Aiconfig scalar validation `src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py`, `tests/workloads/aiconfig/test_command_gen_strategy_standalone.py`	Adds a scalar-collapse helper for command generation, applies it to all scalar dimension arguments in agg and disagg modes, and adds tests for unwrapping and unresolved sweep rejection.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested reviewers

srivatsankrishnan
jeffnvidia
amaslenn

Poem

🐇 I hopped through env params, neat and clear,
Gym got a mask of helpers here.
One value? Hoppity, in it goes,
Too many? The command generator knows.
With caches, CSVs, and scalars in line,
This rabbit approves — the trail is fine!

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main standalone command-generation fix and matches the changeset.
Description check	✅ Passed	The description matches the changeset and explains the same standalone command-gen fix and tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

Make env_params a first-class part of CloudAIGymEnv trial identity so the trajectory cache keys on (action, env_params) rather than action alone, fixing the domain-randomization correctness bug where the same action under a different env_params sample returned a stale reward. - Cache key now includes env_params; cache-key tests pin the contract (formerly the TDD-red specs of NVIDIA#900, folded in here). - Keep env.csv and trajectory.csv 1:1 step-aligned: a single TrajectoryEntry sinks both files coherently, including on constraint failure. - Reject env_params on non-DSE jobs; reject non-finite / negative weights. - Add cache-hit + declared-env_params integration coverage. Folds the test-only PR NVIDIA#900 (cache-key TDD) into this PR so the stack has no permanently-red standalone PR.

…rom search space Make env_params a thin annotation over cmd_args fields instead of a holder of candidate values. Candidate values live in cmd_args (the single source of truth, exactly like an action-space dimension); env_params.<name> only marks a field as env-sampled and carries optional sampling weights, never the values. - EnvParamSpec drops `values`; validates weights (finite, non-negative, sum=1.0). - Sampler/observer resolve candidate lists from cmd_args; scalar knobs are no-ops. - TestDefinition.validate_env_params cross-checks annotations against cmd_args (key must be a real field; weights require a list and must match its length). - Exclude env_params keys from both param_space and is_dse_job: an env-sampled list is not a search dimension, so an env-params-only workload is not a DSE job. - validate_dse_env_params rejects env_params on non-DSE runs and on grid_search (exhaustive search cannot exploit per-trial randomization). - Scrub private-implementation references from public docstrings. - Unit tests use generic Atari Breakout semantics (ball_speed / paddle_width).

…pyright - validate_env_params: reject structured (non-leaf) cmd_args targets. The observer cannot sample them, yet param_space/is_dse_job exclude the whole key, which would silently drop nested action dimensions. - CloudAIGymEnv.write_trajectory: rebind the env.csv sink to the current iteration path before each write, so env.csv stays 1:1 aligned with trajectory.csv when the env instance is reused across iterations. - test_env_params: assert the unknown-field rejection via model_validate so the negative test no longer trips pyright's call-arg check (CI Linting fix); add a structured-target rejection test.

An unweighted env_params spec skipped the candidate-list check, so an empty cmd_args list (e.g. ball_speed = []) passed validation and only failed later in EnvParamsSampler.sample() via rng.choice([]) (IndexError). Guard against an empty candidate list in validate_env_params so the error surfaces at TestDefinition build time. Addresses CodeRabbit feedback.

…ms value objects Replace the EnvParamsSampler class and the StepObserver/EnvParamsObserver indirection with two frozen dataclasses: EnvParam (one resolved knob: candidates, optional weights, single draw) and EnvParams (per-run knobs + seed, built via from_test, sampled per trial). The sampling RNG lives in the env: step() draws this trial's values and hands concrete values to TestRun.apply_params_set(action, env_params=...), which overlays action and sample through one deterministic path. Centralize the cmd_args -> env_params lookup in TestDefinition.is_env_sampled and access current_env_params directly. Expand EnvParam/EnvParams unit tests to cover draw, from_test, sample, and immutability.

Drop the EnvParamsSink Protocol + CsvSink pair (and runtime_checkable) for a single concrete EnvParamsSink, built unconditionally in CloudAIGymEnv. The sink is now stateless: write() takes the record path per call and skips empty samples, so non-DR runs write nothing and write_trajectory needs no branch. Derive both records from a new iteration_dir property and expose the env record via the env_params_record_path property (was _env_csv_path), keeping env.csv and trajectory.csv step-aligned without coupling the name to CSV.

…ty flag Replace the hardcoded `agent == "grid_search"` check with a BaseAgent.samples_env_params capability flag (opt-in, defaults False). Only agents whose search consumes per-trial env_params sampling set it True; enumerating/surrogate agents leave it False, so a config that declares env_params for an agent that would ignore them is rejected up front instead of silently no-op'ing. New agents answer for themselves with no string to maintain. Relocate validate_dse_env_params out of the CLI handlers into configurator/env_params.py next to the logic it guards, looking the agent up via the Registry. Unknown agents are deferred to the dedicated agent-resolution error rather than masked here. Keep all public-facing comments, docstrings, and the error message generic (no internal agent names). Cover the full validator matrix, including the unknown-agent deferral.

Compress multi-line inline comments down to the single non-obvious rationale (or drop them where the code already speaks), per the self-documenting-code principle. Public API docstrings and test intent comments are left intact.

apply_params_set overlays sampled scalar draws onto cmd_args, then reconstructs the TestDefinition to validate the applied action values. That pass re-ran validate_env_params, which rejects a weighted env_param whose cmd_args target is no longer a candidate list - exactly what the overlay produces. env_params is already validated at parse time, so drop it from the validation-only dump. Adds a regression test covering a weighted env_param's scalar draw.

rutayan-nv · 2026-06-25T23:16:29Z

Rebased onto #930 (rpatro/gymnasium-adapter) so this now sits on top of the env_params/gym stack: #901 -> #928 -> #930 -> #938. Base stays main (the parent branch lives on the fork, so GitHub can't use it as a PR base), so the diff here is cumulative and will shrink to just the aiconfig command-gen fix once #930 lands. Review the top commit b0731e7e for this PR's actual change. This pairing (#930 + #938) is what cloudaix #612 pins to for the aiconfigurator RL release. @podkidyshev

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 88-91: The constraint-check failure path in cloudai_gym should
return an observation with the same length as the metric-sized space used by
define_observation_space() and the Gymnasium adapter, rather than a fixed
single-value failure signal. Update the failure branch in the relevant
observation-building logic so it produces a vector matching
len(self.test_run.test.agent_metrics) (with at least one element), keeping the
shape consistent with the success path and the adapter’s Box expectations.

In `@tests/workloads/aiconfig/test_command_gen_strategy_standalone.py`:
- Around line 89-161: The current tests only verify unwrapping and rejection
behavior for disagg dimensions, but they miss the newly normalized agg/isl/osl
paths in AiconfiguratorStandaloneCommandGenStrategy. Add a focused test or
parameterized coverage in test_gen_exec_command_unwraps_single_value_list_dims
and/or test_gen_exec_command_rejects_unresolved_sweep that exercises isl, osl,
or agg so the command generation logic in standalone_command_gen_strategy is
validated for those fields too, not just disagg.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 5370e937-294a-4a5e-bf3c-7c926ccd9d64

📥 Commits

Reviewing files that changed from the base of the PR and between 6bc3247 and b0731e7.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

pyproject.toml
src/cloudai/_core/test_scenario.py
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/base_agent.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/env_params.py
src/cloudai/configurator/gymnasium_adapter.py
src/cloudai/core.py
src/cloudai/models/workload.py
src/cloudai/util/lazy_imports.py
src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py
tests/test_cloudaigym.py
tests/test_env_params.py
tests/test_gymnasium_adapter_contract.py
tests/test_handlers.py
tests/workloads/aiconfig/test_command_gen_strategy_standalone.py

An env_params entry only reclassifies a list-valued cmd_args sweep as env-sampled; a scalar is already fixed, so annotating it is a meaningless label. Previously such an annotation was tolerated as a silent no-op, which let it slip through parse-time validation and inconsistently trip (or not) the downstream "no agent will sample them" check depending on run mode. Reject it where the contract lives - TestDefinition.validate_env_params - so the failure is immediate and mode-independent. EnvParams.from_test's non-list guard becomes defensive (parse-time now guarantees lists); the post-overlay path already drops env_params before re-validating, so concrete scalar draws are unaffected. Extract the per-field checks into a helper to keep the validator under the complexity limit, and update tests: scalar annotations now assert rejection instead of no-op tolerance.

…otocol Add ObsLeafDescriptor (a self-describing observation leaf: "box" of width dim, or "discrete" of size n) and a StructuredObservation Protocol that documents the optional env hooks structured_observation_descriptors() and encode_observation(). These let an env expose a named, per-leaf observation so adapters (e.g. GymnasiumAdapter) can build the matching gymnasium spaces.Dict; the hooks are duck-typed, so envs need not subclass. Both exported via cloudai.core.

…ejection tests Negative tests pass an extra kwarg and an out-of-Literal kind to assert ValidationError; mark the deliberate type violations with type: ignore.

Wrap a CloudAI BaseGym as a gymnasium.Env-shaped object: a spaces.Dict of Discrete (list params) and Box (ContinuousSpace) actions over the tunable params with fixed (single-value) params injected each step; observations as either a flat float32 Box or, when the env opts in via the structured-obs hooks, a spaces.Dict of per-leaf ObsLeafDescriptor subspaces. Continuous dtype="int" params are quantized (rounded/clamped) at decode_action so the trajectory cache key collapses float jitter. The adapter is a pure pass-through over test_run.step (never mutates it), so contextual-bandit rollouts that reset() per trial keep a monotonic trial index. gymnasium is an optional dependency lazy-imported behind the new [rl] extra (also added to dev); CloudAIGymEnv.define_observation_space() now returns one slot per agent metric so adapters get the right Box shape. Exported via cloudai.core. Caller-contract tests pin the step-monotonicity, observation pass-through, continuous-quantization, and structured-obs invariants.

…ature step() delegates to decode_action(dict[str, Any]) and exists precisely to round float/continuous policy actions to ints; widen its parameter type from dict[str, int] to dict[str, Any] to match.

…reserve traceback on DSE re-raise - _as_obs_array(): assert encoded keys match descriptors before coercion (reuses _assert_keys, same guard as decode_action/step_raw) and materialize output by descriptor keys to avoid KeyError on extra keys and silent partial observations on missing keys. - handlers.py: re-raise the captured hard-fail with its original traceback. Addresses CodeRabbit findings on NVIDIA#930.

…ngleton Replace the bespoke _import_gymnasium() in-method seam with the canonical lazy.gymnasium / lazy.np properties; addresses the in-method-import review concern. No behavior change — gymnasium stays an optional [rl] extra.

The lazy.gymnasium.spaces refactor gives the adapter precise gymnasium types instead of Any, which surfaced two latent issues the scoped pre-commit run missed: - pyright now sees Space[Any]/Dict in the adapter contract test, so concrete attribute access (.low/.high/.n/.spaces) is flagged. Narrow via local bindings + isinstance before access. - lazy_imports.py now has a 2026 commit in its history, so the ci_only copyright check requires the year range 2025-2026.

Inherit from gymnasium.Env (guarded import, falls back to object when the optional [rl] extra is absent) so ecosystem tooling that performs isinstance checks (e.g. Stable-Baselines3) accepts the adapter. - Use the TYPE_CHECKING import form so pyright sees a concrete base class while runtime keeps the optional-dependency fallback. - Drop ClassVar on metadata to match Env's attribute shape (noqa RUF012). - Rename the inner-env accessor unwrapped -> cloudai_env; gymnasium's Env.unwrapped (returns self) is the correct base-env semantics, and the old override returned a non-Env (BaseGym), which would mislead ecosystem code calling .unwrapped.

Inheriting gymnasium.Env widened the static type of action_space from the concrete spaces.Dict the adapter builds to the base spaces.Space, which has no __getitem__. Cast at the test call site to restore subspace indexing for pyright (runtime is unchanged; action_space is always a Dict).

…rse of decode_action decode_action had no public inverse, so consumers needing value->index encoding (e.g. RLlib warm-start / behavioral cloning) reached into the private _tunable_params dict. When ContinuousSpace support split that internal, those consumers broke with AttributeError. encode_action closes the contract: discrete values map to their candidate index, continuous values wrap into the clamped float32 Box array, so decode_action(encode_action(v)) == v for any native v. Adds round-trip contract tests pinning the invariant and rejection of non-candidate values / key mismatches.

…tinuousSpace The GymnasiumAdapter's continuous-action path depends on ContinuousSpace, which ships separately. Until then nothing constructs a ContinuousSpace, so the continuous branches here are unreachable and the only effect of the import is an ImportError at module load. Drop the continuous import, _continuous_params, the Box action mapping, and decode/encode continuous handling so the adapter builds and ships standalone over discrete + structured-observation support. The continuous support rejoins when ContinuousSpace lands.

Disagg/Agg parallelism dims are typed Union[int, List[int]] so a TOML may express a sweep. DSE resolves tunable dims to scalars, but non-tunable single-value dims (e.g. p_pp = [1]) reached command-gen as one-element lists and were rendered as "[1]". simple_predictor.py parses these as int and rejected the value, yielding empty reports (all -1 observations). _scalar() unwraps one-element lists and raises on multi-element lists (an unresolved sweep leaking into command-gen). Applied to disagg, agg, and isl/osl. Adds tests for the unwrap and the unresolved-sweep guard.

coderabbitai

♻️ Duplicate comments (1)

tests/workloads/aiconfig/test_command_gen_strategy_standalone.py (1)
89-161: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Add coverage for the agg / isl / osl normalization paths.

These tests still only exercise disagg, but src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py also changed --isl, --osl, and all agg scalar flags on Lines 89-113. A regression there would still pass this suite, so please add one focused agg case and one isl/osl normalization assertion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/workloads/aiconfig/test_command_gen_strategy_standalone.py` around
lines 89 - 161, The current tests only cover disagg list unwrapping and
unresolved sweep rejection, but they miss the new normalization logic in
AiconfiguratorStandaloneCommandGenStrategy for agg scalar flags plus --isl and
--osl. Add a focused test that exercises the agg path and asserts those scalar
flags render correctly, and add an assertion that isl/osl normalization still
produces the expected command output so regressions in gen_exec_command are
caught.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@tests/workloads/aiconfig/test_command_gen_strategy_standalone.py`:
- Around line 89-161: The current tests only cover disagg list unwrapping and
unresolved sweep rejection, but they miss the new normalization logic in
AiconfiguratorStandaloneCommandGenStrategy for agg scalar flags plus --isl and
--osl. Add a focused test that exercises the agg path and asserts those scalar
flags render correctly, and add an assertion that isl/osl normalization still
produces the expected command output so regressions in gen_exec_command are
caught.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4fbf31e4-1de3-4a73-84f7-626d2014f21f

📥 Commits

Reviewing files that changed from the base of the PR and between b0731e7 and b6b739e.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (13)

pyproject.toml
src/cloudai/cli/handlers.py
src/cloudai/configurator/__init__.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/env_params.py
src/cloudai/configurator/gymnasium_adapter.py
src/cloudai/core.py
src/cloudai/models/workload.py
src/cloudai/util/lazy_imports.py
src/cloudai/workloads/aiconfig/standalone_command_gen_strategy.py
tests/test_env_params.py
tests/test_gymnasium_adapter_contract.py
tests/workloads/aiconfig/test_command_gen_strategy_standalone.py

rutayan-nv requested review from jeffnvidia, podkidyshev and srivatsankrishnan as code owners June 21, 2026 23:07

rutayan-nv added 9 commits June 25, 2026 15:50

rutayan-nv force-pushed the rpatro/aiconfig-cmdgen-scalar branch from 6bc3247 to b0731e7 Compare June 25, 2026 23:16

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

Comment thread src/cloudai/configurator/cloudai_gym.py

Comment thread tests/workloads/aiconfig/test_command_gen_strategy_standalone.py

rutayan-nv added 14 commits June 26, 2026 11:12

test(env-params): suppress pyright on intentional ObsLeafDescriptor r…

584f145

…ejection tests Negative tests pass an extra kwarg and an out-of-Literal kind to assert ValidationError; mark the deliberate type violations with type: ignore.

fix(gymnasium-adapter): accept continuous actions in step() type sign…

6c2452f

…ature step() delegates to decode_action(dict[str, Any]) and exists precisely to round float/continuous policy actions to ints; widen its parameter type from dict[str, int] to dict[str, Any] to match.

style(configurator): ruff format gymnasium_adapter.py

2e2d431

rutayan-nv force-pushed the rpatro/aiconfig-cmdgen-scalar branch from b0731e7 to b6b739e Compare June 26, 2026 15:41

coderabbitai Bot reviewed Jun 26, 2026

View reviewed changes

rutayan-nv mentioned this pull request Jun 26, 2026

DSE: single-element list dims ([1]) aren't scalarized before command-gen (no central invariant) #946

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(aiconfig): unwrap single-value list dims in standalone command-gen#938

fix(aiconfig): unwrap single-value list dims in standalone command-gen#938
rutayan-nv wants to merge 23 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/aiconfig-cmdgen-scalar

rutayan-nv commented Jun 21, 2026

Uh oh!

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

rutayan-nv commented Jun 25, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rutayan-nv commented Jun 21, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Jun 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

rutayan-nv commented Jun 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 21, 2026 •

edited

Loading