Skip to content

feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928

Open
rutayan-nv wants to merge 12 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/obs-leaf-descriptor
Open

feat(configurator): add ObsLeafDescriptor + structured-observation protocol#928
rutayan-nv wants to merge 12 commits into
NVIDIA:mainfrom
rutayan-nv:rpatro/obs-leaf-descriptor

Conversation

@rutayan-nv

Copy link
Copy Markdown
Contributor

Issue

  • CloudAI envs only expose a flat observation vector; there is no way to describe a named, per-leaf observation so an adapter can build a structured (Dict) observation space.

Fix

  • Add ObsLeafDescriptor(kind="box"|"discrete", dim, n) and a StructuredObservation Protocol documenting the optional, duck-typed env hooks structured_observation_descriptors() / encode_observation(). Adapters read these to build a matching gymnasium.spaces.Dict; envs need not subclass. Both exported via cloudai.core.

Testing

  • tests/test_env_params.py (+3 tests): box defaults, discrete requires n >= 1, bad-dim / extra-field / unknown-kind rejection. ruff + pyright + vulture clean.

Stack: #901#927 (ContinuousSpace) ← this. Second of the gymnasium-adapter upstreaming; PR3 ports the adapter that consumes both primitives.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds DSE domain-randomization infrastructure: a ContinuousSpace action-space model, EnvParamSpec/EnvParamsSampler/CsvSink/EnvParamsObserver primitives in env_params.py, observer hooks in CloudAIGymEnv.step() and cache-key logic, a default BaseAgent.run() loop with termination control via select_action() returning None, and structured failure reporting in handle_dse_job via _record_run_failure and dse_failure.txt.

Changes

DSE Domain Randomization and Error Handling

Layer / File(s) Summary
Action space and env-params schemas and infrastructure
src/cloudai/_core/action_space.py, src/cloudai/configurator/env_params.py
Defines ContinuousSpace for action-space bounds with strict validation (low < high), EnvParamSpec for categorical parameter specs with optional validated weights, ObsLeafDescriptor for observation leaf structure, StructuredObservation protocol for observation encoding, EnvParamsSampler for deterministic per-trial categorical sampling with RNG isolation, EnvParamsSink protocol and CsvSink for step-aligned CSV persistence, and StepObserver protocol with EnvParamsObserver for per-step sampling and recording.
TestRun and TestDefinition extensions for env-params
src/cloudai/_core/test_scenario.py, src/cloudai/models/workload.py
Extends TestRun with current_env_params: dict[str, Any] field and increment_step() -> int method. Extends TestDefinition with env_params: dict[str, EnvParamSpec] field defaulting to empty dict, documenting sampling per trial, persistence to env.csv, and trajectory cache-key incorporation.
Public type exports via core.py
src/cloudai/core.py
Re-exports ContinuousSpace, ObsLeafDescriptor, and StructuredObservation from internal modules by adding imports and updating the __all__ list.
Test coverage for action-space and env-params primitives
tests/test_action_space.py, tests/test_env_params.py, tests/test_test_scenario.py
Comprehensive pytest coverage for ContinuousSpace dtype defaults and bounds validation, EnvParamSpec weight validation, EnvParamsSampler determinism and per-parameter independence, CsvSink file I/O and step validation, EnvParamsObserver current_env_params updates and sink persistence, ObsLeafDescriptor box/discrete validation rules, and TestRun.increment_step() monotonic advancement.
CloudAIGymEnv observer wiring, TrajectoryEntry env_params, and cache key change
src/cloudai/configurator/cloudai_gym.py
Adds env_params field to TrajectoryEntry, wires EnvParamsObserver initialization and before_step/after_step hooks into CloudAIGymEnv.step() for both cached and non-cached branches, increments test run step before cache lookup, and updates get_cached_trajectory_result to require exact match of both action and current_env_params with back-compat for missing env_params.
Test coverage for CloudAIGymEnv observer integration and cache behavior
tests/test_cloudaigym.py
Tests step advancement before output_path computation, cached trajectory row step coupling to advance ordering, cache-miss when env_params differ, cache-hit when action and current_env_params match, back-compat cache-hit when neither has env_params, workload re-execution on env_params change, env.csv/trajectory.csv step alignment, observer execution on cache-hit with declared env_params, and observer absence when env_params not declared.
BaseAgent.run() default exploration loop
src/cloudai/configurator/base_agent.py
Updates abstract select_action signature to optionally return None for early termination. Adds logging import and concrete run() method that resets the environment, iterates select_action/env.step/update_policy up to max_steps, breaks when select_action returns None, logs step progress, and returns 0 on completion. Docstring defines failure contract distinguishing recoverable return codes from unexpected exceptions.
handle_dse_job refactor: agent.run() delegation and failure reporting
src/cloudai/cli/handlers.py
Refactors handle_dse_job to instantiate the configured agent and call agent.run() per TestRun, accumulates recoverable non-zero return codes via bitwise OR while capturing unexpected exceptions in run_error. Adds _record_run_failure helper to write dse_failure.txt with formatted traceback and OSError handling. Updates generate_reports signature to accept optional error parameter and calls _record_run_failure before normal report generation. Adds validate_dse_env_params to reject non-DSE test runs declaring env_params, wired into handle_dry_run_and_run and verify_test_scenarios.
Test coverage for handle_dse_job, validate_dse_env_params, and error reporting
tests/test_handlers.py
Introduces CustomRunStubAgent with configurable run() behavior, custom_run_agent_name fixture for agent registry management, and comprehensive tests covering agent.run() invocation, non-zero return code propagation and accumulation, exception propagation as hard failures aborting remaining runs, dse_failure.txt generation on hard-fail with exception type/message, and validate_dse_env_params/verify_test_scenarios enforcement that env_params are allowed only for DSE jobs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • srivatsankrishnan
  • jeffnvidia
  • podkidyshev

🐰 Hoppity-hop through the action space wide,
env_params sampled per trial — what a ride!
Observers before_step, CsvSink writes the row,
Cache keys match action AND env, don't you know.
On hard fail, dse_failure.txt tells the tale —
A rabbit reports errors without fail! 🌿

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the primary change: adding ObsLeafDescriptor and StructuredObservation protocol for structured observation support, which is the core feature of this changeset.
Description check ✅ Passed The PR description clearly explains the issue (flat observation vectors lack structure), the fix (new descriptor and protocol with env hooks), testing approach, and context within the upstreaming stack. It is directly related to the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure path breaks env.csvtrajectory.csv step alignment.

Line 146 triggers observer.before_step(...) (which persists env.csv) before constraints are checked, but Line 170 returns early without appending a trajectory row. For env-param-enabled runs, this creates unmatched env.csv rows.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint check failure path breaks the alignment between env.csv and
trajectory.csv because observer.before_step() is called before the constraint
validation, writing an env.csv entry that never gets a matching trajectory.csv
row when the function returns early. Move the constraint check (the if not
self.test_run.test.constraint_check() block) to execute before calling
observer.before_step() at the beginning of the step. This ensures that if
constraints fail, no env.csv entry is written, keeping both CSV files aligned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop captures the `done` flag from the
env.step(action) call but never checks it, causing the loop to continue
executing after the environment signals termination. Add a conditional check
after the update_policy call to break out of the loop when `done` evaluates to
True, ensuring the agent stops running trials when the episode terminates as
intended.

In `@src/cloudai/configurator/env_params.py`:
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field to
expose the box data type contract. Add an optional dtype field to
ObsLeafDescriptor that defaults to "float32" for box leaves, allowing adapters
to consume dtype metadata from the descriptor schema. Update the _validate
method to ensure dtype is appropriately set for box-type leaves and validate
that the dtype value is valid when specified.
- Around line 113-115: The `encode_observation()` method signature in
StructuredObservation diverges from the documented public contract. The
documented contract specifies a zero-arg hook, but the current implementation
includes an `observation: list` parameter. Remove the `observation` parameter
from the `encode_observation()` method signature to align it with the documented
public contract and ensure compatibility with duck-typed environments and
adapters that expect a zero-arg method.
- Line 2: The copyright header on line 2 of
src/cloudai/configurator/env_params.py does not match the CI header-check test
expectations for the year format. Update the copyright year string in the header
from the current "2024-2026" format to match the exact year string that the
repository's header-check test expects for this file (which should be "2026"
based on the CI requirement). Ensure the format precisely matches what the
enforced header check validates.

In `@tests/test_action_space.py`:
- Around line 43-50: Add type-check suppressions to the test functions that
intentionally pass invalid arguments to validate runtime error handling. For the
test_continuous_space_rejects_unknown_dtype function, suppress the type-check
error on the line where ContinuousSpace is instantiated with the invalid
"double" dtype argument. For the test_continuous_space_forbids_extra_fields
function, suppress the type-check error on the line where ContinuousSpace is
instantiated with the unexpected step parameter. These suppressions allow the
runtime validation assertions to execute without being blocked by static type
checkers like pyright, which would otherwise reject these intentionally invalid
arguments.

In `@tests/test_env_params.py`:
- Line 2: The copyright header on line 2 contains an incorrect year range that
fails the CI header check. Update the copyright year in the NVIDIA CORPORATION &
AFFILIATES copyright line from the current range (2025-2026) to only 2026 to
match the CI expectation and unblock the header validation.
- Around line 143-148: Replace the three direct ObsLeafDescriptor constructor
calls with model_validate() calls to satisfy pyright type checking. For the
first validation test (with the dim=0 case and match pattern), call
model_validate() instead of the constructor directly. For the second test (with
the unexpected=1 parameter), similarly use model_validate(). For the third test,
use model_validate() AND correct the test data from kind="categorical" to a
valid literal value that ObsLeafDescriptor accepts, which should be either "box"
or "discrete" based on the kind validation constraints.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint check failure path breaks the alignment
between env.csv and trajectory.csv because observer.before_step() is called
before the constraint validation, writing an env.csv entry that never gets a
matching trajectory.csv row when the function returns early. Move the constraint
check (the if not self.test_run.test.constraint_check() block) to execute before
calling observer.before_step() at the beginning of the step. This ensures that
if constraints fail, no env.csv entry is written, keeping both CSV files
aligned.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ed2eee75-0fae-450f-a244-ee6a41e8986a

📥 Commits

Reviewing files that changed from the base of the PR and between 1b0e8cc and 7c6a1ab.

📒 Files selected for processing (13)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

Comment thread src/cloudai/configurator/base_agent.py
Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread src/cloudai/configurator/env_params.py
Comment thread src/cloudai/configurator/env_params.py
Comment thread tests/test_action_space.py Outdated
Comment thread tests/test_env_params.py Outdated
Comment thread tests/test_env_params.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cloudai/configurator/cloudai_gym.py (1)

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure early return skips trajectory persistence and after_step hooks.

After Line 145-147, observers run before_step. But Line 168-170 returns immediately on failed constraints, so no trajectory row is written and after_step is never called. With env-param observers, this can leave env.csv and trajectory.csv out of sync for that step.

Proposed fix
         if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
             logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint-failure return path at lines 168-170 skips calling write_trajectory
and observer.after_step, creating a mismatch with the before_step call at line
146-147. Before the early return statement that handles the failed constraint
check, add a trajectory entry write using write_trajectory with appropriate
step, action, and constraint-failure reward values, then call
observer.after_step for all observers in self.observers (similar to the pattern
used in the cached_result branch at lines 164-169) to ensure trajectory
persistence and observer symmetry are maintained.
♻️ Duplicate comments (7)
src/cloudai/configurator/env_params.py (3)

113-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

StructuredObservation signatures diverge from the declared public contract.

Line 113-115 currently uses Optional[...] for descriptors and requires an observation arg in encode_observation, but the documented contract is zero-arg hooks returning a descriptor dict and encoded leaves.

Proposed fix
-    def structured_observation_descriptors(self) -> Optional[Dict[str, ObsLeafDescriptor]]: ...
-    def encode_observation(self, observation: list) -> Dict[str, Any]: ...
+    def structured_observation_descriptors(self) -> Dict[str, ObsLeafDescriptor]: ...
+    def encode_observation(self) -> Dict[str, Any]: ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 113 - 115, The method
signatures for structured_observation_descriptors and encode_observation in the
StructuredObservation class do not match the documented public contract. Fix
structured_observation_descriptors to return Dict[str, ObsLeafDescriptor]
(removing the Optional wrapper) to indicate it always returns a descriptor dict,
and remove the observation parameter from encode_observation to make it a
zero-arg hook that returns encoded leaves, aligning both methods with the
declared contract for these zero-arg hook methods.

2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the SPDX copyright year string to unblock CI.

Line 2 still uses 2024-2026, but CI expects the exact 2026 string for this file.

Proposed fix
-# Copyright (c) 2024-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Based on learnings, header year formatting must match the repository’s enforced checker exactly.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` at line 2, The copyright header on
line 2 uses the year range "2024-2026" but the CI checker requires the exact
year "2026" for this file. Update the copyright year in the first file header
comment from the range format to just "2026" to match the repository's enforced
header format checker requirements.

Sources: Learnings, Pipeline failures


87-97: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

ObsLeafDescriptor does not expose the documented box dtype contract.

Line 87-90 defines kind/dim/n only; the PR objective says box leaves default to float32. Without a descriptor dtype, adapters cannot consume that schema metadata.

Proposed fix
 class ObsLeafDescriptor(BaseModel):
@@
     kind: Literal["box", "discrete"]
     dim: int = 1
     n: Optional[int] = None
+    dtype: Literal["float32"] = "float32"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 87 - 97, The
ObsLeafDescriptor class is missing a dtype field that should be exposed as part
of the schema metadata. Add a dtype field to ObsLeafDescriptor alongside the
existing kind, dim, and n fields, with a default value of float32 for box type
observations, so that adapters can access the data type information from the
descriptor schema.
tests/test_action_space.py (1)

45-45: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Unblock pyright for intentionally-invalid constructor tests.

Line 45 and Line 50 are valid runtime-validation tests, but static typing rejects them and currently breaks CI lint.

Proposed fix
+from typing import Any, cast
@@
 def test_continuous_space_rejects_unknown_dtype() -> None:
     with pytest.raises(ValidationError):
-        ContinuousSpace(low=0.0, high=1.0, dtype="double")
+        ContinuousSpace(low=0.0, high=1.0, dtype=cast(Any, "double"))
@@
 def test_continuous_space_forbids_extra_fields() -> None:
     with pytest.raises(ValidationError):
-        ContinuousSpace(low=0.0, high=1.0, step=0.1)
+        ContinuousSpace(low=0.0, high=1.0, **cast(Any, {"step": 0.1}))

Also applies to: 50-50

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_action_space.py` at line 45, The ContinuousSpace constructor calls
at lines 45 and 50 in tests/test_action_space.py are intentionally-invalid
runtime validation tests but violate static typing contracts checked by pyright,
causing CI lint failures. Add a pyright ignore directive comment to each of
these lines (both the line 45 ContinuousSpace call with dtype="double" and the
line 50 similar construct) to allow the static type checker to skip validation
while preserving the runtime validation test behavior.

Source: Pipeline failures

tests/test_env_params.py (2)

146-148: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use model_validate(...) for these negative constructor tests so pyright passes.

Line 146 and Line 148 intentionally pass invalid payloads, but direct ObsLeafDescriptor(...) calls are rejected by the typed constructor before runtime validation, causing CI lint failure. Keep the invalid test data, but route it through model_validate({...}).

Proposed fix
 def test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields() -> None:
     with pytest.raises(ValidationError, match="dim must be"):
         ObsLeafDescriptor(kind="box", dim=0)
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="box", dim=1, unexpected=1)
+        ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1})
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="categorical", dim=1)
+        ObsLeafDescriptor.model_validate({"kind": "categorical", "dim": 1})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 146 - 148, The direct constructor
calls to ObsLeafDescriptor on lines 146 and 148 are rejected by the type checker
before runtime validation can occur, causing CI lint failures. Convert both
constructor calls to use model_validate({...}) instead, wrapping the invalid
test data (the keyword arguments) into a dictionary payload. This allows the
invalid data to pass through type checking and be caught by the runtime
validation in the pytest.raises blocks.

Source: Pipeline failures


2-2: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fix the copyright year on Line 2 to unblock CI.

Line 2 still uses 2025-2026, but CI expects this file’s second header line to be 2026 only.

Proposed fix
-# Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Based on learnings, copyright year formatting must match the repository’s header-check logic in tests/test_check_copyright_headers.py.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` at line 2, Update the copyright year in the header
line to match the CI validation requirements. Change the copyright year format
from the current `2025-2026` to `2026` only in the copyright header comment.
This aligns with the copyright header check logic that validates the formatting
across the repository, ensuring CI will pass the header validation.

Sources: Learnings, Pipeline failures

src/cloudai/configurator/base_agent.py (1)

149-161: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Honor terminal done in the default run loop.

Line 149 reads done, but the loop continues through max_steps. This can run extra steps after episode termination and feed invalid transitions into update_policy.

Proposed fix
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
+            if done:
+                break
         return 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 149 - 161, The run loop
in the base_agent.py file reads the done flag from the environment but does not
honor it to exit early. After calling update_policy with the step data (which
includes the done flag), add a check: if done is True, break out of the loop
immediately. This prevents extra steps from being executed after the episode has
terminated and avoids feeding invalid transitions into update_policy beyond the
natural episode boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 145-170: The constraint-failure return path at lines 168-170 skips
calling write_trajectory and observer.after_step, creating a mismatch with the
before_step call at line 146-147. Before the early return statement that handles
the failed constraint check, add a trajectory entry write using write_trajectory
with appropriate step, action, and constraint-failure reward values, then call
observer.after_step for all observers in self.observers (similar to the pattern
used in the cached_result branch at lines 164-169) to ensure trajectory
persistence and observer symmetry are maintained.

---

Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 149-161: The run loop in the base_agent.py file reads the done
flag from the environment but does not honor it to exit early. After calling
update_policy with the step data (which includes the done flag), add a check: if
done is True, break out of the loop immediately. This prevents extra steps from
being executed after the episode has terminated and avoids feeding invalid
transitions into update_policy beyond the natural episode boundary.

In `@src/cloudai/configurator/env_params.py`:
- Around line 113-115: The method signatures for
structured_observation_descriptors and encode_observation in the
StructuredObservation class do not match the documented public contract. Fix
structured_observation_descriptors to return Dict[str, ObsLeafDescriptor]
(removing the Optional wrapper) to indicate it always returns a descriptor dict,
and remove the observation parameter from encode_observation to make it a
zero-arg hook that returns encoded leaves, aligning both methods with the
declared contract for these zero-arg hook methods.
- Line 2: The copyright header on line 2 uses the year range "2024-2026" but the
CI checker requires the exact year "2026" for this file. Update the copyright
year in the first file header comment from the range format to just "2026" to
match the repository's enforced header format checker requirements.
- Around line 87-97: The ObsLeafDescriptor class is missing a dtype field that
should be exposed as part of the schema metadata. Add a dtype field to
ObsLeafDescriptor alongside the existing kind, dim, and n fields, with a default
value of float32 for box type observations, so that adapters can access the data
type information from the descriptor schema.

In `@tests/test_action_space.py`:
- Line 45: The ContinuousSpace constructor calls at lines 45 and 50 in
tests/test_action_space.py are intentionally-invalid runtime validation tests
but violate static typing contracts checked by pyright, causing CI lint
failures. Add a pyright ignore directive comment to each of these lines (both
the line 45 ContinuousSpace call with dtype="double" and the line 50 similar
construct) to allow the static type checker to skip validation while preserving
the runtime validation test behavior.

In `@tests/test_env_params.py`:
- Around line 146-148: The direct constructor calls to ObsLeafDescriptor on
lines 146 and 148 are rejected by the type checker before runtime validation can
occur, causing CI lint failures. Convert both constructor calls to use
model_validate({...}) instead, wrapping the invalid test data (the keyword
arguments) into a dictionary payload. This allows the invalid data to pass
through type checking and be caught by the runtime validation in the
pytest.raises blocks.
- Line 2: Update the copyright year in the header line to match the CI
validation requirements. Change the copyright year format from the current
`2025-2026` to `2026` only in the copyright header comment. This aligns with the
copyright header check logic that validates the formatting across the
repository, ensuring CI will pass the header validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4dd48574-dda1-4f02-8138-c5510caa80a9

📥 Commits

Reviewing files that changed from the base of the PR and between 7c6a1ab and 5657f1e.

📒 Files selected for processing (13)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py
  • tests/test_test_scenario.py

@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from 5657f1e to 29867a8 Compare June 16, 2026 15:06

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/cloudai/configurator/env_params.py (2)

100-115: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

StructuredObservation.encode_observation signature diverges from the documented contract.

The PR objective documents a zero-arg hook encode_observation() -> dict[str, Any], but Line 115 includes an observation: list parameter. This breaks the duck-typing contract for environments implementing the protocol.

Proposed fix
-    def encode_observation(self, observation: list) -> Dict[str, Any]: ...
+    def encode_observation(self) -> Dict[str, Any]: ...
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 100 - 115, The
encode_observation method signature in the StructuredObservation Protocol class
on line 115 includes an observation: list parameter, which diverges from the
documented contract that specifies a zero-argument method. Remove the
observation: list parameter from the encode_observation method definition to
align with the documented contract and maintain proper duck-typing compatibility
with environments implementing this protocol.

74-97: ⚠️ Potential issue | 🟠 Major | ⚖️ Poor tradeoff

ObsLeafDescriptor is missing the box dtype contract.

The PR objective states box leaves default to float32, but this model exposes no dtype field, so adapters cannot consume dtype metadata from the descriptor schema. Add a dtype field with appropriate validation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/env_params.py` around lines 74 - 97, The
ObsLeafDescriptor class is missing a dtype field needed to expose data type
metadata for box leaves to adapters. Add a dtype field to the class (appropriate
type should be determined based on your dtype representation strategy, likely
Optional[str] with a default of "float32" for box kind), then update the
_validate method to enforce that dtype is present and valid when kind equals
"box", ensuring adapters can properly consume the dtype contract from the
descriptor schema.
tests/test_env_params.py (1)

142-148: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix negative-constructor tests to satisfy type checking.

Line 148 uses kind="categorical" which is not a valid Literal["box", "discrete"] value. This fails type checking before Pydantic validation runs. Use model_validate() for negative tests or choose a valid literal value that triggers the intended validation error.

Proposed fix
     with pytest.raises(ValidationError, match="dim must be"):
         ObsLeafDescriptor(kind="box", dim=0)
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="box", dim=1, unexpected=1)  # type: ignore
+        ObsLeafDescriptor.model_validate({"kind": "box", "dim": 1, "unexpected": 1})
     with pytest.raises(ValidationError):
-        ObsLeafDescriptor(kind="categorical", dim=1)  # type: ignore
+        ObsLeafDescriptor.model_validate({"kind": "invalid", "dim": 1})
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_env_params.py` around lines 142 - 148, The test case on lines
147-148 in the test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields
function uses kind="categorical" which is not a valid Literal value (only "box"
and "discrete" are valid). This causes type checking to fail before Pydantic
validation runs. Either use model_validate() with a dictionary to bypass type
checking for this negative test, or replace the invalid kind value with a valid
literal ("box" or "discrete") that still triggers the validation error you are
testing for (likely related to dim=1 being invalid for that kind).
tests/test_action_space.py (1)

43-50: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Pyright errors still present (duplicate of past review).

The # type: ignore comments don't suppress pyright-specific errors. CI is still failing with reportArgumentType on line 45 and reportCallIssue on line 50. Use pyright-specific suppressions as noted in the previous review.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_action_space.py` around lines 43 - 50, The generic `# type:
ignore` comments are not suppressing pyright-specific errors in the test
functions test_continuous_space_rejects_unknown_dtype and
test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment
on the ContinuousSpace call at line 45 with a pyright-specific suppression for
the reportArgumentType error, and replace the `# type: ignore` comment on the
ContinuousSpace call at line 50 with a pyright-specific suppression for the
reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each
respective error code.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_env_params.py`:
- Around line 87-88: The pytest.raises context manager for ValueError at the
sink.write call lacks a match parameter. Add a match parameter with an
appropriate regex pattern to the pytest.raises(ValueError) call that matches the
expected error message thrown when sink.write is called with drop_rate of 0.0,
which will improve test specificity and provide better failure diagnostics.

---

Duplicate comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 100-115: The encode_observation method signature in the
StructuredObservation Protocol class on line 115 includes an observation: list
parameter, which diverges from the documented contract that specifies a
zero-argument method. Remove the observation: list parameter from the
encode_observation method definition to align with the documented contract and
maintain proper duck-typing compatibility with environments implementing this
protocol.
- Around line 74-97: The ObsLeafDescriptor class is missing a dtype field needed
to expose data type metadata for box leaves to adapters. Add a dtype field to
the class (appropriate type should be determined based on your dtype
representation strategy, likely Optional[str] with a default of "float32" for
box kind), then update the _validate method to enforce that dtype is present and
valid when kind equals "box", ensuring adapters can properly consume the dtype
contract from the descriptor schema.

In `@tests/test_action_space.py`:
- Around line 43-50: The generic `# type: ignore` comments are not suppressing
pyright-specific errors in the test functions
test_continuous_space_rejects_unknown_dtype and
test_continuous_space_forbids_extra_fields. Replace the `# type: ignore` comment
on the ContinuousSpace call at line 45 with a pyright-specific suppression for
the reportArgumentType error, and replace the `# type: ignore` comment on the
ContinuousSpace call at line 50 with a pyright-specific suppression for the
reportCallIssue error. Use the format `# pyright: ignore[error_code]` for each
respective error code.

In `@tests/test_env_params.py`:
- Around line 142-148: The test case on lines 147-148 in the
test_obs_leaf_descriptor_rejects_bad_dim_and_extra_fields function uses
kind="categorical" which is not a valid Literal value (only "box" and "discrete"
are valid). This causes type checking to fail before Pydantic validation runs.
Either use model_validate() with a dictionary to bypass type checking for this
negative test, or replace the invalid kind value with a valid literal ("box" or
"discrete") that still triggers the validation error you are testing for (likely
related to dim=1 being invalid for that kind).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 10cda741-a2d5-4271-a790-dac9d27e8a32

📥 Commits

Reviewing files that changed from the base of the PR and between 5657f1e and 29867a8.

📒 Files selected for processing (9)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py

Comment thread tests/test_env_params.py Outdated
@rutayan-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch 2 times, most recently from 1072426 to a7a2c01 Compare June 16, 2026 22:15

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/cloudai/configurator/cloudai_gym.py (2)

185-201: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve sampled current_env_params in the fallback restore path.

When the output path is missing, Line 188 resets from original_test_run and only restores step/output_path; by Line 201 the trajectory row can record empty env params even though a sample existed for this step.

Proposed fix
         else:
             self.test_run = copy.deepcopy(self.original_test_run)
             self.test_run.step = new_tr.step
             self.test_run.output_path = new_tr.output_path
+            self.test_run.current_env_params = dict(new_tr.current_env_params)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 185 - 201, When the
fallback restore path is taken (when test output path does not exist), the code
resets from original_test_run but only restores the step and output_path
attributes from new_tr. This causes the current_env_params to be empty when
writing the trajectory entry, even though new_tr contains sampled environment
parameters. In the else block after line 188, after setting
self.test_run.output_path from new_tr, also restore
self.test_run.current_env_params from new_tr to preserve the sampled parameters
for the trajectory record.

145-170: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Constraint-failure exits leave env.csv and trajectory.csv out of sync.

At Line 146, before_step can persist a row to env.csv, but the early return at Line 170 skips write_trajectory, so the same trial step exists only in one file.

Proposed fix
-        if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
-            logging.info("Constraint check failed. Skipping step.")
-            return [-1.0], self.rewards.constraint_failure, True, {}
+        if not self.test_run.test.constraint_check(self.test_run, self.runner.system):
+            logging.info("Constraint check failed. Skipping step.")
+            observation = [-1.0]
+            reward = self.rewards.constraint_failure
+            self.write_trajectory(
+                TrajectoryEntry(
+                    step=self.test_run.step,
+                    action=action,
+                    reward=reward,
+                    observation=observation,
+                    env_params=dict(self.test_run.current_env_params),
+                )
+            )
+            for observer in self.observers:
+                observer.after_step(self.test_run, observation, reward)
+            return observation, reward, True, {}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/cloudai_gym.py` around lines 145 - 170, The
constraint-failure early return at the end of the code snippet skips the
write_trajectory call, while the before_step observer has already potentially
written to env.csv, causing the files to be out of sync. Before the early return
statement in the constraint_check failure case, add a write_trajectory call with
a TrajectoryEntry that mirrors the structure used in the cached result handling,
including the current step, action, the constraint_failure reward, the current
observation, and the current env_params dictionary, to ensure both csv files
remain synchronized.
♻️ Duplicate comments (1)
src/cloudai/configurator/base_agent.py (1)

150-162: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Stop the default run loop when env.step returns done=True.

Line 150 captures done, but the loop never checks it. This keeps stepping after terminal state and can skew DSE trial flow and persisted step-aligned artifacts.

Proposed fix
             self.update_policy(
                 {
                     "trial_index": step,
                     "value": reward,
                     "observation": observation,
                     "prev_observation": prev_observation,
                     "action": action,
                     "done": done,
                 }
             )
+            if done:
+                break
             logging.info(f"Step {step}: Observation: {[round(obs, 4) for obs in observation]}, Reward: {reward:.4f}")
         return 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cloudai/configurator/base_agent.py` around lines 150 - 162, The loop
captures the done flag from self.env.step(action) on line 150 but never checks
it to terminate the loop when the environment reaches a terminal state. Add a
check after the self.update_policy() call that breaks out of the loop when done
is True. This ensures the default run loop stops stepping once the environment
signals completion, preventing artifacts misalignment and incorrect trial flow.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/env_params.py`:
- Around line 66-70: The `EnvParamSpec._validate_weights()` method currently
only validates that weights are non-negative and have a positive sum, but it
does not reject non-finite values like infinity or NaN. Add validation to check
that all weights in self.weights are finite numbers before the existing positive
sum check. Raise a ValueError with a descriptive message if any non-finite
weights are detected, using a utility function like math.isfinite() to test each
weight value. This ensures malformed configurations are rejected at schema
validation time rather than failing later at runtime when random.choices() is
called.

In `@tests/test_cloudaigym.py`:
- Around line 567-615: Add a new test function alongside
test_env_csv_is_step_aligned_with_trajectory that validates env.csv and
trajectory.csv step alignment when a constraint failure causes early exit. This
test should follow a similar structure to the existing test but configure the
CloudAIGymEnv with constraint parameters that will be violated during step
execution, triggering an early termination. Verify that even when the
environment exits due to constraint failure, the step columns in env.csv and
trajectory.csv remain perfectly aligned (same number of rows, matching step
numbers), ensuring the corpus-friendly merge contract holds across both normal
execution and constraint-failure paths.

---

Outside diff comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 185-201: When the fallback restore path is taken (when test output
path does not exist), the code resets from original_test_run but only restores
the step and output_path attributes from new_tr. This causes the
current_env_params to be empty when writing the trajectory entry, even though
new_tr contains sampled environment parameters. In the else block after line
188, after setting self.test_run.output_path from new_tr, also restore
self.test_run.current_env_params from new_tr to preserve the sampled parameters
for the trajectory record.
- Around line 145-170: The constraint-failure early return at the end of the
code snippet skips the write_trajectory call, while the before_step observer has
already potentially written to env.csv, causing the files to be out of sync.
Before the early return statement in the constraint_check failure case, add a
write_trajectory call with a TrajectoryEntry that mirrors the structure used in
the cached result handling, including the current step, action, the
constraint_failure reward, the current observation, and the current env_params
dictionary, to ensure both csv files remain synchronized.

---

Duplicate comments:
In `@src/cloudai/configurator/base_agent.py`:
- Around line 150-162: The loop captures the done flag from
self.env.step(action) on line 150 but never checks it to terminate the loop when
the environment reaches a terminal state. Add a check after the
self.update_policy() call that breaks out of the loop when done is True. This
ensures the default run loop stops stepping once the environment signals
completion, preventing artifacts misalignment and incorrect trial flow.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 64a943dc-eea4-4d96-8190-7db0c7da3c25

📥 Commits

Reviewing files that changed from the base of the PR and between 1072426 and a7a2c01.

📒 Files selected for processing (12)
  • src/cloudai/_core/action_space.py
  • src/cloudai/_core/test_scenario.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/configurator/base_agent.py
  • src/cloudai/configurator/cloudai_gym.py
  • src/cloudai/configurator/env_params.py
  • src/cloudai/core.py
  • src/cloudai/models/workload.py
  • tests/test_action_space.py
  • tests/test_cloudaigym.py
  • tests/test_env_params.py
  • tests/test_handlers.py

Comment thread src/cloudai/configurator/env_params.py Outdated
Comment thread tests/test_cloudaigym.py Outdated
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch 2 times, most recently from a7cb274 to f44b1ea Compare June 17, 2026 20:26
rutayan-nv added a commit to rutayan-nv/cloudai that referenced this pull request Jun 17, 2026
random.choices treats inf/nan weights inconsistently (they slip past the
existing non-negative + positive-sum checks), so validate finiteness in
EnvParamSpec and reject inf/nan/-inf. Also add a match= to the CsvSink
zero-step pytest.raises for clarity (CodeRabbit, NVIDIA#901/NVIDIA#928).
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch 8 times, most recently from dfc5777 to 3336c51 Compare June 24, 2026 10:33
Comment thread src/cloudai/configurator/cloudai_gym.py
Make env_params a first-class part of CloudAIGymEnv trial identity so the
trajectory cache keys on (action, env_params) rather than action alone,
fixing the domain-randomization correctness bug where the same action under
a different env_params sample returned a stale reward.

- Cache key now includes env_params; cache-key tests pin the contract
  (formerly the TDD-red specs of NVIDIA#900, folded in here).
- Keep env.csv and trajectory.csv 1:1 step-aligned: a single TrajectoryEntry
  sinks both files coherently, including on constraint failure.
- Reject env_params on non-DSE jobs; reject non-finite / negative weights.
- Add cache-hit + declared-env_params integration coverage.

Folds the test-only PR NVIDIA#900 (cache-key TDD) into this PR so the stack has no
permanently-red standalone PR.
…rom search space

Make env_params a thin annotation over cmd_args fields instead of a holder of
candidate values. Candidate values live in cmd_args (the single source of truth,
exactly like an action-space dimension); env_params.<name> only marks a field as
env-sampled and carries optional sampling weights, never the values.

- EnvParamSpec drops `values`; validates weights (finite, non-negative, sum=1.0).
- Sampler/observer resolve candidate lists from cmd_args; scalar knobs are no-ops.
- TestDefinition.validate_env_params cross-checks annotations against cmd_args
  (key must be a real field; weights require a list and must match its length).
- Exclude env_params keys from both param_space and is_dse_job: an env-sampled
  list is not a search dimension, so an env-params-only workload is not a DSE job.
- validate_dse_env_params rejects env_params on non-DSE runs and on grid_search
  (exhaustive search cannot exploit per-trial randomization).
- Scrub private-implementation references from public docstrings.
- Unit tests use generic Atari Breakout semantics (ball_speed / paddle_width).
…pyright

- validate_env_params: reject structured (non-leaf) cmd_args targets. The
  observer cannot sample them, yet param_space/is_dse_job exclude the whole
  key, which would silently drop nested action dimensions.
- CloudAIGymEnv.write_trajectory: rebind the env.csv sink to the current
  iteration path before each write, so env.csv stays 1:1 aligned with
  trajectory.csv when the env instance is reused across iterations.
- test_env_params: assert the unknown-field rejection via model_validate so the
  negative test no longer trips pyright's call-arg check (CI Linting fix); add a
  structured-target rejection test.
An unweighted env_params spec skipped the candidate-list check, so an empty cmd_args list
(e.g. ball_speed = []) passed validation and only failed later in EnvParamsSampler.sample()
via rng.choice([]) (IndexError). Guard against an empty candidate list in validate_env_params
so the error surfaces at TestDefinition build time. Addresses CodeRabbit feedback.
…ms value objects

Replace the EnvParamsSampler class and the StepObserver/EnvParamsObserver
indirection with two frozen dataclasses: EnvParam (one resolved knob: candidates,
optional weights, single draw) and EnvParams (per-run knobs + seed, built via
from_test, sampled per trial). The sampling RNG lives in the env: step() draws
this trial's values and hands concrete values to TestRun.apply_params_set(action,
env_params=...), which overlays action and sample through one deterministic path.

Centralize the cmd_args -> env_params lookup in TestDefinition.is_env_sampled and
access current_env_params directly. Expand EnvParam/EnvParams unit tests to cover
draw, from_test, sample, and immutability.
Drop the EnvParamsSink Protocol + CsvSink pair (and runtime_checkable) for a
single concrete EnvParamsSink, built unconditionally in CloudAIGymEnv. The
sink is now stateless: write() takes the record path per call and skips empty
samples, so non-DR runs write nothing and write_trajectory needs no branch.

Derive both records from a new iteration_dir property and expose the env
record via the env_params_record_path property (was _env_csv_path), keeping
env.csv and trajectory.csv step-aligned without coupling the name to CSV.
…ty flag

Replace the hardcoded `agent == "grid_search"` check with a BaseAgent.samples_env_params
capability flag (opt-in, defaults False). Only agents whose search consumes per-trial
env_params sampling set it True; enumerating/surrogate agents leave it False, so a config
that declares env_params for an agent that would ignore them is rejected up front instead
of silently no-op'ing. New agents answer for themselves with no string to maintain.

Relocate validate_dse_env_params out of the CLI handlers into configurator/env_params.py
next to the logic it guards, looking the agent up via the Registry. Unknown agents are
deferred to the dedicated agent-resolution error rather than masked here.

Keep all public-facing comments, docstrings, and the error message generic (no internal
agent names). Cover the full validator matrix, including the unknown-agent deferral.
Compress multi-line inline comments down to the single non-obvious rationale (or drop
them where the code already speaks), per the self-documenting-code principle. Public API
docstrings and test intent comments are left intact.
apply_params_set overlays sampled scalar draws onto cmd_args, then
reconstructs the TestDefinition to validate the applied action values.
That pass re-ran validate_env_params, which rejects a weighted env_param
whose cmd_args target is no longer a candidate list - exactly what the
overlay produces. env_params is already validated at parse time, so drop
it from the validation-only dump. Adds a regression test covering a
weighted env_param's scalar draw.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch 2 times, most recently from 31fc658 to e581e0c Compare June 25, 2026 22:17
An env_params entry only reclassifies a list-valued cmd_args sweep as
env-sampled; a scalar is already fixed, so annotating it is a meaningless
label. Previously such an annotation was tolerated as a silent no-op, which
let it slip through parse-time validation and inconsistently trip (or not)
the downstream "no agent will sample them" check depending on run mode.

Reject it where the contract lives - TestDefinition.validate_env_params -
so the failure is immediate and mode-independent. EnvParams.from_test's
non-list guard becomes defensive (parse-time now guarantees lists); the
post-overlay path already drops env_params before re-validating, so concrete
scalar draws are unaffected.

Extract the per-field checks into a helper to keep the validator under the
complexity limit, and update tests: scalar annotations now assert rejection
instead of no-op tolerance.
…otocol

Add ObsLeafDescriptor (a self-describing observation leaf: "box" of width
dim, or "discrete" of size n) and a StructuredObservation Protocol that
documents the optional env hooks structured_observation_descriptors() and
encode_observation(). These let an env expose a named, per-leaf observation
so adapters (e.g. GymnasiumAdapter) can build the matching gymnasium
spaces.Dict; the hooks are duck-typed, so envs need not subclass.

Both exported via cloudai.core.
…ejection tests

Negative tests pass an extra kwarg and an out-of-Literal kind to assert
ValidationError; mark the deliberate type violations with type: ignore.
@rutayan-nv rutayan-nv force-pushed the rpatro/obs-leaf-descriptor branch from e581e0c to 584f145 Compare June 26, 2026 15:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants