Skip to content

[codex] Add reproducible minimal PPO WSL workflow#484

Draft
HC-Seaple wants to merge 2 commits into
Emerge-Lab:2.0from
HC-Seaple:codex/minimal-ppo-wsl
Draft

[codex] Add reproducible minimal PPO WSL workflow#484
HC-Seaple wants to merge 2 commits into
Emerge-Lab:2.0from
HC-Seaple:codex/minimal-ppo-wsl

Conversation

@HC-Seaple

Copy link
Copy Markdown

What changed

  • Add a self-contained continuous-action PPO trainer with vectorized rollouts, GAE, clipped updates, checkpointing, and deterministic evaluation.
  • Add Windows/WSL setup and launch scripts for the Linux-native Raylib build.
  • Add generic WOMD JSON-to-map preparation without committing datasets or generated binaries.
  • Add native third-person checkpoint visualization and JSON metrics.
  • Ensure complete renderer frames are written to ffmpeg.
  • Document clone, setup, map preparation, training, visualization, and handoff.

Validation

  • Python scripts pass python -m py_compile.
  • WSL launchers pass bash -n.
  • The staged change set passes git diff --check.
  • The end-to-end workflow was previously exercised in WSL with a 10,112-step checkpoint and 92-frame native render.

Current limitation

This is a smoke-test training architecture. Reward shaping still needs route-progress reward, reverse-motion penalties, and stronger collision/off-road costs before scaling.

@eugenevinitsky

Copy link
Copy Markdown

Hi! Thanks for the PR. We can review and discuss this once it's off draft status but a quick thing to mention is that this probably needs a slightly different folder structure since otherwise it pollutes the script folder with a lot of unstructured code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants