Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Debug server state (generated by debug.sh)
# Debug server state (generated by verify.py)
.debug/

# Compiled Lua sources
Expand Down Expand Up @@ -42,3 +42,9 @@ luac.out
*.x86_64
*.hex


# Riptide artifacts (cloud-synced)
.humanlayer/tasks/

# Python bytecode cache (verify.py / sandbox.py)
__pycache__/
33 changes: 22 additions & 11 deletions .pi/skills/factorio-mod-dev/DEBUGGING.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,37 @@ Errors appear in `factorio-current.log` just after the script checksum lines.

## Protocol

1. `./debug.sh reset` + run `--create` → fix data-stage errors to zero
2. `./debug.sh` → fix control-stage errors (check log after checksum lines)
3. `./debug.sh reset` again → verify `on_init` ran (`storage.creative_mode ~= nil`)
4. Use RCON to inspect live state
1. `uv run verify.py load` → data + control load gate. It runs `--create`, scans
the log for `^Error`, and asserts the control-stage sentinel
`CREATIVE_MOD_CONTROL_OK` is present. `load=FAIL (data/control error)` →
data/control-stage error to fix; `load=FAIL (control stage incomplete)` → the
silent mid-`require` crash (see below).
2. `uv run verify.py behavior` → boots the headless server and asserts `on_init`
ran (`storage_initialized`) and the default state (`default_disabled`).
3. Use `uv run verify.py shell '<cmd>'` to inspect live state interactively.

## Silent control-stage failure

If `control.lua` crashes mid-`require`, the game still starts and RCON responds —
but all mod globals are `nil` and `on_init` never fires. Diagnose:
but all mod globals are `nil` and `on_init` never fires. `verify.py load` catches
this via the missing sentinel (`load=FAIL (control stage incomplete)`). To
diagnose at runtime, drive the mod's own remote interface (a bare `/c` runs in
the scenario context, not the mod's):

```bash
./rcon.sh '/c rcon.print(tostring(storage.creative_mode ~= nil))'
uv run verify.py shell '/c rcon.print(tostring(pcall(function() return remote.call("creative-mode", "is_enabled") end)))'
```

`false`/`nil` → look in `factorio-current.log` right after the mod's checksum line.
`false` → the call errored / `on_init` never ran; look in `factorio-current.log`
right after the mod's checksum line.

## Save lifecycle gotcha

`--create` with a broken `control.lua` creates the save but skips `on_init`.
After fixing control-stage errors, always `./debug.sh reset` — a server restart alone is not enough.
After fixing control-stage errors, re-run `uv run verify.py load` (it re-runs
`--create` every time, emitting a fresh sentinel) and `uv run verify.py behavior`
to confirm `on_init` ran; use `--clean` to recreate the save from scratch when a
restart alone is not enough.

## RCON caveats

Expand All @@ -42,9 +53,9 @@ After fixing control-stage errors, always `./debug.sh reset` — a server restar
## Porting to a new Factorio version

1. Bump `factorio_version` and `base >=` in `info.json`
2. Fix data-stage errors (`--create`)
3. Fix control-stage errors (`--start-server`)
4. `./debug.sh reset` — confirm `on_init` ran
2. Fix data-stage errors (`uv run verify.py load`)
3. Fix control-stage errors (`uv run verify.py load` → sentinel present)
4. `uv run verify.py behavior` — confirm `on_init` ran

**Reference:** `data/changelog.txt` in the Factorio install lists every API change by version.

Expand Down
46 changes: 36 additions & 10 deletions .pi/skills/factorio-mod-dev/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: factorio-mod-dev
description: Develop and debug the creative-mod Factorio mod. Use when working on mod scripts, prototypes, GUI, events, or porting to a new Factorio version. Covers repo layout, mod structure, key internals, and the debug toolchain.
description: Develop and debug the creative-mod Factorio mod. Use when working on mod scripts, prototypes, GUI, events, or porting to a new Factorio version. Covers repo layout, mod structure, key internals, and the verify.py verification pipeline.
---

# Factorio Mod Dev
Expand All @@ -21,26 +21,52 @@ creative-mod/
└── locale/ # translations
```

## Debug toolchain
## Verification loop

`verify.py` is the canonical way to check the mod. It loads creative-mod in the
local Factorio install, runs assertions, and exits `0`/non-zero with a stable,
greppable `RESULT:` line, so you can edit → verify → read result → iterate.
Run it via `uv`:

```bash
uv run verify.py doctor # preflight: factorio binary + version, uv, jq
uv run verify.py static # luacheck . + stylua --check .
uv run verify.py load # data + control load gate (incl. silent-crash guard)
uv run verify.py behavior # headless server + RCON assertion batch
uv run verify.py all # static → load → behavior, aggregated
uv run verify.py --help
```

The layered model is **static → load → behavior** (cheapest to deepest); `all`
runs the three in sequence. Read the result by grepping `^RESULT:` and/or
checking `$?`:

```
RESULT: load=PASS # exit 0
RESULT: load=FAIL (control stage incomplete) # exit non-zero, reason names the failure
```

For investigation, use the bounded tooling modes (successors to the removed
standalone shell wrappers):

```bash
./debug.sh # start headless server
./debug.sh gui --window-size 1920x1080 # start with full GUI (windowed)
./debug.sh log # tail factorio-current.log
./debug.sh reset # wipe save → re-triggers on_init
./rcon.sh '/c rcon.print(...)' # one-shot RCON command
./rcon-shell.sh # interactive REPL
uv run verify.py shell '/c rcon.print(game.tick)' # one-shot RCON; omit arg for a stdin REPL
uv run verify.py debug --command '/c ...' # bounded headless session
uv run verify.py debug --gui # manual-only graphical escape hatch
uv run verify.py load --clean # recreate the debug save from scratch
```

Output channels:
Output channels for the values you inspect:

| Goal | Use | Where |
|---|---|---|
| Inspect a value | `rcon.print(v)` | echoed back to terminal |
| Trace code | `log("msg")` | `factorio-current.log` |
| Dump large table | `helpers.write_file("f", d)` | `.debug/script-output/f` |

→ See `DEBUG.md` for full tool reference.
→ See `VERIFY.md` (this skill folder) for the full subcommand reference, the
`RESULT:`/exit-code contract, and the replicable local install setup.
→ See `DEBUG.md` for the output-channel reference.
→ See `DEBUGGING.md` (this skill folder) for debugging methodology and porting guide.
→ See `RELEASE.md` (this skill folder) for release checklist and GitHub Actions workflow reference.

Expand Down
148 changes: 148 additions & 0 deletions .pi/skills/factorio-mod-dev/VERIFY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# verify.py — Verification Pipeline Reference

`verify.py` is the canonical way to check the mod. It is a single bounded tool
that loads creative-mod in the local Factorio install, runs assertions, and
exits `0`/non-zero with a stable, greppable `RESULT:` line so an agent can
edit → verify → read result → iterate without a human. **Local only** (not CI).

Run it via `uv` (it has a PEP 723 inline header; stdlib only, `rcon.py` and
`sandbox.py` imported as local modules):

```bash
uv run verify.py <subcommand> [flags]
```

## The RESULT / exit-code contract

Every subcommand prints **exactly one** line of the form:

```
RESULT: <name>=PASS
RESULT: <name>=FAIL (reason)
```

and exits `0` on PASS, non-zero on FAIL. To drive it programmatically, grep for
`^RESULT:` and/or check `$?`. The `reason` names the failing tool / assertion /
phase, e.g.:

```
RESULT: static=FAIL (luacheck=PASS stylua=FAIL)
RESULT: load=FAIL (control stage incomplete)
RESULT: behavior=FAIL (assert storage_initialized)
RESULT: all=FAIL (static=PASS load=PASS behavior=FAIL)
```

Layered subcommands also print their per-step lines (e.g. `assert
storage_initialized=PASS (...)`) before the final `RESULT:` line, so partial
progress is visible.

## Subcommands

| Subcommand | What it does | Layer |
|---|---|---|
| `doctor` | Preflight: Factorio binary + `--version`, `uv`, `jq` on PATH. Distinguishes "install problem" from "mod problem". | preflight |
| `static` | Wraps `luacheck .` + `stylua --check .` (same invocations as `lint.yml`); excludes the gitignored `.debug/` sandbox from luacheck so the local result matches a clean checkout. | static |
| `load` | Bootstraps the `.debug/` sandbox, runs the bounded `--create` data+control stage, scans the log for `^Error`, and asserts the control-stage sentinel `CREATIVE_MOD_CONTROL_OK` is present (guards the silent control-crash case). | load |
| `behavior` | Boots the headless server, polls RCON until it answers, runs a read-only assertion batch, then terminates + reaps under a watchdog. | behavior |
| `all` | Runs `static` → `load` → `behavior` in sequence and aggregates into one `RESULT: all=…` line. | all |
| `debug` | Bounded, scriptable headless session: boot server, poll RCON, optional one-shot `--command`, reap under a watchdog. `--gui` is the manual-only graphical escape hatch. | tooling |
| `shell` | Bounded RCON pass-through: one-shot command argument, or stdin REPL (auto-prefixes `/c`). Attaches to a running server or starts a bounded one. | tooling |

### Flags

- `static` — no flags.
- `load` — `--clean` (recreate the debug save from scratch; default reuses for a
fast loop), `--timeout <s>` (hard timeout for `--create`, default `180`).
- `behavior` / `all` / `debug` / `shell` — `--clean`, `--timeout <s>`, and
`--ready-timeout <s>` (hard timeout to wait for the server to answer RCON,
default `120`).
- `debug` additionally — `--command '<rcon cmd>'` (one-shot, run once ready) and
`--gui` (manual-only full graphical client; blocks, needs a display).
- `shell` additionally — a positional `command` argument (one-shot); omit it to
read commands from stdin.

### Examples

```bash
uv run verify.py doctor
uv run verify.py static
uv run verify.py load
uv run verify.py load --clean
uv run verify.py behavior
uv run verify.py all
uv run verify.py shell '/c rcon.print(game.tick)'
uv run verify.py debug --command '/c rcon.print(tostring(remote.call("creative-mode", "is_enabled")))'
uv run verify.py debug --gui # manual escape hatch
```

## The behavior assertion batch (read-only)

Assertions run in the **mod's** context via the remote interface — a bare `/c`
runs in the *scenario* script context, where `storage` is the scenario's
storage, not creative-mod's. So the batch drives the mod's own interface:

- `storage_initialized` — `remote.call("creative-mode", "is_enabled")` succeeds
(`on_init` ran to completion → runtime confirmation of the silent-crash guard).
- `default_disabled` — that same call returns `false` (creative mode off by
default).

The batch is fully read-only for now; the GUI-driven "enable all cheats" path is
out of scope (no connected player on a headless server).

## Replicable local install setup

The verifier assumes a working local Factorio install is present; it does not
install Factorio. `verify.py doctor` is the runnable companion that confirms the
prerequisites below. To reproduce the environment on a new machine:

### Factorio binary

- **Factorio 2.1.7** (full install — the base mods `base`, `elevated-rails`,
`quality`, `space-age` ship with it, so no mod provisioning step is needed).
- The binary path is **fixed and self-located** relative to the mod:
`../../bin/x64/factorio` (resolved from `verify.py`'s own location, exactly as
the old shell launcher derived it from `SCRIPT_DIR`). Nothing is read from
environment variables.

### Tooling on PATH

- **`uv`** — runs `verify.py` via its PEP 723 inline header.
- **`jq`** — used in the toolchain (and checked by `doctor`).
- **`stylua`** — Lua formatter; `static` runs `stylua --check .`. It skips
gitignored paths automatically.
- **`luacheck`** — Lua linter; `static` runs `luacheck .`. **It must be built
against Lua 5.3** — it crashes under the system Lua 5.5. Install it via
luarocks pinned to Lua 5.3, into the per-user tree:

```bash
luarocks --lua-version=5.3 install luacheck --local
```

This puts the `luacheck` binary under `~/.luarocks/bin`, which **must be on
`PATH`** for `verify.py static` (and `doctor`'s assumptions) to find it:

```bash
export PATH="$HOME/.luarocks/bin:$PATH"
```

### Sandbox paths (created by `sandbox.py`)

`verify.py` stands up an isolated `.debug/` sandbox next to the mod (gitignored,
absent in CI):

```
.debug/
├── config/config.ini # read-data → factorio/data, write-data → .debug/
├── mods/
│ ├── creative-mod_<version> → ../../ (symlink to the live working tree)
│ ├── mod-list.json # copied from mods_dev/
│ └── mod-settings.dat # copied from mods_dev/ (if present)
├── saves/debug-save.zip # the live debug save
├── factorio-current.log # game + Lua log
├── console.log # server console / RCON command log
└── script-output/ # helpers.write_file() output
```

The live-tree symlink means edits are instant (no packaging step). The symlink is
re-pointed each run to the current version, and stale differently-versioned
symlinks are pruned. RCON runs on port `27015` with password `factorio-debug`.
Loading
Loading