|
| 1 | +--- |
| 2 | +name: fix-art-issues |
| 3 | +description: > |
| 4 | + Fix a GitHub issue on OpenPipe/ART and open a PR. |
| 5 | + Use when the user asks to fix, solve, or work on an ART issue, |
| 6 | + or references a GitHub issue URL containing "OpenPipe/ART". |
| 7 | + Triggers: "fix ART issue", "solve this issue" with an OpenPipe/ART URL, |
| 8 | + "work on ART #N". |
| 9 | +--- |
| 10 | + |
| 11 | +# Fix ART Issue |
| 12 | + |
| 13 | +Fix a GitHub issue on `OpenPipe/ART` and open a PR. |
| 14 | + |
| 15 | +- **Repo**: `OpenPipe/ART` |
| 16 | +- **Base branch**: `main` |
| 17 | + |
| 18 | +Assumes the workspace is already set up with the correct branch checked out and `.env` in place (handled by the system-level `fix-art-workspace` skill). |
| 19 | + |
| 20 | +## Workflow |
| 21 | + |
| 22 | +### 1. Read the Issue |
| 23 | +``` |
| 24 | +gh issue view <number> --repo OpenPipe/ART --json title,body,labels,assignees,comments |
| 25 | +``` |
| 26 | + |
| 27 | +### 2. Explore, Plan, Implement |
| 28 | +- Use the Explore agent to understand relevant code before making changes. |
| 29 | +- Plan clearly, implement with minimal focused changes. No over-engineering. |
| 30 | + |
| 31 | +### 3. Commit and Push |
| 32 | +- Commit with a message that includes `Closes #<issue-number>`. |
| 33 | +- Push the feature branch. If HTTPS push fails due to SAML SSO, set SSH remote: `git remote set-url origin git@github.com:OpenPipe/ART.git` |
| 34 | + |
| 35 | +### 4. Open a Draft PR |
| 36 | +- `gh pr create --base main --draft`. |
| 37 | +- PR body: `## Summary`, `Closes #<number>`, `## Changes`, `## Test plan`. |
| 38 | + |
| 39 | +### 5. Testing |
| 40 | +- **No test artifacts in the final PR**: debug prints, test scripts, and temporary changes must NOT be committed. |
| 41 | +- Update the PR's test plan section with detailed results. |
| 42 | +- When testing passes, mark the PR as ready: `gh pr ready`. |
| 43 | + |
| 44 | +## Reference |
| 45 | + |
| 46 | +Read `CONTRIBUTING.md` at the repo root for guidance on code quality checks (prek), CI cache refresh, and the release process. |
| 47 | + |
| 48 | +## Dependency Management Tips |
| 49 | + |
| 50 | +- **Pin versions strictly** (`==`) for critical deps like `transformers`, `trl`, `unsloth`, `unsloth-zoo`, `vllm` to avoid surprise breakage from new releases. |
| 51 | +- **Don't loosen pins without reason**: if a dep was `==X.Y.Z`, keep it pinned unless there's a specific reason to change. Don't use `>=` just because it seems more flexible. |
| 52 | +- **`uv run` fails on macOS** for backend deps (apex/torch need CUDA). This is expected — use `uvx ruff` for linting locally, test on GPU cluster. |
| 53 | + |
| 54 | +## Deploying a GPU Cluster |
| 55 | + |
| 56 | +Name the SkyPilot cluster after the branch name without the `fix/` prefix, replacing `/` with `-` (SkyPilot doesn't allow slashes). For example, if the branch is `fix/short-description`: |
| 57 | +``` |
| 58 | +uv run sky launch -c short-description skypilot-config.yaml -y |
| 59 | +``` |
| 60 | + |
| 61 | +To connect: `ssh short-description` |
| 62 | + |
| 63 | +To tear down when done: `uv run sky down short-description` |
| 64 | + |
| 65 | +## GPU Cluster Testing Tips |
| 66 | + |
| 67 | +- **Kill stale GPU processes** before re-running tests: `nvidia-smi --query-compute-apps=pid --format=csv,noheader | xargs -r kill -9`. Previous failed runs leave processes holding GPU memory. |
| 68 | +- **Set `gpu_memory_utilization`** in test scripts (e.g. `0.7`) — the default `0.9` is too high when Unsloth's training model is also loaded on the same GPU. |
| 69 | +- **Redirect test output to a log file**: `nohup python test.py > /tmp/output.log 2>&1 &` then `tail -f /tmp/output.log`. SSH background tasks lose output when connection drops. |
| 70 | +- **Git on cluster**: SSH keys may not be configured. Use HTTPS with token: `git remote set-url origin https://${GITHUB_TOKEN}@github.com/OpenPipe/ART.git` |
| 71 | +- **Tear down clusters** when done: `sky down <cluster-name> -y` |
| 72 | + |
| 73 | +$ARGUMENTS |
0 commit comments