Skip to content

Commit 320a15c

Browse files
JS-1495 Refine peach-check skill: improve grep filters, mass failure verdict rules, step numbering (#6653)
Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
1 parent 736f101 commit 320a15c

File tree

3 files changed

+361
-64
lines changed

3 files changed

+361
-64
lines changed
Lines changed: 181 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
22
name: peach-check
3-
description: Fetch the latest Peach Main Analysis run from SonarSource/peachee-js, classify all
4-
failed jobs as critical analyzer issues or safe-to-ignore infrastructure problems, and print a
5-
summary table. Use before a SonarJS release to verify analyzer stability.
6-
disable-model-invocation: true
3+
description: Use before a SonarJS release or when the nightly Peach Main Analysis workflow shows
4+
failures that need triage. Classifies each failure as a critical analyzer bug or a safe-to-ignore
5+
infrastructure problem.
6+
allowed-tools: Bash(gh run list:*), Bash(gh api:*), Bash(gh run rerun:*), Bash(mkdir:*),Bash(sed --sandbox:*), Read, Agent
77
---
88

99
# Peach Main Analysis Check
@@ -14,6 +14,21 @@ This skill fetches the latest Peach Main Analysis GitHub Actions run, classifies
1414
and produces a summary. It is self-contained: all instructions are in this file and in
1515
`docs/peach-main-analysis.md`.
1616

17+
## Prerequisites
18+
19+
Before running this skill, ensure:
20+
21+
- `gh` is installed and authenticated (`gh auth status`)
22+
- the current GitHub identity has access to `SonarSource/peachee-js`
23+
- the environment permits outbound GitHub API requests
24+
- parallel `gh api` calls are allowed, since failed jobs should be triaged concurrently
25+
26+
## Command discipline
27+
28+
**Never chain commands** with `&&`, `;`, or `|`. Each command in this skill must be issued as a
29+
separate Bash call. Chaining bypasses the per-tool permission prompts that allow the user to
30+
review each action individually.
31+
1732
## Invocation
1833

1934
```
@@ -25,13 +40,7 @@ and produces a summary. It is self-contained: all instructions are in this file
2540

2641
## Execution
2742

28-
You are the **orchestrator**. Follow these phases exactly.
29-
30-
---
31-
32-
### Phase 1: Fetch failed jobs
33-
34-
**Step 1.1 — Find the run to analyse**
43+
**Step 1 — Find the run to analyse**
3544

3645
If the user provided a `run-id`, use it directly.
3746

@@ -43,25 +52,46 @@ gh run list \
4352
--workflow main-analysis.yml \
4453
--branch js-ts-css-html \
4554
--limit 5 \
46-
--json databaseId,conclusion,createdAt,status
55+
--json databaseId,conclusion,createdAt,status \
56+
--jq '[.[] | select(.status == "completed")] | first | {databaseId, conclusion, createdAt}'
57+
```
58+
59+
This prints the `databaseId`, `conclusion`, and `createdAt` of the most recent completed run (meaning finished running, not necessarily passed — a completed run can have failed jobs, which is what we're looking for). Record `databaseId` as `RUN_ID`.
60+
61+
**Step 1b — Rerun if the run was cancelled**
62+
63+
If the run `conclusion` is `"cancelled"`, the run did not finish normally — some jobs were cut short before they could produce results. Rerun the cancelled/failed jobs automatically:
64+
65+
```bash
66+
gh run rerun RUN_ID --repo SonarSource/peachee-js --failed
67+
```
68+
69+
Then print:
70+
71+
```
72+
⚠️ Run RUN_ID (DATE) was cancelled before completion.
73+
Rerun triggered for all failed/cancelled jobs. Check back once the rerun completes.
4774
```
4875

49-
Pick the most recent run with `status == "completed"` (meaning finished running, not necessarily passed — a completed run can have failed jobs, which is what we're looking for). Record its `databaseId` as `RUN_ID`.
76+
Then stop — do not attempt to triage the incomplete results.
5077

51-
**Step 1.2 — Collect all failed jobs**
78+
**Step 2 — Collect all failed jobs**
5279

5380
The run has ~250 jobs across 3 pages. Fetch all three pages and collect jobs where
5481
`conclusion == "failure"`:
5582

5683
```bash
57-
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=1"
58-
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=2"
59-
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=3"
84+
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=1" \
85+
--jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]'
86+
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=2" \
87+
--jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]'
88+
gh api "repos/SonarSource/peachee-js/actions/runs/RUN_ID/jobs?per_page=100&page=3" \
89+
--jq '[.jobs[] | select(.conclusion == "failure") | {name, id, completedAt}]'
6090
```
6191

62-
For each page, collect jobs where `conclusion == "failure"`. Record each job's `name` and `id`.
92+
Each command outputs only the failed jobs for that page. `completedAt` may be `null` — see Step 7 for handling.
6393

64-
**Step 1.3 — Early exit if no failures**
94+
**Step 3 — Early exit if no failures**
6595

6696
If there are no failed jobs, print:
6797

@@ -71,81 +101,175 @@ If there are no failed jobs, print:
71101

72102
Then stop.
73103

74-
**Step 1.4 — Create tasks for each failure**
104+
**Step 4 — Mass failure detection**
75105

76-
For each failed job, create a task:
106+
If **≥80% of jobs failed** (e.g. 200+ out of 253), this indicates a single shared root cause.
107+
Do not triage every job individually.
77108

109+
Instead:
110+
1. Sample 5 representative jobs (spread across pages 1–3)
111+
2. Run Phase 2 grep on each (see below) to classify each sampled job individually, including sensor and stack trace origin
112+
3. If **any** sampled job is CRITICAL, the mass verdict is CRITICAL — CRITICAL takes priority regardless of how many other jobs match an IGNORE pattern
113+
4. Otherwise, apply the shared pattern's verdict to all failed jobs
114+
5. In the summary, note the mass event and list only the sampled jobs as evidence
115+
116+
**Mass failure verdict rules — check in this order:**
117+
118+
- **CRITICAL** if the shared stack trace originates in `org.sonar.plugins.javascript` — the
119+
SonarJS plugin itself is broken (e.g. fails to initialize, crashes during analysis). This
120+
takes priority over any infrastructure explanation: if the SonarJS plugin is at fault, it
121+
must be fixed before release regardless of how many jobs are affected.
122+
123+
- **IGNORE** if the shared error is a Peach infrastructure failure with no SonarJS involvement:
124+
- Peach server down: `HTTP 502 Bad Gateway` at `peach.sonarsource.com/api/server/version`
125+
- Artifact expired: `Artifact has expired (HTTP 410)` during JAR download — **only when
126+
exit code 1 is the sole failure**. If exit code 3 also appears, the analysis still ran
127+
after the download failure; treat exit code 3 as a separate failure and check it for
128+
SonarJS involvement before concluding IGNORE.
129+
130+
- **NEEDS-MANUAL-REVIEW** if the pattern does not match either of the above.
131+
132+
**Step 5 — Read the classification guide and triage all logs**
133+
134+
Read `docs/peach-main-analysis.md` once to load the failure categories and decision flowchart.
135+
136+
Create the work directory where logs will be stored for inspection:
137+
138+
```bash
139+
mkdir -p target/peach-logs
78140
```
79-
TaskCreate subject: "Assess peach failure: JOB_NAME"
80-
description: "Classify the failure of job JOB_NAME (id: JOB_ID) in Peach Main Analysis run RUN_ID.
81-
Download the logs and classify using docs/peach-main-analysis.md."
82-
```
83141

84-
**Step 1.5 — Launch parallel assessment agents**
142+
Then triage each failed job using a graduated approach. Work through phases as needed — stop as
143+
soon as a job can be classified. Run all jobs in parallel within each phase.
144+
145+
**Phase 1 — Download log and filter for failure signals (always, all jobs in parallel)**
85146

86-
Before launching agents, download the logs for ALL failed jobs:
147+
Download the log to disk, then filter for key failure signals. Saving to disk avoids re-downloading
148+
in Phase 2 and leaves logs available for manual inspection after the run. Do NOT use `tail -40`
149+
cleanup steps often run after the scan step fails (e.g. always-run SHA extraction), pushing the
150+
exit code out of the tail window. A multi-line `sed -n` script is more reliable and easier to
151+
maintain than one long regular expression. `--sandbox` prevents sed from executing shell commands
152+
via the `e` command, which is a risk when processing untrusted log content:
87153

88154
```bash
89-
gh api "repos/SonarSource/peachee-js/actions/jobs/JOB_ID/logs"
155+
gh api "repos/SonarSource/peachee-js/actions/jobs/JOB_ID/logs" \
156+
> target/peach-logs/JOB_ID.log
157+
sed --sandbox -n '
158+
/Process completed with exit code/p
159+
/EXECUTION FAILURE/p
160+
/OutOfMemoryError/p
161+
/502 Bad Gateway/p
162+
/503 Service Unavailable/p
163+
/Artifact has expired/p
164+
/All 3 attempts failed/p
165+
/ERR_PNPM/p
166+
/ERESOLVE/p
167+
/ETARGET/p
168+
/notarget/p
169+
/Invalid value of sonar/p
170+
/does not exist for/p
171+
' target/peach-logs/JOB_ID.log
90172
```
91173

92-
Run this for each failed job. Store the output (trim to the most relevant ~100 lines if very long — keep lines containing ERROR, exit code, exception names, step group headers). You will pass these logs inline to each agent.
174+
Use the decision flowchart and failure categories from `docs/peach-main-analysis.md` to classify
175+
the filtered output. If the filtered lines show exit code 3 (EXECUTION FAILURE from the
176+
SonarQube scanner), always continue to Phase 2 — Phase 1 does not surface Java stack traces,
177+
so the SonarJS plugin involvement cannot be ruled out from Phase 1 alone.
93178

94-
Then update each agent prompt to include the logs inline. The per-job agent prompt becomes (fill in JOB_NAME, JOB_ID, TASK_ID, LOG_CONTENT):
179+
**Phase 2 — Sensor and stack trace filter (for exit code 3 failures)**
95180

96-
In a SINGLE message, launch one Agent tool call per failed job. All agents run concurrently.
97-
Each agent receives this prompt (fill in JOB_NAME, JOB_ID, TASK_ID, LOG_CONTENT):
181+
When Phase 1 shows exit code 3, run this to find the last sensor that ran and surface any
182+
SonarJS plugin stack trace. The log is already on disk from Phase 1 — no re-download needed:
98183

184+
```bash
185+
sed --sandbox -n '
186+
/Sensor /p
187+
/EXECUTION FAILURE/p
188+
/OutOfMemoryError/p
189+
/Process completed with exit code/p
190+
/org\.sonar\.plugins\.javascript/p
191+
' target/peach-logs/JOB_ID.log
99192
```
100-
You are assessing a failed job in the Peach Main Analysis workflow.
101193

102-
Job: JOB_NAME
103-
Job ID: JOB_ID
104-
Task ID: TASK_ID (mark this task complete when done)
194+
This surfaces both the last sensor that ran and any `org.sonar.plugins.javascript` frames in the
195+
stack trace. Apply the classification rules in `docs/peach-main-analysis.md` and run this only
196+
for jobs that need it, all concurrently.
105197

106-
Your steps:
107-
0. Working directory: use the directory where this skill was invoked (run `pwd` to confirm). Verify that `docs/peach-main-analysis.md` exists before reading it.
108-
1. Read docs/peach-main-analysis.md to understand failure categories and the decision flowchart
109-
2. The job logs are provided below between the <job-logs> tags. Read them carefully.
198+
**Phase 3 — Full log (only when Phase 2 is still ambiguous)**
110199

111-
<job-logs>
112-
LOG_CONTENT
113-
</job-logs>
200+
If the failure still cannot be classified (unrecognised stack trace, unexpected exit code), read
201+
the full log from disk using the `Read` tool on `target/peach-logs/JOB_ID.log`. This should be
202+
rare.
114203

115-
3. Classify the failure using the decision flowchart in docs/peach-main-analysis.md
116-
4. Mark the task TASK_ID as complete
117-
5. Return a structured assessment:
204+
**Step 6 — Classify each job**
118205

206+
Using the decision flowchart from the classification guide, classify each job directly from the
207+
logs. Most failures are unambiguous (clone timeout, dep install failure, project misconfiguration)
208+
and need no further help.
209+
210+
**Only launch parallel agents when** a job's logs are ambiguous — e.g. an unfamiliar stack trace
211+
or an exit code that doesn't match any known category. Launch one Agent per ambiguous job,
212+
all concurrently, passing the classification rules and log excerpt inline:
213+
214+
```
215+
You are assessing a failed job in the Peach Main Analysis workflow.
216+
217+
Job: JOB_NAME
218+
Classification rules: <classification-rules>CLASSIFICATION_RULES</classification-rules>
219+
Job logs: <job-logs>LOGS</job-logs>
220+
221+
Classify and return:
119222
Job: JOB_NAME
120223
Verdict: CRITICAL | IGNORE | NEEDS-MANUAL-REVIEW
121-
Category: <category name from docs/peach-main-analysis.md>
122-
Evidence: <the key log line(s) that led to this verdict, max 2 lines>
123-
124-
Do not do anything else. Just classify and return the assessment.
224+
Category: <category name>
225+
Evidence: <key log line(s), max 2>
125226
```
126227

127-
**Step 1.6 — Collect results and print summary**
228+
If an agent returns no structured assessment, record that job as `NEEDS-MANUAL-REVIEW` with
229+
evidence `Agent returned no output`.
230+
231+
**Step 7 — Check for clustered failures**
128232

129-
Wait for all agents to return. If any agent returned no structured assessment, record that job as `NEEDS-MANUAL-REVIEW` with evidence `Agent returned no output`. Then print the summary table:
233+
If 2 or more jobs share the same category, check whether they failed within a
234+
5-minute window. Use `completedAt` timestamps if available; otherwise extract the timestamp prefix
235+
from log lines (format: `2026-MM-DDTHH:MM:SS.`). If clustered, record a general note for the
236+
summary, for example:
237+
> ⚠️ N jobs failed with the same pattern within a 5-minute window — likely caused by a single infrastructure event.
238+
239+
**Step 8 — Print summary**
240+
241+
Sort rows by verdict: CRITICAL first, then NEEDS-MANUAL-REVIEW, then IGNORE.
242+
Place the Category column first. After the verdict counts and release recommendation, list any
243+
general notes collected during log analysis (for example clustered failures or mass-failure
244+
observations):
130245

131246
```
132247
## Peach Main Analysis — Run RUN_ID (DATE)
133248
134-
| Job | Verdict | Category | Evidence |
135-
|-------------|-----------------------|---------------------------|----------------------------------------------|
136-
| gutenberg | 🔴 CRITICAL | Analyzer crash | IllegalArgumentException: invalid line offset |
137-
| builderbot | ✅ IGNORE | Dep install failure | ERR_PNPM_OUTDATED_LOCKFILE |
138-
| hono | ✅ IGNORE | Dep install failure | ETARGET: No matching version for @hono/... |
249+
| Category | Job | Verdict | Evidence |
250+
|---------------------------|-------------|-----------------------|----------------------------------------------|
251+
| Analyzer crash | gutenberg | 🔴 CRITICAL | IllegalArgumentException: invalid line offset |
252+
| Dep install failure | builderbot | ✅ IGNORE | ERR_PNPM_OUTDATED_LOCKFILE |
253+
| Dep install failure | hono | ✅ IGNORE | ETARGET: No matching version for @hono/... |
139254
140255
### Summary
141256
- 🔴 CRITICAL: N jobs — investigate before release
142257
- ⚠️ NEEDS-MANUAL-REVIEW: N jobs — manual check required
143258
- ✅ IGNORE: N jobs — unrelated to SonarJS analyzer
144259
145260
**Release recommendation:** SAFE / NOT SAFE / REVIEW NEEDED
261+
262+
### Notes
263+
- ⚠️ N jobs failed with the same pattern within a 5-minute window — likely caused by a single infrastructure event.
146264
```
147265

148266
The release recommendation is:
149267
- **SAFE** — zero CRITICAL or NEEDS-MANUAL-REVIEW jobs
150268
- **NOT SAFE** — one or more CRITICAL jobs
151269
- **REVIEW NEEDED** — zero CRITICAL but one or more NEEDS-MANUAL-REVIEW jobs
270+
271+
**Step 9 — Update docs if a new failure pattern was found**
272+
273+
If any job was classified as NEEDS-MANUAL-REVIEW and you identified its root cause during this
274+
session, update `docs/peach-main-analysis.md` with a new category entry. This keeps the
275+
classification guide current for future runs.

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ bin/
44
.flattened-pom.xml
55

66
lib/
7+
node_modules
78
node_modules/
89
coverage/
910

@@ -39,6 +40,7 @@ sonarjs-1.0.0.tgz
3940
.vs/
4041

4142
# VS Code
43+
.vscode
4244
.vscode/
4345

4446
# ---- Mac OS X

0 commit comments

Comments
 (0)