Skip to content

feat: add skip_permissions, continue command, and build retry for resilient opencode automation#2

Open
jc01rho wants to merge 3 commits into
cortexkit:mainfrom
jc01rho:main
Open

feat: add skip_permissions, continue command, and build retry for resilient opencode automation#2
jc01rho wants to merge 3 commits into
cortexkit:mainfrom
jc01rho:main

Conversation

@jc01rho

@jc01rho jc01rho commented Jun 12, 2026

Copy link
Copy Markdown

Adds automatic retry, permission bypass, and session resume to opencode automation.

Features

  • Provider error retry: Auto-retries opencode run on transient provider errors (rate limits, 5xx, network failures) using --continue to resume the interrupted session
  • Build retry: Re-runs opencode when build artifacts are missing or session/build errors occur
  • Permission bypass: New skip_permissions config (default: true) passes --dangerously-skip-permissions to skip interactive permission prompts in automation
  • Session resume: New orw continue command resumes the last interrupted opencode session without re-running git clone/bun install

Config fields

  • retry_attempts: max provider error retry attempts (default: 3)
  • retry_delay_ms: delay between provider retries in ms (default: 10000)
  • build_retry_attempts: max build retry attempts (default: 3)
  • build_retry_delay_ms: delay between build retries in ms (default: 30000)
  • skip_permissions: pass --dangerously-skip-permissions to opencode (default: true)

Implementation

  • src/retry.ts: isRetryableError, readLogSlice, fileSize with per-attempt log isolation
  • src/index.ts: Two nested retry loops: runOpenCodeWithRetry (provider) and runOpenCodeWithBuildRetry (build artifacts)
  • src/retry.test.ts: 11 tests covering pattern detection, per-attempt log slicing, false-positive guards

Commands

orw check # Build latest release with full retry
orw check --force # Force rebuild even if already processed
orw continue # Resume last interrupted opencode session

Greptile Summary

This PR adds automatic retry logic, permission bypass, and session resume to the opencode automation pipeline. Two nested retry loops (runOpenCodeWithRetry for provider errors and runOpenCodeWithBuildRetry for missing artifacts) replace the previous single-shot opencode run call, and a new orw continue command wires into the same retry path.

  • src/retry.ts (new): pure-function helpers isRetryableError, readLogSlice, and fileSize with per-attempt log isolation to avoid false-positive retries from accumulated log content.
  • src/index.ts: replaces the inline opencode run block with runOpenCodeWithBuildRetryrunOpenCodeWithRetry; adds continueRun, isBuildRetryableError, and five new config fields.
  • src/retry.test.ts (new): 11 unit tests covering pattern detection and per-attempt log slicing.

Confidence Score: 4/5

Mostly safe to merge; the duplicate writeState call should be fixed first.

The check function calls writeState(cfg, next) twice in a row — a copy-paste artifact at the seam between the old and new code. While currently idempotent, it is a structural mistake that would break if writeState ever gains append semantics. The retry and permission-bypass logic itself is well-structured and covered by tests.

src/index.ts — the duplicate writeState and two dead variables (afterSize, logOffset) in the new retry functions.

Important Files Changed

Filename Overview
src/index.ts Adds continueRun, runOpenCodeWithBuildRetry, runOpenCodeWithRetry, and isBuildRetryableError. Contains a duplicate writeState call in check, a dead afterSize variable, and a dead logOffset variable in runOpenCodeWithRetry.
src/retry.ts New module providing isRetryableError, readLogSlice, and fileSize. Logic is clean; HTTP-status regexes use word-boundary context guards to avoid false positives.
src/retry.test.ts 11 unit tests covering isRetryableError — pattern matching, false-positive guards, and per-attempt log isolation. Tests are correct and match the implementation.
README.md Documents the new retry_attempts / retry_delay_ms config fields and orw continue command. Does not document build_retry_* and skip_permissions fields.
config.example.json Adds retry_attempts and retry_delay_ms to the example config. Does not include build_retry_attempts, build_retry_delay_ms, or skip_permissions.

Sequence Diagram

sequenceDiagram
    participant CLI as orw check / orw continue
    participant BRetry as runOpenCodeWithBuildRetry
    participant PRetry as runOpenCodeWithRetry
    participant OC as opencode run
    participant VB as verifyBuild

    CLI->>BRetry: prompt, env, log, release
    loop build_retry_attempts
        BRetry->>PRetry: cfg, prompt, env, log
        loop retry_attempts
            PRetry->>OC: opencode run
            alt success
                OC-->>PRetry: exit 0
                PRetry-->>BRetry: return
            else retryable provider error
                OC-->>PRetry: exit 1
                PRetry->>PRetry: isRetryableError(err, logSlice)
                PRetry->>PRetry: sleep(retry_delay_ms)
            else non-retryable error
                PRetry-->>BRetry: throw
            end
        end
        BRetry->>VB: verifyBuild
        alt artifacts verified
            VB-->>BRetry: State
            BRetry-->>CLI: State
        else isBuildRetryableError
            VB-->>BRetry: throw
            BRetry->>BRetry: sleep(build_retry_delay_ms)
        else non-retryable
            BRetry-->>CLI: throw
        end
    end
Loading

Reviews (8): Last reviewed commit: "feat: add build retry logic with artifac..." | Re-trigger Greptile

When opencode run fails with retryable provider errors (Provider
returned error, rate limits, 5xx, network timeouts), ORW now
automatically retries using `opencode run --continue` to resume
the interrupted session.

New config fields:
- retry_attempts: max attempts including first (default 3)
- retry_delay_ms: delay between retries (default 10000ms)

Retryable patterns detected from log output:
- Provider returned error, rate limit, overloaded
- HTTP 429/500/502/503
- Network errors (ECONNRESET, ETIMEDOUT, socket hang up)
Copilot AI review requested due to automatic review settings June 12, 2026 17:08

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds configurable retry behavior for opencode run failures that appear to be transient provider errors, improving resilience of the check workflow.

Changes:

  • Introduces retry_attempts / retry_delay_ms to config types, defaults, init config, and example config.
  • Wraps OpenCode execution in runOpenCodeWithRetry() and retries with opencode run --continue on retryable errors.
  • Documents the new retry behavior and config options in README/help text.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/index.ts Adds retry config + new retry wrapper around opencode run and log-based retryable error detection.
config.example.json Shows the new retry configuration keys and defaults.
README.md Documents retry behavior and adds the new config keys to the example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/index.ts
Comment on lines +420 to +450
let lastErr: unknown;

for (let attempt = 1; attempt <= cfg.retry_attempts; attempt++) {
const isRetry = attempt > 1;
if (isRetry) {
out(`Retry attempt ${attempt}/${cfg.retry_attempts} after provider error (waiting ${cfg.retry_delay_ms / 1000}s)...`);
await note(log, `\n--- Retry attempt ${attempt}/${cfg.retry_attempts} ---\n`);
await sleep(cfg.retry_delay_ms);
}

const args = [cfg.opencode_bin, "run"];
if (isRetry) args.push("--continue");
args.push("--agent", cfg.agent, "--model", cfg.model, prompt);

try {
await run(args, {
cwd: cfg.work_repo,
log,
env: envWithFlag,
});
return;
} catch (err) {
lastErr = err;
if (!isRetryableError(err, log)) {
throw err;
}
out(`Attempt ${attempt} failed with retryable provider error.`);
}
}

throw lastErr;
Comment thread src/index.ts Outdated

if (msg.includes("exited with")) {
try {
const logContent = readFileSync(log, "utf8").toLowerCase();
Comment thread src/index.ts Outdated
Comment thread src/index.ts
Comment thread src/index.ts Outdated
Comment thread src/index.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/index.ts

@ualtinok ualtinok left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR — the retry behavior is useful, but I don't think this is safe to merge yet.

Please address these before we reconsider:

  1. Validate retry_attempts. Right now retry_attempts: 0 skips opencode run entirely and ends up throwing undefined. Since attempts include the initial run, values below 1 should either be rejected clearly or clamped to 1.

  2. Classify retryability from only the current attempt's output/log slice. The implementation scans the full accumulated log file, so after attempt 1 logs a retryable provider error, any later non-retryable failure can still be treated as retryable because the old message remains in the shared log. Capture a log offset before each attempt, or otherwise inspect only output from that attempt.

  3. Tighten the HTTP matching. Bare "429", "502", and "503" can match unrelated numbers in logs. Please match provider/HTTP error context instead.

  4. Add focused regression coverage for the retry behavior: invalid/zero attempts, retry using --continue, per-attempt retry classification, and a non-retryable second failure not being retried because of stale first-attempt log content.

Typecheck passes, and the feature direction makes sense, but these failure-mode issues need to be fixed before merge.

@jc01rho jc01rho requested a review from ualtinok June 13, 2026 16:15
1. Clamp retry_attempts to minimum 1 to prevent undefined throw
2. Use per-attempt log slicing to avoid stale retryable errors
3. Tighten HTTP status code matching with context patterns
4. Add focused regression tests for retry behavior

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/index.ts">

<violation number="1" location="src/index.ts:447">
P1: Reading the log slice immediately after the process exits can miss provider-error text because `exec` appends log chunks asynchronously via unawaited `void write(chunk)`. Missing error text causes `isRetryableError` to return false, skipping a needed retry.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread src/index.ts
Comment thread src/retry.ts
@socket-security

socket-security Bot commented Jun 14, 2026

Copy link
Copy Markdown

No dependency changes detected. Learn more about Socket for GitHub.

👍 No dependency changes detected in pull request

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/index.ts">

<violation number="1" location="src/index.ts:447">
P1: Reading the log slice immediately after the process exits can miss provider-error text because `exec` appends log chunks asynchronously via unawaited `void write(chunk)`. Missing error text causes `isRetryableError` to return false, skipping a needed retry.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread src/index.ts
Comment thread src/index.ts Outdated
Comment thread src/index.ts

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="prompt.txt">

<violation number="1" location="prompt.txt:24">
P2: Step 5's bug search runs after the build but before verification without an explicit rebuild step, creating ambiguity about whether fixes require recompilation. If bugs are fixed after step 4, the binary verified in steps 6-7 would be stale. The phrase 'before final build' implies a rebuild is needed but no such step is present.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread prompt.txt Outdated
4. Build the host CLI in `packages/opencode` with `OPENCODE_CHANNEL={{channel}} OPENCODE_VERSION={{version}} bun run build -- --single`.
5. Ensure the built CLI binary exists at `{{cli}}`.
6. Run `{{cli}} --version` and verify the output is exactly `{{version}}`.
5. Search for session ID encoding bugs in merged PRs before final build. Focus on:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Step 5's bug search runs after the build but before verification without an explicit rebuild step, creating ambiguity about whether fixes require recompilation. If bugs are fixed after step 4, the binary verified in steps 6-7 would be stale. The phrase 'before final build' implies a rebuild is needed but no such step is present.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At prompt.txt, line 24:

<comment>Step 5's bug search runs after the build but before verification without an explicit rebuild step, creating ambiguity about whether fixes require recompilation. If bugs are fixed after step 4, the binary verified in steps 6-7 would be stale. The phrase 'before final build' implies a rebuild is needed but no such step is present.</comment>

<file context>
@@ -21,8 +21,13 @@ Exact tasks:
 4. Build the host CLI in `packages/opencode` with `OPENCODE_CHANNEL={{channel}} OPENCODE_VERSION={{version}} bun run build -- --single`.
-5. Ensure the built CLI binary exists at `{{cli}}`.
-6. Run `{{cli}} --version` and verify the output is exactly `{{version}}`.
+5. Search for session ID encoding bugs in merged PRs before final build. Focus on:
+   - Auto-resume plugins (`opencode-auto-resume`) that may pass URL-encoded placeholder values like `%7Bid%7D` instead of real session IDs
+   - Session validation code that expects strings starting with `ses` but may receive encoded placeholders
</file context>

@jc01rho jc01rho changed the title feat: retry opencode run on provider errors using --continue feat: add skip_permissions, continue command, and build retry for resilient opencode automation Jun 14, 2026
Comment thread src/index.ts
Comment on lines +386 to +411
async function continueRun(cfg: Cfg) {
const release = await latest(cfg);
const prev = await readState(cfg);
if (!prev.tag) throw new Error("No previous build state found. Run `orw check` first.");

const free = await hold(cfg, true);
try {
const log = prev.log ?? path.join(
logDir(cfg),
`${stamp()}-${release.tag_name.replaceAll("/", "-")}-continue.log`,
);
await fs.mkdir(path.dirname(log), { recursive: true });
await note(log, `\n--- Resuming: continue run for ${release.tag_name} ---\n`);

const env = releaseEnv(release);
const sources = resolveSources(cfg);
const prompt = await render(cfg, sources, release);

out(`Resuming last opencode session for ${release.tag_name}...`);
const next = await runOpenCodeWithBuildRetry(cfg, prompt, env, log, release);
await writeState(cfg, next);
out(`Continue completed for ${release.tag_name}`);
} finally {
await free();
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 orw continue never actually resumes the interrupted session

continueRun delegates to runOpenCodeWithBuildRetryrunOpenCodeWithRetry, which only adds --continue when isRetry = attempt > 1. On the first attempt, isRetry is false, so opencode run is invoked without --continue — starting a brand-new session in the existing work directory rather than resuming the interrupted one. The flag only reaches opencode if that fresh run happens to fail with a retryable provider error and then retries.

To fix this, runOpenCodeWithRetry (and by extension runOpenCodeWithBuildRetry) needs an option like { forceContinue?: boolean } that continueRun can pass through to make --continue unconditional on every attempt.

1. Add build_retry_attempts/build_retry_delay_ms config fields
2. Implement isMissingArtifactError() helper to detect artifact completion
3. Add retry loop around verifyBuild that waits for background artifacts
4. Update help text and config.example.json with new options
5. Add focused regression tests for build retry behavior

This enables ORW to automatically retry opencode builds when artifacts
aren't immediately available, which handles the case where bun test
background processes take longer than 8 seconds to complete.

Fixes: build artifact retry for background processes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants