Skip to content

[aw-failures] [aw] Failure Investigation Report — 2026-04-20 13:09–19:09 UTC #27410

@github-actions

Description

@github-actions

Executive Summary

6-hour window analysis (2026-04-20T13:09–19:09 UTC) surfaced 3 distinct failure modes across 31 total runs. Two are already tracked; one new failure pattern (chatgpt.com firewall block in the Codex AI Moderator) has no existing coverage and is tracked in sub-issue #27412.

Failure Clusters

Cluster Runs Engine Root Cause Tracked? Priority
AI Moderator — OpenAI 401 Unauthorized §24680519515 Codex Missing bearer auth header for api.openai.com #27404 P1
AI Moderator — chatgpt.com firewall block §24681803841 Codex 1 blocked egress to chatgpt.com:443 #27412 P1
Design Decision Gate — safeoutputs MCP drop §24680939211 Claude Code HTTP connection to safeoutputs MCP dropped after 63s #27405 P1
Smoke CI cancellations 24681457679, 24680574781, 24680545696 Superseded by newer pushes on main ✗ (expected)

Evidence

AI Moderator — chatgpt.com firewall block (run 24681803841)
  • Engine: Codex v0.121.0 (auto model)
  • Trigger: issues event, actor: verkyyi
  • Firewall: api.openai.com:443 — 13 allowed; chatgpt.com:4431 blocked; github.com:443 — 2 allowed
  • 0 agent turns recorded despite agent job running 1.4 minutes
  • No auth errors; OpenAI credentials functional for the 13 allowed calls
  • Log file: 1 MB of agent stdio recorded despite 0 parsed turns
  • Different failure mode from issue [aw] AI Moderator failed #27404 (which is a 401 auth error on a different run)
AI Moderator — OpenAI 401 (run 24680519515)
  • Engine: Codex, trigger: issues event, actor: Daidanny008
  • Error: unexpected status 401 Unauthorized: Missing bearer or basic authentication in header
  • URL: api.openai.com/v1/responses
  • Reconnect loop: 5 retries before terminal failure
  • Already tracked in [aw] AI Moderator failed #27404
Design Decision Gate — safeoutputs MCP drop (run 24680939211)
  • Engine: Claude Code, trigger: PR copilot/add-codemod-for-serena-conversion
  • Agent completed 6 turns; wrote ADR file; attempted git add docs/adr/274...
  • safeoutputs MCP: HTTP connection dropped after 63s uptime
  • detection and safe_outputs jobs were skipped as a result
  • 0/12 network requests blocked (not a firewall issue — MCP-layer timeout)
  • Already tracked in [aw] Design Decision Gate 🏗️ failed #27405

Existing Issue Correlation

Issue Status Assessment
#27404 — AI Moderator OpenAI 401 Open Still active; run 24680519515 confirms recurrence
#27405 — Design Decision Gate safeoutputs MCP drop Open Transient or recurring; run 24680939211 confirms occurrence
#27396 — No-Op Runs tracker Open Normal operation tracker, no action needed

No issues to close — all open issues reflect genuinely unresolved failure modes.

Fix Roadmap

Priority Issue Action
P1 #27412 (new) Investigate why Codex v0.121.0 calls chatgpt.com; add to allowlist or fix agent behavior
P1 #27404 Investigate intermittent OpenAI API 401 auth failures for Codex engine
P1 #27405 Fix safeoutputs MCP connection stability (timeout threshold or retry logic)

Sub-Issues Created

References:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions