[Pelis Agent Factory Advisor] Agentic Workflow Advisor: Maturity Assessment & Recommendations (2026-04-18) #2084
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📊 Executive Summary
gh-aw-firewallis one of the most mature agentic workflow repositories I've analyzed — 25 agentic.mdworkflows spanning security, testing, documentation, issue management, token optimization, and smoke testing across three AI engines. The primary opportunities are filling a few symmetry gaps (Codex token analysis, container image scanning) and closing minor CI monitoring blind spots rather than building foundational automation from scratch.🎓 Patterns Learned vs. Current Repo
📋 Workflow Inventory
build-testsecurity-guardsmoke-claudesmoke-codexsmoke-copilotsmoke-copilot-byoksmoke-chrootsmoke-opencodesmoke-servicesci-doctorsecret-digger-{claude,codex,copilot}security-reviewdependency-security-monitorclaude-token-usage-analyzerclaude-token-optimizercopilot-token-usage-analyzercopilot-token-optimizerdoc-maintainercli-flag-consistency-checkerupdate-release-notesissue-monsterissue-duplication-detectorfirewall-issue-dispatcherci-cd-gaps-assessmenttest-coverage-improverplanpelis-agent-factory-advisor🚀 Recommendations
P0 — High Impact, Low Effort (implement immediately)
1. Codex Token Usage Analyzer + Optimizer
What: Mirror
claude-token-usage-analyzer.mdandclaude-token-optimizer.mdfor the Codex engine.Why: You have token analyzers for Claude and Copilot but not Codex. With
smoke-codexandsecret-digger-codexrunning regularly, Codex costs are unmonitored. Symmetry also makes it easier to compare costs across engines.How: Clone
claude-token-usage-analyzer.md→codex-token-usage-analyzer.md, adjust the engine filter in the log query. Cloneclaude-token-optimizer.md→codex-token-optimizer.md, setworkflow_runtrigger to"Daily Codex Token Usage Analyzer".Effort: ~30 minutes (copy + adjust engine filter)
2. Fix ci-doctor Monitored Workflow List
What: Add missing workflows to
ci-doctor.md'sworkflow_run.workflowslist.Why:
smoke-opencode,smoke-services,smoke-copilot-byok,Secret Digger (Claude/Codex/Copilot),Firewall Issue Dispatcher,CI Doctor(self),Doc Maintainer,Dependency Security Monitor,CLI Flag Consistency Checker,Test Coverage Improverare absent. Failures in these go uninvestigated.How: Edit the
workflows:list inci-doctor.mdto add the missing names. Recompile withgh-aw compile.Effort: ~15 minutes
P1 — High Impact, Medium Effort (near-term)
3. Container Image Vulnerability Scanner
What: A weekly or per-release agentic workflow that runs Trivy against the three container images (
squid,agent,api-proxy) and creates security issues for HIGH/CRITICAL CVEs.Why:
dependency-security-monitorcovers npm packages but the Docker images (Ubuntu 22.04 base, squid, Node.js) accumulate OS-level CVEs that npm audit never sees. Since this is a security tool, unpatched base images undermine the product's security posture.How:
Prompt instructs agent to pull each GHCR image, run
trivy image --severity HIGH,CRITICAL, parse JSON output, and create a single consolidated issue per scan with findings grouped by image.Effort: ~2 hours
4. Stale Issue / PR Manager
What: Weekly workflow that labels and comments on issues/PRs with no activity for 30+ days, and closes them after 60 days of inactivity (with a grace comment).
Why: As the issue-monster auto-assigns issues, there's no lifecycle management on the other end. Stale Copilot-assigned issues accumulate without resolution signals.
How:
Agent queries for stale items, posts a "This issue has been inactive for 30 days — still relevant?" comment, labels with
stale, and closes issues labeledstalewith no response after another 7 days.Effort: ~1.5 hours
5. Network Egress Policy Drift Detector
What: Weekly agent that audits all allowed domains across smoke test configs, AGENTS.md documented allowlists, and the actual
--allow-domainsflags used in examples/scripts. Flags any domain that appears in tests but isn't in documented allowlists or vice versa.Why: This is a firewall product — the firewall's own test infrastructure should model minimum-privilege egress. Unreviewed domain additions in tests could indicate supply-chain or test hygiene issues.
How: Agent uses
bashtool withgrepto extract--allow-domainsvalues from all test scripts, smoke test MD files, andexamples/. Compares against a known-good baseline stored in cache-memory. Diffs are posted as a discussion or issue.Effort: ~2 hours
P2 — Medium Impact (plan for later)
6. PR Quality / Complexity Advisor
What: Agentic PR reviewer that goes beyond
security-guardto assess code complexity, test coverage delta, API surface changes, and whether the PR has appropriate tests for changed security paths.Why:
security-guardcovers security posture but not general quality signals like "this PR changessetup-iptables.shbut adds no tests" or "this function has cyclomatic complexity >15".Effort: Medium — needs good prompting to avoid noise
7. Benchmark Tracker
What: Weekly workflow that runs the
benchmarks/suite, stores results in cache-memory, and creates a trend issue when performance regresses >10% vs. the 4-week average.Why: Container startup time and firewall overhead are user-facing performance metrics. No current workflow tracks these over time.
Effort: Medium — need to understand what benchmarks already exist
8. Release Readiness Checker
What: Triggered on
release: created(before publish), validates that: CHANGELOG is updated, all smoke tests passed on the release branch, container images are built and pushed, and version numbers are consistent acrosspackage.json,action.yml, andinstall.sh.Why:
update-release-notesonly runs after release is published. A pre-release gate catches issues before the release is public.Effort: Medium
P3 — Nice to Have
9. Cross-Engine Performance Comparison
What: Monthly workflow that compares token usage, turn counts, and task completion quality across Claude/Codex/Copilot for the same smoke test prompts.
Why: Helps the team make informed decisions about which engine to use for which workflow type, and tracks whether model updates are net positive.
10. CHANGELOG Auto-Updater
What: On PR merge to main, append a CHANGELOG entry derived from the PR title/body/labels.
Why: Complements
update-release-noteswith a developer-facing rolling changelog.📈 Maturity Assessment
🔄 Best Practice Comparison
What This Repo Does Exceptionally Well
security-guardon every PR + dailysecurity-review+dependency-security-monitoris best-in-classfirewall-issue-dispatcherpulling fromgithub/gh-awis a rare, valuable patternWhat to Improve
📝 Notes
Analysis run: 2026-04-18. Patterns stored in cache-memory at
/tmp/gh-aw/cache-memory/pelis_patterns. Next run will compare against this baseline to detect workflow drift or new gaps introduced by additions.Top 3 actions by ROI:
Beta Was this translation helpful? Give feedback.
All reactions