[Pelis Agent Factory Advisor] Agentic Workflow Advisor: Maturity Assessment & Recommendations (2026-04-18) #2084

2026-04-18T21:42:51Z

github-actions[bot]
bot Apr 18, 2026

📊 Executive Summary

gh-aw-firewall is one of the most mature agentic workflow repositories I've analyzed — 25 agentic .md workflows spanning security, testing, documentation, issue management, token optimization, and smoke testing across three AI engines. The primary opportunities are filling a few symmetry gaps (Codex token analysis, container image scanning) and closing minor CI monitoring blind spots rather than building foundational automation from scratch.

🎓 Patterns Learned vs. Current Repo

Pattern	Status
Multi-engine smoke tests	✅ Excellent (Claude, Codex, Copilot, BYOK, chroot, opencode, services)
Token cost monitoring + optimizer chain	✅ Claude + Copilot; ❌ Codex missing
Red team / adversarial testing	✅ secret-digger across all 3 engines
CI failure triage (ci-doctor)	✅ Present; partial coverage gap
Issue lifecycle management	✅ issue-monster, duplication-detector, firewall-issue-dispatcher
Slash command automation	✅ /plan command
Daily security review + dep monitor	✅ Comprehensive
Container image scanning	❌ Missing (only npm dep audit)
Stale issue/PR management	❌ Missing
Benchmark / performance tracking	❌ Missing

📋 Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Build validation on PRs	PR, dispatch	✅ Good
`security-guard`	PR security posture review	PR, dispatch	✅ Strong (Claude engine)
`smoke-claude`	Claude engine smoke test	12h schedule, PR	✅ Good
`smoke-codex`	Codex engine smoke test	Schedule, PR	✅ Good
`smoke-copilot`	Copilot engine smoke test	Schedule, PR	✅ Good
`smoke-copilot-byok`	BYOK Copilot smoke test	Schedule, PR	✅ Good
`smoke-chroot`	Chroot isolation smoke test	Schedule, PR	✅ Security-focused
`smoke-opencode`	OpenCode engine smoke test	Schedule	✅ Good
`smoke-services`	Services smoke test	Schedule	✅ Good
`ci-doctor`	CI failure investigator	workflow_run failure	✅ Good; gaps in monitored list
`secret-digger-{claude,codex,copilot}`	Red team secret search	dispatch	✅ Multi-engine coverage
`security-review`	Daily comprehensive security review	Daily	✅ Thorough
`dependency-security-monitor`	Daily npm CVE monitoring	Daily	✅ Good; no Docker image scan
`claude-token-usage-analyzer`	Daily Claude token analysis	Daily	✅ Good
`claude-token-optimizer`	Claude optimization recommendations	workflow_run	✅ Good
`copilot-token-usage-analyzer`	Daily Copilot token analysis	Daily	✅ Good
`copilot-token-optimizer`	Copilot optimization recommendations	workflow_run	✅ Good
`doc-maintainer`	Daily doc sync with code	Daily	✅ Good
`cli-flag-consistency-checker`	Weekly CLI doc/code consistency	Weekly	✅ Good
`update-release-notes`	Release notes from git diff	release published	✅ Good
`issue-monster`	Auto-assigns issues to Copilot	issue opened, hourly	✅ Good
`issue-duplication-detector`	Flags duplicate issues	issue opened	✅ Good
`firewall-issue-dispatcher`	Syncs gh-aw issues to this repo	Every 6h	✅ Cross-repo
`ci-cd-gaps-assessment`	Daily CI/CD quality gap analysis	Daily	✅ Meta-automation
`test-coverage-improver`	Weekly test coverage PRs	Weekly	✅ Good
`plan`	/plan slash command	slash_command	✅ Good
`pelis-agent-factory-advisor`	This workflow	Daily	✅ Meta-monitoring

🚀 Recommendations

P0 — High Impact, Low Effort (implement immediately)

1. Codex Token Usage Analyzer + Optimizer

What: Mirror claude-token-usage-analyzer.md and claude-token-optimizer.md for the Codex engine.

Why: You have token analyzers for Claude and Copilot but not Codex. With smoke-codex and secret-digger-codex running regularly, Codex costs are unmonitored. Symmetry also makes it easier to compare costs across engines.

How: Clone claude-token-usage-analyzer.md → codex-token-usage-analyzer.md, adjust the engine filter in the log query. Clone claude-token-optimizer.md → codex-token-optimizer.md, set workflow_run trigger to "Daily Codex Token Usage Analyzer".

Effort: ~30 minutes (copy + adjust engine filter)

2. Fix ci-doctor Monitored Workflow List

What: Add missing workflows to ci-doctor.md's workflow_run.workflows list.

Why: smoke-opencode, smoke-services, smoke-copilot-byok, Secret Digger (Claude/Codex/Copilot), Firewall Issue Dispatcher, CI Doctor (self), Doc Maintainer, Dependency Security Monitor, CLI Flag Consistency Checker, Test Coverage Improver are absent. Failures in these go uninvestigated.

How: Edit the workflows: list in ci-doctor.md to add the missing names. Recompile with gh-aw compile.

Effort: ~15 minutes

P1 — High Impact, Medium Effort (near-term)

3. Container Image Vulnerability Scanner

What: A weekly or per-release agentic workflow that runs Trivy against the three container images (squid, agent, api-proxy) and creates security issues for HIGH/CRITICAL CVEs.

Why: dependency-security-monitor covers npm packages but the Docker images (Ubuntu 22.04 base, squid, Node.js) accumulate OS-level CVEs that npm audit never sees. Since this is a security tool, unpatched base images undermine the product's security posture.

How:

---
description: Weekly container image vulnerability scan using Trivy
on:
  schedule: weekly
  workflow_dispatch:
  release:
    types: [published]
permissions:
  contents: read
  issues: read
  security-events: read
tools:
  bash:
    - "trivy image:*"
    - "docker pull:*"
safe-outputs:
  create-issue:
    labels: [security, container-vulnerability]

Prompt instructs agent to pull each GHCR image, run trivy image --severity HIGH,CRITICAL, parse JSON output, and create a single consolidated issue per scan with findings grouped by image.

Effort: ~2 hours

4. Stale Issue / PR Manager

What: Weekly workflow that labels and comments on issues/PRs with no activity for 30+ days, and closes them after 60 days of inactivity (with a grace comment).

Why: As the issue-monster auto-assigns issues, there's no lifecycle management on the other end. Stale Copilot-assigned issues accumulate without resolution signals.

How:

on:
  schedule: weekly
  workflow_dispatch:
skip-if-no-match: "is:open is:issue updated:<30d"
tools:
  github:
    toolsets: [issues, pull_requests]

Agent queries for stale items, posts a "This issue has been inactive for 30 days — still relevant?" comment, labels with stale, and closes issues labeled stale with no response after another 7 days.

Effort: ~1.5 hours

5. Network Egress Policy Drift Detector

What: Weekly agent that audits all allowed domains across smoke test configs, AGENTS.md documented allowlists, and the actual --allow-domains flags used in examples/scripts. Flags any domain that appears in tests but isn't in documented allowlists or vice versa.

Why: This is a firewall product — the firewall's own test infrastructure should model minimum-privilege egress. Unreviewed domain additions in tests could indicate supply-chain or test hygiene issues.

How: Agent uses bash tool with grep to extract --allow-domains values from all test scripts, smoke test MD files, and examples/. Compares against a known-good baseline stored in cache-memory. Diffs are posted as a discussion or issue.

Effort: ~2 hours

P2 — Medium Impact (plan for later)

6. PR Quality / Complexity Advisor

What: Agentic PR reviewer that goes beyond security-guard to assess code complexity, test coverage delta, API surface changes, and whether the PR has appropriate tests for changed security paths.

Why: security-guard covers security posture but not general quality signals like "this PR changes setup-iptables.sh but adds no tests" or "this function has cyclomatic complexity >15".

Effort: Medium — needs good prompting to avoid noise

7. Benchmark Tracker

What: Weekly workflow that runs the benchmarks/ suite, stores results in cache-memory, and creates a trend issue when performance regresses >10% vs. the 4-week average.

Why: Container startup time and firewall overhead are user-facing performance metrics. No current workflow tracks these over time.

Effort: Medium — need to understand what benchmarks already exist

8. Release Readiness Checker

What: Triggered on release: created (before publish), validates that: CHANGELOG is updated, all smoke tests passed on the release branch, container images are built and pushed, and version numbers are consistent across package.json, action.yml, and install.sh.

Why: update-release-notes only runs after release is published. A pre-release gate catches issues before the release is public.

Effort: Medium

P3 — Nice to Have

9. Cross-Engine Performance Comparison

What: Monthly workflow that compares token usage, turn counts, and task completion quality across Claude/Codex/Copilot for the same smoke test prompts.

Why: Helps the team make informed decisions about which engine to use for which workflow type, and tracks whether model updates are net positive.

10. CHANGELOG Auto-Updater

What: On PR merge to main, append a CHANGELOG entry derived from the PR title/body/labels.

Why: Complements update-release-notes with a developer-facing rolling changelog.

📈 Maturity Assessment

Dimension	Current (1–5)	Target	Gap
CI/CD Coverage	4	5	ci-doctor monitoring gaps; missing container image scan
Security Automation	5	5	✅ Excellent — red team, daily review, dep monitor, security-guard
Token Cost Management	4	5	Codex missing from analyzer/optimizer chain
Issue Lifecycle	4	5	No stale management
Documentation	4	5	Good; no CHANGELOG automation
Release Automation	3	4	Pre-release readiness gate missing
Performance Monitoring	2	4	No benchmark tracking
Overall	4.0	4.7	Mostly polish; no foundational gaps

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

Multi-engine redundancy: Secret digger and smoke tests run across all 3 engines — adversarial and functional coverage
Token cost awareness: The analyzer → optimizer chain pattern is sophisticated and self-funding (reduces cost over time)
Security-first automation: security-guard on every PR + daily security-review + dependency-security-monitor is best-in-class
Cross-repo automation: firewall-issue-dispatcher pulling from github/gh-aw is a rare, valuable pattern
skip-if-match guards: Prevents duplicate issue/PR creation across schedulers — very clean

What to Improve

Coverage symmetry: When you add a new engine, add it to the token analyzer chain and ci-doctor monitor list
Container layer security: Docker image CVE scanning is the most impactful unaddressed gap for a security tool
Issue lifecycle closure: issue-monster opens Copilot tasks; nothing closes them when abandoned

📝 Notes

Analysis run: 2026-04-18. Patterns stored in cache-memory at /tmp/gh-aw/cache-memory/pelis_patterns. Next run will compare against this baseline to detect workflow drift or new gaps introduced by additions.

Top 3 actions by ROI:

Fix ci-doctor workflow list (15 min, closes monitoring blind spots)
Add codex-token-usage-analyzer + codex-token-optimizer (30 min, closes symmetry gap)
Add container image vulnerability scanner (2h, closes the biggest actual security gap in the toolchain)

Generated by Pelis Agent Factory Advisor · ● 370.9K · ◷

expires on Apr 25, 2026, 9:42 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Advisor: Maturity Assessment & Recommendations (2026-04-18) #2084

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Advisor: Maturity Assessment & Recommendations (2026-04-18) #2084

Uh oh!

github-actions[bot] bot Apr 18, 2026

📊 Executive Summary

🎓 Patterns Learned vs. Current Repo

📋 Workflow Inventory

🚀 Recommendations

P0 — High Impact, Low Effort (implement immediately)

1. Codex Token Usage Analyzer + Optimizer

2. Fix ci-doctor Monitored Workflow List

P1 — High Impact, Medium Effort (near-term)

3. Container Image Vulnerability Scanner

4. Stale Issue / PR Manager

5. Network Egress Policy Drift Detector

P2 — Medium Impact (plan for later)

6. PR Quality / Complexity Advisor

7. Benchmark Tracker

8. Release Readiness Checker

P3 — Nice to Have

9. Cross-Engine Performance Comparison

10. CHANGELOG Auto-Updater

📈 Maturity Assessment

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

What to Improve

📝 Notes

Replies: 0 comments

github-actions[bot]
bot Apr 18, 2026