[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations — April 2026 #1969

2026-04-13T21:45:50Z

github-actions[bot]
bot Apr 13, 2026

📊 Executive Summary

gh-aw-firewall is one of the most mature agentic workflow repositories observed, operating at maturity level 4.5/5 with 27 agentic workflows spanning security, smoke testing, documentation, issue management, and cost optimization. The top opportunities are: adding a container image CVE scanner for GHCR-published images, scheduling the secret-digger workflows to run automatically rather than manually, and creating a codex token optimizer to match the existing claude/copilot optimization pipeline.

🎓 Patterns Learned (Pelis Agent Factory vs. This Repo)

Pattern	Factory Standard	This Repo
Cache-memory for cross-run state	✅ Recommended	✅ Used in security-review, issue-duplication-detector
`skip-if-match` guards	✅ Recommended	✅ Consistently applied
`workflow_run` chaining	✅ Recommended	✅ token-analyzer → token-optimizer
Shared import modules	✅ Recommended	✅ `shared/secret-audit.md`, `shared/reporting.md`, `shared/mcp/gh-aw.md`
Threat detection disabled on security tools	✅ Recommended	✅ All workflows use `threat-detection: false`
Per-engine smoke tests	✅ Best practice	✅ Claude + Copilot + Codex + Chroot + Services
`expires:` on auto-created issues	✅ Prevents stale issues	✅ Used in dependency-security-monitor
`roles: all` for PR triggers	✅ Ensures all contributors covered	✅ smoke and build-test workflows
`reaction:` event trigger	✅ Interactive triggers	✅ Used in smoke-claude, smoke-services
Container image scanning	✅ Should accompany container releases	❌ Missing
Scheduled red-team runs	✅ Continuous adversarial testing	⚠️ Secret diggers are manual-only
Multi-engine cost parity	✅ Optimize all engines	⚠️ Codex optimizer missing

📋 Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`security-guard`	Blocks PRs that weaken security	PR opened/sync	✅ Excellent — fast, targeted
`security-review`	Daily threat modeling + escape test results	Daily + manual	✅ Strong — uses cache-memory, imports escape data
`dependency-security-monitor`	CVE detection + dep update PRs	Daily + manual	✅ Strong — creates issues with expiry
`secret-digger-claude`	Container isolation red-team (Claude)	Manual only	⚠️ Should run weekly
`secret-digger-codex`	Container isolation red-team (Codex)	Manual only	⚠️ Should run weekly
`secret-digger-copilot`	Container isolation red-team (Copilot)	Manual only	⚠️ Should run weekly
`smoke-claude`	End-to-end AWF smoke test (Claude)	12h + PR + reaction	✅ Well-configured
`smoke-copilot`	End-to-end AWF smoke test (Copilot)	12h + PR + reaction	✅ Well-configured
`smoke-codex`	End-to-end AWF smoke test (Codex)	12h + PR + reaction	✅ Well-configured
`smoke-chroot`	Chroot integration validation	12h + PR	✅ Good
`smoke-services`	Redis/PostgreSQL host-service-ports test	12h + PR + reaction	✅ Good — tests specific AWF feature
`build-test`	Multi-language build validation	PR + manual	✅ Comprehensive language matrix
`test-coverage-improver`	Weekly PR adding tests for uncovered paths	Weekly + manual	✅ Security-path focused, well-scoped
`ci-doctor`	Investigates CI failures, creates issues	`workflow_run` failure	✅ Well-integrated; ⚠️ workflow list is manually maintained
`ci-cd-gaps-assessment`	Daily CI/CD gap detection	Daily + manual	✅ Uses `agentic-workflows` MCP
`doc-maintainer`	Syncs docs with code changes	Daily + skip-if-match	✅ Good `skip-if-match` guard; ⚠️ no `network.allowed`
`cli-flag-consistency-checker`	Audits CLI flags vs docs	Weekly + manual	✅ Domain-specific, valuable
`update-release-notes`	Enriches release notes from diff	`release` event	✅ Event-driven, well-targeted
`issue-monster`	Auto-assigns open issues to Copilot	Issues opened + hourly	✅ Classic factory pattern
`issue-duplication-detector`	Finds duplicate issues	Issues opened	✅ Uses cache-memory correctly
`firewall-issue-dispatcher`	Syncs `gh-aw` issues → this repo	Every 6h + manual	✅ Cross-repo with PAT, cli-proxy enabled
`claude-token-usage-analyzer`	Analyzes Claude token spend	Daily + manual	✅ Good
`claude-token-optimizer`	Creates optimization recommendations	`workflow_run` from analyzer	✅ Correctly chained
`copilot-token-usage-analyzer`	Analyzes Copilot token spend	Daily + manual	✅ Good
`copilot-token-optimizer`	Creates optimization recommendations	`workflow_run` from analyzer	✅ Correctly chained
`plan`	/plan slash command for task breakdown	`slash_command`	✅ Good UX pattern
`pelis-agent-factory-advisor`	This workflow — daily repo maturity review	Daily + manual	✅ Meta-advisory

🚀 Recommendations

P0 — High Impact, Low Effort (Implement immediately)

1. Schedule Secret Digger Workflows Weekly

What: Add schedule: weekly to secret-digger-claude.md, secret-digger-codex.md, and secret-digger-copilot.md.

Why: These red-team agents are among the most valuable security assets in the repo — they actively probe container isolation boundaries. Running them only manually means new code changes can silently degrade isolation for weeks. The shared/secret-audit.md already rotates techniques via cache-memory.

How: Add to each secret-digger frontmatter:

on:
  schedule: weekly
  workflow_dispatch:
  skip-if-match:
    query: 'is:issue is:open label:isolation-testing created:>2026-04-06'
    max: 1

Effort: 5 minutes × 3 files. Risk: Low — skip-if-match prevents flooding.

2. Add Codex Token Optimizer

What: Create codex-token-usage-analyzer.md and codex-token-optimizer.md to match the existing Claude/Copilot optimization pipeline.

Why: Three AI engines are used (claude, codex, copilot) and codex is the only one without cost instrumentation. Codex runs are used in smoke tests, build tests, and the secret digger — unmonitored costs accumulate.

How: Clone copilot-token-usage-analyzer.md → codex-token-usage-analyzer.md, substitute copilot → codex and GH_AW_MODEL_COPILOT → GH_AW_MODEL_CODEX. Chain the optimizer the same way.

Effort: ~30 minutes (copy + adapt two files). Risk: Minimal.

P1 — High Impact, Medium Effort (Near-term)

3. Container Image CVE Scanner

What: New agentic workflow container-image-security.md that scans GHCR-published Docker images (awf-squid, awf-agent, awf-api-proxy) for CVEs using Trivy or Grype.

Why: The repo publishes container images to GHCR on every release. Base images (ubuntu/squid:latest, ubuntu:22.04) accumulate CVEs over time. There is a dependency-audit.yml for npm packages and codeql.yml for source code, but no container layer scanning. This is a critical gap for a security product.

How:

on:
  schedule: weekly
  release:
    types: [published]
  workflow_dispatch:
tools:
  bash: true
network:
  allowed:
    - ghcr.io
    - github
safe-outputs:
  create-issue:
    title-prefix: "[Container CVE] "
    labels: [security, containers]
    expires: 7d

Agent pulls each image from GHCR, runs trivy image (already available on GitHub runners), parses HIGH/CRITICAL findings, and creates issues.

Effort: ~2 hours. Risk: Low — read-only GHCR pull + issue creation.

4. Domain Whitelist Auditor

What: New weekly workflow domain-whitelist-auditor.md that reviews domains in network.allowed across all workflow .md files and checks whether each domain is still justified.

Why: AWF is a firewall product — its own CI workflows should model minimal egress. Over time, network allowlists in workflows accumulate domains that were added for one-time reasons but never removed. This is especially embarrassing for a firewall tool.

How: Agent reads all .github/workflows/*.md files, extracts network.allowed domains, cross-references them against what each workflow actually needs based on its tools/commands, and creates a discussion report flagging over-permissive entries.

Effort: ~2 hours. Risk: Low — read-only, creates discussion not issues.

5. Performance Regression Agentic Analyst

What: New workflow performance-regression-analyzer.md that reads the output of performance-monitor.yml (which already runs) and interprets results, creates issues for regressions, and proposes fixes.

Why: performance-monitor.yml generates benchmark data but doesn't create actionable follow-up. Benchmarks only add value when regressions are automatically surfaced and addressed.

How: Trigger via workflow_run on Performance Monitor, read benchmark artifacts, compare against baseline stored in cache-memory, create issues for >10% regressions with specific optimization suggestions.

Effort: ~3 hours. Risk: Low.

6. Release Readiness Checklist Agent

What: New workflow release-readiness.md that runs on PRs targeting main with [release] or version bump commits, checking that: CHANGELOG is updated, container images are tested, all smoke tests pass, docs are current, and release notes draft exists.

Why: update-release-notes.md runs after release, but there's no pre-release gate. For a security product, a missed changelog entry or untested container can cause serious user trust issues.

How: Triggered on push to main or workflow_dispatch with version parameter. Agent queries recent CI runs, checks doc freshness, validates CHANGELOG, creates a release-readiness issue/comment.

Effort: ~3 hours. Risk: Low.

P2 — Medium Impact

7. Stale Issue Closer

What: Weekly stale-issue-triage.md that identifies issues with no activity in 30+ days, comments asking for status, and closes if no response in 7 days.

Why: issue-monster.md creates Copilot-assigned PRs but doesn't clean up stale issues when PRs are abandoned. Over time, unresolved issues inflate the backlog.

Effort: ~1 hour. Risk: Low with dry-run option first.

8. PR Quality Reviewer for AWF PRs

What: Extend security-guard.md or create a companion pr-quality-reviewer.md that checks PRs for: test coverage of changed code, documentation updates for new CLI flags, and network allowlist additions.

Why: Currently security-guard only reviews security weakening. A quality reviewer would catch cases where new features lack tests or docs — complementing the existing weekly cli-flag-consistency-checker.

Effort: ~2 hours. Risk: Low — add-comment only.

P3 — Nice to Have

9. Auto-Merge Agent for Copilot-Created PRs

What: Workflow that monitors draft PRs created by copilot-swe-agent, promotes them from draft when CI passes, and requests review.

Why: issue-monster.md auto-assigns issues but human review is still needed to merge. An auto-promote-from-draft step would streamline the flow.

Effort: Medium. Risk: Medium — needs careful branch protection integration.

10. CI Doctor Workflow Auto-Discovery

What: Enhance ci-doctor.md to auto-discover monitored workflows from the repository's workflow files rather than maintaining a hardcoded list.

Why: The current ci-doctor.md has 25+ hardcoded workflow names. Every new workflow added to the repo requires a manual update to CI Doctor's workflows: list, or it won't be monitored for failures.

How: Add a pre-agent step that runs gh workflow list --json name and populates the watched list dynamically, or use a separate .github/workflows/ci-doctor-workflows.yml that generates the list.

Effort: ~2 hours. Risk: Low.

📈 Maturity Assessment

Dimension	Current (1–5)	Target	Gap
Security Automation	5	5	✅ Container image scanning is the only gap
Smoke/Integration Testing	5	5	✅ Excellent multi-engine coverage
Cost & Token Optimization	4	5	Codex optimizer missing
Documentation Maintenance	4.5	5	PR quality reviewer would close the gap
Issue Management	4.5	5	Stale issue triage missing
Release Automation	3.5	5	Pre-release readiness check missing
Observability & Alerting	4	5	Performance regression analysis not agentic
Overall	4.5	5	Container CVE scan + scheduled red-team are blockers

🔄 Best Practice Comparison

What this repo does exceptionally well

workflow_run chaining — The token analyzer → optimizer pipeline is a textbook Pelis factory pattern. Triggers are data-driven, not time-driven.
Threat detection disabled — Every workflow correctly sets threat-detection: false. For a security product's own CI, false positives from safe-outputs threat detection would be costly.
skip-if-match discipline — Consistently prevents duplicate issues/PRs from automation, a common failure mode in other repos.
Shared modules — shared/secret-audit.md, shared/reporting.md, and shared/mcp/gh-aw.md reduce duplication across all three secret-digger variants and other workflows.
Reaction-triggered smoke tests — Using reaction: "heart" / reaction: "rocket" for interactive re-triggering is a sophisticated UX touch rarely seen elsewhere.
Red-team coverage — Having three separate secret-digger variants (one per AI engine) catches engine-specific behaviors that single-engine tests would miss.
expires: on auto-issues — Prevents stale automation issues from polluting the backlog.

What to improve

Container image CVE scanning — The biggest actual security gap. The product publishes container images but doesn't scan them.
Scheduled adversarial testing — Secret diggers run manually; weekly automation would catch regressions within days of a code change.
CI Doctor maintenance burden — Hardcoded workflow list in ci-doctor.md will silently miss new workflows added to the repo.
Performance regression — performance-monitor.yml generates data but no agent interprets and acts on it.
Cost coverage parity — Codex is unmonitored for token costs relative to Claude and Copilot.

📝 Notes

Cache-memory updated with:

Content hash: cdd0a0ce84f26f5119f7edac2510378b84c221c9fe2515d48c25d44f63f6f075
Maturity level: 4.5/5
Top gaps tracked for next run comparison: container image CVE scan, secret digger scheduling, codex token optimizer, domain whitelist auditor

Items to watch next run:

Was secret-digger-*.md schedule added?
Was codex-token-optimizer.md created?
Was container image scanning introduced?
Did CI Doctor gain auto-discovery?

Generated by Pelis Agent Factory Advisor on 2026-04-13.

Generated by Pelis Agent Factory Advisor · ● 588.6K · ◷

expires on Apr 20, 2026, 9:45 PM UTC

2026-04-20T22:53:14Z

github-actions[bot]
bot Apr 20, 2026
Author

This discussion was automatically closed because it expired on 2026-04-20T21:45:50.556Z.

Closed by Workflow

0 replies

2026-04-21T00:17:44Z

github-actions[bot]
bot Apr 21, 2026
Author

🔮 The ancient spirits stir in the firewall logs.
The smoke-test wanderer passed through this hall and marked the run complete.
May your workflows remain guarded and your signals clear.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

0 replies

2026-04-21T01:11:14Z

github-actions[bot]
bot Apr 21, 2026
Author

🔮 The ancient spirits stir, and the smoke-test agent has walked this thread.
The runes show command, browser, build, and file checks completed.
May the firewall remain steadfast under watchful stars.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

registry.npmjs.org

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "registry.npmjs.org"

See Network Configuration for more information.

🔮 The oracle has spoken through Smoke Codex

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations — April 2026 #1969

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Maturity Assessment & Recommendations — April 2026 #1969

Uh oh!

github-actions[bot] bot Apr 13, 2026

📊 Executive Summary

🎓 Patterns Learned (Pelis Agent Factory vs. This Repo)

📋 Workflow Inventory

🚀 Recommendations

P0 — High Impact, Low Effort (Implement immediately)

1. Schedule Secret Digger Workflows Weekly

2. Add Codex Token Optimizer

P1 — High Impact, Medium Effort (Near-term)

3. Container Image CVE Scanner

4. Domain Whitelist Auditor

5. Performance Regression Agentic Analyst

6. Release Readiness Checklist Agent

P2 — Medium Impact

7. Stale Issue Closer

8. PR Quality Reviewer for AWF PRs

P3 — Nice to Have

9. Auto-Merge Agent for Copilot-Created PRs

10. CI Doctor Workflow Auto-Discovery

📈 Maturity Assessment

🔄 Best Practice Comparison

What this repo does exceptionally well

What to improve

📝 Notes

Replies: 3 comments

Uh oh!

github-actions[bot] bot Apr 20, 2026 Author

Uh oh!

github-actions[bot] bot Apr 21, 2026 Author

Uh oh!

github-actions[bot] bot Apr 21, 2026 Author

github-actions[bot]
bot Apr 13, 2026

github-actions[bot]
bot Apr 20, 2026
Author

github-actions[bot]
bot Apr 21, 2026
Author

github-actions[bot]
bot Apr 21, 2026
Author