[Pelis Agent Factory Advisor] Agentic Workflow Advisor Report — April 2026 #2061

2026-04-17T21:43:33Z

github-actions[bot]
bot Apr 17, 2026

📊 Executive Summary

gh-aw-firewall is an exceptionally mature agentic workflow repository with 26 agentic .md workflows covering security, CI/CD, documentation, token optimization, issue automation, and multi-engine smoke testing. The top remaining opportunities are a container image CVE scanner, stale issue/PR manager, and a breaking-change detector — all high-value, low-effort additions that align directly with this repo's security mission.

🎓 Patterns Learned from Pelis Agent Factory

The Pelis Agent Factory documentation establishes these key patterns observed and applied here:

Pattern	Description	Present in Repo?
Skip-if-match guard	Avoid duplicate work using `skip-if-match` on open issues/PRs	✅ Yes (doc-maintainer, coverage-improver, token-optimizer)
Workflow chaining	`workflow_run` triggers for pipeline composition	✅ Yes (token-optimizer chains from analyzer)
Cache memory	Persistent state across runs via `cache-memory` tool	✅ Yes (security-review, issue-duplication-detector)
Multi-engine parity	Same workflow implemented for Claude, Copilot, Codex	✅ Yes (smoke tests, secret-diggers)
Safe-outputs threat detection	DIFC integrity layer on all write operations	✅ Mostly (a few workflows disable it)
Shared imports / fragments	Reusable `shared/*.md` snippets	✅ Yes (mcp-pagination, reporting, secret-audit)
Schedule + manual dispatch	All scheduled workflows also have `workflow_dispatch`	✅ Yes, consistently
Role-based triggers	PR workflows using `roles: all`	✅ Yes (build-test, security-guard, smoke tests)
Network firewall on agents	`network.allowed` domain whitelisting	⚠️ Partial (not all workflows specify this)

📋 Workflow Inventory

Agentic Workflows (`.md`)

Workflow	Purpose	Trigger	Assessment
`build-test`	Run build/test suite on PRs	PR + manual	✅ Good
`ci-cd-gaps-assessment`	Identify CI coverage gaps	Daily + manual	✅ Good
`ci-doctor`	Investigate failed CI runs	`workflow_run` failed	✅ Excellent pattern
`claude-token-usage-analyzer`	Analyze Claude token spend	Daily	✅ Good
`claude-token-optimizer`	Recommend token reduction	After analyzer	✅ Good chaining
`cli-flag-consistency-checker`	Flag docs vs. implementation drift	Weekly	✅ Good
`copilot-token-usage-analyzer`	Analyze Copilot token spend	Daily	✅ Good
`copilot-token-optimizer`	Recommend Copilot token reduction	After analyzer	✅ Good chaining
`dependency-security-monitor`	Monitor CVEs in deps	Daily	✅ Good
`doc-maintainer`	Sync docs with code changes	Daily	✅ Good
`firewall-issue-dispatcher`	Cross-repo issue triage	Every 6h	✅ Unique pattern
`issue-duplication-detector`	Flag duplicate issues	On issue open	✅ Good
`issue-monster`	Assign issues to Copilot agents	On issue open + hourly	✅ Good
`pelis-agent-factory-advisor`	This workflow!	Daily	✅ Meta
`plan`	Generate plans via `/plan` slash command	Slash command	✅ Good
`secret-digger-claude/codex/copilot`	Red-team container secret search	Manual	✅ Excellent
`security-guard`	PR security boundary review	PR + manual	✅ Excellent
`security-review`	Comprehensive daily threat modeling	Daily	✅ Excellent
`smoke-chroot/claude/codex/copilot/byok/opencode/services`	Engine smoke tests	Schedule + PR	✅ Excellent coverage
`test-coverage-improver`	Write tests for uncovered code	Weekly	✅ Good
`update-release-notes`	Auto-update release notes	On release publish	✅ Good

Standard Workflows (`.yml`)

Workflow	Purpose
`build.yml`	TypeScript compile + artifact
`codeql.yml`	Static code analysis
`dependency-audit.yml`	`npm audit`
`lint.yml`	ESLint
`performance-monitor.yml`	Benchmark tracking
`release.yml`	Docker image publish + npm
`test-integration.yml`	Integration test suite
`test-coverage.yml`	Jest coverage report
`test-chroot.yml`	Chroot security tests
`test-examples.yml`	Examples validation
`deploy-docs.yml`	Astro docs site deploy
`pr-title.yml`	Conventional commit title check
`link-check.yml`	Markdown link validation

🚀 Recommendations

P0 — High Impact, Low Effort (Implement Now)

1. 🐳 Container Image CVE Scanner Agent

What: An agentic workflow that runs after each Docker image build (post-release or nightly), fetches Trivy/Grype scan results, and creates prioritized issues for any HIGH/CRITICAL CVEs found in the Squid, Agent, and API Proxy container images.

Why: This repo publishes container images to GHCR used by AI agents in production. A compromised base image is a critical security risk. The existing dependency-security-monitor only covers npm deps, not Docker layers.

How:

on:
  schedule: daily
  workflow_run:
    workflows: ["Release"]
    types: [completed]

Use bash tool to run docker run aquasec/trivy image ghcr.io/github/gh-aw-firewall/agent:latest --format json, parse results, create issues for HIGH/CRITICAL CVEs. Chain with skip-if-match to avoid duplicates.

Effort: Low — Trivy is a single Docker command; issue creation via safe-outputs.

2. 🗂️ Stale Issue/PR Manager

What: A weekly agent that identifies stale issues (no activity > 30 days) and PRs (no review activity > 14 days), adds a stale label, posts a polite comment, and closes items with no response after 7 more days.

Why: The issue-monster assigns new issues but there's no lifecycle management. As the repo grows, stale items accumulate and obscure active work.

How:

on:
  schedule: weekly
  workflow_dispatch:
skip-if-match:
  query: 'is:issue is:open label:stale'
  max: 20

Use github toolset to list old issues, add labels, post comments. Use cache-memory to track which items were already warned.

Effort: Low — standard GitHub API operations, well-trodden pattern.

P1 — High Impact, Medium Effort (Near-Term)

3. 🔍 Breaking Change Detector

What: A PR workflow that uses Claude to detect breaking changes in CLI flags, public API contracts, Docker container interfaces, or environment variable semantics — and requires explicit acknowledgment before merge.

Why: The CLI has a published interface (awf --allow-domains, --image-tag, etc.) used by downstream agentic workflows. Silent breaking changes cause widespread failures. The cli-flag-consistency-checker is weekly; this would be per-PR.

How:

on:
  pull_request:
    types: [opened, synchronize]
    paths:
      - "src/cli.ts"
      - "src/types.ts"
      - "action.yml"
      - "containers/**"

Diff the PR against main, detect flag removals/renames, env var changes, Docker image interface changes, and comment with a breaking-change checklist.

Effort: Medium — requires careful prompt engineering to minimize false positives.

4. 📊 Performance Regression Advisor

What: An agentic workflow that reads the output of performance-monitor.yml and provides natural-language analysis of regressions, trends, and recommendations — posting results as a discussion or GitHub Step Summary.

Why: performance-monitor.yml runs benchmarks but there's no agent analyzing the results for regressions. The raw numbers are hard to interpret without context.

How: Chain via workflow_run from performance-monitor.yml. Use agentic-workflows MCP to read benchmark artifacts, compare to historical baselines stored in cache-memory, and flag regressions > 10%.

Effort: Medium — requires understanding the benchmark output format and establishing baseline tracking.

P2 — Medium Impact

5. 👋 First-Time Contributor Onboarding Agent

What: Triggers on first-time contributor PRs/issues and posts a helpful, personalized welcome message with relevant docs links, development setup tips, and pointers to good first issues.

Why: This is a complex security tool with non-obvious setup requirements (Docker, iptables, sudo). New contributors often struggle without guidance.

How:

on:
  pull_request:
    types: [opened]

Check if the actor has previous PRs; if not, post a tailored welcome. Reference CONTRIBUTING.md and key sections of AGENTS.md.

Effort: Low-Medium — straightforward but requires good prompt to be genuinely helpful vs. boilerplate.

6. 🔄 Cross-Engine Smoke Test Comparator

What: After all smoke tests complete, an agent compares results across Claude/Copilot/Codex engines and flags behavioral divergences (e.g., one engine passing PR review while another fails).

Why: The repo runs smoke tests on 6+ engine variants but results are siloed. Cross-engine divergences can indicate bugs in engine-specific code paths.

How: Chain from all smoke-* workflows via workflow_run. Compare agentic-workflows audit outputs across runs. Post a weekly summary discussion.

Effort: Medium — requires correlating multiple runs.

P3 — Nice to Have

7. 📝 ADR (Architecture Decision Record) Suggester

Detects significant architectural changes in PRs (new container, new network topology, new security boundary) and suggests creating an ADR in docs/.

8. 🧪 Integration Test Auto-Filler

When ci-cd-gaps-assessment or test-coverage-improver identifies a gap but doesn't create a PR (due to complexity), create a GitHub issue with a detailed specification for a human to implement.

9. 🌐 Firewall Rule Audit Agent

Weekly agent that reviews the generated squid.conf patterns against known bypass techniques and domain squatting, creating issues when suspicious ACL rules are detected.

📈 Maturity Assessment

Dimension	Current (1-5)	Target (1-5)	Gap
Security Automation	5	5	✅ None
CI/CD Coverage	4	5	Container image scanning
Issue Lifecycle	3	4	Stale management
Documentation	4	4	✅ Met
Multi-Engine Testing	5	5	✅ None
Token Optimization	5	5	✅ None
Contributor Experience	2	4	Onboarding agent
Breaking Change Detection	2	4	Per-PR detector
Performance Analysis	2	3	Regression advisor

Overall: 4/5 — One of the most complete agentic workflow setups observed. The gaps are refinements, not gaps in fundamental coverage.

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

✅ Multi-engine parity: Same critical workflows run on Claude, Copilot, and Codex
✅ Defense-in-depth security: Secret-diggers, security-guard, security-review, dependency-monitor all complement each other
✅ Workflow chaining: Token analyzer → optimizer is a clean composition pattern
✅ Skip-if-match guards: Prevents duplicate issues/PRs from scheduled workflows
✅ Cache memory for state: Issue deduplication uses persistent memory correctly
✅ Cross-repo automation: firewall-issue-dispatcher bridges gh-aw → gh-aw-firewall

What Could Be Improved

⚠️ Inconsistent network.allowed: Not all agentic workflows specify network constraints — a few could be tightened
⚠️ No container image scanning: npm deps monitored but not Docker base images
⚠️ No issue lifecycle management: Issues are created and assigned but not closed/triaged over time
⚠️ Breaking changes not caught per-PR: Only weekly CLI flag checker, not per-PR diff analysis

📝 Notes

Cache memory updated with:

Content hash: cdd0a0ce84f26f5119f7edac2510378b84c221c9fe2515d48c25d44f63f6f075
Maturity score: 4/5
Top gaps tracked: container image CVE scanner, stale manager, breaking change detector
Next advisor run will skip re-reading docs if hash unchanged

Generated by pelis-agent-factory-advisor workflow — run ID 24587829426

Generated by Pelis Agent Factory Advisor · ● 383.4K · ◷

expires on Apr 24, 2026, 9:43 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Agentic Workflow Advisor Report — April 2026 #2061

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Agentic Workflow Advisor Report — April 2026 #2061

Uh oh!

github-actions[bot] bot Apr 17, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

📋 Workflow Inventory

Agentic Workflows (.md)

Standard Workflows (.yml)

🚀 Recommendations

P0 — High Impact, Low Effort (Implement Now)

1. 🐳 Container Image CVE Scanner Agent

2. 🗂️ Stale Issue/PR Manager

P1 — High Impact, Medium Effort (Near-Term)

3. 🔍 Breaking Change Detector

4. 📊 Performance Regression Advisor

P2 — Medium Impact

5. 👋 First-Time Contributor Onboarding Agent

6. 🔄 Cross-Engine Smoke Test Comparator

P3 — Nice to Have

7. 📝 ADR (Architecture Decision Record) Suggester

8. 🧪 Integration Test Auto-Filler

9. 🌐 Firewall Rule Audit Agent

📈 Maturity Assessment

🔄 Best Practice Comparison

What This Repo Does Exceptionally Well

What Could Be Improved

📝 Notes

Replies: 0 comments

github-actions[bot]
bot Apr 17, 2026

Agentic Workflows (`.md`)

Standard Workflows (`.yml`)