Skip to content

docs(readme): reframe for cloud incident triage in plain language#62

Open
sourcehawk wants to merge 1 commit into
mainfrom
docs/readme-cloud-positioning
Open

docs(readme): reframe for cloud incident triage in plain language#62
sourcehawk wants to merge 1 commit into
mainfrom
docs/readme-cloud-positioning

Conversation

@sourcehawk
Copy link
Copy Markdown
Owner

Description

Reframe the README so an SRE/platform engineer without an AI background can tell what triagent is, decide whether they want it, and get it running. The previous version led with the architecture (MCP catalog, reasoning agent, playbook walker) and anchored everything on Kubernetes. This version leads with the job, lists the surfaces it reaches as peers, and glosses the AI vocabulary inline.

Changes

  • Lead with the general capability (cloud incident triage) and list reachable surfaces as peers (Kubernetes, AWS, GCP, Prometheus, Slack, GitHub, incident.io) instead of anchoring on Kubernetes.
  • Gloss AI terms inline (agent, MCP, playbook) for readers without AI background; explain the Claude Code dependency and that investigation context is sent to Claude.
  • State the operating model plainly: read-only cluster/cloud access, the only writes are reviewed Git PRs, runs per-laptop, teams share via Git.
  • Clarify that only the Claude Code CLI is needed to launch; other integrations attach through the profile.
  • Trim the command list to triagent help plus the two start forms.
  • Drop the user-facing "preflight" step, including its inaccurate claim that the launcher verifies the namespace exists and that you can list pods (it does neither at start, by design).
  • Add the contributor merge bar (frontend typecheck) and a ~/.local/bin PATH note.

Testing

Docs-only change; no code touched. Validated the rewrite with fresh-reader passes per the writing-docs workflow: three personas (evaluating SRE, team adopter, contributor) answered their arrival questions from the revised text alone, then a follow-up pass confirmed the previously-failing questions (safety model, data egress, AWS/GCP-vs-setup consistency, deployment model, contributor merge bar) now resolve. No em dashes; one line per paragraph.

Rewrite the README so an SRE/platform engineer without AI background can
tell what triagent is and get it running:

- Lead with the general capability (cloud incident triage) and list the
  surfaces it reaches as peers (Kubernetes, AWS, GCP, Prometheus, Slack,
  GitHub, incident.io) rather than anchoring everything on Kubernetes.
- Gloss the AI terms inline (agent, MCP, playbook, walker) instead of
  assuming them; explain the Claude Code dependency and that investigation
  context is sent to Claude.
- State the safety and operating model plainly: read-only cluster/cloud
  access, the only writes are reviewed Git PRs, runs per-laptop, teams
  share via Git.
- Trim the command list to `triagent help` plus the two start forms.
- Drop the internal "preflight" vocabulary from the user-facing flow.
- Add the contributor merge bar (frontend typecheck) and a PATH note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 30, 2026 17:43
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Docs-only rewrite of README.md that reframes triagent from Kubernetes-first "agentic incident investigation" toward cloud incident triage in plain language, glossing AI vocabulary inline and clarifying the operating/safety model. Also trims the command list, drops the now-inaccurate user-facing preflight step, and updates the contributor section to reflect that frontend typecheck is a CI gate not covered by make test.

Changes:

  • Replace AI/architecture-led intro and "what it does" with a plainer SRE-facing pitch, glossing terms like agent, MCP, and playbook.
  • Reword requirements/run flow: clarify Claude Code dependency and data egress, drop the inaccurate preflight namespace/pod-list claim, simplify command list to help and the two start forms.
  • Add ~/.local/bin PATH note and a contributor merge bar covering make test, make lint, and frontend npm run typecheck.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread README.md

Every tool call stays visible, so you can audit the chain or interrupt at any point. Finished sessions can be shared
so the next operator starts from where you ended, not from the alert.
It connects to the places you already look: Kubernetes, AWS, and GCP (all read-only), Prometheus, Slack, GitHub, and incident.io, plus anything else you wire up via [MCP](https://modelcontextprotocol.io) (the open standard for plugging AI assistants into tools).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants