Evaluate. Ship. Observe. Own.
Continuous evaluation, safety testing, observability, and release readiness for Microsoft Foundry agents.
Documentation | PyPI | VS Code Extension | Latest release
AgentOps Accelerator helps Microsoft Foundry agent teams evaluate quality, prepare releases, monitor behavior, and stay accountable after launch. It gives you a practical starting point for agent operations, with Foundry integration as the default path and deeper setup guidance in the full docs.
python -m pip install agentops-accelerator
agentops initagentops init starts a guided setup that creates your agentops.yaml and
.agentops/ workspace.
Next, follow the tutorial that matches your agent type:
Use AgentOps Accelerator when you need to:
- Evaluate an agent before release
- Compare changes across versions
- Capture release evidence
- Monitor agent quality and regressions
- Give teams a repeatable way to own agent behavior in production
The accelerator keeps the local workflow simple, then points you to the full docs when you are ready to configure pipelines, dashboards, and release practices.
For setup guides, tutorials, architecture, CI/CD guidance, Doctor checks, and evaluator reference, start with the documentation site:
https://aka.ms/agentops-accelerator
az login
$env:AZURE_AI_FOUNDRY_PROJECT_ENDPOINT = "https://<resource>.services.ai.azure.com/api/projects/<project>"
$env:AZURE_OPENAI_ENDPOINT = "https://<openai-resource>.openai.azure.com"
$env:AZURE_OPENAI_DEPLOYMENT = "gpt-4o-mini"
agentops eval analyze
agentops eval run
agentops doctor --evidence-packFor Foundry targets, use either project_endpoint: in agentops.yaml or
AZURE_AI_FOUNDRY_PROJECT_ENDPOINT. Config wins when both are set.
Outputs land in .agentops/results/latest/:
results.json- machine-readable (versioned, stable schema)report.md- human-readable, PR-friendly
Release evidence lands in .agentops/release/latest/:
evidence.json- machine-readable production-readiness projectionevidence.md- PR/release summary
Capture the first successful run as a baseline:
New-Item -ItemType Directory -Force .agentops\baseline | Out-Null
Copy-Item .agentops\results\latest\results.json .agentops\baseline\results.jsonTo see a visible comparison, publish a new agent version with a prompt
that paraphrases instead of copying exact-answer requests, update
agentops.yaml to that new name:version, and compare against the
baseline:
agentops eval run --baseline .agentops/baseline/results.jsonThe report grows a Comparison vs Baseline section with per-metric deltas.
Install optional extras as needed: [agent] for Doctor/Cockpit and [mcp] for MCP.
agentops --version- show installed version.agentops init- bootstrap config and seed data.agentops eval analyze- check eval readiness.agentops eval init- bootstrap an azdeval.yamlrecipe and wireexecution: azd.agentops eval run [--baseline PATH]- run an evaluation.agentops eval promote-traces --source FILE [--apply]- promote local trace export files.agentops telemetry validate NAME- validate an Azure Monitor or Application Insights import.agentops telemetry preview NAME --rows N- preview telemetry import rows.agentops telemetry import NAME --apply- write the imported telemetry dataset.agentops report generate- regeneratereport.md.agentops workflow analyze- recommend CI/CD shape.agentops workflow generate- generate CI/CD workflows.agentops skills install- install Copilot or Claude skills.agentops mcp serve- start the MCP server.agentops doctor [--evidence-pack]- run readiness checks.agentops cockpit- open the local Cockpit.agentops agent serve- serve Doctor as a Copilot Extension.
agentops cockpit opens a localhost command center for the current workspace.
It combines eval history, Doctor findings, workflow status, and links to the
matching Foundry and Azure Monitor views.
Cockpit sections, in display order:
- Foundry connection - project, tenant, agent, App Insights.
- Foundry launchpad - links for the agent, project, and telemetry.
- Observability readiness - tracing, evals, red team, alerts.
- AgentOps Doctor - latest Doctor findings.
- Eval gate summary - local and CI gate history.
- Quality gate summary - score trends and regressions.
- Production signal - App Insights health snapshot.
- CI/CD Pipelines - GitHub Actions status.
- Next actions - contextual recommendations.
- Foundry Prompt Agent tutorial - use this when the Foundry target is
agent: name:version. Walks the sandbox to dev journey with a PR gate. - Hosted or HTTP Agent tutorial - use this when the target is a Foundry hosted or HTTP endpoint URL. Same sandbox to dev journey for endpoint-based agents.
- End-to-end tutorial - extends either of the above with the full sandbox to dev to qa to prod promotion, Foundry red-team scans, and trace-to-regression promotion.
- Evaluation paths - choose static dataset, grey-box HTTP, or telemetry/trace import.
- Core concepts
- How it works
- Doctor explained
- CI/CD with GitHub Actions
- Built-in evaluator reference
- Release process
See CONTRIBUTING.md for development, testing, and contribution guidance.