[Agents Extension] Add test scenarios using the cli-interactive-tester tool by trangevi · Pull Request #8524 · Azure/azure-dev

trangevi · 2026-06-02T18:16:58Z

No description provided.

Signed-off-by: trangevi <trangevi@microsoft.com>

…ll add back later Signed-off-by: trangevi <trangevi@microsoft.com>

Signed-off-by: trangevi <trangevi@microsoft.com>

Copilot

Pull request overview

Adds a comprehensive, goal-based suite of manual interactive test scenarios for the azure.ai.agents azd extension, designed to be driven via the cli-interactive-tester MCP server. This codifies repeatable end-to-end command flows (from offline help/version checks through Tier 2 cloud provision/deploy/invoke) along with a profile/override mechanism and supporting fixtures.

Changes:

Introduces a tiered scenario catalog (00-, 10-, 2x-) with tagging conventions for selective runs and fleet orchestration.
Adds shared profile defaults (profile.yaml), a local override template (profile.local.yaml.example), and gitignore rules for local profiles and run artifacts.
Adds a minimal “from-code” Python fixture used by scaffold-only init scenarios, and documents scenario usage in both the scenarios README and the extension AGENTS.md.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
cli/azd/extensions/azure.ai.agents/AGENTS.md	Documents the existence/intent of the manual cli-interactive-tester scenario suite and how contributors should use it.
cli/azd/extensions/azure.ai.agents/cspell.yaml	Adds a new word to prevent false-positive spellcheck failures from scenario docs.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/.gitignore	Ignores local profiles and tester output artifacts.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/README.md	Provides full orchestration guidance (tiers, tags, WSL path rules, auth prerequisites, hooks, and fleet mode).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/profile.yaml	Defines repo-shared default profile values (region/model/shared suffix).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/profile.local.yaml.example	Provides a template for per-user/per-CI identifying values (prefix/subscription/optional tenant).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/fixtures/from-code/app.py	Minimal Python source fixture for “init from existing code” scenarios.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/fixtures/from-code/requirements.txt	Minimal requirements file to ensure Python project detection during init-from-code flows.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-version.yaml	Tier 0 scenario for `azd ai agent version`.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-help-root.yaml	Tier 0 scenario validating root help output/command discovery.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-sample-list-text.yaml	Tier 0 scenario for `sample list` text rendering.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-sample-list-json-filters.yaml	Tier 0 scenario for `sample list` JSON output and filtering flags.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-doctor-empty-dir.yaml	Tier 0 scenario for `doctor` behavior in an empty directory.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-doctor-local-only.yaml	Tier 0 scenario for `doctor --local-only`.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-init-validate-mutually-exclusive.yaml	Tier 0 negative-path scenario validating init argument conflicts.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-init-validate-no-prompt-missing.yaml	Tier 0 negative-path scenario validating `--no-prompt` missing inputs behavior.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/00-init-picker-navigation.yaml	Tier 0 scenario focusing on init picker UX (filtering, navigation, abort behavior).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-template-python.yaml	Tier 1 scenario scaffolding from a Python template (auth required; stops before provision).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-template-dotnet.yaml	Tier 1 scenario scaffolding from a .NET template (auth required; stops before provision).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-from-manifest-url.yaml	Tier 1 scenario scaffolding from a GitHub manifest URL (auth + gh auth prerequisite).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-from-code.yaml	Tier 1 scenario for “use code in current directory” flow using seeded fixture.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-flags-agent-name-model.yaml	Tier 1 scenario validating `--agent-name`/`--model` overrides when initializing from a manifest URL.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/10-init-deploy-mode-code.yaml	Tier 1 scenario validating interactive code-deploy mode prompts (entry point/runtime).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/20-setup-deploy-shared-agent.yaml	Tier 2 setup scenario that provisions and deploys a shared agent used by subsequent Tier 2 scenarios.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/21-show.yaml	Tier 2 scenario validating `show` table output.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/21-show-json.yaml	Tier 2 scenario validating `show --output json`.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/22-invoke-remote.yaml	Tier 2 scenario validating remote `invoke`.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/22-invoke-new-session.yaml	Tier 2 scenario validating session vs conversation memory semantics for invoke.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/22-invoke-input-file.yaml	Tier 2 scenario validating `invoke -f` request-body-from-file behavior.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/23-sessions-lifecycle.yaml	Tier 2 scenario validating the sessions lifecycle command group.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/24-files-lifecycle.yaml	Tier 2 scenario validating the files lifecycle command group.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/25-monitor-console.yaml	Tier 2 scenario validating monitor console logs.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/25-monitor-system.yaml	Tier 2 scenario validating monitor system/container events.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/26-endpoint-update.yaml	Tier 2 scenario validating `endpoint update` behavior (patching without new version).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/27-run-local-and-invoke-local.yaml	Tier 2 scenario validating `run` + `invoke --local` with allocated ports and two sessions.
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/2A-doctor-provisioned-all-pass.yaml	Tier 2 scenario validating doctor against a provisioned project (with a known-acceptable warning).
cli/azd/extensions/azure.ai.agents/tests/cli-interactive-tester-scenarios/2Z-teardown-down.yaml	Tier 2 teardown scenario to destroy resources and clean the shared working directory.

Signed-off-by: trangevi <trangevi@microsoft.com>

github-actions · 2026-06-05T21:51:37Z

📋 Prioritization Note

Thanks for the contribution! The linked issue isn't in the current milestone yet.
Review may take a bit longer — reach out to @rajeshkamal5050 or @kristenwomack if you'd like to discuss prioritization.

Adds a workflow skill under .github/skills/agent-scenario-tests/ that resolves the current branch's PR, maps changed files to impacted cli-interactive-tester scenario tags, drives the matching scenarios through the tester MCP server, and posts a results comment on the PR. Cost-aware: Tier 2 runs only after explicit user confirmation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…coverage Add 9 cli-interactive-tester scenarios closing the coverage gaps found in the PR #8524 review: - eval (cmd:eval): 00-eval-context-required (offline endpoint-required) and 28-eval-lifecycle (Tier 2 init/run/list/show against the shared agent). - optimize (cmd:optimize): 00-optimize-apply-requires-candidate (offline required-flag) and 29-optimize-submit-and-cancel (Tier 2, capped iteration). - invoke: 00-invoke-validate-protocol (offline unsupported-protocol) and 23-invoke-protocol-invocations (Tier 2 invocations memory semantics). - init: 00-init-validate-deploy-mode (offline value/required-flag validation) and 10-init-deploy-mode-container (Tier 1 container scaffold). - doctor: 00-doctor-partial-failure (mixed PASS+FAIL, exit 1). Add cmd:eval and cmd:optimize to the tag taxonomy, update the scenarios README tier tables, and update the agent-scenario-tests skill impact-mapping (eval and optimize are now covered Tier 2 commands, no longer listed as gaps). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

A local scenario run revealed that 00-init-validate-deploy-mode never exercised the --deploy-mode validation: in an empty directory with --no-prompt, init fails earlier with 'template selection requires interactive mode' because validateCodeDeployInput is only reached after an init method resolves. Reclassify Tier 0 -> Tier 1, rename to 10-init-validate-deploy-mode.yaml, and seed the from-code fixture so the from-code method resolves and the bogus --deploy-mode value is actually rejected ('--deploy-mode must be container or code'). Reaching the check scaffolds a starter template (network), hence Tier 1. Note the late-validation UX (template scaffolded before the flag is validated) as a report_finding. Update the README tier tables accordingly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…dings A full Tier 0+1+2 cli-interactive-tester run against a freshly deployed shared agent surfaced two scenario-accuracy issues (the CLI itself behaved correctly): - 28-eval-lifecycle: 'eval init --no-wait' is ASYNC — it submits dataset (datagen-*) and evaluator (evaluatorgen-*) generation jobs and writes eval.yaml, but does NOT create an eval 'run'. So 'eval list' legitimately shows 0 rows right after init and 'eval show' (no id) errors cleanly. Refined the header + goals to describe the async semantics and treat an empty list / eval-id-required message as expected rather than a failure. - 29-optimize-submit-and-cancel: the optimize command group is preview-gated per subscription. On a non-enrolled subscription both 'optimize' and 'optimize list' return a clean 400 SubscriptionNotRegistered (signup: aka.ms/ao/quickstart), so the submit->status->cancel lifecycle can't run. Documented the Agent Optimizer enrollment prerequisite and added a gating check that accepts the clean SubscriptionNotRegistered error as a valid outcome when enrollment is absent. Add cspell words: datagen, evaluatorgen, signup. All Tier 0 (13) and Tier 1 (2) scenarios and the Tier 2 setup/invoke/teardown passed; resources were fully torn down with azd down --force --purge. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Add 6 new cli-interactive-tester scenarios covering three recently merged commands: Tier 0 (offline help validation): - 00-delete-help.yaml: validates azd ai agent delete --help output - 00-endpoint-show-help.yaml: validates azd ai agent endpoint show --help output - 00-code-download-help.yaml: validates azd ai agent code download --help output Tier 2 (cloud E2E, run between 2A-doctor and 2Z-teardown): - 2B-endpoint-show.yaml: shows endpoint config (table + JSON output) - 2C-code-download.yaml: negative-path test (container agent returns AgentNotCodeBased) - 2D-delete.yaml: deletes agent with --force, confirms removal via show All scenarios tested locally: 6/6 PASS. Co-authored-by: Jian Wu <wujia@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

The 4 Tier 1 scenarios that copy fixtures used a hardcoded /mnt/c/Repos/... fallback path (Travis's machine). Replace with bash :? expansion so that missing AZD_AGENTS_FIXTURES fails immediately with a clear message instead of a cryptic 'No such file or directory'. Co-authored-by: Jian Wu <wujia@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

MCP-tool-driven workflow for running cli-interactive-tester scenarios. Manual dispatch only (workflow_dispatch) with tier selection (0, 0+1, 0+1+2). Key design: - All scenarios executed via cli-interactive-tester MCP tool (not shell parsing) - Tool installed via git clone + pip install -e from coreai-microsoft repo - Checkout hardcoded to trangevi/test-scenarios (until PR Azure#8524 merges) - ubuntu-22.04 runner (consistent with existing pipelines) - profile.local.yaml generated from GitHub secrets at runtime - Tier 2 includes always-run teardown step for resource cleanup - Results uploaded as artifacts Blocking: python -m auto_test_tool.runner batch mode needs to be confirmed or implemented. Without it, scenarios cannot run headlessly in CI. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot-driven pipeline using cli-interactive-tester MCP tool. Same architecture as local testing: Copilot CLI (LLM) ↔ MCP ↔ cli-interactive-tester ↔ tmux ↔ azd CLI Design: - workflow_dispatch only (tier selector: 0 / 0+1 / 0+1+2) - ubuntu-22.04 runner - cli-interactive-tester installed via git clone + pip install -e - MCP config generated for Copilot to connect to the tool - Copilot reads scenario goals and drives terminal autonomously - Tier 2 has always-run teardown for Azure resource cleanup - Results uploaded as artifacts (HTML reports + screenshots) Checkout: hardcoded to trangevi/test-scenarios (until PR Azure#8524 merges) Blocking: need to confirm how to invoke Copilot CLI headlessly in CI (copilot --mcp-config --prompt-file, gh copilot run, or Extensions API) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Agentic Workflow (.md source) for Copilot-driven E2E testing. Uses gh-aw framework — same pattern as extension-pr-labeler. Architecture: gh-aw framework → Copilot CLI (LLM) ↔ MCP ↔ cli-interactive-tester ↔ tmux ↔ azd Key design: - gh-aw .md source file (compile with 'gh aw compile' to generate .lock.yml) - cli-interactive-tester registered as MCP tool in frontmatter - Copilot reads scenario YAML goals and drives terminal autonomously - workflow_dispatch with tier selector (0 / 0+1 / 0+1+2) - Setup: Go build, Python 3.12, tmux, uv, Azure login, test profile - Checkout: trangevi/test-scenarios (until PR Azure#8524 merges) TODO: - Confirm cli-interactive-tester repo visibility (public/private) - Run 'gh aw compile' to generate .lock.yml - Configure secrets: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_SUBSCRIPTION_ID, FOUNDRY_PROJECT_ENDPOINT, GH_TOKEN, COPILOT_GITHUB_TOKEN Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Agentic Workflow for Copilot-driven E2E testing. Uses gh-aw framework — same pattern as extension-pr-labeler. Architecture: gh-aw framework → Copilot CLI (LLM) ↔ MCP ↔ cli-interactive-tester ↔ tmux ↔ azd Key design: - cli-interactive-tester registered as MCP tool in frontmatter - Copilot reads scenario YAML goals and drives terminal autonomously - workflow_dispatch with tier selector (0 / 0+1 / 0+1+2) - Setup: Go build, Python 3.12, tmux, uv, Azure login, test profile - Checkout: trangevi/test-scenarios (until PR Azure#8524 merges) TODO: - Confirm cli-interactive-tester repo visibility (public/private) - Run 'gh aw compile' to generate .lock.yml - Configure secrets: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_SUBSCRIPTION_ID, FOUNDRY_PROJECT_ENDPOINT, GH_TOKEN, COPILOT_GITHUB_TOKEN Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot CLI-driven pipeline using cli-interactive-tester MCP tool. Same architecture as local testing — Copilot reads scenario goals and drives terminal via MCP protocol. Implementation: - Copilot CLI installed via npm install -g @github/copilot - Auth via COPILOT_GITHUB_TOKEN (Fine-grained PAT, Copilot Requests perm) - MCP config in ~/.copilot/mcp-config.json (auto-loaded by Copilot) - Execution: copilot -p prompt --allow-tool=... --no-ask-user - workflow_dispatch with tier selector (0 / 0+1 / 0+1+2) - ubuntu-22.04 runner - Checkout: trangevi/test-scenarios (until PR Azure#8524 merges) - Tier 2 has always-run teardown for Azure resource cleanup - Results uploaded as artifacts TODO: - Confirm --allow-tool syntax for MCP-registered tools - Configure COPILOT_PAT secret (Fine-grained PAT) - Confirm cli-interactive-tester repo visibility - Create prompt-ci-run.md in scenarios directory Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

therealjohn

Approved, but I'm requesting changes to block until someone on the AZD can also take a look. Will ping and re-approve once they can review.

Add test scenarios for the cli-interactive-tester tool

16db9dd

Signed-off-by: trangevi <trangevi@microsoft.com>

microsoft-github-policy-service Bot assigned trangevi Jun 2, 2026

github-actions Bot added the ext-agents azure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes} extensions label Jun 2, 2026

trangevi added 11 commits June 2, 2026 14:14

Some scenario edits

4fbaa67

Signed-off-by: trangevi <trangevi@microsoft.com>

Picking up recent tester tool updates

61fc535

Signed-off-by: trangevi <trangevi@microsoft.com>

Some scenario updates

596826b

Signed-off-by: trangevi <trangevi@microsoft.com>

Some more fixes to the scenarios

71767db

Signed-off-by: trangevi <trangevi@microsoft.com>

Some more improvements

df67a66

Signed-off-by: trangevi <trangevi@microsoft.com>

Remove optimization and evals because I don't understand them yet, wi…

3eac215

…ll add back later Signed-off-by: trangevi <trangevi@microsoft.com>

Add prompt to readme

c18315c

Signed-off-by: trangevi <trangevi@microsoft.com>

Add parameterization support

1d1e5f2

Signed-off-by: trangevi <trangevi@microsoft.com>

Add tags

d3cd9b5

Signed-off-by: trangevi <trangevi@microsoft.com>

Agents.md update, to direct people to the testing

ea2bac4

Signed-off-by: trangevi <trangevi@microsoft.com>

cspell

8a63c1d

Signed-off-by: trangevi <trangevi@microsoft.com>

trangevi marked this pull request as ready for review June 5, 2026 21:32

Copilot AI review requested due to automatic review settings June 5, 2026 21:32

trangevi requested review from JeffreyCA, glharper, therealjohn and trrwilson as code owners June 5, 2026 21:32

Copilot started reviewing on behalf of trangevi June 5, 2026 21:33 View session

Copilot AI reviewed Jun 5, 2026

View reviewed changes

trangevi linked an issue Jun 5, 2026 that may be closed by this pull request

Initial pass at scenario testing #8557

Open

PR comments

4793e9f

Signed-off-by: trangevi <trangevi@microsoft.com>

glharper requested review from RickWinter, danieljurek, tg-msft and vhvb1989 as code owners June 8, 2026 15:58

glharper and others added 5 commits June 9, 2026 13:21

v1212 mentioned this pull request Jun 11, 2026

Add E2E test pipeline for azure.ai.agents extension (Tier 0/1) #8607

Draft

glharper approved these changes Jun 11, 2026

View reviewed changes

glharper enabled auto-merge (squash) June 11, 2026 14:43

therealjohn requested changes Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Agents Extension] Add test scenarios using the cli-interactive-tester tool#8524

[Agents Extension] Add test scenarios using the cli-interactive-tester tool#8524
trangevi wants to merge 19 commits into
mainfrom
trangevi/test-scenarios

trangevi commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

therealjohn left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

trangevi commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Prioritization Note

Uh oh!

therealjohn left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented Jun 5, 2026 •

edited

Loading