Skip to content

Add advanced Hosted Agent tutorial for network-isolated Azure AI Landing Zone environments #340

Description

@placerda

Summary

Add a separate advanced tutorial that shows how to run the AgentOps Hosted Agent workflow inside a network-isolated enterprise Azure environment, using Azure AI Landing Zone as the baseline infrastructure.

This should not be a short note in the basic Hosted Agent tutorial. It should be its own tutorial, or an advanced variant, because the target reader needs additional landing-zone prerequisites, connectivity decisions, validation steps, and operational guidance before the normal Hosted Agent flow can work safely.

Background

The basic Hosted Agent tutorial is useful for the standard sandbox-to-production workflow. Enterprise users also need guidance for environments where public access is restricted and core AI resources are deployed behind private networking.

The Azure AI Landing Zone repository, https://github.com/Azure/bicep-ptn-aiml-landing-zone, describes an enterprise-scale, production-ready reference architecture for secure and resilient AI applications and agents on Azure. This tutorial should treat Azure AI Landing Zone as the baseline infrastructure that is deployed first. AgentOps should not duplicate the landing-zone architecture. Instead, the tutorial should explain how to operate AgentOps and Hosted Agent within that isolated baseline.

Proposed tutorial scope

Create a tutorial that explains this workflow:

  1. Deploy Azure AI Landing Zone first, following the landing-zone guidance and enterprise controls.
  2. Confirm the isolated environment has the required networking, identity, monitoring, and DevOps access paths.
  3. Deploy or configure AgentOps and Hosted Agent to operate inside that environment.
  4. Reuse the existing Hosted Agent workflow where possible:
    • sandbox/dev
    • evaluate
    • ship through a PR gate
    • observe
    • own and collect evidence
  5. Add the extra validation steps needed when public network access is disabled or tightly restricted.

Topics to cover

The tutorial should include practical guidance for at least these areas:

  • Private networking and private endpoints: which resources need private access, DNS expectations, and how to validate name resolution and connectivity.
  • Identity and RBAC: managed identity or workload identity usage, least-privilege role assignments, and expected access boundaries between AgentOps, Foundry, storage, monitoring, and deployment resources.
  • GitHub Actions, OIDC, and runner connectivity: how PR gates and deployment workflows authenticate, and what changes when GitHub-hosted runners cannot reach private endpoints. Cover options such as self-hosted runners, private network access patterns, and OIDC trust configuration.
  • Azure AI Foundry project endpoints: how Hosted Agent connects to project endpoints in an isolated environment, including validation for private endpoint access and blocked public access.
  • Application Insights and Azure Monitor access: how telemetry is emitted, queried, and reviewed when monitoring resources are private or ingestion/query paths are restricted.
  • Egress constraints: required outbound dependencies, expected deny-by-default behavior, and how to document approved egress when needed.
  • Secrets and configuration: Key Vault or managed identity patterns, configuration values that differ from the basic tutorial, and how to avoid placing secrets in repository files or GitHub logs.
  • Evaluation and telemetry under restricted public access: how evaluation runs, traces, logs, and evidence collection work when the environment cannot call public endpoints freely.

Suggested structure

A possible tutorial outline:

  1. When to use this tutorial

    • Explain that this is for enterprise or regulated environments using Azure AI Landing Zone and private networking.
    • Link to the basic Hosted Agent tutorial for non-isolated scenarios.
  2. Architecture baseline

    • Introduce Azure AI Landing Zone as the prerequisite baseline.
    • Clarify that AgentOps builds on the landing zone instead of recreating it.
  3. Prerequisites

    • Azure AI Landing Zone deployed.
    • Required subscriptions, resource groups, identity, private DNS, private endpoints, and monitoring resources available.
    • Required GitHub repository, environments, OIDC trust, and runner strategy decided.
  4. Connectivity validation

    • Validate private DNS and private endpoint resolution.
    • Validate GitHub Actions or self-hosted runner access.
    • Validate Foundry project access.
    • Validate Application Insights or Monitor ingestion and query access.
  5. Run the Hosted Agent flow inside the isolated environment

    • Adapt the sandbox/dev step.
    • Run evaluation.
    • Enforce the PR gate.
    • Ship safely.
    • Observe through private monitoring access.
    • Capture ownership and evidence.
  6. Troubleshooting

    • Private DNS failures.
    • Missing RBAC assignments.
    • Runner cannot reach private endpoints.
    • Foundry endpoint blocked or resolving publicly.
    • Telemetry ingestion/query failures.
    • Evaluation blocked by egress policy.
  7. Cleanup and governance notes

    • Explain what can be cleaned up from the tutorial and what is owned by the landing zone.
    • Warn readers not to remove shared landing-zone resources by mistake.

Acceptance criteria

  • A new standalone tutorial, or clearly separated advanced variant, is added for running Hosted Agent in a network-isolated enterprise environment.
  • The tutorial explicitly uses Azure AI Landing Zone as the pre-deployed baseline infrastructure and links to https://github.com/Azure/bicep-ptn-aiml-landing-zone.
  • The tutorial makes clear that AgentOps should consume the landing-zone baseline rather than duplicate the landing-zone architecture.
  • The tutorial reuses the existing Hosted Agent workflow where possible: sandbox/dev, evaluate, ship through PR gate, observe, own/evidence.
  • The tutorial documents additional prerequisites and validation steps for private networking, private endpoints, DNS, identity/RBAC, runner connectivity, Foundry access, monitoring access, egress restrictions, secrets/configuration, evaluation, and telemetry.
  • The tutorial includes a practical troubleshooting section for common isolated-environment failures.
  • The tutorial keeps security guidance concrete and avoids asking users to weaken enterprise isolation controls just to make the sample work.
  • Links from the existing Hosted Agent tutorial point readers to this advanced tutorial when they need enterprise network isolation.

Suggested labels

  • documentation
  • tutorial
  • enhancement
  • enterprise
  • security

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions