feat(appkit): reference agent-app, dev-playground chat UI, docs, and template by MarioCadenas · Pull Request #306 · databricks/appkit

MarioCadenas · 2026-04-21T17:58:48Z

Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.

Reference application: agent-app

apps/agent-app/ — a standalone app purpose-built around the agents
feature. Demonstrates every major capability in one place:

Markdown orchestrator (config/agents/assistant.md, default)
with destructive file tools (upload, delete) for HITL demo, and
agents: [support, researcher] delegating to both a markdown
sibling and a code-defined specialist.
Markdown specialist (config/agents/support.md) — full analytics
- files toolkits, plus ambient get_weather.
Code-defined specialist (researcher in server.ts) — defined
in code specifically because its MCP tool set is conditional on
runtime env vars, which markdown frontmatter can't express.
Referenced from assistant.md via the markdown → code cross-reference.
server.ts — concise: ambient tool() factories, conditional MCP
wiring, zero-trust host allowlist derived from the same env vars,
agents() plugin config with autoInheritTools and mcp.trustedHosts.
Vite + React 19 + TailwindCSS frontend with a chat UI showing
streaming tokens, tool calls, and an approval card that approves or
denies destructive tool requests over /api/agents/approve.
Databricks deployment config (databricks.yml, app.yaml) and
.env.example for local dev.

dev-playground chat UI + autocomplete agent

apps/dev-playground/client/src/routes/agent.route.tsx — chat UI
with inline autocomplete (hits the autocomplete markdown agent
configured with ephemeral: true) and a full threaded conversation
panel (hits the default agent).

apps/dev-playground/server/index.ts — code-defined helper agent
using fromPlugin(analytics) alongside the markdown-driven
autocomplete agent in config/agents/, demonstrating the mixed
setup against the same plugin list. Route tree (routeTree.gen.ts)
regenerated to include the new /agent route.

Docs

docs/docs/plugins/agents.md — progressive guide covering:

Drop a markdown file → it just works.
Scope tools via toolkits: / tools: frontmatter.
Code-defined agents with fromPlugin().
Sub-agents (markdown → markdown, markdown → code, code → code).
Standalone runAgent() (no createApp or HTTP).

Plus configuration reference (including approval, limits, mcp
keys), runtime API reference, and a full frontmatter schema table.

docs/docs/api/appkit/ — typedoc regenerated for the full agents
public surface including AgentDefinition.ephemeral,
AgentsPluginConfig.{approval, limits, mcp}, updated
loadAgentFromFile / loadAgentsFromDir signatures, expanded
AgentEvent union, and ToolkitEntry annotations.

Template

template/appkit.plugins.json — adds the agents plugin entry so
npx @databricks/appkit init --features agents scaffolds the plugin
correctly.

Test plan

Full appkit vitest suite: 1552 tests passing at stack tip
Typecheck clean across all 8 workspace projects
pnpm docs:build clean (no broken links)
pnpm --filter=@databricks/appkit build:package clean, publint clean

Signed-off-by: MarioCadenas MarioCadenas@users.noreply.github.com

PR Stack

Shared agent types + LLM adapters — feat(appkit): shared agent types and LLM adapter implementations #301
Tool primitives + ToolProvider surfaces — feat(appkit): tool primitives and ToolProvider surfaces on core plugins #302
Plugin infrastructure (attachContext + PluginContext) — feat(appkit): plugin infrastructure — attachContext + PluginContext mediator #303
agents() plugin + createAgent(def) + markdown-driven agents — feat(appkit): agents() plugin, createAgent(def), and markdown-driven agents #304
fromPlugin() DX + runAgent plugins arg + toolkit-resolver — feat(appkit): fromPlugin() DX, runAgent plugins arg, shared toolkit-resolver #305
Reference app + dev-playground + docs (this PR)

Demo

agent-demo.mp4

pkosiec

(not yet a full review, just some partial comments)

pkosiec · 2026-05-07T10:29:37Z

+    ...fromPlugin(analytics),                             // all analytics tools
+    ...fromPlugin(files, { only: ["uploads.read"] }),     // filtered subset
+    get_weather: tool({
+      name: "get_weather",


Why do we need a name for a tool which is already registered as a unique key?

pkosiec · 2026-05-07T10:37:45Z

+toolkits:
+  - analytics                             # all analytics.* tools
+  - files: [uploads.read, uploads.list]   # only these files tools
+  - genie: { except: [getConversation] }  # everything but getConversation
+tools: [get_weather]                      # ambient tool declared in code


I'm not sure if it isn't a bit confusing to have such distinction - in the end, it's always about tools (but either custom ones or plugin ones.

Maybe we should rename that to: tools and extraTools
or, have something like:

tools: - plugin:analytics - plugin:files: { only: [uploads.read] } - get_weather

WDYT?

pkosiec · 2026-05-07T10:45:50Z

That's a lot of the code for the dev playground.

Honestly, I'm not sure if that fits in the dev playground - maybe we should move it somewhere? E.g. Move it as a separate template in app-templates?

IMO this is too good to keep it in the dev playground.
Just saying, it's not a blocker but I think it has a huge potential 👍

Actually, as a second thought - the "Smart dashboard" example is kind of reimplementing Metric Views? Correct me if I'm wrong 🤔

pkosiec · 2026-05-07T10:53:02Z

Wait, this file shouldn't be modified, right? Because there's no breaking change if we haven't released agent plugin yet?

(BTW, why do we have so many "Changelog" headers? 😄 something is wrong with our changelog generation, I guess)

pkosiec · 2026-05-07T11:50:35Z

Agentic reviews (aggregated report):

Code Review — agent/v2/6-apps-docs

Context

Branch agent/v2/6-apps-docs (based on agent/v2/5-fromplugin-runagent) adds ~22K lines implementing the full agents system: core agent runtime, agents plugin, MCP client, plugin context mediator, tool primitives, ToolProvider surfaces on all core plugins, smart dashboard demo, template updates, and docs. This review covers both correctness bugs and developer experience issues.

What works well

Progressive disclosure — 5 levels (markdown drop-in → frontmatter scoping → code agents → sub-agents → standalone) is a strong pedagogical ladder

tool() ergonomics — Zod schema drives type inference + JSON Schema generation + runtime validation with LLM-friendly errors

fromPlugin() pattern — elegant lazy references; no instance coupling, spread-friendly, clear error messages with Available: [...] listing

Safety defaults — autoInheritTools: false, approval gates, resource limits, MCP host policy, SQL classifier. Two-key operation for auto-inherit is well-designed

docs/docs/plugins/agents.md — comprehensive, well-structured, covers the full lifecycle with runnable examples at each level

P0 — Must fix

1. Doc/behavior contradiction on auto-inherit

File: docs/docs/plugins/agents.md:57-58 vs :233

Level 1 documentation says:

"4. Auto-inherits every registered ToolProvider plugin's tools (analytics.*, files.*, ...)"

But the Configuration Reference says:

"autoInheritTools defaults to { file: false, code: false } — no tools spread into any agent unless the developer explicitly opts in."

These directly contradict each other. A developer following Level 1 will expect their markdown agent to see all plugin tools out of the box, but it won't — they get zero tools. This is the single most confusing thing in the agents DX.

Fix: Either (a) add agents({ autoInheritTools: { file: true } }) to the Level 1 example, or (b) rewrite the Level 1 narrative to say the agent has no tools yet and point to Level 2.

P1 — Should fix before merge

2. MCP callTool corrupts results when text is undefined

File: packages/appkit/src/connectors/mcp/client.ts:264-275

McpToolCallResult.content[].text is typed text?: string (line 58), but callTool maps c.text without filtering undefined:

.filter((c) => c.type === "text") .map((c) => c.text) // c.text can be undefined .join("\n"); // produces "undefined" in output

Fix: Change both occurrences (error path line 267 and success path line 274) to filter:

.filter((c): c is { type: string; text: string } => c.type === "text" && typeof c.text === "string") .map((c) => c.text)

3. MCP sendNotification ignores HTTP error status

File: packages/appkit/src/connectors/mcp/client.ts:362-389

The notifications/initialized fetch doesn't check response.ok. A 4xx/5xx response is silently ignored, making connect() appear successful when the server may not have registered the client.

Fix: Warn (don't throw — MCP spec says notifications are fire-and-forget):

if (!response.ok) { logger.warn("MCP notification %s failed: %s %s", method, response.status, response.statusText); }

4. MCP SSE response body read has no size limit

File: packages/appkit/src/connectors/mcp/client.ts:339-340

response.text() reads the entire SSE body into memory. A malicious or misconfigured MCP server could send unbounded data.

Fix: Read incrementally with a size cap (e.g., 10 MB), or keep response.text() but add a Content-Length check if the header is present and document the 30s AbortSignal.timeout as the backstop.

5. reload() races with in-flight streams

File: packages/appkit/src/plugins/agents/agents.ts:161-168

reload() calls mcpClient.close() and sets it to null. In-flight streams captured this.mcpClient at _streamAgent call time (line 802). The old client is now closed, so sendRpc throws "MCP client is closed" mid-stream.

Fix: Don't close the old client synchronously — let in-flight streams drain. Or simpler: don't close it at all (it has no keep-alive connections; GC collects after in-flight refs drop).

6. tool() return type lies about string-only

Files: packages/appkit/src/core/agent/tools/tool.ts:19, function-tool.ts:19

ToolConfig.execute is typed as (args) => Promise<string> | string, and FunctionTool.execute matches. But the template's helper.ts:29-30 returns objects:

execute: () => ({ now: new Date().toISOString() }),

This works at runtime (the result gets serialized downstream) but the type is wrong. Developers naturally want to return structured objects from tools.

Fix: Widen ToolConfig.execute and FunctionTool.execute return types to unknown | Promise<unknown> (matching defineTool's handler signature), with serialization handled by the runner.

7. SSE parsing boilerplate in template AgentChat

File: template/client/src/pages/agents/AgentChat.tsx:104-125

~20 lines of manual SSE frame splitting, data-line extraction, and JSON parsing. Every developer who scaffolds an app will copy this verbatim. The comment at line 34 acknowledges the gap. Consider shipping a thin parseSSEStream(reader) async generator in @databricks/appkit-ui so the template doesn't teach low-level SSE plumbing as the canonical pattern.

P2 — Fix if straightforward

8. Files and Genie plugin tools don't forward signal

Files:

packages/appkit/src/plugins/files/plugin.ts:1059-1123

packages/appkit/src/plugins/genie/genie.ts:63-99

All tool handlers in _defineVolumeTools and _defineSpaceTools ignore the signal parameter from ToolEntry.handler. The analytics and lakebase plugins correctly forward it.

Fix: Add signal parameter to each handler. For genie sendMessage, pass it to abort the async iteration.

9. MCP JSON-RPC response shape not validated

File: packages/appkit/src/connectors/mcp/client.ts:349-351

as JsonRpcResponse is an unsafe cast. A malformed server response could pass through without json.error, json.result, or matching json.id.

Fix: Add minimal validation:

if (typeof json !== "object" || json === null || json.jsonrpc !== "2.0") { throw new Error(`MCP response for ${method} is not valid JSON-RPC 2.0`); }

10. Hardcoded 30s tool execution timeout

File: packages/appkit/src/core/plugin-context.ts:187

const timeout = 30_000; is not configurable. Complex SQL queries or batch operations may need more.

Fix: Accept an optional timeoutMs in the executeTool signature with 30_000 as default.

11. Missing approval timeout validation

File: packages/appkit/src/plugins/agents/agents.ts:124

cfg.timeoutMs from user config is not validated. A negative or zero value causes immediate denial.

Fix: timeoutMs: Math.max(cfg.timeoutMs ?? 60_000, 1_000)

12. Inconsistent tool annotation fields across plugins

Files: lakebase.ts:181-185, analytics.ts:290, files/plugin.ts:1067,1108,1119, genie.ts:76,96

The ToolAnnotations type defines a preferred effect enum ("read" | "write" | "update" | "destructive") but no plugin uses it — all use the deprecated readOnly/destructive booleans. The lakebase plugin sets both readOnly AND destructive (inverse of each other) which is redundant.

Fix: Standardize on the effect field across all plugins. Remove redundant pairs in lakebase.

13. Stale comments reference destructive: true instead of effect

Files: shared/src/agent.ts:126, core/agent/types.ts:149, docs/docs/plugins/agents.md:334

These all reference the legacy boolean form. Since effect is now preferred and destructive is @deprecated, docs and comments should lead with effect.

14. createAgent({ name }) precedence unclear

File: packages/appkit/src/core/agent/types.ts:77-78

AgentDefinition.name is optional ("Filled in from the enclosing key"), but the template explicitly sets it. Does explicit name override the key? What if they differ? The template should either omit name or the docs should explain precedence.

15. tool() vs defineTool() callback naming inconsistency

Files: tool.ts, define-tool.ts

Two tool-creation functions for different audiences (agent authors vs plugin authors) use different callback names: execute vs handler. A developer switching from agent-side to plugin-side tool authoring will wonder why the API changed.

P3 — Low priority

16. Sub-agent input extraction is lenient

File: packages/appkit/src/plugins/agents/agents.ts:975-980

Falls back to JSON.stringify(args) when args.input isn't a string. An object like { input: null } serializes the entire args object.

17. AbortSignal.any() requires Node 20+

File: packages/appkit/src/connectors/mcp/client.ts:327

No polyfill or feature detection. Document Node 20+ as minimum requirement or add a fallback.

18. MCP connectAll doesn't surface partial failures clearly

File: packages/appkit/src/connectors/mcp/client.ts:103-116

Failed connections are logged but not thrown. Callers can't distinguish "all connected" from "partially connected".

19. tool() description silently falls back to name

File: tool.ts:40 — description: config.description ?? config.name

If a developer omits description, the LLM sees the tool name as the description (e.g., "get_weather"). Consider logging a warning or making description required.

20. Playground server is too dense for a reference

File: apps/dev-playground/server/index.ts — 750 lines

Mixes agent setup, file policy harness, saved-views CRUD, and OBO/SP smoke tests. Consider extracting non-agent concerns into separate files.

21. No shared SSE/stream hook in appkit-ui

The playground has custom hooks (use-agent-stream.ts, use-action-dispatcher.ts) that demonstrate SSE consumption, but these are local — not reusable. The template reimplements the same SSE parsing inline, creating two divergent implementations.

Not a bug (reviewer false positives)

Budget consumed before approval: Intentional security design ("Counted pre-dispatch so a prompt-injected agent cannot drain the budget silently via denied calls")

EventChannel push/close race: Node.js is single-threaded; these can't execute concurrently

Stream cleanup ordering: activeStreams.delete() in driver's finally runs before the generator drains, but approval gate entries are already resolved. The 404 to late approvals is correct behavior

OTel context leak in asUser() dev proxy: Proxy objects and closures are lightweight; GC handles them fine

pkosiec · 2026-05-07T12:08:09Z

Actually, as a second thought - the "Smart dashboard" example is kind of reimplementing Metric Views? Correct me if I'm wrong 🤔

pkosiec · 2026-05-07T12:58:32Z

I missed that part on previous PRs, but now I'm testing the flow and I have a small suggestion: can we make the description more high-level? I don't think we need to mention the fromModelServing stuff

Maybe we should also rephrase that the agent will require DATABRICKS_SERVING_ENDPOINT_NAME if the endpoint is not configured - just to improve the language stating that the env is required.

And why Model Serving (agents)? maybe it should be named Model Serving Endpoint for LLM used by Agent or sth shorter?

pkosiec · 2026-05-07T13:11:02Z

I initialized an app from the template and got

Error: Failed to register agent 'helper' (code): Agent 'helper' tool 'current_time' has an unrecognized shape at AgentsPlugin.loadAgents (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:125:10) at async AgentsPlugin.setup (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:86:3) at async Promise.all (index 0) at async Function._createApp (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/src/core/appkit.ts:225:5) { [cause]: Error: Agent 'helper' tool 'current_time' has an unrecognized shape at AgentsPlugin.buildToolIndex (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:259:10) at AgentsPlugin.buildRegisteredAgent (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:167:32) at async AgentsPlugin.loadAgents (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:121:23) at async AgentsPlugin.setup (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:86:3) at async Promise.all (index 0) at async Function._createApp (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/src/core/appkit.ts:225:5) }

Here's what my agent thinks:

Root cause: The template's tool() calls omit name (e.g. tool({ description, schema, execute })). The tool() factory sets name: config.name → undefined. Then isFunctionTool() checks typeof obj.name === "string" → false, so it falls through all shape checks and throws "unrecognized shape". The irony: buildToolIndex already overrides the name with the object key at line 407: def: { ...functionToolToDefinition(tool), name: key }. So name in tool() is redundant — but the type guard rejects it before that line is reached. Fix: Make name optional in ToolConfig and relax isFunctionTool to not require it, since buildToolIndex always assigns it from the key.

pkosiec · 2026-05-07T13:16:19Z

+ * Replace this with `<GenieChat>`-style components when AppKit ships a
+ * first-class agent chat primitive in `@databricks/appkit-ui/react`.
+ */
+export function AgentChat() {


Does it make sense to have an assistant tab? maybe we can just keep the helper one, as it's the only one that comes with tools?
Or, let me ask a different related question: what's the purpose of the "assistant" agent in the template?

pkosiec · 2026-05-08T09:26:14Z

Let's note that only streaming model serving endpoints are supported. On the list during the apps init we list all of the resources.

Here's the comparison of the endpoint usage in context of the Model Serving plugin:

Template (non-streaming) vs Dev-Playground (streaming)

Hook / Client approach

Aspect Template Dev-Playground

Hook useServingInvoke useServingStream

Backend endpoint POST /api/serving/invoke POST /api/serving/stream

Transport Regular HTTP POST → JSON response HTTP POST → SSE (Server-Sent Events)

State { invoke, data, loading, error } { stream, chunks, streaming, error, reset }

Backend routes (both registered by ServingPlugin)

/invoke → calls servingConnector.invoke() → uses the SDK's high-level client.servingEndpoints.query() which returns a single QueryEndpointResponse JSON object.

/stream → calls servingConnector.stream() → uses the SDK's low-level client.apiClient.request({ raw: true }) hitting /serving-endpoints/{name}/invocations with Accept: text/event-stream and { stream: true } in the body. Returns raw SSE bytes piped directly to the client.

Response parsing

Template extracts choices[0].message.content — a complete response in one shot.

Dev-Playground extracts choices[0].delta.content from each SSE chunk and concatenates them in real-time, committing the full message when streaming ends.

What "doesn't support streaming" means

A model serving endpoint "not supporting streaming" means:

The underlying model doesn't implement the SSE/chunked response protocol. When you send { stream: true } in the request body, the endpoint either ignores it and returns a full JSON response, or returns an error.

Practically: The /serving-endpoints/{name}/invocations endpoint with Accept: text/event-stream won't produce an incremental byte stream — the response.contents ReadableStream will be null or the call will fail (see client.ts:103: throw new Error("Response body is null — streaming not supported")).

Which endpoints support it: Foundation Model APIs (e.g. DBRX, Llama, external models like GPT) generally support streaming. Custom model endpoints (e.g. sklearn, MLflow pyfunc models) typically do not — they return a single JSON response.

So the template defaults to useServingInvoke (non-streaming) as the safe, universally compatible choice, with the streaming hook commented out as an opt-in. The dev-playground uses useServingStream because it's a showcase/reference app assumed to point at a chat-capable LLM endpoint.

…template Final layer of the agents feature stack. Everything needed to exercise, demonstrate, and learn the feature. `apps/agent-app/` — a standalone app purpose-built around the agents feature. Ships with: - `server.ts` — full example of code-defined agents via `fromPlugin`: ```ts const support = createAgent({ instructions: "…", tools: { ...fromPlugin(analytics), ...fromPlugin(files), get_weather, "mcp.vector-search": mcpServer("vector-search", "https://…"), }, }); await createApp({ plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })], }); ``` - `config/agents/assistant.md` — markdown-driven agent alongside the code-defined one, showing the asymmetric auto-inherit default. - Vite + React 19 + TailwindCSS frontend with a chat UI. - Databricks deployment config (`databricks.yml`, `app.yaml`) and deploy scripts. `apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with inline autocomplete (hits the `autocomplete` markdown agent) and a full threaded conversation panel (hits the default agent). `apps/dev-playground/server/index.ts` — adds a code-defined `helper` agent using `fromPlugin(analytics)` alongside the markdown-driven `autocomplete` agent in `config/agents/`. Exercises the mixed-style setup (markdown + code) against the same plugin list. `apps/dev-playground/config/agents/*.md` — both agents defined with valid YAML frontmatter. `docs/docs/plugins/agents.md` — progressive five-level guide: 1. Drop a markdown file → it just works. 2. Scope tools via `toolkits:` / `tools:` frontmatter. 3. Code-defined agents with `fromPlugin()`. 4. Sub-agents. 5. Standalone `runAgent()` (no `createApp` or HTTP). Plus a configuration reference, runtime API reference, and frontmatter schema table. `docs/docs/api/appkit/` — regenerated typedoc for the new public surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig, ToolkitEntry, ToolkitOptions, all adapter types, and the agents plugin factory). `template/appkit.plugins.json` — adds the `agent` plugin entry so `npx @databricks/appkit init --features agent` scaffolds the plugin correctly. - Full appkit vitest suite: 1311 tests passing - Typecheck clean across all 8 workspace projects - `pnpm docs:build` clean (no broken links) - `pnpm --filter=@databricks/appkit build:package` clean, publint clean Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Documents the new `mcp` configuration block and the rules it enforces: same-origin-only by default, explicit `trustedHosts` for external MCP servers, plaintext `http://` refused outside localhost-in-dev, and DNS-level blocking of private / link-local IP ranges (covers cloud metadata services). See PR #302 for the policy implementation and PR #304 for the `AgentsPluginConfig.mcp` wiring. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection covering `analytics.query` readOnly enforcement, `lakebase.query` opt-in via `exposeAsAgentTool`, and the approval flow. New "Human-in-the-loop approval for destructive tools" subsection documents the config, SSE event shape, and `POST /chat/approve` contract. - `apps/agent-app`: approval-card component rendered inline in the chat stream whenever an `appkit.approval_pending` event arrives. Destructive badge + Approve/Deny buttons POST to `/api/agent/approve` with the carried `streamId`/`approvalId`. - `apps/dev-playground/client`: matching approval-card on the agent route, using the existing appkit-ui `Button` component and Tailwind utility classes. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Updates `docs/docs/plugins/agents.md` to document the new two-key auto-inherit model introduced in PR #302 (per-tool `autoInheritable` flag) and PR #304 (safe-by-default `autoInheritTools: { file: false, code: false }`). Adds an "Auto-inherit posture" subsection explaining that the developer must opt into `autoInheritTools` AND the plugin author must mark each tool `autoInheritable: true` for a tool to spread without explicit wiring. Includes a table documenting the `autoInheritable` marking on each core plugin tool, plus an example of the setup-time audit log so operators can see exactly what's inherited vs. skipped. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - **Reference app no longer ships hardcoded dogfood URLs.** The three `https://e2-dogfood.staging.cloud.databricks.com/...` and `https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP URLs in `apps/agent-app/server.ts` are replaced with optional env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When set, their hostnames are auto-added to `agents({ mcp: { trustedHosts } })`. `.env.example` uses placeholder values the reader can replace instead of another team's workspace. - **`appkit.agent` → `appkit.agents` in the reference app.** The prior `appkit.agent as { list, getDefault }` cast papered over the plugin-name mismatch fixed in PR #304. The runtime key now matches the docs, the manifest, and the factory name; the cast is gone. - **Auto-inherit opt-in added to the reference config.** Since the defaults flipped to `{ file: false, code: false }` (PR #304, S-3), the reference now explicitly enables `autoInheritTools: { file: true }` so the markdown agents that ship alongside the code-defined one still pick up the analytics / files read-only tools. This is the pattern a real deployment should follow — opt in deliberately. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> - `apps/dev-playground/config/agents/autocomplete.md` sets `ephemeral: true`. Each debounced autocomplete keystroke no longer leaves an orphan thread in `InMemoryThreadStore` — the server now deletes the thread in the stream's `finally` (PR #304). Closes R1 from the MVP re-review. - `docs/docs/plugins/agents.md` documents the new `ephemeral` frontmatter key alongside the other AgentDefinition knobs. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com> Documents the MVP resource caps landed in PR #304: the static request-body caps (enforced by the Zod schemas) and the three configurable runtime limits (`maxConcurrentStreamsPerUser`, `maxToolCalls`, `maxSubAgentDepth`). Includes the config-block shape in the main reference and a new "Resource limits" subsection under the Configuration section explaining the intent and per-user semantics of each cap. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

The agents plugin's manifest `name` is `agents` (plural), so routes mount at `/api/agents/*` and its client config is keyed as `agents` — but three call sites still referenced the old singular `agent`: - apps/agent-app/src/App.tsx: /api/agent/{info,chat,approve} returned an Express 404 HTML page, which the client then tried to JSON.parse, producing "Unexpected token '<', <!DOCTYPE ...". Swap to /api/agents/*. - apps/dev-playground/client/src/routes/agent.route.tsx: same three paths, plus getPluginClientConfig("agent") returned {} so hasAutocomplete was false and the autocomplete hook short-circuited before ever firing a request. Swap the lookup key to "agents". - template/appkit.plugins.json: the scaffolded plugin descriptor still used the singular name/key, which would have broken fresh apps the same way. Align with the plugin's real manifest name. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Move reference apps to config/agents/<id>/agent.md; document migration and reserved skills folder; align generated API snippets and CHANGELOG.

typedoc picked up JSDoc changes from agent/v2/4-agents-plugin: - New public export `agentIdFromMarkdownPath` (helper for path-based id resolution used by `loadAgentFromFile`). - `loadAgentsFromDir` description/body now reflects the folder layout (`<id>/agent.md`, orphan `*.md` rejected, reserved `skills/` dir). Generated by docusaurus-plugin-typedoc during pnpm --filter=docs build.

… retire agent-app Stage 0 of the smart-dashboard-demo plan. Ports the prototype Smart Dashboard (NYC Taxi analytics) from the p3ju worktree into dev-playground as a new route, migrates its markdown agents to the folder layout, and deletes apps/agent-app — which is superseded by this demo as the integration test of the entire v2 agents stack. Client: - New route at client/src/routes/smart-dashboard.route.tsx with its own subdirectory for components/ and hooks/. - Ported 8 components (ActiveFilters, AgentSidebar, AnomalyCard, FareChart, InsightCard, KPICards, QuerySection, TripChart) and 4 hooks (useActionDispatcher, useAgentStream, useChartColors, useDashboardData) as-is. Relative imports preserved. - Nav link added in __root.tsx. - TanStack routeTree.gen.ts auto-regenerated. Server: - Ports apply_filter and highlight_period inline tools. - Adds sql_analyst (code-defined: fromPlugin(analytics)) and dashboard_pilot (code-defined: apply_filter + highlight_period) per the plan's Q2 = option B decision. - Adds query markdown dispatcher in config/agents/query/agent.md delegating to both specialists via the agents: frontmatter. - Ports insights and anomaly ephemeral markdown agents. Config: - Ports 4 SQL queries into config/queries/dashboard_*.sql. - Note: shared/appkit-types/analytics.d.ts not regenerated in this commit; useAnalyticsQuery("dashboard_*", ...) uses explicit as casts and works at runtime. Regenerate with 'npx @databricks/appkit generate-types' locally when convenient. Cleanup: - apps/agent-app/ removed in full. No references outside pnpm-lock.yaml (regenerated). - plans/smart-dashboard-demo.md added with the full staged plan. Verification: - pnpm --filter=dev-playground client typecheck: clean. - pnpm --filter=dev-playground client vite build: clean. - Server typecheck: same pre-existing errors as main (files plugin union type, telemetry CacheManager, playwright DOM lib) — no new regressions. Next stages (1-6, per the plan): dispatcher integration verified, save_view + approval card, dashboard-context injection + focus_chart, Stream Inspector, polish, demo script.

Stage 0 ported the dashboard shell verbatim from the prototype; this commit layers every v2-stack feature on top, moves the feature dir out of routes/ (TanStack was flagging files as stray routes), rewrites the agent -> UI action pipeline for correctness, and adds discoverability for the HITL flow. Server (apps/dev-playground/server/index.ts) - Split the polymorphic apply_filter into four narrower tools: filter_by_date_range, filter_by_pickup_zip, filter_by_fare, clear_filters. Each has exactly one client-side effect; removes the whole class of 'agent said it worked but nothing moved' bugs. - Add clear_highlights, focus_chart, save_view (destructive; triggers the approval gate). - dashboard_pilot instructions rewritten with a compact verb-per-line reference so the LLM picks the right single tool for each intent. Client - moved out of routes/ - Feature code relocates to client/src/features/smart-dashboard/ (components/, hooks/, lib/). TanStack Router was warning that every non-route file under routes/ 'does not contain any route piece.' - smart-dashboard.route.tsx uses @/features/ aliases; the route file is now the only thing under routes/. Client - correctness fixes in the action dispatcher - Act only on response.output_item.done (never .added, which fires with partial arguments and caused double-applied highlights plus silent JSON-parse races). - Dedupe by call_id with a bounded LRU; reset on appkit.metadata (new-run signal). - Use updater callbacks (onFilterUpdate(prev => ...)) instead of a currentFilters prop to eliminate stale-closure bugs when the agent fires multiple tool calls in one render cycle. - Validate arg shapes per tool; anything malformed or unrecognized surfaces through onUnknownTool (route renders as a red banner + console.warn). Silent failure was the worst failure mode. - Emit a human-readable summary for every applied action (onAction). Client - discoverability / HITL - New QuickActionsBar with Save view... (inline name input), Clear filters, Clear highlights. Each dispatches through the chat pipeline so the agent still reasons and the approval gate still fires for destructive actions - the bar just saves typing. - ActionToast (bottom-left) confirms every dispatcher-applied action for ~3s. Answers 'did it work?' without opening the inspector. - QuerySection refactored into a view: content/isLoading/onSend come from the route. Lifting useAgentStream one level up lets the Quick Actions bar and the chat input share a single agent stream. - QuerySection example queries refreshed to cover the new tools. Client - stream-inspector wiring - SSEEvent extended with approval_pending payload fields. - use-stream-inspector threaded through so every run's events flow into the inspector's module-level store. - FocusableChart renamed its 'id' prop to 'chartId' (logical registry key, not a DOM id - biome was right to complain). Verification - pnpm --filter=dev-playground client tsc --noEmit: clean. - pnpm --filter=dev-playground client vite build: clean. - Server typecheck: same pre-existing errors as main; no new regressions. - apps/dev-playground/shared/appkit-types/analytics.d.ts regenerated by vite build to register the four dashboard_* queries; kept in the commit so CI and downstream consumers have typed useAnalyticsQuery access out of the box.

…iews panel + floating chat Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Fresh UC volumes don't have a saved-views/ subdirectory until the first save; the SDK throws FILES_API_DIRECTORY_IS_NOT_FOUND on list. The route was propagating that as a 500 which rendered as a red error banner in the SavedViewsPanel on first load. Catch the error explicitly, return { views: [] }, let the panel render its 'no saved views yet' empty state cleanly. Uploads still work the first time because the SDK auto-creates parent dirs on upload.

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

html2canvas 1.x throws on `oklch()` color values, which Tailwind v4 emits everywhere in computed styles. Swap to the maintained html2canvas-pro fork (drop-in API) so dashboard captures render without "Attempting to parse an unsupported color function 'oklch'" errors in the approval card. Keeps html2canvas pinned so types still resolve.

Databricks SDK `volume.download(path)` returns a wrapper `{ contents: ReadableStream, "content-type": string }`, not the stream itself. The previous handler tried to write the wrapper directly, which produced an empty body and broke thumbnails in the saved-views panel. Now we read `.contents`, drain the stream, and respond with the server-reported content-type (falling back to `image/png`). Also drops a couple of noisy console.logs left over from the debugging session.

… click Clicking a saved-view thumbnail was sending a chat prompt like "Load the saved view 'january'" and letting the agent reconstruct filters from the view name. That dropped the highlights (agent had no tool to fetch the stored metadata) so January-with-focus-on-week-1 came back as just January-wide. Since the client already holds the full authoritative metadata for the clicked thumbnail, bypass the agent and apply `meta.filters` and `meta.highlights` directly to local state, with a toast summarising what was restored. Also hardens the `appkit.approval_pending` handler: it now accepts both snake_case and camelCase fields and validates that approval_id/tool_name/stream_id are non-empty strings before enqueuing, so a malformed event can't push a broken approval card.

Picks up the new `annotations?: ToolAnnotations` field on `ToolConfig` and `FunctionTool` introduced upstream in the annotations-propagation fix.

…nable agent feed Reshapes the Smart Dashboard demo from a sparse 2-chart layout into a 2x2 chart grid with a right-rail agent feed, and turns the previously read-only insights/anomaly cards into clickable actions that drive the dashboard directly. New visualisations: - HourlyHeatmap: day-of-week × hour-of-day grid, click a cell to ask the agent to investigate that slot. - TopZonesChart: hand-rolled horizontal bar leaderboard with click-to- filter and a `highlight_zone` ring driven by the agent. - KPI sparklines: inline 7-day micro-charts with windowed trend deltas baked into each KPI card. Agent feed becomes interactive: - `feed-actions.ts` defines a structured action schema (filter_date, filter_zip, filter_fare, highlight_period, highlight_zone, focus_chart, ask) and a parser. The `insights` and `anomaly` ephemeral agents now emit JSON matching that schema. - `ActionableCard` renders insights/anomalies with action chips that invoke `useActionDispatcher.dispatch` directly — same code path the SSE function-call handler uses, so UI clicks and agent tool calls behave identically. - The feed re-runs (debounced) whenever filters or highlights change. Server-side wiring: - Adds `highlight_zone` and `clear_zone_highlights` tools. - Extends the `focus_chart` enum with `hourly_heatmap` and `top_zones`. - Updates `dashboard_pilot` instructions to prefer `highlight_zone` over `filter_by_pickup_zip` when calling out a single ZIP. - Adds three SQL queries: `dashboard_hourly_heatmap`, `dashboard_top_zones`, `dashboard_kpi_sparklines`. The top-zones query casts `pickup_zip` (an INT in samples.nyctaxi.trips) to STRING so the client's highlight Map keys, the agent's `highlight_zone` arg, and the filter parameter all speak the same type. Polish & defensive fixes: - Defensive `Number()` coercion in `kpi-cards.tsx` for sparkline values so trend math doesn't render `NaN%` or string-concatenated revenue totals if a driver hands back DECIMAL-as-string. - `Sparkline` reserves vertical space for intentionally-empty series (e.g. the categorical "Top Pickup Zone" KPI) instead of rendering a loading-style placeholder. - 2x2 chart grid uses `items-start` + `auto-rows-min content-start` so the rail no longer stretches the chart column and creates dead space. - `ChatDrawer` becomes a controlled component (`open` + `onOpenChange`) so any agent-triggering UI action can auto-open the chat — the user always sees the agent's response without manual disclosure.

The playground header was unscalable: 14 demo links rendered as side-by-side buttons that overflowed on narrow screens, and the home page maintained a parallel hand-curated grid that had already drifted (missing Smart Dashboard, Chart Inference, Vector Search, Policy Matrix, and Serving — ~30% of the catalog). Introduces `client/src/lib/nav.ts` as the single source of truth: each demo declares its label, one-line description, lucide icon, and category group. Both surfaces now read from the same list, so adding a demo is a one-line change and they can no longer drift. Header (`__root.tsx`): - Replaces the button wall with a single "Menu" hamburger dropdown grouping demos by purpose (Data / AI / Platform). - Active route is highlighted inside the dropdown and shown breadcrumb- style next to the brand, so the user always knows where they are. - Caps dropdown height at viewport-minus-header with overflow scroll, so adding more demos won't break the layout. Home page (`index.tsx`): - Restrained hero with a soft dual-radial gradient wash (~6-8% opacity, primary + accent) — depth without saturation. - Featured card for the Smart Dashboard flagship demo: gradient accent, icon tile, eyebrow badge, animated CTA. The featured demo also appears in its category grid, de-emphasised with a "Featured above" note. - Three category sections with one-line taglines, rendered as a 1/2/3-col responsive grid of icon + title + description cards. Each card is a real `<Link>` (not a button inside a decorative `<Card>`), so the whole surface is keyboard-accessible. - Footer shows live demo and category counts driven by the catalog.

…tive Retag save_view as effect: "write" (it creates a PNG; it doesn't delete anything) and teach the approval card to render three distinct tiers. Capturing a screenshot no longer masquerades as deletion: writes get a calm blue card with a plus-circle icon, updates get a warning-amber card with a pencil, and real destructive actions retain the red shield-alert. Legacy destructive: true still maps to the red tier, so tools that haven't migrated keep their current look.

Tailwind v4 compiles `bg-blue-50/50` to a two-layer rule: an sRGB hex fallback plus an `@supports (color-mix)` override that mixes the oklch palette token with transparent in oklab. Browsers with color-mix support (recent Chrome/Arc) take the oklab path; older embedded Chromiums (e.g. Cursor's built-in browser) fall through to the sRGB hex. Those two paths produce visibly different tints against the dark `--card` token, which is why the agent-feed cards rendered inconsistently across Chrome, Arc, and Cursor's browser. Pin the four insight/anomaly-tier backgrounds to arbitrary 8-digit hex (`bg-[#eff6ff80]` etc.) so every browser lands on the same sRGB path. Values taken from Tailwind's own fallback output to preserve the intended look on color-mix-capable browsers.

appkit-ui's globals.css already defines dark-theme tokens via two paths — an explicit `.dark` class on <html>, and `@media (prefers-color-scheme: dark)` guarded by `:root:not(.light)` so an explicit `.light` class wins. Tailwind v4's default `dark:` variant, however, is purely media driven. That mismatch shows up when the user forces light via the playground's theme selector while their OS is in dark mode: the bootstrap script sets `<html class="light">`, --card/--background correctly resolve to light, but every `dark:*` utility keeps firing under the media query — cards end up painted with dark-mode backgrounds layered under light-mode chrome. Declare a playground-local `@custom-variant dark` that mirrors the token logic exactly: fire when the element is (or descends from) `.dark`, or when `prefers-color-scheme: dark` matches and no `.light` ancestor is present. This rebinds every `dark:*` utility to respect the theme selector's forced choice, keeping the rest of appkit-ui's consumers — which don't ship the bootstrap script — on the existing media-only behaviour.

The streaming-message bubble in the smart-dashboard chat drawer used `animate-pulse` while tokens arrived. The constant fade in/out reads as visual noise when the agent is mid-stream — especially with longer replies where it pulses for many seconds. Drop the animation; the ellipsis placeholder still communicates the loading state for empty streaming bubbles.

`server({ autoStart: false }).then(appkit => appkit.server.extend(...).start())` is gone — `createApp` now orchestrates server start itself, with the post-setup hook surfaced as the `onPluginsReady` config callback. Drop `autoStart: false`, hoist the `extend` block from the trailing `.then` chain into `onPluginsReady`, and replace the dangling promise with `.catch(console.error)` so unhandled rejections still surface. Tracks #280 / #291 (autoStart removal + on-plugins-ready codemod).

Selecting `agents` in `databricks apps init` previously produced an app that booted, logged "No agents registered.", and rendered no UI for the plugin. Fixes that by scaffolding two starter agents (one markdown, one code-defined) and a chat surface, gated on `{{if .plugins.agents}}`. Added: - template/config/agents/assistant/agent.md — markdown agent, default, no tools. Demonstrates the declarative form. - template/server/agents/helper.ts — code-defined agent via createAgent({...}) with two inline tool({...}) definitions: current_time (returns ISO timestamp) and count_words. Tools are pure JS so the demo works regardless of which other plugins were selected at scaffold time. - template/client/src/pages/agents/AgentChat.tsx — minimal SSE consumer for /api/agents/chat with an agent picker, streaming text bubbles, and inline tool-call rows. Hand-rolled because @databricks/appkit-ui doesn't yet ship a generic agent chat primitive — replace with one when it lands. Modified: - template/server/server.ts: when {{if .plugins.agents}}, imports the helper agent and wires it as agents({ agents: { helper } }) instead of bare agents(). The markdown 'assistant' loads automatically from config/agents/. - template/client/src/App.tsx: conditional NavLink + route entry, mirroring the analytics/files/etc. blocks. End-to-end shape after init with --features agents: - GET /api/agents/info returns { agents: ['assistant', 'helper'], defaultAgent: 'assistant' } - /agents page renders chat with picker - 'what time is it?' to helper triggers a current_time tool round-trip - 'count words in: the quick brown fox' triggers count_words → 4 The serving-endpoint resource (DATABRICKS_SERVING_ENDPOINT_NAME) is already declared in template/appkit.plugins.json from PR 4, so the CLI prompts for an endpoint when agents is selected.

agents, createAgent, fromPlugin, tool and all agent-related exports are now under the beta subpath. Update the dev-playground server and the template helper to import from @databricks/appkit/beta.

- Document agents as beta in docs and set stability in app template manifest - Point Docusaurus Typedoc at typedoc.entry.ts so stable + beta APIs publish together (fixes agent symbol pages being dropped from index-only builds) - Regenerate api/appkit index and sidebar; knip-ignore docs-only entry file Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Typedoc reference grew when the unified entry started exposing tool authoring primitives (defineTool, AppKitMcpClient, DatabricksAdapter, parseTextToolCalls, ToolEntry, ToolRegistry, etc.) that beta.ts now re-exports. Regenerating brings docs/docs/api/ back in sync so the docs:build CI gate passes. pnpm-lock.yaml gains the get-port@7.2.0 entry that was added to @databricks/appkit on main and merged into v4 during the stack rebase. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

MarioCadenas force-pushed the agent/v2/5-fromplugin-runagent branch from 162e970 to 29e3534 Compare April 21, 2026 20:41

MarioCadenas force-pushed the agent/v2/6-apps-docs branch from 57cc1e4 to 4a441d2 Compare April 21, 2026 20:41

MarioCadenas mentioned this pull request Apr 22, 2026

feat(appkit): zero-trust MCP host policy with URL allowlist and scoped auth #307

Closed

7 tasks

MarioCadenas force-pushed the agent/v2/5-fromplugin-runagent branch from 29e3534 to b462716 Compare April 22, 2026 08:45

MarioCadenas force-pushed the agent/v2/6-apps-docs branch 2 times, most recently from d16cdd5 to e81d8bb Compare April 22, 2026 09:24

MarioCadenas force-pushed the agent/v2/5-fromplugin-runagent branch 2 times, most recently from 539487e to dac73b5 Compare April 22, 2026 09:46

MarioCadenas force-pushed the agent/v2/6-apps-docs branch from e81d8bb to 829ad13 Compare April 22, 2026 09:46

MarioCadenas force-pushed the agent/v2/5-fromplugin-runagent branch from dac73b5 to 624f2a0 Compare April 22, 2026 09:59

MarioCadenas force-pushed the agent/v2/6-apps-docs branch 2 times, most recently from 3386200 to 263f587 Compare April 22, 2026 10:21

MarioCadenas force-pushed the agent/v2/5-fromplugin-runagent branch from 8b0c28e to 22393bb Compare May 4, 2026 12:59

pkosiec reviewed May 7, 2026

View reviewed changes

pkosiec reviewed May 8, 2026

View reviewed changes

MarioCadenas added 26 commits May 8, 2026 11:59

docs(agents): folder layout on disk, migrate samples, sync API refs

fcf76ab

Move reference apps to config/agents/<id>/agent.md; document migration and reserved skills folder; align generated API snippets and CHANGELOG.

feat(appkit): sub-agent approval gate + save view to volume + saved v…

cbc57a1

…iews panel + floating chat Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

fix(appkit): forward all sub-agent events except metadata

3dcf369

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

docs(appkit): regenerate typedoc for tool annotations

03e316d

Picks up the new `annotations?: ToolAnnotations` field on `ToolConfig` and `FunctionTool` introduced upstream in the annotations-propagation fix.

fix(playground, template): import agents from @databricks/appkit/beta

27bba72

agents, createAgent, fromPlugin, tool and all agent-related exports are now under the beta subpath. Update the dev-playground server and the template helper to import from @databricks/appkit/beta.

chore: remove plans scratch docs from agents stack branch

c2ac56c

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

chore(appkit): regenerate typedoc after rebase onto v5

cd272f7

Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(appkit): reference agent-app, dev-playground chat UI, docs, and template#306

feat(appkit): reference agent-app, dev-playground chat UI, docs, and template#306
MarioCadenas wants to merge 26 commits intoagent/v2/5-fromplugin-runagentfrom
agent/v2/6-apps-docs

MarioCadenas commented Apr 21, 2026 •

edited

Loading

Uh oh!

pkosiec left a comment

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026 •

edited

Loading

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 7, 2026

Uh oh!

pkosiec May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Aspect	Template	Dev-Playground
Hook	`useServingInvoke`	`useServingStream`
Backend endpoint	`POST /api/serving/invoke`	`POST /api/serving/stream`
Transport	Regular HTTP POST → JSON response	HTTP POST → SSE (Server-Sent Events)
State	`{ invoke, data, loading, error }`	`{ stream, chunks, streaming, error, reset }`

Conversation

MarioCadenas commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference application: agent-app

dev-playground chat UI + autocomplete agent

Docs

Template

Test plan

PR Stack

Demo

Uh oh!

pkosiec left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pkosiec May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Code Review — agent/v2/6-apps-docs

Context

What works well

P0 — Must fix

1. Doc/behavior contradiction on auto-inherit

P1 — Should fix before merge

2. MCP callTool corrupts results when text is undefined

3. MCP sendNotification ignores HTTP error status

4. MCP SSE response body read has no size limit

5. reload() races with in-flight streams

6. tool() return type lies about string-only

7. SSE parsing boilerplate in template AgentChat

P2 — Fix if straightforward

8. Files and Genie plugin tools don't forward signal

9. MCP JSON-RPC response shape not validated

10. Hardcoded 30s tool execution timeout

11. Missing approval timeout validation

12. Inconsistent tool annotation fields across plugins

13. Stale comments reference destructive: true instead of effect

14. createAgent({ name }) precedence unclear

15. tool() vs defineTool() callback naming inconsistency

P3 — Low priority

16. Sub-agent input extraction is lenient

17. AbortSignal.any() requires Node 20+

18. MCP connectAll doesn't surface partial failures clearly

19. tool() description silently falls back to name

20. Playground server is too dense for a reference

21. No shared SSE/stream hook in appkit-ui

Not a bug (reviewer false positives)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Template (non-streaming) vs Dev-Playground (streaming)

Hook / Client approach

Backend routes (both registered by ServingPlugin)

Response parsing

What "doesn't support streaming" means

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MarioCadenas commented Apr 21, 2026 •

edited

Loading

pkosiec May 7, 2026 •

edited

Loading

2. MCP `callTool` corrupts results when `text` is undefined

3. MCP `sendNotification` ignores HTTP error status

5. `reload()` races with in-flight streams

6. `tool()` return type lies about string-only

8. Files and Genie plugin tools don't forward `signal`

13. Stale comments reference `destructive: true` instead of `effect`

14. `createAgent({ name })` precedence unclear

15. `tool()` vs `defineTool()` callback naming inconsistency

17. `AbortSignal.any()` requires Node 20+

18. MCP `connectAll` doesn't surface partial failures clearly

19. `tool()` description silently falls back to name

Backend routes (both registered by `ServingPlugin`)