feat(appkit): reference agent-app, dev-playground chat UI, docs, and template#306
feat(appkit): reference agent-app, dev-playground chat UI, docs, and template#306MarioCadenas wants to merge 26 commits intoagent/v2/5-fromplugin-runagentfrom
Conversation
162e970 to
29e3534
Compare
57cc1e4 to
4a441d2
Compare
29e3534 to
b462716
Compare
d16cdd5 to
e81d8bb
Compare
539487e to
dac73b5
Compare
e81d8bb to
829ad13
Compare
dac73b5 to
624f2a0
Compare
3386200 to
263f587
Compare
8b0c28e to
22393bb
Compare
pkosiec
left a comment
There was a problem hiding this comment.
(not yet a full review, just some partial comments)
| ...fromPlugin(analytics), // all analytics tools | ||
| ...fromPlugin(files, { only: ["uploads.read"] }), // filtered subset | ||
| get_weather: tool({ | ||
| name: "get_weather", |
There was a problem hiding this comment.
Why do we need a name for a tool which is already registered as a unique key?
| toolkits: | ||
| - analytics # all analytics.* tools | ||
| - files: [uploads.read, uploads.list] # only these files tools | ||
| - genie: { except: [getConversation] } # everything but getConversation | ||
| tools: [get_weather] # ambient tool declared in code |
There was a problem hiding this comment.
I'm not sure if it isn't a bit confusing to have such distinction - in the end, it's always about tools (but either custom ones or plugin ones.
Maybe we should rename that to: tools and extraTools
or, have something like:
tools:
- plugin:analytics
- plugin:files: { only: [uploads.read] }
- get_weatherWDYT?
There was a problem hiding this comment.
That's a lot of the code for the dev playground.
Honestly, I'm not sure if that fits in the dev playground - maybe we should move it somewhere? E.g. Move it as a separate template in app-templates?
IMO this is too good to keep it in the dev playground.
Just saying, it's not a blocker but I think it has a huge potential 👍
There was a problem hiding this comment.
Actually, as a second thought - the "Smart dashboard" example is kind of reimplementing Metric Views? Correct me if I'm wrong 🤔
There was a problem hiding this comment.
Wait, this file shouldn't be modified, right? Because there's no breaking change if we haven't released agent plugin yet?
(BTW, why do we have so many "Changelog" headers? 😄 something is wrong with our changelog generation, I guess)
There was a problem hiding this comment.
Agentic reviews (aggregated report):
Code Review — agent/v2/6-apps-docs
Context
Branch agent/v2/6-apps-docs (based on agent/v2/5-fromplugin-runagent) adds ~22K lines implementing the full agents system: core agent runtime, agents plugin, MCP client, plugin context mediator, tool primitives, ToolProvider surfaces on all core plugins, smart dashboard demo, template updates, and docs. This review covers both correctness bugs and developer experience issues.
What works well
- Progressive disclosure — 5 levels (markdown drop-in → frontmatter scoping → code agents → sub-agents → standalone) is a strong pedagogical ladder
tool()ergonomics — Zod schema drives type inference + JSON Schema generation + runtime validation with LLM-friendly errorsfromPlugin()pattern — elegant lazy references; no instance coupling, spread-friendly, clear error messages withAvailable: [...]listing- Safety defaults —
autoInheritTools: false, approval gates, resource limits, MCP host policy, SQL classifier. Two-key operation for auto-inherit is well-designed docs/docs/plugins/agents.md— comprehensive, well-structured, covers the full lifecycle with runnable examples at each level
P0 — Must fix
1. Doc/behavior contradiction on auto-inherit
File: docs/docs/plugins/agents.md:57-58 vs :233
Level 1 documentation says:
"4. Auto-inherits every registered ToolProvider plugin's tools (
analytics.*,files.*, ...)"
But the Configuration Reference says:
"
autoInheritToolsdefaults to{ file: false, code: false }— no tools spread into any agent unless the developer explicitly opts in."
These directly contradict each other. A developer following Level 1 will expect their markdown agent to see all plugin tools out of the box, but it won't — they get zero tools. This is the single most confusing thing in the agents DX.
Fix: Either (a) add agents({ autoInheritTools: { file: true } }) to the Level 1 example, or (b) rewrite the Level 1 narrative to say the agent has no tools yet and point to Level 2.
P1 — Should fix before merge
2. MCP callTool corrupts results when text is undefined
File: packages/appkit/src/connectors/mcp/client.ts:264-275
McpToolCallResult.content[].text is typed text?: string (line 58), but callTool maps c.text without filtering undefined:
.filter((c) => c.type === "text")
.map((c) => c.text) // c.text can be undefined
.join("\n"); // produces "undefined" in outputFix: Change both occurrences (error path line 267 and success path line 274) to filter:
.filter((c): c is { type: string; text: string } => c.type === "text" && typeof c.text === "string")
.map((c) => c.text)3. MCP sendNotification ignores HTTP error status
File: packages/appkit/src/connectors/mcp/client.ts:362-389
The notifications/initialized fetch doesn't check response.ok. A 4xx/5xx response is silently ignored, making connect() appear successful when the server may not have registered the client.
Fix: Warn (don't throw — MCP spec says notifications are fire-and-forget):
if (!response.ok) {
logger.warn("MCP notification %s failed: %s %s", method, response.status, response.statusText);
}4. MCP SSE response body read has no size limit
File: packages/appkit/src/connectors/mcp/client.ts:339-340
response.text() reads the entire SSE body into memory. A malicious or misconfigured MCP server could send unbounded data.
Fix: Read incrementally with a size cap (e.g., 10 MB), or keep response.text() but add a Content-Length check if the header is present and document the 30s AbortSignal.timeout as the backstop.
5. reload() races with in-flight streams
File: packages/appkit/src/plugins/agents/agents.ts:161-168
reload() calls mcpClient.close() and sets it to null. In-flight streams captured this.mcpClient at _streamAgent call time (line 802). The old client is now closed, so sendRpc throws "MCP client is closed" mid-stream.
Fix: Don't close the old client synchronously — let in-flight streams drain. Or simpler: don't close it at all (it has no keep-alive connections; GC collects after in-flight refs drop).
6. tool() return type lies about string-only
Files: packages/appkit/src/core/agent/tools/tool.ts:19, function-tool.ts:19
ToolConfig.execute is typed as (args) => Promise<string> | string, and FunctionTool.execute matches. But the template's helper.ts:29-30 returns objects:
execute: () => ({ now: new Date().toISOString() }),This works at runtime (the result gets serialized downstream) but the type is wrong. Developers naturally want to return structured objects from tools.
Fix: Widen ToolConfig.execute and FunctionTool.execute return types to unknown | Promise<unknown> (matching defineTool's handler signature), with serialization handled by the runner.
7. SSE parsing boilerplate in template AgentChat
File: template/client/src/pages/agents/AgentChat.tsx:104-125
~20 lines of manual SSE frame splitting, data-line extraction, and JSON parsing. Every developer who scaffolds an app will copy this verbatim. The comment at line 34 acknowledges the gap. Consider shipping a thin parseSSEStream(reader) async generator in @databricks/appkit-ui so the template doesn't teach low-level SSE plumbing as the canonical pattern.
P2 — Fix if straightforward
8. Files and Genie plugin tools don't forward signal
Files:
packages/appkit/src/plugins/files/plugin.ts:1059-1123packages/appkit/src/plugins/genie/genie.ts:63-99
All tool handlers in _defineVolumeTools and _defineSpaceTools ignore the signal parameter from ToolEntry.handler. The analytics and lakebase plugins correctly forward it.
Fix: Add signal parameter to each handler. For genie sendMessage, pass it to abort the async iteration.
9. MCP JSON-RPC response shape not validated
File: packages/appkit/src/connectors/mcp/client.ts:349-351
as JsonRpcResponse is an unsafe cast. A malformed server response could pass through without json.error, json.result, or matching json.id.
Fix: Add minimal validation:
if (typeof json !== "object" || json === null || json.jsonrpc !== "2.0") {
throw new Error(`MCP response for ${method} is not valid JSON-RPC 2.0`);
}10. Hardcoded 30s tool execution timeout
File: packages/appkit/src/core/plugin-context.ts:187
const timeout = 30_000; is not configurable. Complex SQL queries or batch operations may need more.
Fix: Accept an optional timeoutMs in the executeTool signature with 30_000 as default.
11. Missing approval timeout validation
File: packages/appkit/src/plugins/agents/agents.ts:124
cfg.timeoutMs from user config is not validated. A negative or zero value causes immediate denial.
Fix: timeoutMs: Math.max(cfg.timeoutMs ?? 60_000, 1_000)
12. Inconsistent tool annotation fields across plugins
Files: lakebase.ts:181-185, analytics.ts:290, files/plugin.ts:1067,1108,1119, genie.ts:76,96
The ToolAnnotations type defines a preferred effect enum ("read" | "write" | "update" | "destructive") but no plugin uses it — all use the deprecated readOnly/destructive booleans. The lakebase plugin sets both readOnly AND destructive (inverse of each other) which is redundant.
Fix: Standardize on the effect field across all plugins. Remove redundant pairs in lakebase.
13. Stale comments reference destructive: true instead of effect
Files: shared/src/agent.ts:126, core/agent/types.ts:149, docs/docs/plugins/agents.md:334
These all reference the legacy boolean form. Since effect is now preferred and destructive is @deprecated, docs and comments should lead with effect.
14. createAgent({ name }) precedence unclear
File: packages/appkit/src/core/agent/types.ts:77-78
AgentDefinition.name is optional ("Filled in from the enclosing key"), but the template explicitly sets it. Does explicit name override the key? What if they differ? The template should either omit name or the docs should explain precedence.
15. tool() vs defineTool() callback naming inconsistency
Files: tool.ts, define-tool.ts
Two tool-creation functions for different audiences (agent authors vs plugin authors) use different callback names: execute vs handler. A developer switching from agent-side to plugin-side tool authoring will wonder why the API changed.
P3 — Low priority
16. Sub-agent input extraction is lenient
File: packages/appkit/src/plugins/agents/agents.ts:975-980
Falls back to JSON.stringify(args) when args.input isn't a string. An object like { input: null } serializes the entire args object.
17. AbortSignal.any() requires Node 20+
File: packages/appkit/src/connectors/mcp/client.ts:327
No polyfill or feature detection. Document Node 20+ as minimum requirement or add a fallback.
18. MCP connectAll doesn't surface partial failures clearly
File: packages/appkit/src/connectors/mcp/client.ts:103-116
Failed connections are logged but not thrown. Callers can't distinguish "all connected" from "partially connected".
19. tool() description silently falls back to name
File: tool.ts:40 — description: config.description ?? config.name
If a developer omits description, the LLM sees the tool name as the description (e.g., "get_weather"). Consider logging a warning or making description required.
20. Playground server is too dense for a reference
File: apps/dev-playground/server/index.ts — 750 lines
Mixes agent setup, file policy harness, saved-views CRUD, and OBO/SP smoke tests. Consider extracting non-agent concerns into separate files.
21. No shared SSE/stream hook in appkit-ui
The playground has custom hooks (use-agent-stream.ts, use-action-dispatcher.ts) that demonstrate SSE consumption, but these are local — not reusable. The template reimplements the same SSE parsing inline, creating two divergent implementations.
Not a bug (reviewer false positives)
- Budget consumed before approval: Intentional security design ("Counted pre-dispatch so a prompt-injected agent cannot drain the budget silently via denied calls")
- EventChannel push/close race: Node.js is single-threaded; these can't execute concurrently
- Stream cleanup ordering:
activeStreams.delete()in driver'sfinallyruns before the generator drains, but approval gate entries are already resolved. The 404 to late approvals is correct behavior - OTel context leak in
asUser()dev proxy: Proxy objects and closures are lightweight; GC handles them fine
There was a problem hiding this comment.
Actually, as a second thought - the "Smart dashboard" example is kind of reimplementing Metric Views? Correct me if I'm wrong 🤔
There was a problem hiding this comment.
I missed that part on previous PRs, but now I'm testing the flow and I have a small suggestion: can we make the description more high-level? I don't think we need to mention the fromModelServing stuff
Maybe we should also rephrase that the agent will require DATABRICKS_SERVING_ENDPOINT_NAME if the endpoint is not configured - just to improve the language stating that the env is required.
And why Model Serving (agents)? maybe it should be named Model Serving Endpoint for LLM used by Agent or sth shorter?
There was a problem hiding this comment.
I initialized an app from the template and got
Error: Failed to register agent 'helper' (code): Agent 'helper' tool 'current_time' has an unrecognized shape
at AgentsPlugin.loadAgents (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:125:10)
at async AgentsPlugin.setup (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:86:3)
at async Promise.all (index 0)
at async Function._createApp (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/src/core/appkit.ts:225:5) {
[cause]: Error: Agent 'helper' tool 'current_time' has an unrecognized shape
at AgentsPlugin.buildToolIndex (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:259:10)
at AgentsPlugin.buildRegisteredAgent (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:167:32)
at async AgentsPlugin.loadAgents (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:121:23)
at async AgentsPlugin.setup (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/dist/plugins/agents/agents.js:86:3)
at async Promise.all (index 0)
at async Function._createApp (/private/tmp/0705/agent/pkosiec-agent-local/node_modules/@databricks/appkit/src/core/appkit.ts:225:5)
}
Here's what my agent thinks:
Root cause: The template's tool() calls omit name (e.g. tool({ description, schema, execute })). The tool()
factory sets name: config.name → undefined. Then isFunctionTool() checks typeof obj.name === "string" →
false, so it falls through all shape checks and throws "unrecognized shape".
The irony: buildToolIndex already overrides the name with the object key at line 407: def: {
...functionToolToDefinition(tool), name: key }. So name in tool() is redundant — but the type guard rejects
it before that line is reached.
Fix: Make name optional in ToolConfig and relax isFunctionTool to not require it, since buildToolIndex always
assigns it from the key.
| * Replace this with `<GenieChat>`-style components when AppKit ships a | ||
| * first-class agent chat primitive in `@databricks/appkit-ui/react`. | ||
| */ | ||
| export function AgentChat() { |
There was a problem hiding this comment.
Let's note that only streaming model serving endpoints are supported. On the list during the apps init we list all of the resources.
Here's the comparison of the endpoint usage in context of the Model Serving plugin:
Template (non-streaming) vs Dev-Playground (streaming)
Hook / Client approach
| Aspect | Template | Dev-Playground |
|---|---|---|
| Hook | useServingInvoke |
useServingStream |
| Backend endpoint | POST /api/serving/invoke |
POST /api/serving/stream |
| Transport | Regular HTTP POST → JSON response | HTTP POST → SSE (Server-Sent Events) |
| State | { invoke, data, loading, error } |
{ stream, chunks, streaming, error, reset } |
Backend routes (both registered by ServingPlugin)
/invoke→ callsservingConnector.invoke()→ uses the SDK's high-levelclient.servingEndpoints.query()which returns a singleQueryEndpointResponseJSON object./stream→ callsservingConnector.stream()→ uses the SDK's low-levelclient.apiClient.request({ raw: true })hitting/serving-endpoints/{name}/invocationswithAccept: text/event-streamand{ stream: true }in the body. Returns raw SSE bytes piped directly to the client.
Response parsing
- Template extracts
choices[0].message.content— a complete response in one shot. - Dev-Playground extracts
choices[0].delta.contentfrom each SSE chunk and concatenates them in real-time, committing the full message when streaming ends.
What "doesn't support streaming" means
A model serving endpoint "not supporting streaming" means:
- The underlying model doesn't implement the SSE/chunked response protocol. When you send
{ stream: true }in the request body, the endpoint either ignores it and returns a full JSON response, or returns an error. - Practically: The
/serving-endpoints/{name}/invocationsendpoint withAccept: text/event-streamwon't produce an incremental byte stream — theresponse.contentsReadableStream will be null or the call will fail (seeclient.ts:103:throw new Error("Response body is null — streaming not supported")). - Which endpoints support it: Foundation Model APIs (e.g. DBRX, Llama, external models like GPT) generally support streaming. Custom model endpoints (e.g. sklearn, MLflow pyfunc models) typically do not — they return a single JSON response.
So the template defaults to useServingInvoke (non-streaming) as the safe, universally compatible choice, with the streaming hook commented out as an opt-in. The dev-playground uses useServingStream because it's a showcase/reference app assumed to point at a chat-capable LLM endpoint.
…template
Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
`apps/agent-app/` — a standalone app purpose-built around the agents
feature. Ships with:
- `server.ts` — full example of code-defined agents via `fromPlugin`:
```ts
const support = createAgent({
instructions: "…",
tools: {
...fromPlugin(analytics),
...fromPlugin(files),
get_weather,
"mcp.vector-search": mcpServer("vector-search", "https://…"),
},
});
await createApp({
plugins: [server({ port }), analytics(), files(), agents({ agents: { support } })],
});
```
- `config/agents/assistant.md` — markdown-driven agent alongside the
code-defined one, showing the asymmetric auto-inherit default.
- Vite + React 19 + TailwindCSS frontend with a chat UI.
- Databricks deployment config (`databricks.yml`, `app.yaml`) and
deploy scripts.
`apps/dev-playground/client/src/routes/agent.route.tsx` — chat UI with
inline autocomplete (hits the `autocomplete` markdown agent) and a
full threaded conversation panel (hits the default agent).
`apps/dev-playground/server/index.ts` — adds a code-defined `helper`
agent using `fromPlugin(analytics)` alongside the markdown-driven
`autocomplete` agent in `config/agents/`. Exercises the mixed-style
setup (markdown + code) against the same plugin list.
`apps/dev-playground/config/agents/*.md` — both agents defined with
valid YAML frontmatter.
`docs/docs/plugins/agents.md` — progressive five-level guide:
1. Drop a markdown file → it just works.
2. Scope tools via `toolkits:` / `tools:` frontmatter.
3. Code-defined agents with `fromPlugin()`.
4. Sub-agents.
5. Standalone `runAgent()` (no `createApp` or HTTP).
Plus a configuration reference, runtime API reference, and frontmatter
schema table.
`docs/docs/api/appkit/` — regenerated typedoc for the new public
surface (fromPlugin, runAgent, AgentDefinition, AgentsPluginConfig,
ToolkitEntry, ToolkitOptions, all adapter types, and the agents
plugin factory).
`template/appkit.plugins.json` — adds the `agent` plugin entry so
`npx @databricks/appkit init --features agent` scaffolds the plugin
correctly.
- Full appkit vitest suite: 1311 tests passing
- Typecheck clean across all 8 workspace projects
- `pnpm docs:build` clean (no broken links)
- `pnpm --filter=@databricks/appkit build:package` clean, publint
clean
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Documents the new `mcp` configuration block and the rules it enforces:
same-origin-only by default, explicit `trustedHosts` for external MCP
servers, plaintext `http://` refused outside localhost-in-dev, and
DNS-level blocking of private / link-local IP ranges (covers cloud
metadata services). See PR #302 for the policy implementation and
PR #304 for the `AgentsPluginConfig.mcp` wiring.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
- `docs/docs/plugins/agents.md`: new "SQL agent tools" subsection
covering `analytics.query` readOnly enforcement, `lakebase.query`
opt-in via `exposeAsAgentTool`, and the approval flow. New
"Human-in-the-loop approval for destructive tools" subsection
documents the config, SSE event shape, and `POST /chat/approve`
contract.
- `apps/agent-app`: approval-card component rendered inline in the
chat stream whenever an `appkit.approval_pending` event arrives.
Destructive badge + Approve/Deny buttons POST to
`/api/agent/approve` with the carried `streamId`/`approvalId`.
- `apps/dev-playground/client`: matching approval-card on the agent
route, using the existing appkit-ui `Button` component and
Tailwind utility classes.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Updates `docs/docs/plugins/agents.md` to document the new
two-key auto-inherit model introduced in PR #302 (per-tool
`autoInheritable` flag) and PR #304 (safe-by-default
`autoInheritTools: { file: false, code: false }`). Adds an
"Auto-inherit posture" subsection explaining that the developer
must opt into `autoInheritTools` AND the plugin author must mark
each tool `autoInheritable: true` for a tool to spread without
explicit wiring.
Includes a table documenting the `autoInheritable` marking on each
core plugin tool, plus an example of the setup-time audit log so
operators can see exactly what's inherited vs. skipped.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
- **Reference app no longer ships hardcoded dogfood URLs.** The three
`https://e2-dogfood.staging.cloud.databricks.com/...` and
`https://mario-mcp-hello-*.staging.aws.databricksapps.com/...` MCP
URLs in `apps/agent-app/server.ts` are replaced with optional
env-driven `VECTOR_SEARCH_MCP_URL` / `CUSTOM_MCP_URL` config. When
set, their hostnames are auto-added to `agents({ mcp: { trustedHosts
} })`. `.env.example` uses placeholder values the reader can replace
instead of another team's workspace.
- **`appkit.agent` → `appkit.agents` in the reference app.** The
prior `appkit.agent as { list, getDefault }` cast papered over the
plugin-name mismatch fixed in PR #304. The runtime key now matches
the docs, the manifest, and the factory name; the cast is gone.
- **Auto-inherit opt-in added to the reference config.** Since the
defaults flipped to `{ file: false, code: false }` (PR #304, S-3),
the reference now explicitly enables `autoInheritTools: { file:
true }` so the markdown agents that ship alongside the code-defined
one still pick up the analytics / files read-only tools. This is the
pattern a real deployment should follow — opt in deliberately.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
- `apps/dev-playground/config/agents/autocomplete.md` sets
`ephemeral: true`. Each debounced autocomplete keystroke no longer
leaves an orphan thread in `InMemoryThreadStore` — the server now
deletes the thread in the stream's `finally` (PR #304). Closes R1
from the MVP re-review.
- `docs/docs/plugins/agents.md` documents the new `ephemeral`
frontmatter key alongside the other AgentDefinition knobs.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Documents the MVP resource caps landed in PR #304: the static
request-body caps (enforced by the Zod schemas) and the three
configurable runtime limits (`maxConcurrentStreamsPerUser`,
`maxToolCalls`, `maxSubAgentDepth`). Includes the config-block
shape in the main reference and a new "Resource limits" subsection
under the Configuration section explaining the intent and per-user
semantics of each cap.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
The agents plugin's manifest `name` is `agents` (plural), so routes mount
at `/api/agents/*` and its client config is keyed as `agents` — but three
call sites still referenced the old singular `agent`:
- apps/agent-app/src/App.tsx: /api/agent/{info,chat,approve} returned an
Express 404 HTML page, which the client then tried to JSON.parse,
producing "Unexpected token '<', <!DOCTYPE ...". Swap to /api/agents/*.
- apps/dev-playground/client/src/routes/agent.route.tsx: same three
paths, plus getPluginClientConfig("agent") returned {} so
hasAutocomplete was false and the autocomplete hook short-circuited
before ever firing a request. Swap the lookup key to "agents".
- template/appkit.plugins.json: the scaffolded plugin descriptor still
used the singular name/key, which would have broken fresh apps the
same way. Align with the plugin's real manifest name.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Move reference apps to config/agents/<id>/agent.md; document migration and reserved skills folder; align generated API snippets and CHANGELOG.
typedoc picked up JSDoc changes from agent/v2/4-agents-plugin: - New public export `agentIdFromMarkdownPath` (helper for path-based id resolution used by `loadAgentFromFile`). - `loadAgentsFromDir` description/body now reflects the folder layout (`<id>/agent.md`, orphan `*.md` rejected, reserved `skills/` dir). Generated by docusaurus-plugin-typedoc during pnpm --filter=docs build.
… retire agent-app
Stage 0 of the smart-dashboard-demo plan. Ports the prototype Smart
Dashboard (NYC Taxi analytics) from the p3ju worktree into dev-playground
as a new route, migrates its markdown agents to the folder layout, and
deletes apps/agent-app — which is superseded by this demo as the
integration test of the entire v2 agents stack.
Client:
- New route at client/src/routes/smart-dashboard.route.tsx with
its own subdirectory for components/ and hooks/.
- Ported 8 components (ActiveFilters, AgentSidebar, AnomalyCard,
FareChart, InsightCard, KPICards, QuerySection, TripChart) and
4 hooks (useActionDispatcher, useAgentStream, useChartColors,
useDashboardData) as-is. Relative imports preserved.
- Nav link added in __root.tsx.
- TanStack routeTree.gen.ts auto-regenerated.
Server:
- Ports apply_filter and highlight_period inline tools.
- Adds sql_analyst (code-defined: fromPlugin(analytics)) and
dashboard_pilot (code-defined: apply_filter + highlight_period)
per the plan's Q2 = option B decision.
- Adds query markdown dispatcher in config/agents/query/agent.md
delegating to both specialists via the agents: frontmatter.
- Ports insights and anomaly ephemeral markdown agents.
Config:
- Ports 4 SQL queries into config/queries/dashboard_*.sql.
- Note: shared/appkit-types/analytics.d.ts not regenerated in this
commit; useAnalyticsQuery("dashboard_*", ...) uses explicit as
casts and works at runtime. Regenerate with
'npx @databricks/appkit generate-types' locally when convenient.
Cleanup:
- apps/agent-app/ removed in full. No references outside
pnpm-lock.yaml (regenerated).
- plans/smart-dashboard-demo.md added with the full staged plan.
Verification:
- pnpm --filter=dev-playground client typecheck: clean.
- pnpm --filter=dev-playground client vite build: clean.
- Server typecheck: same pre-existing errors as main (files plugin
union type, telemetry CacheManager, playwright DOM lib) — no new
regressions.
Next stages (1-6, per the plan): dispatcher integration verified,
save_view + approval card, dashboard-context injection + focus_chart,
Stream Inspector, polish, demo script.
Stage 0 ported the dashboard shell verbatim from the prototype; this commit layers every v2-stack feature on top, moves the feature dir out of routes/ (TanStack was flagging files as stray routes), rewrites the agent -> UI action pipeline for correctness, and adds discoverability for the HITL flow. Server (apps/dev-playground/server/index.ts) - Split the polymorphic apply_filter into four narrower tools: filter_by_date_range, filter_by_pickup_zip, filter_by_fare, clear_filters. Each has exactly one client-side effect; removes the whole class of 'agent said it worked but nothing moved' bugs. - Add clear_highlights, focus_chart, save_view (destructive; triggers the approval gate). - dashboard_pilot instructions rewritten with a compact verb-per-line reference so the LLM picks the right single tool for each intent. Client - moved out of routes/ - Feature code relocates to client/src/features/smart-dashboard/ (components/, hooks/, lib/). TanStack Router was warning that every non-route file under routes/ 'does not contain any route piece.' - smart-dashboard.route.tsx uses @/features/ aliases; the route file is now the only thing under routes/. Client - correctness fixes in the action dispatcher - Act only on response.output_item.done (never .added, which fires with partial arguments and caused double-applied highlights plus silent JSON-parse races). - Dedupe by call_id with a bounded LRU; reset on appkit.metadata (new-run signal). - Use updater callbacks (onFilterUpdate(prev => ...)) instead of a currentFilters prop to eliminate stale-closure bugs when the agent fires multiple tool calls in one render cycle. - Validate arg shapes per tool; anything malformed or unrecognized surfaces through onUnknownTool (route renders as a red banner + console.warn). Silent failure was the worst failure mode. - Emit a human-readable summary for every applied action (onAction). Client - discoverability / HITL - New QuickActionsBar with Save view... (inline name input), Clear filters, Clear highlights. Each dispatches through the chat pipeline so the agent still reasons and the approval gate still fires for destructive actions - the bar just saves typing. - ActionToast (bottom-left) confirms every dispatcher-applied action for ~3s. Answers 'did it work?' without opening the inspector. - QuerySection refactored into a view: content/isLoading/onSend come from the route. Lifting useAgentStream one level up lets the Quick Actions bar and the chat input share a single agent stream. - QuerySection example queries refreshed to cover the new tools. Client - stream-inspector wiring - SSEEvent extended with approval_pending payload fields. - use-stream-inspector threaded through so every run's events flow into the inspector's module-level store. - FocusableChart renamed its 'id' prop to 'chartId' (logical registry key, not a DOM id - biome was right to complain). Verification - pnpm --filter=dev-playground client tsc --noEmit: clean. - pnpm --filter=dev-playground client vite build: clean. - Server typecheck: same pre-existing errors as main; no new regressions. - apps/dev-playground/shared/appkit-types/analytics.d.ts regenerated by vite build to register the four dashboard_* queries; kept in the commit so CI and downstream consumers have typed useAnalyticsQuery access out of the box.
…iews panel + floating chat Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Fresh UC volumes don't have a saved-views/ subdirectory until the first
save; the SDK throws FILES_API_DIRECTORY_IS_NOT_FOUND on list. The
route was propagating that as a 500 which rendered as a red error
banner in the SavedViewsPanel on first load.
Catch the error explicitly, return { views: [] }, let the panel render
its 'no saved views yet' empty state cleanly. Uploads still work the
first time because the SDK auto-creates parent dirs on upload.
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
html2canvas 1.x throws on `oklch()` color values, which Tailwind v4 emits everywhere in computed styles. Swap to the maintained html2canvas-pro fork (drop-in API) so dashboard captures render without "Attempting to parse an unsupported color function 'oklch'" errors in the approval card. Keeps html2canvas pinned so types still resolve.
Databricks SDK `volume.download(path)` returns a wrapper
`{ contents: ReadableStream, "content-type": string }`, not the stream
itself. The previous handler tried to write the wrapper directly, which
produced an empty body and broke thumbnails in the saved-views panel.
Now we read `.contents`, drain the stream, and respond with the
server-reported content-type (falling back to `image/png`).
Also drops a couple of noisy console.logs left over from the debugging
session.
… click Clicking a saved-view thumbnail was sending a chat prompt like "Load the saved view 'january'" and letting the agent reconstruct filters from the view name. That dropped the highlights (agent had no tool to fetch the stored metadata) so January-with-focus-on-week-1 came back as just January-wide. Since the client already holds the full authoritative metadata for the clicked thumbnail, bypass the agent and apply `meta.filters` and `meta.highlights` directly to local state, with a toast summarising what was restored. Also hardens the `appkit.approval_pending` handler: it now accepts both snake_case and camelCase fields and validates that approval_id/tool_name/stream_id are non-empty strings before enqueuing, so a malformed event can't push a broken approval card.
Picks up the new `annotations?: ToolAnnotations` field on `ToolConfig` and `FunctionTool` introduced upstream in the annotations-propagation fix.
…nable agent feed Reshapes the Smart Dashboard demo from a sparse 2-chart layout into a 2x2 chart grid with a right-rail agent feed, and turns the previously read-only insights/anomaly cards into clickable actions that drive the dashboard directly. New visualisations: - HourlyHeatmap: day-of-week × hour-of-day grid, click a cell to ask the agent to investigate that slot. - TopZonesChart: hand-rolled horizontal bar leaderboard with click-to- filter and a `highlight_zone` ring driven by the agent. - KPI sparklines: inline 7-day micro-charts with windowed trend deltas baked into each KPI card. Agent feed becomes interactive: - `feed-actions.ts` defines a structured action schema (filter_date, filter_zip, filter_fare, highlight_period, highlight_zone, focus_chart, ask) and a parser. The `insights` and `anomaly` ephemeral agents now emit JSON matching that schema. - `ActionableCard` renders insights/anomalies with action chips that invoke `useActionDispatcher.dispatch` directly — same code path the SSE function-call handler uses, so UI clicks and agent tool calls behave identically. - The feed re-runs (debounced) whenever filters or highlights change. Server-side wiring: - Adds `highlight_zone` and `clear_zone_highlights` tools. - Extends the `focus_chart` enum with `hourly_heatmap` and `top_zones`. - Updates `dashboard_pilot` instructions to prefer `highlight_zone` over `filter_by_pickup_zip` when calling out a single ZIP. - Adds three SQL queries: `dashboard_hourly_heatmap`, `dashboard_top_zones`, `dashboard_kpi_sparklines`. The top-zones query casts `pickup_zip` (an INT in samples.nyctaxi.trips) to STRING so the client's highlight Map keys, the agent's `highlight_zone` arg, and the filter parameter all speak the same type. Polish & defensive fixes: - Defensive `Number()` coercion in `kpi-cards.tsx` for sparkline values so trend math doesn't render `NaN%` or string-concatenated revenue totals if a driver hands back DECIMAL-as-string. - `Sparkline` reserves vertical space for intentionally-empty series (e.g. the categorical "Top Pickup Zone" KPI) instead of rendering a loading-style placeholder. - 2x2 chart grid uses `items-start` + `auto-rows-min content-start` so the rail no longer stretches the chart column and creates dead space. - `ChatDrawer` becomes a controlled component (`open` + `onOpenChange`) so any agent-triggering UI action can auto-open the chat — the user always sees the agent's response without manual disclosure.
The playground header was unscalable: 14 demo links rendered as side-by-side buttons that overflowed on narrow screens, and the home page maintained a parallel hand-curated grid that had already drifted (missing Smart Dashboard, Chart Inference, Vector Search, Policy Matrix, and Serving — ~30% of the catalog). Introduces `client/src/lib/nav.ts` as the single source of truth: each demo declares its label, one-line description, lucide icon, and category group. Both surfaces now read from the same list, so adding a demo is a one-line change and they can no longer drift. Header (`__root.tsx`): - Replaces the button wall with a single "Menu" hamburger dropdown grouping demos by purpose (Data / AI / Platform). - Active route is highlighted inside the dropdown and shown breadcrumb- style next to the brand, so the user always knows where they are. - Caps dropdown height at viewport-minus-header with overflow scroll, so adding more demos won't break the layout. Home page (`index.tsx`): - Restrained hero with a soft dual-radial gradient wash (~6-8% opacity, primary + accent) — depth without saturation. - Featured card for the Smart Dashboard flagship demo: gradient accent, icon tile, eyebrow badge, animated CTA. The featured demo also appears in its category grid, de-emphasised with a "Featured above" note. - Three category sections with one-line taglines, rendered as a 1/2/3-col responsive grid of icon + title + description cards. Each card is a real `<Link>` (not a button inside a decorative `<Card>`), so the whole surface is keyboard-accessible. - Footer shows live demo and category counts driven by the catalog.
…tive Retag save_view as effect: "write" (it creates a PNG; it doesn't delete anything) and teach the approval card to render three distinct tiers. Capturing a screenshot no longer masquerades as deletion: writes get a calm blue card with a plus-circle icon, updates get a warning-amber card with a pencil, and real destructive actions retain the red shield-alert. Legacy destructive: true still maps to the red tier, so tools that haven't migrated keep their current look.
Tailwind v4 compiles `bg-blue-50/50` to a two-layer rule: an sRGB hex fallback plus an `@supports (color-mix)` override that mixes the oklch palette token with transparent in oklab. Browsers with color-mix support (recent Chrome/Arc) take the oklab path; older embedded Chromiums (e.g. Cursor's built-in browser) fall through to the sRGB hex. Those two paths produce visibly different tints against the dark `--card` token, which is why the agent-feed cards rendered inconsistently across Chrome, Arc, and Cursor's browser. Pin the four insight/anomaly-tier backgrounds to arbitrary 8-digit hex (`bg-[#eff6ff80]` etc.) so every browser lands on the same sRGB path. Values taken from Tailwind's own fallback output to preserve the intended look on color-mix-capable browsers.
appkit-ui's globals.css already defines dark-theme tokens via two paths — an explicit `.dark` class on <html>, and `@media (prefers-color-scheme: dark)` guarded by `:root:not(.light)` so an explicit `.light` class wins. Tailwind v4's default `dark:` variant, however, is purely media driven. That mismatch shows up when the user forces light via the playground's theme selector while their OS is in dark mode: the bootstrap script sets `<html class="light">`, --card/--background correctly resolve to light, but every `dark:*` utility keeps firing under the media query — cards end up painted with dark-mode backgrounds layered under light-mode chrome. Declare a playground-local `@custom-variant dark` that mirrors the token logic exactly: fire when the element is (or descends from) `.dark`, or when `prefers-color-scheme: dark` matches and no `.light` ancestor is present. This rebinds every `dark:*` utility to respect the theme selector's forced choice, keeping the rest of appkit-ui's consumers — which don't ship the bootstrap script — on the existing media-only behaviour.
The streaming-message bubble in the smart-dashboard chat drawer used `animate-pulse` while tokens arrived. The constant fade in/out reads as visual noise when the agent is mid-stream — especially with longer replies where it pulses for many seconds. Drop the animation; the ellipsis placeholder still communicates the loading state for empty streaming bubbles.
`server({ autoStart: false }).then(appkit => appkit.server.extend(...).start())`
is gone — `createApp` now orchestrates server start itself, with the
post-setup hook surfaced as the `onPluginsReady` config callback.
Drop `autoStart: false`, hoist the `extend` block from the trailing
`.then` chain into `onPluginsReady`, and replace the dangling promise
with `.catch(console.error)` so unhandled rejections still surface.
Tracks #280 / #291 (autoStart removal + on-plugins-ready codemod).
Selecting `agents` in `databricks apps init` previously produced an
app that booted, logged "No agents registered.", and rendered no UI for
the plugin. Fixes that by scaffolding two starter agents (one markdown,
one code-defined) and a chat surface, gated on `{{if .plugins.agents}}`.
Added:
- template/config/agents/assistant/agent.md — markdown agent, default,
no tools. Demonstrates the declarative form.
- template/server/agents/helper.ts — code-defined agent via
createAgent({...}) with two inline tool({...}) definitions:
current_time (returns ISO timestamp) and count_words. Tools are pure
JS so the demo works regardless of which other plugins were selected
at scaffold time.
- template/client/src/pages/agents/AgentChat.tsx — minimal SSE consumer
for /api/agents/chat with an agent picker, streaming text bubbles,
and inline tool-call rows. Hand-rolled because @databricks/appkit-ui
doesn't yet ship a generic agent chat primitive — replace with one
when it lands.
Modified:
- template/server/server.ts: when {{if .plugins.agents}}, imports the
helper agent and wires it as agents({ agents: { helper } }) instead
of bare agents(). The markdown 'assistant' loads automatically from
config/agents/.
- template/client/src/App.tsx: conditional NavLink + route entry,
mirroring the analytics/files/etc. blocks.
End-to-end shape after init with --features agents:
- GET /api/agents/info returns { agents: ['assistant', 'helper'],
defaultAgent: 'assistant' }
- /agents page renders chat with picker
- 'what time is it?' to helper triggers a current_time tool round-trip
- 'count words in: the quick brown fox' triggers count_words → 4
The serving-endpoint resource (DATABRICKS_SERVING_ENDPOINT_NAME) is
already declared in template/appkit.plugins.json from PR 4, so the CLI
prompts for an endpoint when agents is selected.
agents, createAgent, fromPlugin, tool and all agent-related exports are now under the beta subpath. Update the dev-playground server and the template helper to import from @databricks/appkit/beta.
- Document agents as beta in docs and set stability in app template manifest - Point Docusaurus Typedoc at typedoc.entry.ts so stable + beta APIs publish together (fixes agent symbol pages being dropped from index-only builds) - Regenerate api/appkit index and sidebar; knip-ignore docs-only entry file Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Typedoc reference grew when the unified entry started exposing tool authoring primitives (defineTool, AppKitMcpClient, DatabricksAdapter, parseTextToolCalls, ToolEntry, ToolRegistry, etc.) that beta.ts now re-exports. Regenerating brings docs/docs/api/ back in sync so the docs:build CI gate passes. pnpm-lock.yaml gains the get-port@7.2.0 entry that was added to @databricks/appkit on main and merged into v4 during the stack rebase. Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>
Signed-off-by: MarioCadenas <MarioCadenas@users.noreply.github.com>


Final layer of the agents feature stack. Everything needed to
exercise, demonstrate, and learn the feature.
Reference application: agent-app
apps/agent-app/— a standalone app purpose-built around the agentsfeature. Demonstrates every major capability in one place:
config/agents/assistant.md, default)with destructive file tools (upload, delete) for HITL demo, and
agents: [support, researcher]delegating to both a markdownsibling and a code-defined specialist.
config/agents/support.md) — full analyticsget_weather.researcherinserver.ts) — definedin code specifically because its MCP tool set is conditional on
runtime env vars, which markdown frontmatter can't express.
Referenced from
assistant.mdvia the markdown → code cross-reference.server.ts— concise: ambienttool()factories, conditional MCPwiring, zero-trust host allowlist derived from the same env vars,
agents()plugin config withautoInheritToolsandmcp.trustedHosts.streaming tokens, tool calls, and an approval card that approves or
denies destructive tool requests over
/api/agents/approve.databricks.yml,app.yaml) and.env.examplefor local dev.dev-playground chat UI + autocomplete agent
apps/dev-playground/client/src/routes/agent.route.tsx— chat UIwith inline autocomplete (hits the
autocompletemarkdown agentconfigured with
ephemeral: true) and a full threaded conversationpanel (hits the default agent).
apps/dev-playground/server/index.ts— code-definedhelperagentusing
fromPlugin(analytics)alongside the markdown-drivenautocompleteagent inconfig/agents/, demonstrating the mixedsetup against the same plugin list. Route tree (
routeTree.gen.ts)regenerated to include the new
/agentroute.Docs
docs/docs/plugins/agents.md— progressive guide covering:toolkits:/tools:frontmatter.fromPlugin().runAgent()(nocreateAppor HTTP).Plus configuration reference (including
approval,limits,mcpkeys), runtime API reference, and a full frontmatter schema table.
docs/docs/api/appkit/— typedoc regenerated for the full agentspublic surface including
AgentDefinition.ephemeral,AgentsPluginConfig.{approval, limits, mcp}, updatedloadAgentFromFile/loadAgentsFromDirsignatures, expandedAgentEventunion, andToolkitEntryannotations.Template
template/appkit.plugins.json— adds theagentsplugin entry sonpx @databricks/appkit init --features agentsscaffolds the plugincorrectly.
Test plan
pnpm docs:buildclean (no broken links)pnpm --filter=@databricks/appkit build:packageclean, publint cleanSigned-off-by: MarioCadenas MarioCadenas@users.noreply.github.com
PR Stack
agents()plugin +createAgent(def)+ markdown-driven agents — feat(appkit): agents() plugin, createAgent(def), and markdown-driven agents #304fromPlugin()DX +runAgentplugins arg + toolkit-resolver — feat(appkit): fromPlugin() DX, runAgent plugins arg, shared toolkit-resolver #305Demo
agent-demo.mp4