[AI-860] Local terminal attach for hosted agents (Phase 1: run-agent / attach / ls)#155
Conversation
Code Review by Qodo
1. Spawn frame OOM risk
|
PR Summary by QodoPhase 1: Local terminal attach for daemon-hosted agents (run-agent/attach/ls) WalkthroughsDescription• Add kcap run-agent, attach, and ls to drive daemon-hosted agents from a local terminal. • Introduce an owner-only Unix domain control socket with a length-prefixed, AOT-safe framing protocol. • Run locally launched agents as PrivateLocal: no per-agent server calls and safe borrowed-cwd cleanup. Diagramgraph TD
A["User terminal"] --> B["kcap CLI"] --> C["Local control socket"] --> D["LocalControlServer"] --> E["AgentOrchestrator"] --> F["PTY agent"]
E --> G["LocalSocketSink(s)"]
subgraph Legend
direction LR
_cli["CLI"] ~~~ _sock(["Socket"]) ~~~ _svc["Service"]
end
High-Level AssessmentThe following are alternative approaches to this PR: 1. Reuse existing serialization (JSON/msgpack) over socket
2. Use an existing multiplexer (tmux/screen) instead of custom attach
3. Drive local attach through SignalR loopback
Recommendation: The PR’s approach (simple length-prefixed frames over an owner-only local socket) is the best fit for Phase 1: it stays AOT-safe, keeps latency low, and cleanly isolates local attach from server contracts. The key follow-up is ensuring long-term protocol evolution stays append-only (FrameType discipline) and that sink overflow/force-detach behavior remains well-documented and stable. File ChangesEnhancement (20)
Tests (7)
Documentation (3)
|
| static (string vendor, WorkLocation work, string cwd, string[] args, ushort cols, ushort rows) ParseSpawn(byte[] p) { | ||
| var o = 0; | ||
| var work = (WorkLocation)p[o++]; | ||
| var cols = Be16(p, o); o += 2; var rows = Be16(p, o); o += 2; | ||
| var vendor = ReadLp(p, ref o); var cwd = ReadLp(p, ref o); | ||
| var n = BinaryPrimitives.ReadInt32BigEndian(p.AsSpan(o)); o += 4; | ||
| var args = new string[n]; | ||
| for (var i = 0; i < n; i++) args[i] = ReadLp(p, ref o); | ||
| return (vendor, work, cwd, args, cols, rows); |
There was a problem hiding this comment.
1. Spawn frame oom risk 🐞 Bug ⛨ Security
FrameCodec.ParseSpawn allocates new string[n] from an on-wire value without validating n against the payload size or a sane upper bound, so a local socket client can force an OOM/crash by sending a Spawn frame with a huge arg count. The length-prefixed string reader (ReadLp) also lacks bounds checks, so malformed payloads can overrun offsets and throw non-protocol exceptions.
Agent Prompt
## Issue description
`FrameCodec.ParseSpawn()` trusts the arg-count field from the wire (`n`) and allocates an array of that size without validating it against remaining payload bytes or a reasonable maximum. A malformed/malicious local client can send a small payload with a very large `n` and trigger a huge allocation (OOM) before any further parsing occurs.
## Issue Context
The local control socket is a privileged local interface (0600), but it is still untrusted input from any process running as the same user. A single bad frame should fail fast with a bounded error, not take down the daemon.
## Fix Focus Areas
- src/Capacitor.Cli.Core/LocalIpc/FrameCodec.cs[80-88]
- src/Capacitor.Cli.Core/LocalIpc/FrameCodec.cs[109-110]
## Suggested fix
- Add explicit bounds checks before reading each field:
- Ensure `o + 4 <= p.Length` before reading any 32-bit length.
- Validate `n >= 0` and cap it to a small maximum (e.g. 4096) *and* validate `n` is plausible given remaining bytes (at minimum `remaining >= n * 4` for the per-arg length prefixes).
- In `ReadLp`, validate `len >= 0` and `o + len <= p.Length` before calling `Encoding.UTF8.GetString`.
- On violation, throw `InvalidDataException` (protocol error) rather than allowing OOM or `ArgumentOutOfRangeException`/`OutOfMemoryException` to surface.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Fixed in 75c6a23. ParseSpawn now validates the wire arg-count against a cap (4096) and remaining bytes before allocating, and ReadLp bounds-checks every length prefix — a malformed/oversized local frame is now a clean InvalidDataException (caught + logged by the socket handler), not an OOM or ArgumentOutOfRangeException. Added two codec tests (bogus arg-count, truncated string length).
| var sink = new LocalSocketSink(capacity: 4096, (chunk, _) => Send(LocalFrame.Stdout(chunk))); | ||
| lock (agent.SinksLock) agent.LocalSinks.Add(sink); | ||
|
|
||
| try { | ||
| // Bounded replay BEFORE any live chunk so the client paints a coherent screen. | ||
| await Send(FrameCodec.Attached(agent.Id, agent.OutputBuffer.Snapshot())); | ||
| var pump = sink.RunAsync(ct); | ||
|
|
||
| try { | ||
| while (!ct.IsCancellationRequested) { | ||
| var f = await FrameCodec.ReadAsync(stream, ct); | ||
| if (f is null || f.Type == FrameType.Detach) break; | ||
|
|
||
| switch (f.Type) { | ||
| case FrameType.Stdin: await agent.Process.WriteAsync(f.Bytes); break; | ||
| case FrameType.Resize: ApplyResizeClamp(agent, sink, f.Cols, f.Rows); break; | ||
| } | ||
| } | ||
| } catch (Exception ex) when (ex is EndOfStreamException or IOException or OperationCanceledException) { | ||
| /* client gone */ | ||
| } finally { | ||
| sink.Complete(); | ||
| await pump.ConfigureAwait(false); | ||
| } |
There was a problem hiding this comment.
3. Sink detach not propagated 🐞 Bug ≡ Correctness
When LocalSocketSink overflows it sets Detached and stops accepting output, but AttachClientLoopAsync never reacts to sink.Detached and keeps processing client stdin/resize frames until the client disconnects or explicitly detaches. This can leave an attach session running after its output path is dead, so the user may continue interacting without a valid terminal view.
Agent Prompt
## Issue description
`LocalSocketSink` can become `Detached` (queue overflow or send failure), but the attach session loop never checks that state and does not proactively end/notify the client. This breaks the intended “force-detach and replay on rejoin” behavior and can lead to blind input.
## Issue Context
- Overflow sets `Detached = true` and completes the sink channel, but does not close the socket.
- The attach loop terminates only on explicit `Detach`, EOF, or cancellation.
## Fix Focus Areas
- src/Capacitor.Cli.Daemon/Services/LocalSocketSink.cs[29-36]
- src/Capacitor.Cli.Daemon/Services/AgentOrchestrator.LocalIpc.cs[90-123]
## Suggested fix
- Create a per-session CTS (linked to `ct`) and pass it to both `sink.RunAsync(...)` and the input loop.
- Monitor sink termination/detach:
- e.g. start a task that awaits `pump` and if `sink.Detached` becomes true, attempt to send a terminal frame (Error or Detach) and then cancel the session CTS to break the input loop.
- Consider removing detached sinks from `agent.LocalSinks` immediately when they detach, to avoid accumulating inert sinks until the client eventually disconnects.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Fixed in 75c6a23. When a sink force-detaches (overflow/send failure) its pump completes with Detached set; a continuation now cancels the input loop's CTS so AttachClientLoopAsync stops reading client stdin, removes the sink, and sends an Error frame telling the client to reattach (a fresh kcap attach replays from a clean frame) — no more blind input. The agent-self-exit path (ExitedCts) already broke the loop the same way.
Spec for letting a human attach their local terminal to a daemon-hosted agent as one of N clients (alongside the web UI), enabling pair programming and start-local-continue-remotely. Daemon owns the PTY; local terminal attaches over a new owner-only local socket, teammates over existing SignalR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses five findings verified against current daemon code: - Drop the "offline-capable" claim: daemon requires ServerUrl + SignalR (DaemonConfig.Validate / DaemonRunner). Phase 1 = no new server CONTRACT, not offline; run-agent stays off the offlineCommands list. - Owned-worktree vs borrowed-cwd distinction: cleanup must never remove a borrowed cwd or its branch (CleanupAgentAsync -> WorktreeManager.RemoveAsync does Directory.Delete / worktree remove --force / branch -D). Critical test. - Lossless per-sink terminal queues (never DropOldest, which reintroduced the AI-844 TUI corruption); overflow force-detaches that one client and replays on rejoin, without stalling the shared PTY loop or other sinks. - Resize arbitration: clamp the PTY to the smallest attached client (tmux). - Remote control (per decision): write-by-default once shared, mirroring hosted-agent run permissions; private-until-explicitly-shared; observe-only is the existing read-only sharing path. Blast-radius trade-off documented. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Five findings verified against current daemon code, all correct: - PrivateLocal agent state: locally-launched agents suppress ALL server interaction (AgentRegistered/status/events/terminal/unregister; SignalR sink not attached) until an explicit Phase 2 share flips PrivateLocal->Shared. Resolves the private-until-shared vs unconditional-register/stream conflict. - Borrowed-cwd Prepare() must skip repo-mutating steps (no .mcp.json, settings.local.json, ~/.claude.json, ~/.codex trust); only read-only/idempotent preflight allowed. Test asserts in-place launch writes no repo/global files. - Launcher-agnostic passthrough as an IHostedAgentLauncher contract; Codex injects mandatory --cd/--no-alt-screen + hooks preflight, then verbatim args. run-agent codex works in v1. New "Vendor passthrough" subsection + conflict policy. - Terminal stream: "no silent partial-stream corruption" + bounded (2 MB OutputBuffer) replay, not "never drops bytes". - Resize: Phase 1 min-clamps local socket clients only; Phase 2 caveat that SignalR has only one agent-level resize (no per-web-client dimensions). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Four findings verified against current daemon code: - PrivateLocal deny-all: orchestrator makes NO per-agent ServerConnection call (register, run events, DaemonUpdateRepoPaths, LaunchFailed, status, run-stopped, EndAgentSession, reconnect re-register, heartbeat) — single guard + strict mock test that fails on any per-agent server method. - PrivateLocal second channel (spawned-agent hooks): launch env omits the hosted-agent vars (KCAP_AGENT_ID/KCAP_DAEMON_URL/KCAP_RENDERED_AGENT) so permissions prompt natively in the terminal and it is not web-controllable; KCAP_URL is kept so the session records normally (per decision — private means not-a-controllable-hosted-agent, not unrecorded). - Share does a one-time bounded OutputBuffer replay before live chunks: the reconnect path skips replay because the server keeps its own buffer, but a freshly-shared PrivateLocal agent the server never saw has none. - Add --share to the command-surface flags (Phase 2); clarify "private" in the security section does not mean unrecorded. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three findings verified against current daemon/hook code: - Env is fixed at execvp (UnixPtyProcess.cs:62); hooks read it at runtime (ClaudeHookCommand.cs:49, CodexHookCommand.cs:87, PermissionRequestCommand.cs:37,77). Removed the false "apply hosted-agent env on share" claim. Private launch now KEEPS KCAP_URL + KCAP_AGENT_ID and OMITS KCAP_RENDERED_AGENT + KCAP_DAEMON_URL; permission prompts stay native for the agent's whole life (accepted, can't change). - Tag-and-link (per decision): KCAP_AGENT_ID is a tag only (sets agent_host_id, no server call — deny-all holds), so the recorded transcript is link-ready; share registers that same id and the server late-links the pre-tagged session. Server must tolerate agent_host_id for an as-yet-unregistered agent (Phase 2 contract). - Codex mandatory passthrough flags: reject a user-supplied duplicate of --cd / --no-alt-screen with a clear error rather than relying on arg-precedence. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bite-sized TDD plan for the local-only "tmux for your agent" foundation: frame codec, daemon local socket, N-client lossless fan-out, PrivateLocal deny-all + owned-vs-borrowed cleanup + conditional env, raw-mode CLI client (run-agent/attach/ls), launcher-agnostic passthrough. Six milestones (A-F), each task TDD with exact files, code, test commands, and AOT gates. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…rator Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…overflow Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…min-clamp Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d cwd Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gh BuildArgs Borrowed cwd launches skip repo-mutating Prepare steps (Claude overlay/.mcp.json/ trust; Codex overlay/trust — read-only hooks preflight still runs). New BuildPassthrough on IHostedAgentLauncher: Claude verbatim; Codex injects mandatory --cd/--no-alt-screen and rejects user duplicates of them. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d + conditional env HandleLocalSpawnAsync runs a local agent PrivateLocal (IsPrivate) in an owned worktree or borrowed cwd, with passthrough args. Deny-all: every per-agent ServerConnection call (status, run events, repo-path, launch-failed, end-session, reconnect re-register, heartbeat, unregister) is guarded by !IsPrivate. Hook env omits KCAP_RENDERED_AGENT/ KCAP_DAEMON_URL (native terminal permissions), keeps KCAP_AGENT_ID (tag-and-link) + KCAP_URL (recording), and re-adds ANTHROPIC_API_KEY (survives the headless scrub). Strict-tripwire test proves no server call fires for a private agent. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…test Socket round-trip test exercises LocalControlServer accept loop + frame routing over a real Unix domain socket (List -> AgentList). Full real-PTY spawn-over-socket is covered by the deterministic deny-all/env test (spawn+attach logic) plus manual smoke (E4), kept out of the automated suite to avoid forkpty-in-a-multithreaded-test flakiness. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…layout-agnostic) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ll, detach, replay) LocalAgentClient connects to the daemon socket, raw-modes the terminal, pumps stdin<-> stdout, polls window size for resize (SIGWINCH isn't in PosixSignal), and intercepts the detach sequence (Ctrl-Q d) via a unit-tested DetachScanner. RunAgentCommand parses args, ensures a daemon is running (auto-start + wait), and wires run-agent/attach/ls + --detached. Interactive paths are build-verified; they need manual smoke (TTY + live daemon). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
f1ebaf7 to
ae45697
Compare
System.Console on Unix re-cooks the terminal, defeating raw mode (double echo, line buffering, LF-not-CR on Enter so the agent never submits, missing cursor). Read stdin and write stdout through read(0)/write(1) directly; touch the Console size getter once before enabling raw mode so .NET's lazy init can't clobber it afterward. Window size still uses Console.WindowWidth (read-only; ioctl is unreliable via P/Invoke on macOS ARM64). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The attach loop was blocked on the client's stdin read and never noticed the agent exiting on its own, so it never flushed the final output or sent Exited — the local client hung. Add a per-agent ExitedCts tripped in CleanupAgentAsync; the attach loop's read is linked to it, so on agent exit it breaks, drains the sink, and sends Exited. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…oritative
Visibility & sharing live entirely on the server/web (AgentRegistered carries no
visibility field; no 'share' concept in CLI/daemon). Reframe Phase 2: a local agent
registers like a UI-launched one and inherits the identical owner-only-until-shared
model — owner sees/drives it from their own web UI immediately ('remote control'),
teammates only via the existing UI share. This drops the invented 'private-until-shared'
model, the bespoke kcap share / --share, and tag-and-link (registers from the start).
Phase 1 code: omit KCAP_AGENT_ID so a local agent records as a plain local session (no
agent_host_id for an unregistered agent); remove the --share flag (sharing is a UI action,
future kcap share CLI tracked as AI-861).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Update — rebased on
Manual smoke confirmed: clean input, cursor, |
…h (Qodo #155) - FrameCodec: validate the wire arg-count against a cap (4096) and remaining bytes before allocating, and bounds-check every length prefix in ReadLp — a malformed local frame is now a clean InvalidDataException, not an OOM or ArgumentOutOfRangeException. - AttachClientLoopAsync: when a sink force-detaches (overflow/send failure) its pump completes with Detached set; cancel the input loop and send an Error frame so the client stops typing blind and reattaches for a fresh replay (the intended force-detach behaviour). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…omic replay, resize) - UnixPtyProcess: unset KCAP_AGENT_ID/RENDERED/DAEMON_URL in the PTY child before applying extraEnv, so a private local agent can't inherit hosted-agent identity/routing from a daemon that was itself launched inside a kcap-tracked session (P1). - HandleLocalSpawnAsync: remove a daemon-created (--worktree) worktree if Prepare / passthrough / spawn fails after creation — no leaked worktrees/branches (P1). - Snapshot+subscribe is now atomic with the read loop's append+fan-out (both under SinksLock), so a chunk can't be both replayed and sent live (no duplicate on attach) (P2). - Resize clamp recomputed on detach too, so a departing smaller client no longer pins larger clients to its size (P2). - Spec: clarify Scope that web client/visibility is Phase 2; Phase 1 is terminal-only (P2 docs). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ign stale docs - LocalAgentClient: distinguish a clean end (explicit detach, stdin EOF, Exited, or Error) from an unexpected daemon/socket drop. A drop now prints 'lost connection to the daemon' and returns exit 1 instead of looking like a clean exit (0). - Docs: spec testing bullet and the plan's D4 task no longer claim KCAP_AGENT_ID is kept for Phase 1 — it's omitted (plain local session, no agent_host_id tag), matching the code. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Summary
Phase 1 of "local terminal attach" — start a daemon-hosted coding agent from your own terminal, drive it live, detach without killing it, and re-attach later ("tmux for your agent"). Entirely local; no new server contract. Web visibility (registering the agent so you can see/drive it from the web UI) is deferred to Phase 2.
Design + plan are on the branch:
docs/superpowers/specs/2026-06-13-local-attach-hosted-agent-design.mddocs/superpowers/plans/2026-06-14-local-attach-phase1.mdWhat's new
kcap run-agent <vendor> [kcap flags] -- [agent args]— auto-starts the daemon, spawns the agent, attaches your terminal.--worktree(default: in-place in your cwd),--name,--detached. Everything after--is forwarded to theclaude/codexCLI verbatim (launcher-agnostic passthrough; Codex injects mandatory--cd/--no-alt-screenand rejects user duplicates).kcap attach <id>/kcap ls— re-attach to / list daemon-hosted agents. Detach withCtrl-Q d.How it works
0600) on the daemon; length-prefixed, AOT-safe frame codec shared by CLI + daemon (with bounds-checked parsing — a malformed frame is a clean protocol error, never an OOM).DropOldest, which reintroduced AI-844 corruption; never stalls the shared PTY loop). Snapshot+subscribe is atomic with the read loop, so attach never duplicates live output.PrivateLocal: a deny-all guard means no per-agent server call (register/status/events/repo-path/heartbeat/finalize/unregister/reconnect), enforced by a strict-tripwire test. They record as a plain local session: hook env keepsKCAP_URLand re-addsANTHROPIC_API_KEY, and omits the hosted-agent vars (KCAP_AGENT_ID,KCAP_RENDERED_AGENT,KCAP_DAEMON_URL) — so the agent isn't tagged as a hosted agent and permissions prompt natively in the terminal. (UnixPtyProcessalso unsets those in the PTY child so nothing leaks from the daemon's own env.)Prepare()(the catastrophic-rm -rf-your-repo path is the top safety invariant + has a dedicated test); a failed--worktreelaunch cleans up the worktree it created.System.Console, which re-cooks the tty on Unix). Resize min-clamps the PTY across attached local clients (tmux semantics), recomputed on attach/detach/resize. Agent self-exit (e.g./exit) cleanly tears the client down.Test plan
LocalControlServer)claudesession — clean interactive I/O + cursor,Ctrl-Creaches the agent,Ctrl-Q ddetaches (agent keeps running),kcap attachrepaints,/exitquits cleanly.Review follow-ups (all addressed; see thread + commits)
--worktreecleanup; atomic snapshot/subscribe; resize reclamp on detach; spec Scope/Phasing consistency.Notes / scope
run-agent/attach/lsno-op with a message on Windows).ServerUrl+ SignalR as today.kcap shareCLI convenience is tracked as AI-861). Web clients attach via the existing SignalR fan-out; first-web-subscribe one-time replay; web-client resize aggregation; Windows support.🤖 Generated with Claude Code