Skip to content

feat(buzz-acp): steering as the default mid-turn mention delivery#1160

Open
tlongwell-block wants to merge 6 commits into
mainfrom
eva/steering-default-mention
Open

feat(buzz-acp): steering as the default mid-turn mention delivery#1160
tlongwell-block wants to merge 6 commits into
mainfrom
eva/steering-default-mention

Conversation

@tlongwell-block

@tlongwell-block tlongwell-block commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

What

Delivers a new mid-turn mention to an agent without ending its turn, on agents that support it — and falls back cleanly to cancel+merge on agents that don't.

When an eligible author (the agent's owner, an allowlisted user, or a cryptographically-verified sibling bot) @mentions the agent while it's working, that message is folded into the live turn via a non-standard ACP steering extension — a mid-turn "inject and continue" method that today exists only as a vendor extension (_goose/unstable/session/steer). The agent keeps working and weaves the new message in; the turn is not cancelled or restarted. On any agent that doesn't implement the extension — detected behaviorally, never by capability flag — the same delivery path falls back to cancel + merge + steering-framing, which works on every ACP agent today.

This PR carries both halves of that delivery:

  • Client side (buzz-acp): detect steer support, target the live run, route the mention through the steer extension with a clean fallback.
  • Server side (buzz-agent): implement the steer method so buzz-agent's own agent can receive a mid-turn steer and fold it into its in-flight prompt.

Per Tyler's directive, both land here on #1160, not a follow-up.

Standardization intent

ACP has no standard mid-turn "inject and continue" primitive — the standard gives a client exactly one mid-turn signal, session/cancel, and it is terminal. The steering capability this PR uses originated as a non-standard extension in goose (_goose/unstable/session/steer). We think it should be a standard, and we intend to propose it as one.

With this PR, buzz-agent becomes the second known implementation of that extension — the first independent one outside goose. Two interoperating implementations of the same wire contract is exactly the evidence a standards proposal needs: it demonstrates the method is implementable, useful, and not goose-specific. Until that proposal lands, we treat the method as what it is — a non-standard extension — and detect it behaviorally rather than assuming it.

Why this shape

The design is: use the steer extension where an agent implements it, fall back to cancel+merge where it doesn't. Detection is behavioral (try-and-tolerate: method_not_found / invalid_params → fall back), not name- or capability-gated — the extension advertises no capability field, so a flag-based gate would be a lie. Because buzz-agent implements the identical wire contract goose does, a single client-side delivery path serves both agents — which is the point: the contract is shared, not vendor-locked.

Lanes

Lane A — client-side steer (buzz-acp). Steer-support detection, activeRunId parse from session/update _meta.goose.activeRunId (the extension's run-targeting field), the try-and-tolerate plumbing, and the AcpClient read-loop seam that routes a steer around the supervisor's cancel select! so the in-flight prompt future is never dropped. Landed as ff46007d.

Lane B — server-side steer (buzz-agent). Landed (528280bf). Implements the steer method end-to-end:

  • Advertises a per-prompt activeRunId via params._meta.goose.activeRunId on session/update so steer-capable clients can target the live run.
  • Validates steer requests against the extension's contract (non-empty prompt, active run present, expectedRunId matches the live run; otherwise invalid_params → client falls back).
  • Queues accepted steers per session and drains them at each run-loop round boundary, appending as user turns. The turn continues; it is not restarted.
  • Tests (tests/fake_llm.rs): steer folds into a live turn (end_turn, not cancelled; steered text reaches the provider); rejected on no active run, run-id mismatch, and empty prompt.

Baseline — cancel+merge default + steering framing. The original three commits (cbb2f787, a01df096, c4cd4c20): a MultipleEventHandling::Steer variant (now the default, was queue), ControlSignal::Steer with a cancel+merge requeue that stamps a CancelReason and renders steer-vs-interrupt framing. This is the fallback path for agents that don't implement the extension, and the behavioral floor.

Steer seam (AcpClient)

The seam routes a steer around the supervisor's cancel select! so the in-flight prompt future is never dropped:

  1. mpsc::channel(1) + try_send; full/closed → fallback ControlSignal::Steer.
  2. AcpClient.steer_rx: Option<Receiver> — installed at dispatch, take()d on prompt entry, dropped on scope exit; reuse/heartbeat tests pin the no-stale-receiver invariant.
  3. TaskMeta { ..., steer_tx: Option<Sender<SteerRequest>> } — stored outside the moved OwnedAgent, same ownership pattern as control_tx.
  4. Mode-gate fork (lib.rs): ControlSignal::Steerpool.send_steer() + ack watcher; everything else → signal_in_flight_task.
  5. Ack watcher forwards SteerAck { channel_id, event_id, result } to the main loop only (no concurrent pool/queue mutation): success → drop the withheld mention so it isn't re-delivered post-turn; failure → signal_in_flight_task(.., ControlSignal::Steer) (cancel+merge fallback, never Cancel, which would drop the batch); unknown-delivery-after-completion → leave queued, normal dispatch handles it.
  6. Read loop reshape: select! over sleep_until / reader.next() / steer_rx.recv(), preserving pre-sleep idle classification. Steer write bounded by remaining hard deadline; idle resets only on a successful steer write, hard deadline stays absolute.

Eligibility / security

Author eligibility (owner ∪ allowlist ∪ verified siblings) is enforced upstream by the inbound author gate (author_allowed) — ineligible authors are dropped before the mode gate, so they can never steer. "Sibling" is a cryptographically-verified same-owner claim via the NIP-OA auth tag (is_owner_or_sibling), not a heuristic. The steer channel is an injection surface, so this authenticated definition is load-bearing.

Under the default respond-to=owner-only, the eligible set is owner ∪ siblings; allowlisted users are included only under --respond-to=allowlist. Widening the default gate would widen the inject surface for every deployment, so allowlist stays opt-in.

Behavior change

Flips the global default from queue to steer. On agents that implement the extension (today: goose, and now buzz-agent) the mention folds into the live turn; on other agents it falls back to cancel+merge with steering framing. Deliberate, per the "make steering the default" requirement.

Compatibility note

The steer extension is a recent capability: it shipped in goose 1.38.0. On older goose (and any agent that doesn't implement it), the client detects the absence behaviorally and falls back to cancel+merge with no lost events — slower path, identical correctness. A follow-up will tighten support detection and document the minimum version.

Files (9, buzz-acp + buzz-agent)

buzz-acp: config.rs (Steer variant + default flip + validation), pool.rs (ControlSignal::Steer requeue), queue.rs (CancelReason, steer-vs-interrupt framing), lib.rs (mode_gate_signal, steer detection + mode-gate fork), acp.rs (active_run_id parse + read-loop steer seam). buzz-agent: agent.rs, lib.rs, wire.rs (server-side steer method), tests/fake_llm.rs.

Tests

cargo test -p buzz-acp → 371/371 green under -D warnings (incl. the two read-loop steer tests). buzz-agent: steer folds-into-live-turn + three rejection cases. Steer-vs-interrupt framing, reason propagation, gate→signal mapping, config default + validation, queue→render e2e, and the OwnerOnly negative case (stranger dropped, owner+sibling admitted) all covered.

Live e2e: the full client→server steer round-trip was verified against a steer-capable agent (goose 1.38.0) over ACP — mid-turn steer accepted against the live activeRunId, steered content incorporated into the running turn, and the turn ended with EndTurn (a single turn, not a cancel + restart, and the mention was not requeued).

Reviewers

Dawn (framing wording), Perci (design correctness), Max (read-loop select! arm), Sami (Lane A owner).

npub1qyvc0c5kl4gqv2fd97fsk46tu378sqgy35vc83rvgfwne90sel7s0ed67d and others added 6 commits June 21, 2026 10:55
When a new, already-author-gated mention arrives while the agent is
mid-turn, deliver it by cancelling and re-prompting with a steering
framing: the message arrived while the agent was working, so it should
continue its in-progress task and weave the message in — rather than
treating it as a replacement request.

Builds on the existing cancel+merge machinery (--multiple-event-handling).
The eligible author set (owner ∪ allowlist ∪ siblings) is already enforced
by the inbound author gate, so every event reaching the mode gate is
steering-eligible.

- Add MultipleEventHandling::Steer; make it the default (was queue).
  Steer requires --dedup=queue (default), enforced by extracted
  validate_multiple_event_handling().
- Add ControlSignal::Steer and CancelReason {Interrupt, Steer}; carry the
  reason on FlushBatch from the pool cancel path through requeue into
  format_prompt.
- format_prompt selects wording via MergeFraming: steer vs interrupt.
- Extract the mode-gate decision into mode_gate_signal() for testability.

Tests: framing (incl. multi-event), reason propagation, gate mapping,
config default and dedup validation. 350 pass, clippy clean.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…or_header

Dawn's framing review: the steer prior-section header overclaimed captured
partial work, but session/cancel is terminal — the section actually holds the
original request. Swap to [What you were working on], honest about its contents.

Add coverage for the seams the split unit tests miss:
- queue->render end-to-end: real flush_next merge driven through the real
  format_prompt, asserting steer framing survives the whole path.
- default OwnerOnly author gate: a stranger is dropped (so it can never reach
  the mode gate to steer); owner and sibling are admitted. Pins the
  'ineligible author must NOT steer' invariant against the default mode.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
… steer; doc fix

Addresses Perci's clean-context review:
- FlushBatch::cancel_reason doc said None => Interrupt framing, but
  MergeFraming::for_reason(None) defaults to Steer. Fixed the doc, and
  aligned the defensive lib.rs fallback (unwrap_or) from Interrupt to Steer
  so both unset-reason paths agree on the gentler default. The path always
  sets a reason; this is purely defensive consistency.
- Pin the cross-thread edge: original work in thread A, steering message in
  thread B (same channel). The reply instruction targets the steering
  message (where the mentioner waits) while framing still says 'continue your
  in-progress work' — intended behavior, now asserted.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Implement the `_goose/unstable/session/steer` request so buzz-agent can
receive a mid-turn message and fold it into the in-flight prompt without
cancelling the turn — the server side of the steering delivery upgrade
that #1160 adds on the client side.

- Advertise a per-prompt `activeRunId` via `params._meta.goose.activeRunId`
  on session/update so steer-capable clients can target the live run.
- Validate steer requests the same way goose does: non-empty prompt,
  active run present, `expectedRunId` matches the live run — otherwise
  `invalid_params`, so the client falls back to cancel+merge.
- Queue accepted steers per session and drain them at each run-loop round
  boundary, appending them to history as user turns. The turn continues;
  it is not restarted.

Mirrors goose's wire contract exactly so a single client-side delivery
path serves both agents. Detection stays behavioral (try-and-tolerate),
not name- or capability-gated.

Tests (tests/fake_llm.rs): steer folds into a live turn (end_turn, not
cancelled; steered text reaches the provider), and is rejected on no
active run, run-id mismatch, and empty prompt.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…isor changes)

Lays the foundation for a goose-native steer follow-up to #1160 (Design B):
when an owner/allowed-user/sibling-bot tags an agent that's already working,
inject the mention mid-turn via goose's `_goose/unstable/session/steer`
instead of cancelling the in-flight turn.

This commit is intentionally scaffold-only — zero changes to the supervisor.
The `Either<Continue, Result>` extraction of `pool.rs:1185-1298` will land
in a follow-up commit after team review of the structural change.

## What's wired

- `AcpClient` gains `active_run_id: Option<String>`, seeded from `session/update`
  notifications of kind `session_info_update` by reading `_meta.goose.activeRunId`.
  Required because goose's steer endpoint rejects empty `expectedRunId`.
  Other agents leave the field `None` for the client lifetime.

- `handle_session_update` now takes `&mut self` (was `&self`) so the
  `session_info_update` arm can update `active_run_id`. Explicit null clears,
  string sets, missing/wrong-type are no-ops.

- `AcpClient::send_custom_request(method, params)` — public delegation to the
  private `send_request`. Lets callers dispatch agent-specific JSON-RPC methods
  (notably `_goose/unstable/session/steer`) without each one being a typed
  method in this crate. Method-not-found errors propagate cleanly so the
  supervisor can fall back to cancel+merge.

- `detect_goose_agent(init_result)` free function — conservative goose-gate
  via `agentInfo.name == "goose"` (or `serverInfo.name` as fallback).
  Per Eva's verification: goose advertises no `steer` capability field on
  `AgentCapabilities`; the `steer: true` markers in the codebase are
  outbound chunk-meta, not inbound capability advertisements. Detection is
  name-based; the try-and-tolerate fallback (method-not-found → cancel+merge)
  in the supervisor will widen the gate when other adapters expose
  compatible methods.

- `OwnedAgent` gains `supports_goose_steer: bool`, populated from
  `detect_goose_agent` at `spawn_and_init` time. Both the initial-spawn and
  respawn-recv `OwnedAgent` construction sites plumb it through.

- `spawn_and_init` return widened from `(AcpClient, u32)` to
  `(AcpClient, u32, bool)`. The third element is `supports_goose_steer`.
  `RespawnResult.result` widens to match. Pattern-match sites in the
  shutdown drain destructure with `_` for the new element until the
  supervisor needs it.

## Tests (+11, all green)

`detect_goose_agent_tests` (lib.rs, 6 tests):
- matches `agentInfo.name == "goose"`
- matches `serverInfo.name == "goose"` as fallback
- rejects other names (claude-code)
- rejects missing name field
- rejects non-string name (number)
- confirms `agentInfo` takes precedence over `serverInfo`

`acp::tests::active_run_id_*` (acp.rs, 4 tests):
- string value sets active_run_id
- explicit null clears active_run_id
- missing activeRunId field leaves state untouched (no clobber)
- non-string non-null value leaves state untouched

`acp::tests::send_custom_request_dispatches_method_and_params` (acp.rs, 1 test):
- verifies method name and params pass through verbatim onto the wire,
  including JSON-RPC envelope sanity (id is numeric, jsonrpc=2.0)

Full suite: `cargo test -p buzz-acp --lib` → 365 passed, 0 failed
(was 354 before this commit; +11 matches).

## What's NOT in this commit

- No `ControlSignal::Steer` handler changes in `pool.rs:1185-1298`.
- No feature flag / config wiring (`--steer-delivery`).
- No supervisor `Either<Continue, Result>` refactor — that's the next PR
  step, gated on team review of the structural extraction.

Dead-code warning on `supports_goose_steer` is expected and will clear when
the supervisor fork lands.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Completes the AcpClient seam for `_goose/unstable/session/steer` so that
the universal `MultipleEventHandling::Steer` signal takes the
non-cancelling goose-native path whenever the in-flight agent supports
it, falling back to cancel+merge whenever it doesn't. Implements the
contract locked by Perci/Max/Eva (unanimous, Option X): the main loop
carries only the steer prompt body; the read loop completes `sessionId`
and `expectedRunId` at write time so `expectedRunId` always matches the
freshest run goose has reported via `session/update`.

Pieces that became live in this commit (all dead-code in the prior
scaffold):

`SteerRequest { prompt_blocks, ack_tx }` — refactored from
`params: Value` to `prompt_blocks: Vec<String>`. The read loop now
builds the full JSON-RPC params via the new `build_steer_params` helper,
reading `self.active_run_id` at write time. If `active_run_id == None`,
the steer arm acks `SteerError::ExpectedRunIdMissing` and writes
nothing; the main loop maps this to the universal cancel+merge fallback.

`read_until_response_with_idle_timeout` now takes `session_id: &str`
lexically (both production call sites already had it in scope) so the
steer arm can fill the param without snapshotting outer state.

`try_native_steer` (lib.rs): the mode-gate fork. Builds the steer body
from `queue::native_steer_framing()` (Eva's drift-proof requirement:
header/closing pulled from `MergeFraming::for_reason(Some(Steer))` so
native and cancel+merge orient the agent identically) plus a single
event block rendered by `queue::format_event_block` (no batch-level
context — native steer is a pure delta into a live turn). On
`pool.send_steer == Ok(())`, synchronously marks the queued event as
withheld via `queue.mark_native_steer_pending` and spawns a watcher
that forwards the oneshot ack to the main loop's `steer_ack_rx`.

Main-loop wiring: new `SteerAckEvent` carrier struct, new
`PoolEvent::SteerAck` variant, new `steer_ack_rx` select arm. The
match arm implements the locked Success / Err-before-pending /
PromptCompletedNeutral semantics: Success drops the withheld event;
Err releases withheld AND issues the cancel+merge fallback; Neutral
releases withheld without firing the fallback (the in-flight turn
already ended).

Tests: two new lib tests in `acp::tests` proving the read-loop steer
arm. `native_steer_with_no_active_run_id_acks_expected_run_id_missing`
covers the None-run-id Err-before-pending path; the steer arm bails
without writing and acks `ExpectedRunIdMissing`.
`native_steer_with_active_run_id_routes_response_to_ack` proves the
happy path: with a set `active_run_id`, the steer write lands, the
response (id=0) is matched against `pending_steer`, and the ack arrives
as `Success`.

Also: stale doc-comment in `dispatch_pending` updated to point at the
actual mode-gate site (the main `select!` relay-event arm, not
`handle_prompt_result`).

Tests: `cargo test -p buzz-acp` → 371/371 pass under `-D warnings`
(369 prior + 2 new). `cargo check -p buzz-acp --tests` clean.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant