Skip to content

[Spec 778] Swap the gemini consult lane to the Antigravity CLI (agy)#988

Open
waleedkadous wants to merge 45 commits into
mainfrom
builder/spir-778
Open

[Spec 778] Swap the gemini consult lane to the Antigravity CLI (agy)#988
waleedkadous wants to merge 45 commits into
mainfrom
builder/spir-778

Conversation

@waleedkadous
Copy link
Copy Markdown
Contributor

Summary

Google retires Gemini-CLI subscription serving (Pro/Ultra/free) on 2026-06-18. This swaps the gemini consult lane's backend from the retired Gemini CLI to the Antigravity CLI (agy) — a lean backend swap, not a redesign.

  • Single, OAuth-only backendagy cannot take an API key (verified), so there is no separate Gemini Developer API backend.
  • Agentic file reading preservedagy --print --sandbox --add-dir <dir> (proven live: the e2e reads a planted file through the consult -m gemini front door).
  • agy's default model, no Pro pin — the model identifier stays gemini everywhere and the pro alias is retained; only the backend binary changed.
  • Non-blocking skip — missing / unauthenticated / IDE-symlink-stub / timed-out agy emits VERDICT: COMMENT, so porch's allApprove treats it as non-blocking and phases still advance 2-way (the core failure this defends).
  • Real-binary resolutionresolveAgyBin() rejects the Antigravity IDE's agy symlink (by realpath) and prefers the real headless CLI, with a CODEV_AGY_BIN override.

Closes #778

Changes

  • consult/index.ts: gemini lane dispatches to runAgyConsultation (--print --sandbox --add-dir, role folded into the prompt); OAuth-marker and timeout detection → non-blocking COMMENT skip; resolveAgyBin / isRealAgyCli / agyRespondsToVersion.
  • consult/usage-extractor.ts: graceful cost/usage degradation for agy's plain-text output (no NaN).
  • doctor.ts: agy presence + streaming verifyAgy() (authed / needs-login / timeout) with current install guidance.
  • Docs + skeleton synced to the agy backend (CLAUDE.md, AGENTS.md, README.md, consult.md, DEPENDENCIES.md, codev.md, arch.md, consult SKILL.md); codev/ copies are byte-identical to their skeleton twins.
  • Tests: guarded real-agy e2e (front-door binary + agentic read), a porch-orchestrated progression test driving next() (skip → advance 2-way; real REQUEST_CHANGES still blocks), and unit-level skip-contract tests.

Testing

  • Full unit suite green (3217 passing, 13 skipped — the guarded real-agy e2e cases).
  • Guarded e2e verified live against real agy: it read a planted file and returned the codeword through consult -m gemini.
  • Headline path exercised live: consult -m gemini --type spec|plan returned real reviews (COMMENT / APPROVE).
  • Model-identifier audit: gemini unchanged in MODEL_CONFIGS, VALID_MODELS, the skeleton protocol-schema.json enum, and all protocol-JSON defaults.

Out of scope (by design)

The Gemini-CLI builder harness (harness.ts + README CLI-flag/config examples) stays on the retired CLI per the approved spec — tracked as a follow-up.

Spec / Plan / Review

  • Spec: codev/specs/778-gemini-cli-antigravity-cli-jun.md
  • Plan: codev/plans/778-gemini-cli-antigravity-cli-jun.md
  • Review: codev/reviews/778-gemini-cli-antigravity-cli-jun.md

Address iter-1 3-way consultation:
- Gemini (fatal): consult prompts rely on filesystem access; API lane must inline
  review content (A1) or run a tool-use loop (A2). Removed wrong single-shot framing.
- Codex (fatal): define porch-safe non-blocking skip (verdict.ts defaults to
  REQUEST_CHANGES); relax doctor unrestricted-key detection; scope other surfaces.
- Both: resolve enterprise contradiction (API default, CLI retained as optional backend).
- Claude: @google/genai already a dep; keep gemini in defaults; add pro-alias test.
Architect rejected the Gemini Developer API pivot at the spec-approval gate and
directed swapping the gemini consult lane to the Antigravity CLI (agy), with:
preserve agentic file-reading, keep the Pro model, subscription/OAuth (AI Ultra)
auth, keep the porch-safe non-blocking skip + graceful cost degradation, stay lean.

Spec rewritten around the empirically-verified agy contract (v1.0.4):
- headless via 'agy --print'; file-reading via '--sandbox --add-dir' (no
  --dangerously-skip-permissions needed); OAuth/subscription auth.
- Footgun documented: PATH 'agy' is an IDE symlink, not the CLI -> Codev must
  resolve the real binary deterministically.
- Open: Pro-pinning mechanism (no --model flag); gate re-presentation mechanics.
Docs research: Antigravity CLI defaults to Gemini 3.5 Flash (High); Pro is
selected via the interactive /model slash command (no --model flag, no obvious
--print equivalent). A naive 'agy --print' would silently use Flash, violating
the architect's keep-Pro requirement. Sharpened the Pro-pinning open question
(default = wrong model) and raised the corresponding risk to High probability;
acceptance must positively confirm Pro served the review.
Architect decided against Pro-pinning to keep the swap lean. The gemini lane will
use agy's default model (currently Gemini 3.5 Flash). Removes the only critical
open question (no --model handling). Documented as an explicit accepted tradeoff
(Flash < Pro for review depth); risk downgraded to Low-impact accepted; success
criteria / test scenarios adjusted accordingly.
Codex REQUEST_CHANGES, Gemini COMMENT, Claude REQUEST_CHANGES — all addressed:
- Fix stale 'lane uses Pro model class' in Desired State (unanimous; missed when
  the don't-pin decision was applied).
- State the model identifier stays 'gemini' (no rename) across all config surfaces.
- Adopt the concrete non-blocking skip: lane emits 'VERDICT: COMMENT' when agy is
  unavailable (verdict.ts treats COMMENT as non-blocking; verified).
- Fast auth-skip (kill child on OAuth-URL detection), binary-resolution rejection
  rule (reject IDE stub -> skip, never launch IDE), Codev-owned timeout.
- Adapt extractReviewText gemini branch (JSON.parse -> raw output for agy plain text).
- Follow the hermes precedent (role inlined, temp-file >100k -> also handles E2BIG).
- Keep 'pro' alias as-is; note harness.ts GEMINI_HARNESS is untouched/distinct.
- Add porch-orchestrated progression test. Rebuttal in iter2-rebuttals.md.
Architect approved the spec-approval gate and added a requirement (reverses
'API out of scope'): the gemini lane supports BOTH backends co-equally —
- agy/OAuth: agentic file-reading, default model (Flash), cheap (keeps iter-2 work).
- Gemini Developer API/GEMINI_API_KEY: inline content, Pro model
  (gemini-3.1-pro-preview), real cost rows from usageMetadata, CI-friendly.
- Selector consult.gemini.backend: agy|api|auto — mechanism + auto-precedence
  (cost-vs-quality tradeoff) to be designed in the Plan and flagged for architect.

Updated Out-of-Scope, Desired State, Success Criteria, Test Scenarios; added the
Amendment A1 section. HOW (dispatch paths + selector) deferred to the Plan.
4 lean phases for the approved spec + Amendment A1:
1. agy_backend - agy --print --sandbox --add-dir, verified binary resolution,
   role inlined (hermes precedent), plain-text/graceful-cost, Codev-owned timeout,
   fast COMMENT-skip; agy doctor check.
2. api_backend - @google/genai gemini-3.1-pro-preview, inlined content,
   usageMetadata cost rows, GEMINI_API_KEY env auth (CI-friendly), COMMENT-skip.
3. backend_selector - consult.gemini.backend agy|api|auto; auto precedence
   proposed + flagged for architect (cost-vs-quality), not hard-coded.
4. docs_skeleton_e2e - doctor, docs/skeleton (model id stays gemini), e2e both
   backends + porch-progression test.

Passes plan_exists / has_phases_json / min_two_phases.
Gemini APPROVE, Codex REQUEST_CHANGES, Claude COMMENT — all addressed:
- Added Cross-Cutting Implementation Contracts: backend-aware dispatch + parsing
  (both backends are 'gemini' -> thread backend through extractReviewText/
  extractUsage; old stats.models path removed); consult.gemini.backend is a NEW
  top-level config key (not under porch.consultation); no config migration
  (missing -> default auto); dual-dispatch sub-branching.
- Phase 4: doctor operational-model counting treats API-only Gemini as operational.
- Fixed test paths to packages/codev/src/__tests__/ (+ /cli/ e2e); noted
  packages/codev/tests/e2e|unit don't exist; protocol-schema enum is skeleton-only.
Rebuttal: 778-plan-iter1-rebuttals.md.
Architect correction: agy v1.0.4 is OAuth-only (no API-key auth — verified: no
login subcommand, no key flag, no *_api_key env var; upstream feature request
open, not shipped). The separate Gemini Developer API backend is unwanted AND
unbuildable, so:
- Spec: removed Amendment A1 (dual-backend); restored the clean single-agy
  Approach-B spec (state at 92527c5). Amendment preserved in history at 7ef541f.
- Plan: rewritten to single-backend — 2 phases (agy_backend, docs_skeleton_e2e);
  dropped api_backend + backend_selector. Kept the agy-relevant iter-1 fixes
  (plain-text extractReviewText adaptation, corrected src/__tests__ paths, doctor
  operational counting, COMMENT non-blocking skip = the CI/headless story).
No API key anywhere.
… the Antigravity CLI (agy)

Replaces the retiring Gemini CLI with agy (OAuth/subscription, agentic file-reading):
- runAgyConsultation: agy --print --sandbox --add-dir <root>, reviewer role folded
  into the prompt (hermes precedent), large-prompt temp-file fallback, plain-text
  output. Codev-owned timeout; fast non-blocking COMMENT skip on OAuth prompt /
  missing-or-invalid binary (verdict.ts treats COMMENT as non-blocking) so porch
  runs still advance — the CI/headless story.
- resolveAgyBin/isRealAgyCli: deterministic binary resolution that rejects the IDE
  symlink by realpath (never launches the IDE); CODEV_AGY_BIN override escape hatch.
- usage-extractor: gemini lane is plain text -> usage degrades to null (no NaN
  cost rows); removed the dead Gemini-CLI stats.models JSON path.
- doctor: agy presence (resolveAgyBin) + OAuth-aware auth probe + operational-model
  counting so an agy-only setup counts as operational; official install hint.
- Model identifier stays 'gemini' (no rename); 'pro' alias kept.

Tests: agy dispatch/sandbox-safety/role-fold/large-prompt/plain-text/OAuth-skip,
binary-resolution unit tests, updated metrics gemini tests, doctor agy timeout.
consult 46/46, doctor 16/16, metrics-gemini 8/8, config 26/26 green; type-clean.

Note: better-sqlite3 MetricsDB tests are blocked locally by a missing Xcode CLT
(node-gyp can't build the native binding) — pre-existing/environmental, unrelated
to this change; they pass where the CLT is installed.
Codex REQUEST_CHANGES, Gemini COMMENT, Claude APPROVE — all addressed:
- resolveAgyBin: add agyRespondsToVersion(--version) behavioral verification for
  an untrusted PATH candidate (Codex CX1); canonical path + CODEV_AGY_BIN override
  stay realpath-trusted.
- verifyAgy: rewrite as async streaming so the OAuth URL is detected on the early
  stream and the probe terminates promptly ('needs login') instead of stalling
  for the full timeout (Codex CX2).
- Tests: agyRespondsToVersion unit test; doctor fast-'needs login' OAuth test
  (replaces obsolete spawnSync-timeout test) (Codex CX3).
- doctor: remove dead VERIFY_CONFIGS['Gemini'] (Gemini G1 / Claude).
- consult.test: assert real _MODEL_CONFIGS (gemini.cli==='agy') instead of a fake
  hardcoded config (Gemini G2).
- Dedup double agySkipContent() call (Claude).

Full suite green: 152 files, 3209 passed, 0 failed. Rebuttal in
778-agy_backend-iter1-rebuttals.md.
Codex REQUEST_CHANGES (iter-2) addressed:
- Add a guarded real-agy integration smoke (src/__tests__/cli/agy-integration.
  e2e.test.ts): runs the real agy (no mock), plants a file, invokes the gemini
  lane, asserts the review contains the marker (agentic file-reading); skips
  cleanly when agy is unavailable/unauthed. Acceptance evidence (authed run):
  agy read planted.txt and returned the codeword (gemini (agy) completed 14.1s).
- pro alias now exercised through the real execution path (consult({model:'pro'})
  -> agy bin spawn); standalone alias test asserts the real exported _MODEL_ALIASES.
- Export _MODEL_ALIASES + _runAgyConsultation for the tests.

Default suite 3210 passed; cli-e2e 84 passed. Rebuttal in
778-agy_backend-iter2-rebuttals.md.
… as a non-blocking skip

Dogfooding surfaced a real gap: on a heavy agentic review that outruns agy's
--print-timeout, agy exits 0 emitting 'timed out waiting for response' (a
non-response, not a review). runAgyConsultation now detects that marker and emits
the non-blocking COMMENT skip instead of writing the timeout text as a 'review'.
This is the correct degraded-gemini behavior (the run proceeds 2-way). Test added.
…ng-skip progression test

- consult.md / DEPENDENCIES.md / CLAUDE.md / AGENTS.md / README.md / both
  consult SKILL.md: document the gemini lane now dispatching to the Antigravity
  CLI (agy), OAuth login, agentic file access, and the non-blocking skip.
- Model identifier audit: 'gemini' stays the lane id everywhere (MODEL_CONFIGS,
  VALID_MODELS, pro alias, skeleton protocol-schema enum, protocol-JSON defaults)
  — backend swap only, no rename.
- Add agy-skip-progression test: pins that the real agySkipContent artifact
  parses as COMMENT and that allApprove treats a skipped gemini lane as
  non-blocking (phase still advances 2-way), while a genuine REQUEST_CHANGES
  from another reviewer still blocks. Exports agySkipContent as _agySkipContent.
…istency + stronger e2e/progression tests

Codex REQUEST_CHANGES (iter 1), all addressed:
- consult.md Performance table: 'File access via --yolo' (retired Gemini CLI flag)
  → Antigravity CLI (agy), --sandbox, plain text. (Also flagged by Claude.)
- .claude/skills/consult/SKILL.md: drop bogus 'tick' protocol from the --protocol
  list so the repo copy matches the skeleton (skeleton ↔ codev consistency).
- codev/DEPENDENCIES.md: replace the stale 'Gemini CLI' section + summary row with
  the Antigravity (agy) section, mirroring the skeleton (docs reference only the
  supported setup).
- agy-integration.e2e.test.ts: add a front-door case driving the real 'consult -m
  gemini' binary (arg-parse → alias/MODEL_CONFIGS → dispatch → agy → agentic read),
  guarded; the prior case exercised only the internal runAgyConsultation.
- agy-porch-progression.test.ts (new): porch-orchestrated test driving next() with
  on-disk review files — a real agySkipContent gemini artifact + codex/claude
  APPROVE advances the phase (gate_pending), while a genuine REQUEST_CHANGES still
  blocks (rebuttal). Complements the unit-level allApprove/parseVerdict test.
…dev/ doc copies to agy

Addresses iter-2 Codex REQUEST_CHANGES + Claude COMMENT (all agy-relevant
current-doc divergences; gemini APPROVE):
- codev/resources/commands/consult.md: synced from skeleton (was stale: gemini-cli,
  --yolo, TICK). Now byte-identical to the skeleton twin. (Claude finding.)
- codev/DEPENDENCIES.md: add the 'Terminal connection issues' section so it is
  byte-identical to the skeleton twin. (Codex #1.)
- CLAUDE.md / AGENTS.md: consultation-checkpoint prose 'Gemini Pro' → 'Gemini (via
  agy)' — agy uses its default model, no Pro pin. (Codex #2.)
- codev/ + codev-skeleton/ resources/commands/codev.md: doctor dependency list
  'Gemini (gemini-cli)' → 'Gemini (Antigravity CLI, agy)'.
- codev/resources/arch.md: Consult Architecture section — gemini lane now spawns agy
  (--print --sandbox --add-dir), role folded into prompt, OAuth (no API key).
- README.md: feature-description 'Gemini Pro' → 'Gemini (via agy)' (consult prose).

Out of scope (unchanged, by design): historical specs/plans/analysis/comparison
artifacts (rewriting falsifies the record); the gemini *builder* harness refs
(README CLI-flag table + architect/builder config — spec leaves harness.ts
untouched); the separate generate-image skill (Gemini image API, not consult).
- codev/reviews/778-*.md: full SPIR review — summary, spec compliance, deviations,
  per-phase consultation feedback, Architecture Updates (arch.md consult section →
  agy), Lessons Learned Updates, flaky tests (none), follow-ups (builder harness).
- lessons-learned.md: self-hosted codev/ ↔ skeleton doc-copy drift lesson.
Addresses CMAP (PR) Codex #1: spec/plan now carry the approval frontmatter
documenting the human gate approvals (spec-approval 2026-06-02, plan-approval
2026-06-02; validated by gemini/codex/claude across the specify/plan consults).
Status flipped draft → approved.
…sult.md

Addresses the re-consult REQUEST_CHANGES (both verified by the architect):
- SECURITY: agy was granted --add-dir $(tmpdir()) — the entire OS temp dir. Add
  consultSandboxDir(): a per-process mkdtemp subdir holding the PR diff (buildPRQuery)
  and the large-prompt temp file; agy is now granted ONLY workspaceRoot + that subdir,
  never the whole tmpdir(). New test pins that the grant is scoped.
- DOCS: the origin/main merge pulled #985's 'Claude auth: subscription vs metered API'
  section into codev/resources/commands/consult.md but not the skeleton copy. Synced the
  section into codev-skeleton/ so both copies are byte-identical again.
- Review doc updated: PR-CMAP rounds recorded; scoped-sandbox noted in spec compliance;
  consult.md consistency claim made accurate (merge-drift re-synced).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Gemini CLI > Antigravity CLI - June 18, 2026 DEADLINE

1 participant