[Spec 778] Swap the gemini consult lane to the Antigravity CLI (agy)#988
Open
waleedkadous wants to merge 45 commits into
Open
[Spec 778] Swap the gemini consult lane to the Antigravity CLI (agy)#988waleedkadous wants to merge 45 commits into
waleedkadous wants to merge 45 commits into
Conversation
Address iter-1 3-way consultation: - Gemini (fatal): consult prompts rely on filesystem access; API lane must inline review content (A1) or run a tool-use loop (A2). Removed wrong single-shot framing. - Codex (fatal): define porch-safe non-blocking skip (verdict.ts defaults to REQUEST_CHANGES); relax doctor unrestricted-key detection; scope other surfaces. - Both: resolve enterprise contradiction (API default, CLI retained as optional backend). - Claude: @google/genai already a dep; keep gemini in defaults; add pro-alias test.
Architect rejected the Gemini Developer API pivot at the spec-approval gate and directed swapping the gemini consult lane to the Antigravity CLI (agy), with: preserve agentic file-reading, keep the Pro model, subscription/OAuth (AI Ultra) auth, keep the porch-safe non-blocking skip + graceful cost degradation, stay lean. Spec rewritten around the empirically-verified agy contract (v1.0.4): - headless via 'agy --print'; file-reading via '--sandbox --add-dir' (no --dangerously-skip-permissions needed); OAuth/subscription auth. - Footgun documented: PATH 'agy' is an IDE symlink, not the CLI -> Codev must resolve the real binary deterministically. - Open: Pro-pinning mechanism (no --model flag); gate re-presentation mechanics.
Docs research: Antigravity CLI defaults to Gemini 3.5 Flash (High); Pro is selected via the interactive /model slash command (no --model flag, no obvious --print equivalent). A naive 'agy --print' would silently use Flash, violating the architect's keep-Pro requirement. Sharpened the Pro-pinning open question (default = wrong model) and raised the corresponding risk to High probability; acceptance must positively confirm Pro served the review.
Architect decided against Pro-pinning to keep the swap lean. The gemini lane will use agy's default model (currently Gemini 3.5 Flash). Removes the only critical open question (no --model handling). Documented as an explicit accepted tradeoff (Flash < Pro for review depth); risk downgraded to Low-impact accepted; success criteria / test scenarios adjusted accordingly.
Codex REQUEST_CHANGES, Gemini COMMENT, Claude REQUEST_CHANGES — all addressed: - Fix stale 'lane uses Pro model class' in Desired State (unanimous; missed when the don't-pin decision was applied). - State the model identifier stays 'gemini' (no rename) across all config surfaces. - Adopt the concrete non-blocking skip: lane emits 'VERDICT: COMMENT' when agy is unavailable (verdict.ts treats COMMENT as non-blocking; verified). - Fast auth-skip (kill child on OAuth-URL detection), binary-resolution rejection rule (reject IDE stub -> skip, never launch IDE), Codev-owned timeout. - Adapt extractReviewText gemini branch (JSON.parse -> raw output for agy plain text). - Follow the hermes precedent (role inlined, temp-file >100k -> also handles E2BIG). - Keep 'pro' alias as-is; note harness.ts GEMINI_HARNESS is untouched/distinct. - Add porch-orchestrated progression test. Rebuttal in iter2-rebuttals.md.
Architect approved the spec-approval gate and added a requirement (reverses 'API out of scope'): the gemini lane supports BOTH backends co-equally — - agy/OAuth: agentic file-reading, default model (Flash), cheap (keeps iter-2 work). - Gemini Developer API/GEMINI_API_KEY: inline content, Pro model (gemini-3.1-pro-preview), real cost rows from usageMetadata, CI-friendly. - Selector consult.gemini.backend: agy|api|auto — mechanism + auto-precedence (cost-vs-quality tradeoff) to be designed in the Plan and flagged for architect. Updated Out-of-Scope, Desired State, Success Criteria, Test Scenarios; added the Amendment A1 section. HOW (dispatch paths + selector) deferred to the Plan.
4 lean phases for the approved spec + Amendment A1: 1. agy_backend - agy --print --sandbox --add-dir, verified binary resolution, role inlined (hermes precedent), plain-text/graceful-cost, Codev-owned timeout, fast COMMENT-skip; agy doctor check. 2. api_backend - @google/genai gemini-3.1-pro-preview, inlined content, usageMetadata cost rows, GEMINI_API_KEY env auth (CI-friendly), COMMENT-skip. 3. backend_selector - consult.gemini.backend agy|api|auto; auto precedence proposed + flagged for architect (cost-vs-quality), not hard-coded. 4. docs_skeleton_e2e - doctor, docs/skeleton (model id stays gemini), e2e both backends + porch-progression test. Passes plan_exists / has_phases_json / min_two_phases.
Gemini APPROVE, Codex REQUEST_CHANGES, Claude COMMENT — all addressed: - Added Cross-Cutting Implementation Contracts: backend-aware dispatch + parsing (both backends are 'gemini' -> thread backend through extractReviewText/ extractUsage; old stats.models path removed); consult.gemini.backend is a NEW top-level config key (not under porch.consultation); no config migration (missing -> default auto); dual-dispatch sub-branching. - Phase 4: doctor operational-model counting treats API-only Gemini as operational. - Fixed test paths to packages/codev/src/__tests__/ (+ /cli/ e2e); noted packages/codev/tests/e2e|unit don't exist; protocol-schema enum is skeleton-only. Rebuttal: 778-plan-iter1-rebuttals.md.
Architect correction: agy v1.0.4 is OAuth-only (no API-key auth — verified: no login subcommand, no key flag, no *_api_key env var; upstream feature request open, not shipped). The separate Gemini Developer API backend is unwanted AND unbuildable, so: - Spec: removed Amendment A1 (dual-backend); restored the clean single-agy Approach-B spec (state at 92527c5). Amendment preserved in history at 7ef541f. - Plan: rewritten to single-backend — 2 phases (agy_backend, docs_skeleton_e2e); dropped api_backend + backend_selector. Kept the agy-relevant iter-1 fixes (plain-text extractReviewText adaptation, corrected src/__tests__ paths, doctor operational counting, COMMENT non-blocking skip = the CI/headless story). No API key anywhere.
… the Antigravity CLI (agy) Replaces the retiring Gemini CLI with agy (OAuth/subscription, agentic file-reading): - runAgyConsultation: agy --print --sandbox --add-dir <root>, reviewer role folded into the prompt (hermes precedent), large-prompt temp-file fallback, plain-text output. Codev-owned timeout; fast non-blocking COMMENT skip on OAuth prompt / missing-or-invalid binary (verdict.ts treats COMMENT as non-blocking) so porch runs still advance — the CI/headless story. - resolveAgyBin/isRealAgyCli: deterministic binary resolution that rejects the IDE symlink by realpath (never launches the IDE); CODEV_AGY_BIN override escape hatch. - usage-extractor: gemini lane is plain text -> usage degrades to null (no NaN cost rows); removed the dead Gemini-CLI stats.models JSON path. - doctor: agy presence (resolveAgyBin) + OAuth-aware auth probe + operational-model counting so an agy-only setup counts as operational; official install hint. - Model identifier stays 'gemini' (no rename); 'pro' alias kept. Tests: agy dispatch/sandbox-safety/role-fold/large-prompt/plain-text/OAuth-skip, binary-resolution unit tests, updated metrics gemini tests, doctor agy timeout. consult 46/46, doctor 16/16, metrics-gemini 8/8, config 26/26 green; type-clean. Note: better-sqlite3 MetricsDB tests are blocked locally by a missing Xcode CLT (node-gyp can't build the native binding) — pre-existing/environmental, unrelated to this change; they pass where the CLT is installed.
Codex REQUEST_CHANGES, Gemini COMMENT, Claude APPROVE — all addressed:
- resolveAgyBin: add agyRespondsToVersion(--version) behavioral verification for
an untrusted PATH candidate (Codex CX1); canonical path + CODEV_AGY_BIN override
stay realpath-trusted.
- verifyAgy: rewrite as async streaming so the OAuth URL is detected on the early
stream and the probe terminates promptly ('needs login') instead of stalling
for the full timeout (Codex CX2).
- Tests: agyRespondsToVersion unit test; doctor fast-'needs login' OAuth test
(replaces obsolete spawnSync-timeout test) (Codex CX3).
- doctor: remove dead VERIFY_CONFIGS['Gemini'] (Gemini G1 / Claude).
- consult.test: assert real _MODEL_CONFIGS (gemini.cli==='agy') instead of a fake
hardcoded config (Gemini G2).
- Dedup double agySkipContent() call (Claude).
Full suite green: 152 files, 3209 passed, 0 failed. Rebuttal in
778-agy_backend-iter1-rebuttals.md.
Codex REQUEST_CHANGES (iter-2) addressed:
- Add a guarded real-agy integration smoke (src/__tests__/cli/agy-integration.
e2e.test.ts): runs the real agy (no mock), plants a file, invokes the gemini
lane, asserts the review contains the marker (agentic file-reading); skips
cleanly when agy is unavailable/unauthed. Acceptance evidence (authed run):
agy read planted.txt and returned the codeword (gemini (agy) completed 14.1s).
- pro alias now exercised through the real execution path (consult({model:'pro'})
-> agy bin spawn); standalone alias test asserts the real exported _MODEL_ALIASES.
- Export _MODEL_ALIASES + _runAgyConsultation for the tests.
Default suite 3210 passed; cli-e2e 84 passed. Rebuttal in
778-agy_backend-iter2-rebuttals.md.
… as a non-blocking skip Dogfooding surfaced a real gap: on a heavy agentic review that outruns agy's --print-timeout, agy exits 0 emitting 'timed out waiting for response' (a non-response, not a review). runAgyConsultation now detects that marker and emits the non-blocking COMMENT skip instead of writing the timeout text as a 'review'. This is the correct degraded-gemini behavior (the run proceeds 2-way). Test added.
…ng-skip progression test - consult.md / DEPENDENCIES.md / CLAUDE.md / AGENTS.md / README.md / both consult SKILL.md: document the gemini lane now dispatching to the Antigravity CLI (agy), OAuth login, agentic file access, and the non-blocking skip. - Model identifier audit: 'gemini' stays the lane id everywhere (MODEL_CONFIGS, VALID_MODELS, pro alias, skeleton protocol-schema enum, protocol-JSON defaults) — backend swap only, no rename. - Add agy-skip-progression test: pins that the real agySkipContent artifact parses as COMMENT and that allApprove treats a skipped gemini lane as non-blocking (phase still advances 2-way), while a genuine REQUEST_CHANGES from another reviewer still blocks. Exports agySkipContent as _agySkipContent.
…istency + stronger e2e/progression tests Codex REQUEST_CHANGES (iter 1), all addressed: - consult.md Performance table: 'File access via --yolo' (retired Gemini CLI flag) → Antigravity CLI (agy), --sandbox, plain text. (Also flagged by Claude.) - .claude/skills/consult/SKILL.md: drop bogus 'tick' protocol from the --protocol list so the repo copy matches the skeleton (skeleton ↔ codev consistency). - codev/DEPENDENCIES.md: replace the stale 'Gemini CLI' section + summary row with the Antigravity (agy) section, mirroring the skeleton (docs reference only the supported setup). - agy-integration.e2e.test.ts: add a front-door case driving the real 'consult -m gemini' binary (arg-parse → alias/MODEL_CONFIGS → dispatch → agy → agentic read), guarded; the prior case exercised only the internal runAgyConsultation. - agy-porch-progression.test.ts (new): porch-orchestrated test driving next() with on-disk review files — a real agySkipContent gemini artifact + codex/claude APPROVE advances the phase (gate_pending), while a genuine REQUEST_CHANGES still blocks (rebuttal). Complements the unit-level allApprove/parseVerdict test.
…dev/ doc copies to agy Addresses iter-2 Codex REQUEST_CHANGES + Claude COMMENT (all agy-relevant current-doc divergences; gemini APPROVE): - codev/resources/commands/consult.md: synced from skeleton (was stale: gemini-cli, --yolo, TICK). Now byte-identical to the skeleton twin. (Claude finding.) - codev/DEPENDENCIES.md: add the 'Terminal connection issues' section so it is byte-identical to the skeleton twin. (Codex #1.) - CLAUDE.md / AGENTS.md: consultation-checkpoint prose 'Gemini Pro' → 'Gemini (via agy)' — agy uses its default model, no Pro pin. (Codex #2.) - codev/ + codev-skeleton/ resources/commands/codev.md: doctor dependency list 'Gemini (gemini-cli)' → 'Gemini (Antigravity CLI, agy)'. - codev/resources/arch.md: Consult Architecture section — gemini lane now spawns agy (--print --sandbox --add-dir), role folded into prompt, OAuth (no API key). - README.md: feature-description 'Gemini Pro' → 'Gemini (via agy)' (consult prose). Out of scope (unchanged, by design): historical specs/plans/analysis/comparison artifacts (rewriting falsifies the record); the gemini *builder* harness refs (README CLI-flag table + architect/builder config — spec leaves harness.ts untouched); the separate generate-image skill (Gemini image API, not consult).
- codev/reviews/778-*.md: full SPIR review — summary, spec compliance, deviations, per-phase consultation feedback, Architecture Updates (arch.md consult section → agy), Lessons Learned Updates, flaky tests (none), follow-ups (builder harness). - lessons-learned.md: self-hosted codev/ ↔ skeleton doc-copy drift lesson.
Addresses CMAP (PR) Codex #1: spec/plan now carry the approval frontmatter documenting the human gate approvals (spec-approval 2026-06-02, plan-approval 2026-06-02; validated by gemini/codex/claude across the specify/plan consults). Status flipped draft → approved.
…sult.md Addresses the re-consult REQUEST_CHANGES (both verified by the architect): - SECURITY: agy was granted --add-dir $(tmpdir()) — the entire OS temp dir. Add consultSandboxDir(): a per-process mkdtemp subdir holding the PR diff (buildPRQuery) and the large-prompt temp file; agy is now granted ONLY workspaceRoot + that subdir, never the whole tmpdir(). New test pins that the grant is scoped. - DOCS: the origin/main merge pulled #985's 'Claude auth: subscription vs metered API' section into codev/resources/commands/consult.md but not the skeleton copy. Synced the section into codev-skeleton/ so both copies are byte-identical again. - Review doc updated: PR-CMAP rounds recorded; scoped-sandbox noted in spec compliance; consult.md consistency claim made accurate (merge-drift re-synced).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Google retires Gemini-CLI subscription serving (Pro/Ultra/free) on 2026-06-18. This swaps the
geminiconsult lane's backend from the retired Gemini CLI to the Antigravity CLI (agy) — a lean backend swap, not a redesign.agycannot take an API key (verified), so there is no separate Gemini Developer API backend.agy --print --sandbox --add-dir <dir>(proven live: the e2e reads a planted file through theconsult -m geminifront door).geminieverywhere and theproalias is retained; only the backend binary changed.agyemitsVERDICT: COMMENT, so porch'sallApprovetreats it as non-blocking and phases still advance 2-way (the core failure this defends).resolveAgyBin()rejects the Antigravity IDE'sagysymlink (by realpath) and prefers the real headless CLI, with aCODEV_AGY_BINoverride.Closes #778
Changes
consult/index.ts:geminilane dispatches torunAgyConsultation(--print --sandbox --add-dir, role folded into the prompt); OAuth-marker and timeout detection → non-blocking COMMENT skip;resolveAgyBin/isRealAgyCli/agyRespondsToVersion.consult/usage-extractor.ts: graceful cost/usage degradation for agy's plain-text output (noNaN).doctor.ts:agypresence + streamingverifyAgy()(authed / needs-login / timeout) with current install guidance.CLAUDE.md,AGENTS.md,README.md,consult.md,DEPENDENCIES.md,codev.md,arch.md, consultSKILL.md);codev/copies are byte-identical to their skeleton twins.agye2e (front-door binary + agentic read), a porch-orchestrated progression test drivingnext()(skip → advance 2-way; real REQUEST_CHANGES still blocks), and unit-level skip-contract tests.Testing
agye2e cases).agy: it read a planted file and returned the codeword throughconsult -m gemini.consult -m gemini --type spec|planreturned real reviews (COMMENT / APPROVE).geminiunchanged inMODEL_CONFIGS,VALID_MODELS, the skeletonprotocol-schema.jsonenum, and all protocol-JSON defaults.Out of scope (by design)
The Gemini-CLI builder harness (
harness.ts+ README CLI-flag/config examples) stays on the retired CLI per the approved spec — tracked as a follow-up.Spec / Plan / Review
codev/specs/778-gemini-cli-antigravity-cli-jun.mdcodev/plans/778-gemini-cli-antigravity-cli-jun.mdcodev/reviews/778-gemini-cli-antigravity-cli-jun.md