feat(cf): cross-env reconcile plan, dry-run (POST /admin/cf/reconcile)#284
Open
posix4e wants to merge 2 commits into
Open
feat(cf): cross-env reconcile plan, dry-run (POST /admin/cf/reconcile)#284posix4e wants to merge 2 commits into
posix4e wants to merge 2 commits into
Conversation
Computes — read-only — what a reconcile WOULD do across the whole CF map, env-labelled, in three buckets: - adopt: live (healthy) CF agent tunnels in the serving env the CP store is missing → fill-only rebuild from CF. - prune: a serving-env agent tunnel that's unclaimed AND not healthy, an unexpected serving-env CNAME, every resource of an env with no live control plane (closed PR), and the whole (unattributed) leak bucket. - refill: hostnames the serving CP expects but CF has no CNAME for. A live foreign env (another CP's, store not held here) is left untouched with a note. A degraded map yields an empty plan + refusal note. Adds tunnel `status`/`created_at` to CfTunnel (populated in both the per-env snapshot and the map) so adopt-vs-prune can tell a live agent from a dead/leaked tunnel; exposes `build_cp_state` for reuse. The endpoint is dry-run ONLY — `?apply=true` is acknowledged but performs no mutations (the guarded, operator-gated apply lands next). Same auth as the other /admin/cf/* surfaces. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DD preview readyURL: https://pr-284.devopsdefender.com Browser login: visit https://pr-284.devopsdefender.com — DD redirects you to Machine-to-machine: GitHub Actions workflows in the Register endpoint for a local agent: |
The dry-run plan flagged 3 live pr-N agent CNAMEs (its -api-/oracle/shell vanity hosts) for prune: 'expected hostnames' was derived from the CP store's extras, but the CP creates more CNAMEs than it records there (and agent-api uses a different name format), so live records looked orphaned — a delete-a-healthy-agent bug at apply time. Rewrite as two passes: decide tunnel actions first (recording pruned tunnel ids), then prune a CNAME only if its target tunnel is gone (unattributed) or is itself being pruned. A CNAME pointing at a live/kept tunnel is always kept, regardless of whether we can re-derive its name. Refill is limited to the reliably-known primary agent hostname.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR-5 of the CF-reconcile arc — the read-only plan. Operator-gated apply lands in the next PR.
What
POST /admin/cf/reconcilecomputes, read-only, what a reconcile would do across the whole CF map, env-labelled:status==healthy) CF agent tunnels in the serving env the CP store is missing → fill-only rebuild from CF (recovery).(unattributed)leak bucket.degradedmap yields an empty plan + refusal note.Adds
status/created_attoCfTunnel(populated in both the per-env snapshot and the map) so adopt-vs-prune can distinguish a live agent from a dead/leaked tunnel; exposesbuild_cp_statefor reuse.Dry-run ONLY —
?apply=trueis acknowledged (apply_requestedechoed) but performs no mutations. Same auth as the other/admin/cf/*surfaces.Validation
cargo fmtclean; compiles locally (macOSsessiond.rsnoise only; CI builds musl).POST /admin/cf/reconcileon the preview returnsdry_run:true, applied:falseand a plan whoseprunebucket lists the real(unattributed)leaks (the ~121 stale CNAMEs the map surfaced) — with zero CF mutations.Next: PR-6 adds
?apply=truewith the guards (skip in-flight-deploy env, TTL, zero-conn), fill-only adopt, audit log.🤖 Generated with Claude Code