Skip to content

Round display width to realizable grid; never taint on resolution mismatch#271

Open
masnwilliams wants to merge 4 commits into
mainfrom
hypeship/fix-xroot-even-rounding
Open

Round display width to realizable grid; never taint on resolution mismatch#271
masnwilliams wants to merge 4 commits into
mainfrom
hypeship/fix-xroot-even-rounding

Conversation

@masnwilliams
Copy link
Copy Markdown
Contributor

@masnwilliams masnwilliams commented Jun 3, 2026

Summary

PATCH /display to a non-multiple-of-8 width was returning 500 and marking the browser instance vm_unrecoverable, because the X-root post-check (added in #262) required the realized resolution to exactly match the request.

The realized X root width is quantized to a multiple of 8 by libxcvt (the CVT timing generator neko uses to create the mode) — so a width like 1365 can never land exactly. CI confirmed it on the built image: a 1365-wide request lands at 1360 (rounded down to the grid). Heights are not gridded.

This PR:

  • Rounds the requested width up to the next multiple of 8 before applying, so neko creates a mode that matches what we ask for and the response reports the size the client actually gets.
  • Downgrades the X-root post-check from fatal to a logged guard. A resolution mismatch must never taint an instance — treating it as fatal is what turned a cosmetic pixel rounding into a fleet-wide outage. The check still logs if a resize genuinely didn't apply.
  • Keeps Skip chromium restart on headful display resize #262's no-chromium-restart behavior; the CDP maximize re-assert stays fatal (a failure there means a broken devtools connection, not a resolution rounding).

This supersedes the earlier even-rounding (+1) assumption in this branch, which was wrong: the rounding is to an 8px grid, downward, not a single pixel up.

Tests

  • Unit: TestRoundUpToWidthGrid — boundary cases for the grid rounding.
  • e2e: TestDisplayResizeOddDimensions — boots the headful neko image, requests an odd width, asserts 200, the response reports the rounded width, and the X root lands there exactly. Runs against the freshly-built image in CI.

Could not run the e2e locally (headful neko image too heavy to boot here); it is CI-validated.


Note

Medium Risk
Changes production PATCH /display success semantics on Xorg (rounding + no longer failing on X-root mismatch); mitigated by unit/e2e tests and keeping CDP failures fatal.

Overview
PATCH /display on headful Xorg no longer fails when libxcvt realizes a width on an 8px grid that differs from the client request. Before applying the resize, requested width is rounded up to the next multiple of 8 via roundUpToWidthGrid, so Neko/xrandr modes align with what the API returns; height is unchanged.

The post-resize X root poll (waitForXRootSize) is now a non-fatal guard: mismatches are warn-logged only, so cosmetic quantization does not surface as 500 or mark instances unrecoverable. CDP maximize re-assert after resize remains fatal (devtools failure, not display rounding).

Adds TestRoundUpToWidthGrid and headful neko e2e TestDisplayResizeOddDimensions (e.g. width 1365 → 200 with response width 1368, height preserved).

Reviewed by Cursor Bugbot for commit 621dacc. Bugbot is set up for automated code reviews on this repo. Configure here.

The Xorg dummy driver only realizes even-numbered framebuffer
dimensions, so an odd requested width/height lands one pixel larger
(e.g. 1365 -> 1366). waitForXRootSize required an exact match, so these
resizes failed verification, returned a configure 500, and tainted the
browser instance as unrecoverable -- failing session create and dropping
CDP for live sessions on the host.

Accept the next-even-up value per axis. Even targets stay strict and a
resize that never took effect is still rejected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@masnwilliams masnwilliams marked this pull request as ready for review June 3, 2026 23:09
sjmiller609
sjmiller609 approved these changes Jun 3, 2026
@firetiger-agent
Copy link
Copy Markdown

Created a monitoring plan for this PR.

What this PR does: Fixes a silent false-failure in browser VM display resizing — odd pixel dimensions (e.g. 1365 wide) were always rejected by the resize post-condition check because the Xorg dummy driver rounds them up to the next even number, causing resize operations to spuriously time out.

Intended effect:

  • PATCH /display 500 errors: baseline 0 (display resizes with odd dimensions were falsely failing, not surfacing as 500s at the kernel API level); confirmed if no new "X root did not reach requested size" errors appear post-deploy.
  • Railway HTTP 5xx rate: baseline 0.005–0.07% during typical active hours; confirmed if rate stays within this band.

Risks:

  • Over-acceptance of wrong sizessizeAxisSatisfied could accept a size that never took effect; alert if "X root did not reach requested size" appears in logs (would indicate incorrect acceptance logic), or if Railway 5xx rate rises above 1% for 2+ consecutive hours.
  • Even-dimension regression — strict equality must still hold for even-pixel resize requests; alert if Railway 5xx rate for display-bearing endpoints rises post-deploy with no other concurrent change.
  • Xvfb path unaffected — fix is Xorg-only; headless VM resizes use a separate code path and are not changed by this PR.

View monitor

masnwilliams and others added 2 commits June 3, 2026 23:51
Boots the headful neko image (the production config that creates modes
dynamically) and PATCHes /display to an odd width/height. Before the
waitForXRootSize fix this returned 500 and tainted the instance; it must
now return 200 with the X root settling within a pixel of the request.

Asserts proximity rather than an exact match so it doesn't hardcode the
driver's even-rounding behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The realizable X root width is quantized to a multiple of 8 by libxcvt
(CVT timing), so a non-multiple-of-8 width can never match the request
exactly. Round the request up to that grid before applying so neko
creates a matching mode, and downgrade the X-root post-check from fatal
(500 + instance taint) to a logged guard — a resolution mismatch must
never taint an instance.

Replaces the earlier even-rounding (+1) assumption, which was wrong:
CI showed 1365 lands at 1360 (rounded down to the grid), not 1366.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@masnwilliams masnwilliams changed the title Accept dummy-driver even rounding in X root resize verification Round display width to realizable grid; never taint on resolution mismatch Jun 4, 2026
The standalone test re-read the X root after the resize, which is flaky on
the heavy headful image: neko's startup-mode init can lose the race and
leave the root at the dummy driver's 3840x2160 default, so the physical
resize doesn't take effect even though the PATCH succeeds. Assert the
regression that actually matters instead — 200 (not 500/taint) and a
response reporting the grid-rounded width. Physical convergence of a clean
resize is already covered by TestDisplayResizeChromiumWindow.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sjmiller609 sjmiller609 self-requested a review June 4, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants