explorer: report real Samples-in-View count when cap is reached (closes #206) by rdhyee · Pull Request #210 · isamplesorg/isamplesorg.github.io

rdhyee · 2026-05-11T23:09:17Z

Closes #206. First item from the #203 retrospective Codex review (cluster D).

Problem

In point mode, the viewport sample query is capped at POINT_BUDGET = 5000 with no ORDER BY. The "Samples in View" stat box displayed cachedData.length, which equals the cap in any region dense enough to exceed it. Cyprus deep-link (#v=1&lat=34.9957&lng=33.6798&alt=15212&mode=point) showed exactly 5,000 — real count from a direct DuckDB query against the lite parquet: 23,421 samples in the same viewport. ~5x understatement.

Fix

Two changes in loadViewportSamples:

Real count: after the LIMIT POINT_BUDGET data fetch, run a count(*) against the same WHERE clause only when the cap is reached. Stat box sSamples now displays the true count; sPoints shows the rendered count under a "Samples Rendered" label so the user sees both numbers. Phase message becomes "23,422 samples in view (showing 5,000 — zoom in for more). Click one for details." when capped. Stale-guard applied between the two queries.

Below the cap the second query is skipped (data.length == real count by definition), so no extra round-trip in the common case.
ORDER BY pid on the data query, so the POINT_BUDGET-worth of rows the LIMIT returns is deterministic across browsers and sessions. (Codex recommendation 4 from the explorer: fix URL state round-trip (closes #201 parts A+B) #203 retrospective.)

Verification

Smoke test against localhost with this branch (Quarto preview):

Cyprus (alt=15212) point-mode display:
  mode: point
  Samples Rendered: 5,000
  Samples in View: 23,422
  phase: 23,422 samples in view (showing 5,000 — zoom in for more). Click one for details.

The 23,422 real count matches the OC investigation's DuckDB result (23,421 at ±0.1°; 23,481 at ±0.13° — explorer's 30%-padded viewport sits between).

url_roundtrip_investigation.js still passes all 6 iterations against localhost, so no regression in #203/#205 URL state work.

Risks

One extra DuckDB-WASM query per point-mode load when capped only. count(*) on a filtered parquet scan is comparable cost to the data fetch but returns a single number — under 100ms on warm cache in typical viewports. Acceptable for the UX win, especially since dense viewports are exactly the case where users were getting wrong information.
ORDER BY pid adds a sort step. Affects performance only in the capped case (sort happens on the result of the WHERE-filtered scan). pid is the primary key on the parquet — sort is on small string keys.
Stat box label change ("Samples Rendered" appears when capped). Could be jarring if a label-stability test exists; none in the current spec set.

Test plan

Smoke: Cyprus deep-link shows real count = 23,422, rendered = 5,000
url_roundtrip_investigation.js — all 6 iterations green against localhost
Manual smoke on deploy preview: Cyprus + a non-capped viewport (small island somewhere with <5k samples)
Codex review (recommended)

🤖 Generated with Claude Code

isamplesorg#206) In point mode, the viewport sample query was capped at POINT_BUDGET = 5000 with no ORDER BY. The "Samples in View" stat box and phase message both displayed `cachedData.length`, which equals the budget in any region dense enough to exceed it. Cyprus deep-link (`#v=1&lat=34.9957&lng=33.6798&alt=15212&mode=point`) showed exactly 5,000 — real count from DuckDB-WASM against the lite parquet: 23,421 samples in the same viewport. Counter understated the density by ~5x. Two changes, both localized to `loadViewportSamples`: 1. **Real count**: after the LIMIT POINT_BUDGET data fetch, run a `count(*)` against the same WHERE clause when we hit the cap. Stat box `sSamples` now displays the true count; `sPoints` shows the rendered count under a "Samples Rendered" label so the user sees both numbers at a glance. Phase message becomes "23,422 samples in view (showing 5,000 — zoom in for more). Click one for details." when capped; unchanged when not. Stale-guard applied between the two queries. The count query only runs when `data.length >= POINT_BUDGET`. Below the cap the displayed and real counts are identical, so a second round-trip would be pure waste. 2. **`ORDER BY pid`** on the data query, so the POINT_BUDGET-worth of rows the LIMIT returns is deterministic across browsers and sessions. Codex's recommendation 4 from the isamplesorg#203 retrospective. Verification: - Smoke test against localhost confirms "Samples Rendered: 5,000 / Samples in View: 23,422" for the Cyprus URL (cf. matching ±0.13° DuckDB count from the OC investigation). - url_roundtrip_investigation.js still passes all 6 iterations against localhost (no regression in isamplesorg#203/isamplesorg#205 URL state work). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ats, stable labels) Three changes from the Codex review of PR isamplesorg#210: 1. **Stale guard in count-query catch**: the `try/catch` around the real-count query could mutate state on a stale request if the count failed AFTER a newer request started. Added `if (myReqId !== requestId) return` at the top of the catch, before the warning log. 2. **Cache-hit stats**: when `isWithinCache(bounds)` short-circuits the fetch, the stat boxes were left whatever they were from the previous load (the original loose behavior, but now noticeable because the count info matters). Cached `totalCount` and `capReached` alongside bounds + data, and re-apply on cache hits. 3. **Always-on stable labels**: replaced the conditional label flip ("Samples in View" ⇄ "Samples Rendered") with stable labels — sPoints always says "Samples Rendered", sSamples always says "Samples in View". When the cap isn't reached, both boxes show the same number; Codex flagged the label flip as more confusing than the redundancy. All four `cachedBounds = null` reset sites also clear the new `cachedTotalCount` / `cachedCapReached` fields so they don't leak between mode transitions. Verified locally: - Cyprus smoke test still reads `Samples Rendered: 5,000 / Samples in View: 23,422`. - url_roundtrip_investigation.js all 6 iterations green. Follow-up issue isamplesorg#211 filed (Codex's other suggestion: tighten count to the visible viewport rather than the 30%-padded fetch box). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee · 2026-05-11T23:28:50Z

Codex review applied (commit `ca8407c`)

Posted the Codex review on this PR. Acted on it. Three changes:

Stale guard in count-query catch. The Codex-flagged race (count fails after a newer request starts → stale catch path mutates state) is closed: if (myReqId !== requestId) return at the top of the catch.
Cache-hit stats. cachedTotalCount + cachedCapReached now cached alongside cachedBounds/cachedData. loadViewportSamples's cache short-circuit now re-applies them to the stat boxes, so panning inside the padded cache window doesn't leave stale figures on screen. All four cachedBounds = null reset sites also clear the new fields.
Always-on stable labels. Dropped the conditional label flip (Samples in View ⇄ Samples Rendered as cap is crossed) in favor of stable labels: sPoints always says Samples Rendered, sSamples always says Samples in View. Redundancy when uncapped is intentional per Codex's preference.

Filed Codex's other suggestion as #211: tighten the count to the visible viewport rather than the 30%-padded fetch box. Out of scope here; folds naturally into #208 if scope grows.

Verified locally:

Cyprus smoke still reads Samples Rendered: 5,000 / Samples in View: 23,422
url_roundtrip_investigation.js all 6 iterations green

Codex review verbatim (for the record):

Race bug in count-query failure path [...] If the count query fails after a newer camera/filter request starts, the stale request can still update cachedBounds, cachedData, points, stats, and phase text. Add the same guard inside the catch before logging/fallback.

Cache-hit stats remain stale [...] Panning inside the padded cached area can leave the previous 'Samples in View' count on screen.

'Samples in View' is still padded-area count [...] If the goal is strict honesty, count against actual bounds; if the goal is only 'not silently capped,' this is acceptable but should probably be called out as a follow-up. [→ explorer: point-mode counts use padded fetch box, not visible viewport #211]

UX: I would prefer always-stable labels: Samples Rendered and Samples in View, even when equal. [✅]

I would not defer this to #208. Add the stale guard in the count catch, then ship. [✅]

Ready to merge from my side.

) (#214) * tests: Playwright spec for explorer URL state round-trip (closes #209) CI-safe spec set extracted from `tests/playwright/url_roundtrip_investigation.js` (which remains as a manual diagnostic). Five tests covering the URL state contract slices fixed in PRs #203, #205, #210, #212: 1. **Point-mode deep-link with `mode=point` enters point mode** — regression for Bug B fix in #203 (when heading is omitted, the URL must still hydrate into point mode). 2. **Low-alt deep-link WITHOUT `mode=point` enters point mode** — regression for #207 item 4 (boot should enter point mode based on altitude alone, not just the URL's mode flag). 3. **Sub-threshold pan updates URL hash via `moveEnd`** — regression for #205 (Cesium's `camera.changed` suppresses small moves below `percentageChanged` threshold; `moveEnd` is the backstop). 4. **URL round-trips across browser contexts** — combined regression for #203 + #205. Context A pans + zooms after settle, captures URL; Context B opens the captured URL and arrives at the same camera + mode within tight tolerance. 5. **`h3` hashchange with invalid cell clears `_globeState.selectedH3`** — regression for #207 item 6 (asymmetry between boot and hashchange null-result branches). Search-result row click selection (#207 item 8) deferred — needs working FTS data and depends on substrate the spec shouldn't have to set up. Tests use `waitForPointModeSettled` (waits for the phase message's "Click one for details." text) instead of `waitForMode` for the no-`mode=point` cases — `waitForMode` can match a transient mode flip during the boot transition (the dual-mode-state anomaly being addressed structurally in #208). `test.setTimeout(360000)` per describe block — cold-cache point-mode boot can take 60–90s per #190, and the round-trip test opens multiple fresh contexts each paying the cost. Verified: all 5 tests pass against localhost in ~1m 24s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tests: address Codex round-1 review on URL roundtrip specs Per Codex review on PR #214: - Replace fixed `waitForTimeout` sleeps with `waitForFunction` polling URL hash in tests 3 (sub-threshold pan) and 4 (cross-context round-trip). Fixes thin timing margin (0.5s) over flight + moveEnd debounce. - Wrap context-creation in try/finally so ctxA/ctxB get closed on mid-test assertion failure (test 4). - Drop unused `before` snapshot in test 3. - Move 6-minute `test.setTimeout(360000)` from describe-level default (180s) to per-test override on the round-trip test only. Single broken test no longer waits 6 minutes to fail in CI. - Add `waitForBootSettled(page)` helper waiting for `viewer._suppressHashWrite === false` — `_globeState.mode` is initialized early (explorer.qmd:871) while the hashchange listener is registered late (explorer.qmd:2210); waitForMode alone isn't enough. - Strengthen test 5 with the boot-settle wait so the hashchange listener exists before driving the test hashchange. - Add comment in `snapshot()` noting `_globeState.selectedH3`/`selectedPid` field coupling. - Update header: change `closes #209` to `addresses #209` (PR explicitly defers several #209 checklist items to a follow-up). Remove the internally-inconsistent "don't try cold-cache deep-links" sentence that contradicted the timeout comment's cold-cache rationale. Kept (with rationale in code comments): - `waitForPointModeSettled` matches "Click one for details" rather than the count phrase Codex suggested — count phrase misses the cap-reached done-state branch (explorer.qmd:1611) which says "samples in view ..." with no "individual" word. "Click one for details" is the common denominator across both done-state branches. - Test 5 h3-hashchange remains a hashchange-driven test rather than the valid→invalid round-trip Codex proposed. The original assertion DOES catch the regression: line 2272 unconditionally overwrites selectedH3 from the URL on hashchange, so the only way the post-handler value is null is the explicit null-result branch at line 2285. Codex's "trivially true" framing missed that line 2272 always fires first. (Tried seeding selectedH3 before the hashchange to make the test even stronger; the seed survived the handler unexpectedly — suspected OJS reactivity around mainModule.value('viewer') returning a different reference than the handler's closure. Documented in test comment.) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * tests: address Codex round-2 review on URL roundtrip specs Codex round 2 sharpened the round-1 concern on Test 5 and was right: booting with `&h3=<invalid>` runs the boot deep-link's own null-result branch (explorer.qmd:2728) which sets `selectedH3=null` BEFORE the test even drives a hashchange. So the post-mutation `selectedH3 === null` wait is satisfied by boot, not by the hashchange handler's null-clear branch at line 2285 (the regression we want to guard). Fix: gate the assertion on a handler-only side effect. The hashchange handler's `camera.flyTo` (explorer.qmd:2220-2227) rotates `camera.heading` to the URL's heading value — only the handler produces this rotation. Wait for heading to reach ~5° before asserting `selectedH3 === null`. This proves the handler executed past line 2272 (which would write a non-null selectedH3 from the URL) AND reached line 2285 (the null-clear). Also addresses Codex's altitude nit on Test 4: `waitForHashLatLng` now takes an optional `alt` so the round-trip assertion checks lat + lng + alt together, matching what `buildHash` actually writes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

rdhyee mentioned this pull request May 11, 2026

explorer: point-mode counts use padded fetch box, not visible viewport #211

Open

rdhyee merged commit 3950827 into isamplesorg:main May 11, 2026
1 check passed

This was referenced May 12, 2026

tests: Playwright spec for explorer URL state round-trip (addresses #209) #214

Merged

explorer: round-trip specs for remaining URL state surfaces (follow-up to #209) #215

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

explorer: report real Samples-in-View count when cap is reached (closes #206)#210

explorer: report real Samples-in-View count when cap is reached (closes #206)#210
rdhyee merged 2 commits into
isamplesorg:mainfrom
rdhyee:fix/206-samples-in-view-honest-count

rdhyee commented May 11, 2026

Uh oh!

rdhyee commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rdhyee commented May 11, 2026

Problem

Fix

Verification

Risks

Test plan

Uh oh!

rdhyee commented May 11, 2026

Codex review applied (commit ca8407c)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Codex review applied (commit `ca8407c`)