Skip to content

Add UK rowwise dataset build driver#143

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/uk-rowwise-build-driver-20260619
Jun 19, 2026
Merged

Add UK rowwise dataset build driver#143
MaxGhenis merged 1 commit into
mainfrom
codex/uk-rowwise-build-driver-20260619

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Summary

  • add tools/build_uk_rowwise_dataset.py to build a row-wise UK local-geography H5 from a compact Populace UK H5
  • support generated or supplied official geography crosswalks, optional constituency/LA coverage checks, manifest hashes, weight-preservation diagnostics, and local-support summaries
  • document the local build command in packages/populace-build/README.md

Real build smoke

Using the current compact UK 2023 H5, supplied official crosswalk, and --n-clones 2, the driver produced a 2.62 GB row-wise H5 with:

  • person rows: 2,314,200; benunit rows: 1,237,960; household rows: 1,070,160
  • household weight delta: 0.0
  • missing geography rows: 0
  • constituency coverage: 650/650; local-authority coverage: 360/360 PE target codes
  • assigned geographies: 650 constituencies and 361 local authorities including the extra NI LA in official source data
  • duplicate source-household/constituency pairs: 0
  • weakest support: 551 household rows by constituency; 35 by local authority

Checks

  • uv run ruff check tools/build_uk_rowwise_dataset.py packages/populace-build/tests/test_uk_rowwise_build_driver.py
  • uv run ruff format --check tools/build_uk_rowwise_dataset.py packages/populace-build/tests/test_uk_rowwise_build_driver.py
  • git diff --check
  • env -u UV_FROZEN uv lock --check
  • uv run --project packages/populace-build --extra uk python -m pytest packages/populace-build/tests/test_uk_rowwise_build_driver.py -q
  • uv run --project packages/populace-build --extra uk python -m pytest packages/populace-build/tests/test_uk_rowwise_build_driver.py packages/populace-build/tests/test_uk_rowwise_dataset.py packages/populace-build/tests/test_uk_rowwise_geography.py -q
  • uv run --all-packages python -m pytest -q

Review cycle: final read-only review reported no actionable findings.

@MaxGhenis MaxGhenis force-pushed the codex/uk-rowwise-build-driver-20260619 branch from 59004aa to 27555a7 Compare June 19, 2026 18:29
@MaxGhenis MaxGhenis merged commit 2b0ac2e into main Jun 19, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/uk-rowwise-build-driver-20260619 branch June 19, 2026 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant