Skip to content

Add UK row-wise dataset cloning#137

Merged
MaxGhenis merged 1 commit into
mainfrom
codex/uk-rowwise-dataset-20260619
Jun 19, 2026
Merged

Add UK row-wise dataset cloning#137
MaxGhenis merged 1 commit into
mainfrom
codex/uk-rowwise-dataset-20260619

Conversation

@MaxGhenis

Copy link
Copy Markdown
Contributor

Summary

  • add a UK single-year dataset wrapper that clones person, benunit, and household tables with consistent ID remapping
  • assign row-wise OA/LSOA/MSOA/LA/constituency/region geography to cloned households using the Populace-owned row-wise primitives
  • validate person-household and person-benunit links after cloning and support writing a valid PolicyEngine-UK single-year H5
  • export the dataset helper API and add PyTables to the UK extra because HDFStore is required for UK H5 read/write

Real Populace UK 2023 pilot

Using the current Populace UK 2023 H5 and the official crosswalk sources from #136:

  • base shapes: person 1,157,100; benunit 618,980; household 535,080
  • crosswalk coverage: 650/650 constituencies and 360/360 local authorities covered/sampleable
  • n_clones=1: household weight preserved exactly; 0 blank geographies; 100% FRS region-code match; 650/650 constituencies and 360/360 LAs covered
  • n_clones=2: household weight preserved exactly; 0 blank geographies; 100% FRS region-code match; 650/650 constituencies and 360/360 LAs covered
  • weakest n_clones=2 constituency support: Na h-Eileanan an Iar, 555 rows/source households
  • weakest n_clones=2 LA support: Isles of Scilly, 26 rows/source households; City of London, 80 rows/source households

Checks

  • uv run --frozen --package populace-build --group dev python -m pytest packages/populace-build/tests/test_uk_rowwise_dataset.py packages/populace-build/tests/test_uk_rowwise_geography.py -q
  • uv run --frozen --package populace-build --extra uk --group dev python -m pytest packages/populace-build/tests/test_uk_rowwise_dataset.py -q
  • uv run --frozen --package populace-build --extra uk --group dev python -m pytest packages/populace-build/tests/test_uk_rowwise_dataset.py packages/populace-build/tests/test_uk_rowwise_geography.py packages/populace-build/tests/test_uk_geography_sources.py -q
  • uv run --frozen ruff check .
  • git diff --check
  • uv run --frozen --all-packages --group dev python -m pytest -q
  • env -u UV_FROZEN uv lock --check
  • real Populace UK 2023 1x/2x pilot described above

Review cycle

  • Read-only review found no actionable findings after inspecting the wrapper, tests, exports, dependency update, row-wise primitive contract, and installed UKSingleYearDataset H5/simulation loading contract.

@MaxGhenis MaxGhenis merged commit 23789ba into main Jun 19, 2026
4 checks passed
@MaxGhenis MaxGhenis deleted the codex/uk-rowwise-dataset-20260619 branch June 19, 2026 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant