This repository was archived by the owner on Jun 19, 2026. It is now read-only.
Calibrate bus fare and bus subsidy spending to DfT totals#431
Merged
Conversation
generate_lcfs_table is unit-tested to compute bus_fare_spending, but nothing checked it survives the QRF predict + enhanced-dataset assembly/save into the published dataset — and it currently doesn't (issue #430): every other consumption output lands, bus_fare_spending is dropped downstream. Add an end-to-end test asserting the enhanced dataset carries a populated bus_fare_spending column. Marked xfail so it is mergeable and documents the gap; it will XPASS once the pipeline is fixed. Refs #430. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
8fb9d96 to
4493f09
Compare
Anchor both bus variables to official DfT Annual Bus Statistics (y/e March 2025, England): passenger fare receipts £3.4bn (BUS05aii) and net government support £3.0bn (BUS05bii). Adds calibrate_bus_fare_spending (consumption) and calibrate_bus_subsidy_spending (services), mirroring calibrate_rail_subsidy_ spending, called after weight calibration in create_datasets. Unanchored, imputed bus fare inherited the broader transport-consumption over-estimate (~£10bn, ~3x) and bus subsidy drifted low (~£1.5bn). Updates the bus_subsidy_spending smoke target to the official £3.0bn and de-xfails the end-to-end bus_fare_spending dataset test (the column is present in the current release; the earlier "drop" was a stale-file misread, not a pipeline bug). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
DfT bus-finance figures are England-only; scale to UK by the ONS mid-2023 population ratio (UK 68.3M / England 57.7M ≈ 1.18) as a documented best approximation. Targets: bus fare £3.4bn→~£4.0bn, bus subsidy £3.0bn→~£3.5bn. Indicative (bus use per head varies by nation); refine with Scotland/Wales/NI sources if a direct UK figure is needed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the presence-only check with an active 20% total test for both bus_fare_spending and bus_subsidy_spending against the DfT Annual Bus Statistics targets (England, population-uplifted to UK). Uses the enhanced FRS dataset, which make data builds but make download does not fetch, so the baseline fixture skips it in PR CI and runs it on the post-merge build against the freshly calibrated data (same pattern as test_energy_calibration). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Drop the 'or "<col>" not in dataset.household' guard from calibrate_bus_fare_spending / calibrate_bus_subsidy_spending so they match the rail calibration (if target is None: return None) and fail loudly if the imputed column is unexpectedly absent, rather than silently skipping. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Anchors both bus variables to official DfT Annual Bus Statistics (year ending March 2025) totals, uplifted England → UK by population, and adds an end-to-end test that bus fare reaches the dataset.
bus_fare_spendingbus_subsidy_spendingSources:
Why
Both bus variables were imputed but unanchored (no calibration target, not in
uprating_factors.csv), so they drifted badly in the dataset:bus_fare_spendingbus_subsidy_spendingThe fare over-estimate came from inheriting a broader transport-consumption inflation (
transport_consumptionitself ~2.6× high); the subsidy drifted low.How
Mirrors the existing
calibrate_rail_subsidy_spendingpattern — post-calibration scaling that computes the actual weighted total and scales the column to the target:calibrate_bus_fare_spending(consumption.py) +BUS_FARE_TARGETScalibrate_bus_subsidy_spending(services.py) +BUS_SUBSIDY_TARGETScreate_datasets.pyalongside the rail/fuel calibration.Coverage / uplift caveat
DfT publishes bus finance for England only. There's no single official GB/UK total, so I scale England → UK by the ONS population ratio (~1.18) as a documented best approximation. It's indicative: bus use per head varies by nation (London lifts England's per-capita use), so the true UK factor is likely a touch below the population ratio. Can be refined with Transport Scotland / StatsWales / DfI NI figures if a direct UK total is wanted.
Tests
bus_subsidy_spendingsmoke target set to the uplifted ~£3.5bn.bus_fare_spendingsmoke target recorded (commented) — enable once a calibrated dataset is published (the released dataset predates this).bus_fare_spendingis present in the current release (enhanced_frs_2024_25.h5). The earlier "missing" was a stale-file misread (enhanced_frs_2023_24.h5), not a pipeline drop (issue bus_fare_spending dropped between LCFS imputation and the published enhanced dataset #430 closed with that correction).Scaling is deterministic, so calibrated totals hit the targets by construction; takes effect on the next dataset rebuild.
🤖 Generated with Claude Code