feat(lake): per-environment retention policy settings#510
Merged
Conversation
Lake hot/cold retention windows were hardcoded defaults (7/90) with no way to change them short of a raw DB write. The LakeRetentionPolicy model and the daily retention sweep already supported per-dataset windows — this wires a settings surface to drive them. - lake-retention-policy.ts: manage a dedicated per-env policy (named __env:<environmentId>), attaching every dataset in the environment. coldDays is the delete horizon enforced per dataset by the sweep; hotDays is the hot->cold move stored for the shared table TTL. Bounds + validation included. - environment router: getLakeRetention (VIEWER) / setLakeRetention, clearLakeRetention (ADMIN, audited) env-scoped procedures. - upsertLakeDataset: a new dataset inherits its environment's policy so the sweep enforces it without a manual re-save. - UI: Lake retention card in the environment Lake Storage tab, beside the cold-tier bucket config. Tests: service unit tests + router gating/audit/behaviour tests; existing lake-catalog suite updated for the new policy lookup.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lake hot/cold retention windows were hardcoded defaults (7 hot / 90 cold days) with no way to change them short of a raw DB write. The
LakeRetentionPolicymodel and the daily retention sweep already supported per-dataset windows — this PR wires a settings surface to drive them, placed beside the existing cold-tier bucket config in Environment → Lake Storage.What changed
lake-retention-policy.ts(new) — manages a dedicated per-environment policy (__env:<environmentId>) and keeps every dataset in the environment attached to it. Includes bounds (1–3650 days) + validation (coldDays >= hotDays).environmentrouter —getLakeRetention(VIEWER),setLakeRetention/clearLakeRetention(ADMIN, audited, demo-gated, system-env rejected).upsertLakeDataset— a newly created dataset inherits its environment's policy, so the sweep enforces it without a manual re-save.Enforcement semantics (intentional asymmetry)
sweepLakeRetention()DELETE — works at runtime with no DDL. Can shorten retention below the sharedlake_eventstable default.effectiveRetention's clamp and surfaced in the UI; a per-dataset move is not issued here. The UI copy states this.No schema change (the model already existed) → no migration.
Test plan
lake-retention-policy.test.ts— service unit tests (get/set/clear/resolve, validation, attach counts)environment-lake-retention.test.ts— router gating (VIEWER/ADMIN), audit wiring, NOT_FOUND, system-env reject, BAD_REQUEST on inverted windowlake-catalogsuite updated for the new policy lookup on dataset create