Skip to content

Bug: INDIRECT_NON_BREAKING snapshots in dev environments write to a table nobody reads, causing repeated re-backfills #5793

@gandeevanraghuramandd

Description

@gandeevanraghuramandd

When a snapshot is categorized as INDIRECT_NON_BREAKING, dev environment runs (sqlmesh run dev) repeatedly backfill the same intervals on every invocation. Completed work is never recognized.

The root cause is an inconsistency in how three subsystems treat INDIRECT_NON_BREAKING snapshots. These snapshots are is_representative=True but is_deployable=False, and different parts of the codebase use different predicates:

Decision Predicate used Result Effect
Which table does the dev view point to? is_representative points to prod table i.e sqlmesh__<schema>.<model>__<version> View reads from prod table
Which intervals are checked for missing work? is_representative intervals (prod ledger) Missing work computed against prod
Where is the evaluated data written? is_deployable sqlmesh__<schema>.<model>__<dev_version>__dev (dev table, __dev suffix) Data written to dev table
Where is completed work recorded? is_deployable dev_intervals (dev ledger) Completions written to dev ledger

Because missing-interval detection reads intervals but successful evaluation writes to dev_intervals, completed work is never visible to the next run.

Reproduction

  1. Have a model B that depends on model A.
  2. Make a non-breaking change to A and apply a plan. B gets categorized as INDIRECT_NON_BREAKING.
  3. Both prod and dev environments now point to the same snapshot of B.
  4. Run sqlmesh run dev --ignore-cron. It backfills some intervals for B.
  5. Run sqlmesh run dev --ignore-cron again immediately. It backfills the same intervals.
  6. Repeat indefinitely -- same result every time.

Code references

INDIRECT_NON_BREAKING is unconditionally non-deployable - DeployabilityIndex.create() in sqlmesh/core/snapshot/definition.py:

if (
    snapshot.is_forward_only
    or snapshot.is_indirect_non_breaking   # <-- unconditional
    or is_forward_only_model
    or has_auto_restatement
    or not is_valid_start
):
    this_deployable = False

But it is marked as representative - same method, immediately below:

if not snapshot.is_paused or (
    snapshot.is_indirect_non_breaking and snapshot.intervals
):
    representative_shared_version_ids.add(node)

Missing-interval detection uses is_representative - Snapshot.missing_intervals() in sqlmesh/core/snapshot/definition.py:

intervals = (
    self.intervals if deployability_index.is_representative(self) else self.dev_intervals
)

Interval recording uses is_deployable - PlanEvaluator in sqlmesh/core/plan/evaluator.py:

is_deployable = deployability_index.is_deployable(snapshot)
# ...
intervals=intervals if is_deployable else [],
dev_intervals=intervals if not is_deployable else [],

View promotion uses is_representative - SnapshotEvaluator._promote_snapshot() in sqlmesh/core/snapshot/evaluator.py:

table_name = snapshot.table_name(deployability_index.is_representative(snapshot))

Impact

  • Dev runs are never idempotent for affected models. Every run re-processes the same intervals.
  • The dev table (__dev suffix) receives data that no view ever points to. The work is wasted.
  • CI pipelines using sqlmesh run dev will always perform full backfills regardless of prior runs.
  • This is not version-specific; the same code pattern exists across multiple SQLMesh releases.

Expected behavior

For a given snapshot, these three decisions should be consistent:

  1. The table the dev view points to,
  2. The interval ledger checked for missing work, and
  3. The interval ledger updated after successful work

should all refer to the same physical table (either prod or dev), not a mix of both.

Environment

  • SQLMesh version: 0.234.1 (also confirmed in 0.231.1 source)
  • Engine: DuckDB (but the issue is engine-independent; it's in the core snapshot/plan logic)

Note: AI was used to both triage this bug and to document the issue. However, the code paths and conclusions were manually inspected and validated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions