Skip to content

databricks-solutions/lakeflow_framework

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Databricks Lakeflow Framework

Docs Release License

Documentation | Sample Data Bundles

Project Description

The Lakeflow Framework is a metadata-driven framework for building Databricks Lakeflow Spark Declarative Pipelines. It uses a configuration-driven, pattern-based approach to support both batch and streaming workloads across the medallion architecture.

The framework supports centralized and domain-oriented operating models, and accommodates multiple modelling paradigms (including dimensional, Data Vault, and enterprise canonical models). It is designed for simplicity, performance, maintainability, and extensibility as the Databricks product evolves.

Why use Lakeflow Framework

  • Configuration-driven pattern based pipeline delivery with reusable implementation patterns
  • Support for batch and streaming pipelines across Bronze/Silver/Gold, aligned to your chosen modelling pattern
  • Flexible for centralized and domain-oriented operating models

Quick start

git clone https://github.com/databricks-solutions/lakeflow_framework.git
cd lakeflow_framework
pip install -r requirements-dev.txt

Then:

  1. Open the hosted docs: https://databricks-solutions.github.io/lakeflow_framework/
  2. Deploy the framework using the Deploy Framework guide
  3. Deploy samples from samples/ using the documentation walkthroughs
  4. Build your first pipeline bundle using the Build a Pipeline Bundle guide

Prerequisites

  • Access to a Databricks workspace
  • Databricks CLI installed and configured
  • Python environment with project dependencies installed
  • Familiarity with Databricks Lakeflow Spark Declarative Pipelines concepts

Repository structure

  • docs/ - Sphinx documentation and versioned docs build tooling
  • samples/ - example framework and pipeline bundles
  • src/ - framework source code and runtime components

Version compatibility

This project tracks Databricks Lakeflow Spark Declarative Pipelines capabilities and evolves with platform changes. Validate runtime, feature, and API compatibility against your target Databricks workspace and the latest project documentation before production rollout.

Project status and support

The framework is actively maintained. Databricks support does not cover this repository; issue support is best effort through GitHub issues.

Releases and changelog

Documentation

Please refer to the documentation for further details and an explanation of the samples. The documentation needs to be deployed as HTML or Markdown within your org before it can be used.

Local docs development (optional)

pip install -r requirements-docs.txt
make -C docs html

How to get help

Databricks support doesn't cover this content. For questions or bugs, please open a GitHub issue and the team will help on a best effort basis.

License

© 2025 Databricks, Inc. All rights reserved. The source in this notebook is provided subject to the Databricks License [https://databricks.com/db-license-source]. All included or referenced third party libraries are subject to the licenses set forth below.

About

Metadata-driven framework for Databricks Spark Declarative Pipelines. Config-driven, pattern based approach to batch & streaming across the medallion architecture. Deploys via Declarative Automation Bundles. Built for simplicity, extensibility, and alignment with the Databricks product roadmap.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors

Languages