CDC Pipeline With Delta Live Tables
Demo Type
Product Tutorial
Duration
Self-paced
Related Content
What you’ll learn
This demo highlights how Delta Live Tables simplifies CDC (change data capture).
CDC is typically done by ingesting changes from external systems (ERP, SQL databases) with tools like Fivetran, Debezium etc.
In this demo, we’ll show you how to re-create your table consuming CDC information.
We’ll also implement a SCD2 (Slowly Changing Dimention table of type 2). While this can be really tricky to implement when data arrives out of order, DLT makes this super simple with just one keyword.
Ultimately, we’ll show you how to programatically scan multiple incoming folders and trigger N streams (one for each CDC table), leveraging DLT with Python.
To install the demo, get a free Databricks workspace and execute the following two commands in a Python notebook
%pip install dbdemos
import dbdemos
dbdemos.install('dlt-cdc')
Dbdemos is a Python library that installs complete Databricks demos in your workspaces. Dbdemos will load and start notebooks, Delta Live Tables pipelines, clusters, Databricks SQL dashboards, warehouse models … See how to use dbdemos
Dbdemos is distributed as a GitHub project.
For more details, please view the GitHub README.md file and follow the documentation.
Dbdemos is provided as is. See the License and Notice for more information.
Databricks does not offer official support for dbdemos and the associated assets.
For any issue, please open a ticket and the demo team will have a look on a best-effort basis.