Monitor Your Data Quality With Lakehouse Monitoring

What You’ll Learn

Lakehouse Monitoring allows you to easily profile, diagnose, and enforce quality directly in the Databricks Data Intelligence Platform. Without any additional tools or complexity, Lakehouse Monitoring helps teams proactively discover quality issues before downstream processes are impacted.

Through this tutorial, you can see how easy it is to create a monitor on any table in Unity Catalog and get insights on data trends and anomalies. This tutorial will illustrate a retail use case monitoring transaction data and walk through best practices when configuring a monitor. At the end of the demo, you will get an auto-generated dashboard to discover quality issues/anomalies in:

  • Data volume
  • Data Integrity
  • Numerical Distribution Change
  • Categorical Distribution Change

With Lakehouse Monitoring built into Unity Catalog, this demo will also show you how quality integrated with metadata allows you to perform root cause analysis and impact analysis with lineage.

For more information on Lakehouse Monitoring, please see our product documentation (AWS | Azure) to get started today.

 

To install the demo, get a free Databricks workspace and execute the following two commands in a Python notebook

%pip install dbdemos
import dbdemos
dbdemos.install('lakehouse-monitoring', catalog='main', schema='dbdemos_lhm')

Dbdemos is a Python library that installs complete Databricks demos in your workspaces. Dbdemos will load and start notebooks, Delta Live Tables pipelines, clusters, Databricks SQL dashboards, and warehouse models … See how to use dbdemos

 

Dbdemos is distributed as a GitHub project.

For more details, please view the GitHub README.md file and follow the documentation.
Dbdemos is provided as is. See the 
License and Notice for more information.
Databricks does not offer official support for dbdemos and the associated assets.
For any issue, please open a ticket and the demo team will have a look on a best-effort ba
sis.