What is Delta Live Tables?
Delta Live Tables (DLT) is the first ETL framework that uses a simple declarative approach to building reliable data pipelines and automatically managing your infrastructure at scale so data analysts and engineers can spend less time on tooling and focus on getting value from data. With DLT, engineers are able to treat their data as code and apply modern software engineering best practices like testing, error handling, monitoring and documentation to deploy reliable pipelines at scale.
Accelerate ETL development
DLT natively enables modern software engineering best practices, giving you the ability to develop in environments separate from production, easily test before deployment, deploy and manage environments using parameterization, conduct unit testing and provide documentation. As a result, you can simplify the development, testing, deployment, operations and monitoring of ETL pipelines with first-class constructs for expressing transformations, CI/CD, SLAs and quality expectations, and for seamlessly handling batch and streaming in a single API.
Automatically manage your infrastructure
DLT was built from the ground up to automatically manage your infrastructure and to automate complex and time-consuming activities. DLT automatically scales compute infrastructure by allowing the user to set the minimum and maximum number of instances and let DLT size up the cluster according to cluster utilization. In addition, tasks like orchestration, error handling and recovery are all done automatically — as is performance optimization. With DLT, you can focus on data transformation instead of operations.
Have confidence in your data
Deliver reliable data with built-in quality controls, testing, monitoring and enforcement to ensure accurate and useful BI, data science and ML. DLT makes it easy to create trusted data sources by including first-class support for data quality management and monitoring tools. It helps prevent bad data from flowing into tables, tracks data quality over time and provides tools to troubleshoot bad data with granular pipeline observability. This gives you a high-fidelity lineage diagram of your pipeline and lets you track dependencies, and aggregate data quality metrics across all of your pipelines.
Simplify batch and streaming
Optimize your cost performance by providing up-to-date, self-optimized data for apps and by creating auto-scaling data pipelines for batch or streaming processing. Unlike other products that force you to deal with streaming and batch workloads separately, DLT supports any type of data workload with a single API so data engineers and analysts alike can build cloud-scale data pipelines faster and without needing advanced data engineering skills.
Meet regulatory requirements
Capture all information about your table for analysis and auditing automatically with the event log. Understand how data flows through your organization and meet compliance requirements.
Simplify data pipeline deployment and testing
With different copies of data isolated and updated through a single code base, data lineage information can be captured and used to keep data fresh anywhere. So the same set of query definitions can be run in development, staging and production.
Reduce operational complexity with unified batch and streaming
Build and run both batch and streaming pipelines in one place with controllable and automated refresh settings, saving time and reducing operational complexity.
“At ADP, we are migrating our human resource management data to an integrated data store on the lakehouse. Delta Live Tables has helped our team build in quality controls, and because of the declarative APIs, support for batch and real-time using only SQL, it has enabled our team to save time and effort in managing our data.”
– Jack Berkowitz, CDO, ADP
“At Shell, we are aggregating all our sensor data into an integrated data store. Delta Live Tables has helped our teams save time and effort in managing data at [the multi-trillion-record scale] and continuously improving our AI engineering capability. With this capability augmenting the existing lakehouse architecture, Databricks is disrupting the ETL and data warehouse markets, which is important for companies like ours. We are excited to continue to work with Databricks as an innovation partner.”
– Dan Jeavons, General Manager — Data Science, Shell
“Delta Live Tables enables collaboration and removes data engineering resource blockers, allowing our analytics and BI teams to self-serve without needing to know Spark or Scala. In fact, one of our data analysts — with no prior Databricks or Spark experience — was able to build a DLT pipeline to turn file streams on S3 into usable exploratory data sets within a matter of hours using mostly SQL.”
– Christina Taylor – Data Engineering, Bread Finance
Ready to get