Over the past several months, we’ve made DLT pipelines faster, more intelligent, and easier to manage at scale. DLT now delivers a streamlined, high-performance foundation for building and operating reliable data pipelines at any scale.
First, we’re thrilled to announce that DLT pipelines now integrate fully with Unity Catalog (UC). This allows users to read from and write to multiple catalogs and schemas while consistently enforcing Row-Level Security (RLS) and Column Masking (CM) across the Databricks Data Intelligence Platform.
Additionally, we’re excited to present a slate of recent enhancements covering performance, observability, and ecosystem support that make DLT the pipeline tool of choice for teams seeking agile development, automated operations, and reliable performance.
Read on to explore these updates, or click on individual topics to dive deeper:
"Integrating DLT with Unity Catalog has revolutionized our data engineering, providing a robust framework for ingestion and transformation. Its declarative approach enables scalable, standardized workflows in a decentralized setup while maintaining a centralized overview. Enhanced governance, fine-grained access control, and data lineage ensure secure, efficient pipeline management. The new capability to publish to multiple catalogs and schemas from a single DLT pipeline further streamlines data management and cuts costs."— Maarten de Haas, Product Architect, Heineken International
The integration of DLT with UC ensures that data is managed consistently across various stages of the data pipeline, providing more efficient pipelines, better lineage and compliance with regulatory requirements, and more reliable data operations. The key enhancements in this integration include:
To streamline data management and optimize pipeline development, Databricks now enables publishing tables to multiple catalogs and schemas within a single DLT pipeline. This enhancement simplifies syntax and eliminates the need for the LIVE keyword, and reduces infrastructure costs, development time, and monitoring burden by helping users easily consolidate multiple pipelines into one. Learn more in the detailed blog post.
The integration of DLT with Unity Catalog also includes fine-grained access control with row-level security (RLS) and column masking (CM) for datasets published by DLT pipelines. Administrators can define row filters to restrict data visibility at the row level and column masks to dynamically protect sensitive information, ensuring strong data governance, security, and compliance.
There are several SQL user-defined function (UDF) examples for how to define these policies in the documentation.
Moving DLT pipelines from the Hive Metastore (HMS) to Unity Catalog (UC) streamlines governance, enhances security, and enables multi-catalog support. The migration process is straightforward—teams can clone existing pipelines without disrupting operations or rebuilding configurations. The cloning process copies pipeline settings, updates materialized views (MVs) and streaming tables (STs) to be UC-managed, and ensures that STs resume processing without data loss. Best practices for this migration are fully documented here.
Once migration is complete, both the original and new pipelines can run independently, allowing teams to validate UC adoption at their own pace. This is the best approach for migrating DLT pipelines today. While it does require data copy, later this year we plan to introduce an API for copy-less migration—stay tuned for updates.
We’ve made significant improvements to performance in DLT in the last few months, enabling faster development and more efficient pipeline execution.
First, we sped up the validation phase of DLT by 80%*. During validation, DLT checks schemas, data types, table access and more in order to catch problems before execution begins. Second, we reduced the time it takes to initialize serverless compute for serverless DLT.
As a result, iterative development and debugging of DLT pipelines is faster than before.
*On average, according to internal benchmarks
Building on the DLT Sink API, we’re further expanding the flexibility of Delta Live Tables with foreachBatch support. This enhancement allows users to write streaming data to any batch-compatible sink, unlocking new integration possibilities beyond Kafka and Delta tables.
With foreachBatch, each micro-batch of a streaming query can be processed using batch transformations, enabling powerful use cases like MERGE INTO operations in Delta Lake and writing to systems that lack native streaming support, such as Cassandra or Azure Synapse Analytics. This extends the reach of DLT Sinks, ensuring that users can seamlessly route data across their entire ecosystem. You can review more details in the documentation here.
Users can now access query history for DLT pipelines, making it easier to debug queries, identify performance bottlenecks, and optimize pipeline runs. Available in Public Preview, this feature allows users to review query execution details through the Query History UI, notebooks, or the DLT pipeline interface. By filtering for DLT-specific queries and viewing detailed query profiles, teams can gain deeper insights into pipeline performance and improve efficiency.
The event log can now be published to UC as a Delta table, providing a powerful way to monitor and debug pipelines with greater ease. By storing event data in a structured format, users can leverage SQL and other tools to analyze logs, track performance, and troubleshoot issues efficiently.
We have also introduced Run As for DLT pipelines, allowing users to specify the service principal or user account under which a pipeline runs. Decoupling pipeline execution from the pipeline owner enhances security and operational flexibility.
Finally, users can now filter pipelines based on various criteria, including run as identities and tags. These filters enable more efficient pipeline management and tracking, ensuring that users can quickly find and manage the pipelines they are interested in.
These improvements collectively enhance the observability and manageability of pipelines, making it easier for organizations to ensure their pipelines are operating as intended and aligned with their operational criteria.
We are now introducing the capability to read Streaming Tables (STs) and Materialized Views (MVs) in dedicated access mode. This feature allows pipeline owners and users with the necessary SELECT privileges to query STs and MVs directly from their personal dedicated clusters.
This update simplifies workflows by opening ST and MV access to assigned clusters that are yet to be upgraded to shared clusters. With access to STs and MVs in dedicated access mode, users can work in an isolated environment—ideal for debugging, development, and personal data exploration.
Users can now read a change data feed (CDF) from STs targeted by the APPLY CHANGES
command. This improvement simplifies the tracking and processing of row-level changes, ensuring that all data modifications are captured and handled effectively.
Additionally, Liquid Clustering is now supported for both STs and MVs within Databricks. This feature enhances data organization and querying by dynamically managing data clustering according to specified columns, which are optimized during DLT maintenance cycles, typically conducted every 24 hours.
By bringing best practices for intelligent data engineering into full alignment with unified lakehouse governance, the DLT/UC integration simplifies compliance, enhances data security, and reduces infrastructure complexity. Teams can now manage data pipelines with stronger access controls, improved observability, and greater flexibility—without sacrificing performance. If you’re using DLT today, this is the best way to ensure your pipelines are future-proofed. If not, we hope this update signifies to you a concerted step forward in our commitment to maximizing the DLT user experience for data teams.
Explore our documentation to get started, and stay tuned for the roadmap enhancements listed above. We’d love your feedback!