Delta automatically indexes, compacts and caches data helping achieve up to 100x improved performance over Apache Spark. Delta delivers performance optimizations by automatically capturing statistics and applying various techniques to data for efficient querying.
Delta provide full ACID-compliant transactions and enforce schema on write, giving data teams controls to ensure data reliability. Deltaʼs upsert capability
provides a simple way to clean data and apply new business logic without reprocessing data.
Delta dramatically simplifies data pipelines by providing a common API to transactionally store large historical and streaming datasets in cloud blob stores and making these massive datasets available for high-performance analytics.
Databricks Delta, a key component of Databricks Runtime, enables data scientists to explore and visualize data and combine this data with various ML frameworks (Tensorflow, Keras, Scikit-Learn etc) seamlessly to build models. As a result, Delta can be used to run not only SQL queries but also for Machine Learning using Databricks Workspace on large amounts of streaming data.