Delta Engine

High performance query execution for Delta Lake

Accelerate all your workloads on your data lake with Delta Engine, a new query engine designed for speed and flexibility. It’s built from the ground up to deliver fast performance on modern cloud hardware for all data use cases across data engineering, data science, machine learning, and data analytics. Bring even better performance to your Delta Lake on Databricks with Delta Engine.

Benefits

Real-world Performance: Architected for fast performance on real world data and applications, not just synthetic tests

Open, Compatible APIs: Fully compatible with Apache Spark™ APIs to ensure workloads run seamlessly without code changes

Broad language support: Best in class performance on your data lake for streaming and batch workloads using SQL, Python, R, Scala, and Java

Features

 

Native Execution Engine (Photon): 100% Apache Spark-compatible vectorized query engine designed to take advantage of modern CPU architecture for extremely fast parallel processing of data
 
Caching Layer: Automatically caches and transcodes data into a CPU-efficient format to better leverage the increased storage speeds of NVMe SSDs, delivering up to 5x faster scan performance for virtually all workloads
 

Improved Query Optimizer: Extends Spark’s capabilities with optimizations that accelerate star schema workloads on data lakes up to 18x

  • Cost-Based Optimizer: Develops faster query plans with more advanced statistics, optimal join types, and better shuffle sizes
  • Adaptive Query Execution: Dynamically re-plans queries during execution, for live performance improvements as data is read
  • Dynamic Runtime Filters: Improved data skipping with greater granularity for faster query performance

Ready to get started?

Try Databricks for free