Real-world Performance: Architected for fast performance on real world data and applications, not just synthetic tests
Open, Compatible APIs: Fully compatible with Apache Spark™ APIs to ensure workloads run seamlessly without code changes
Broad language support: Best in class performance on your data lake for streaming and batch workloads using SQL, Python, R, Scala, and Java
Native Execution Engine (Photon): 100% Apache Spark-compatible vectorized query engine designed to take advantage of modern CPU architecture for extremely fast parallel processing of data
Caching Layer: Automatically caches and transcodes data into a CPU-efficient format to better leverage the increased storage speeds of NVMe SSDs, delivering up to 5x faster scan performance for virtually all workloads