Session
Spark 4.0 and Delta 4.0 For Streaming Data
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Data Engineering and Streaming |
Industry | Energy and Utilities, Manufacturing |
Technologies | Apache Spark, Delta Lake, Databricks SQL |
Skill Level | Intermediate |
Duration | 40 min |
Real-time data is one of the most important datasets for any Data and AI Platform across any industry.
Spark 4.0 and Delta 4.0 include new features that make ingestion and querying of real-time data better than ever before.
Features such as:
- Python custom data sources for simple ingestion of streaming and batch time series data sources using Spark
- Variant types for managing variable data types and json payloads that are common in the real time domain
- Delta liquid clustering for simple data clustering without the overhead or complexity of partitioning
In this presentation you will learn how data teams can leverage these latest features to build industry-leading, real-time data products using Spark and Delta and includes real world examples and metrics of the improvements they make in performance and processing of data in the real time space.
Session Speakers
IMAGE COMING SOON
Bryce Bartmann
/Chief Digital Technology Advisor
Shell