eBook

Unlock the potential of your data

Learn how Apache Spark™ and Delta Lake unify all your data — big data and business data — on one platform for BI and ML.

Apache Spark 3.x is a monumental shift in ease of use, higher performance and smarter unification of APIs across Spark components. And for the data being processed, Delta Lake brings data reliability and performance to data lakes, with capabilities like ACID transactions, schema enforcement, DML commands and time travel.

In this eBook, we offer a step-by-step guide to technical content and related assets that will lead you to learn Apache Spark and Delta Lake. Whether you’re just getting started or you’re already an accomplished developer, explore the benefits of these open source projects.

Here are the 8 steps we’ll cover:

Why Apache Spark and Delta Lake
Apache Spark concepts, key terms and keywords
Advanced Apache Spark internals and core
DataFrames, data sets and Spark SQL essentials
Graph processing with GraphFrames
Continuous applications with structured streaming
Machine learning for humans
Reliable data lakes and data pipelines

Unlock the potential of your data

Get the eBook