Steps for a Developer to Learn
Apache Spark with Delta Lake

Get the eBook by Email

Learn Apache Spark™ with Delta Lake

Learn how Apache Spark
and Delta Lake unify all
your data — big data and
business data — on one
platform for BI and ML

Data without limits


What’s holding you back from unlocking the full potential of your data? You need a platform that can process and hold all your data — both raw data and business data — and deliver it to all your downstream users for BI and ML.

Apache Spark™ 2.x is a monumental shift in ease of use, higher performance and smarter unification of APIs across Spark components. And for the data being processed, Delta Lake brings data reliability and performance to data lakes, with capabilities like ACID transactions, schema enforcement, DML commands, and time travel.

In this eBook, we offer a step-by-step guide to technical content and related assets that will lead you to learn Apache Spark and Delta Lake. Whether you’re getting started or you’re already an accomplished developer, these steps will let you explore the benefits of these open source projects.

Here are the topics we will cover:

  • Why Apache Spark and Delta Lake
  • Apache Spark and Delta Lake concepts, key terms and keywords
  • Advanced Apache Spark internals and core
  • DataFrames, Datasets and Spark SQL essentials
  • Graph processing with GraphFrames
  • Continuous applications with structured streaming
  • Machine learning for humans
  • Data reliability challenges for data lakes
  • Delta Lake for ACID transactions, schema enforcement and more
  • Unifying batch and streaming data pipelines