Skip to main content

"Learning Spark" book available from O'Reilly

Holden Karau
Andy Konwinski
Patrick Wendell
Matei Zaharia
Share this post

large oreilly book cover

Today we are happy to announce that the complete Learning Spark book is available from O’Reilly in e-book form with the print copy expected to be available February 16th. At Databricks, as the creators behind Apache Spark, we have witnessed explosive growth in the interest and adoption of Spark, which has quickly become one of the most active software projects in Big Data. To continue fostering the developer and user communities around Spark we created a book to help engineers and data scientists learn Spark and use it to solve their most challenging problems.

Learning Spark covers Spark’s rich collection of data programming APIs and libraries (e.g., MLlib), which make it easy for data scientists to use cutting edge statistical approaches to tackle problems using data of unprecedented scale. Engineers, meanwhile, will learn how to write general-purpose distributed programs in Spark as well as configure and operate production deployments of Spark.

The Learning Spark book does not require any existing Spark or distributed systems knowledge, though some knowledge of Scala, Java, or Python might be helpful.

The topics covered include Spark’s core general purpose distributed computing engine, as well as some of Spark’s most popular components including Spark SQL, Spark Streaming, and Spark's Machine Learning library MLlib. Both new and existing Spark practitioners will be able to learn Spark best practices as well as important tuning tricks and debugging skills.

The book is available today from O’Reilly, Amazon, and others in e-book form, as well as print pre-order (expected availability of February 16th) from O’Reilly, Amazon.  The code examples from the book are available on the books GitHub as well as notebooks in the “learning_spark” folder in Databricks Cloud.

We are also excited to share the discount code BWORM AUTHD. This discount is for 40% off print or 50% off ebooks when you buy directly from O'Reilly.

The authors, Holden Karau, Andy Konwinski, Patrick Wendell, and Matei Zaharia will attend Strata San Jose (February 17 - 20th 2015). We will be giving talks and on Thursday morning we will be signing books. Please stop by booth 1021 and visit us.

Try Databricks for free

Related posts

Apache Spark Turns Five Years Old!

March 31, 2015 by Matei Zaharia in
Today, we’re celebrating an important milestone for the Apache Spark project -- it’s now been five years since Spark was first open sourced...

Make Your Oil and Gas Assets Smarter by Implementing Predictive Maintenance with Databricks

July 19, 2018 by Don Hillborn and Denny Lee in
How to build an end-to-end predictive data pipeline with Databricks Delta and Spark Streaming Maintaining assets such as compressors is an extremely complex...

Databricks and Apache Spark™ 2017 Year in Review

January 3, 2018 by Jules Damji in
At Databricks we welcome the dawn of the New Year 2018 by reflecting on what we achieved collectively as a company and community...
See all Company Blog posts