New eBook Released: Mastering Advanced Analytics with Apache Spark

Mastering Advanced Analytics with Apache Spark eBook Cover

Published: April 27, 2016

Free Edition has replaced Community Edition, offering enhanced features at no cost. Start using Free Edition today.

We are excited to announce that the second eBook in our technical blog book series, Mastering Advanced Analytics with Apache Spark, has been released today!

You can download the eBook here.

We focused on the topic of “Advanced Analytics” due to the challenges created by the continued growth in data. This coupled with increasingly complex use cases demands much more than running queries against the data set. Whether you’re scrutinizing the clickstream from millions of visitors to optimize online ad placements or sifting through billions of transactions to identify signs of fraud, more sophisticated approaches to automatically glean insights from enormous volumes of data - such as machine learning and graph processing - is more important than ever.

This eBook offers a collection of the most popular technical blog posts that provide an introduction to machine learning and other advanced techniques on Spark, including:

An introduction to machine learning in Apache Spark
Using Spark for advanced topics such as clustering, trees, graph processing
How you can use SparkR to analyze data at scale with the R language

We’ve also augmented the blogs with new code examples in Databricks notebooks, which are freely available with the eBook download. A sample of the new notebooks include:

Scalable Decision Trees with MLlib
ML Import, Export, and Simple Operations
Generalized Linear Models in SparkR
Random Forests and Boosting in MLlib

Download the eBook to get started on your next advanced analytics project today. To try out the code examples, get on the waitlist for the Databricks Community Edition. If you have not read the first eBook in the series, be sure to check out Apache Spark Analytics Made Simple for technical content and code examples geared toward an introduction to data analytics with Apache Spark.

Your compact guide to modern analytics

Never miss a Databricks post

Sign up