A Look Back at Spark Summit East 2016: Thank you NYC!

UDPATE: All slides and videos have been posted on the Spark Summit website, check them out!


Like all previous Spark Summits, the first Summit of 2016 was a resounding success. As a reflection of Apache Spark’s surging popularity, the Summit in New York grew from 900 people last year, to well over 1,300 people representing 500+ companies. This year many large organizations were eager to show off their production Spark use cases with the community, such as Bloomberg, Novartis, and Comcast.

We had two important product announcements at the event: Community Edition Beta and Dashboards. As the company behind Spark Summits, we are excited to bring you the highlights from each day. The PDFs and videos of the talks mentioned in this blog will be posted in the coming weeks. Check back here if you could not make it in person.

Day One Training

The Summit remains a place for many to acquire Spark expertise. Over 500 people – from beginners to advanced users – participated in our Spark training, where topics such as Spark basics to advanced data science with Spark were taught in a hands-on lab setting.

Day Two Highlights

The second day of the Spark Summit kicked off with a keynote featuring Matei Zaharia, the creator of Spark and co-founder / CTO of Databricks, who provided an overview of the upcoming Spark 2.0. Databricks co-founder and CEO Ali Ghodsi took the stage after Matei to announce the beta release of Databricks Community Edition, a free version of our cloud-based Spark platform with the goal of making Spark easy to learn and accessible to the everyone. The announcement ended with an exciting demo of the Community Edition by Michael Armbrust of Databricks, which you can watch on-demand here. The other prominent industry keynotes from Hortonworks, IBM, and SAP highlighted the increased level of Spark adoption they are seeing and how it is accelerating innovation and growth in the enterprise.

Slides from the keynotes from Day Two:

The theme of Spark establishing itself as the enterprise data processing engine of choice was echoed throughout the day as presenters shared examples of utilizing Spark to achieve measurable business outcomes.

Day Three Highlights

Up first on day three was Reynold Xin, co-founder and Chief Architect at Databricks, who spoke about the rise of continuous applications that act on real-time data and the future of Spark Streaming. Following Reynold were three presentations from industry leaders who are paving the way for mass Spark adoption including heads of technology and analytics from Synchronoss, and eBay.

Selected slides from the keynotes from Day Three:

The momentum around Spark innovation continued as attendees spilled into various speaking sessions for developers, data scientists, researchers, and enterprise users. Interesting highlights include how BlackRock has built a Spark-based framework for managing data quality tests, Viacom shared their experiences building a just-in-time data warehouse with Databricks, and Novartis presented how they leverage Spark for distributed analytics and interactive visualization of large, high-dimensional screening data.

If you weren’t able to catch these great talks and the others from this week, the recordings will be available on the Spark Summit website in the coming weeks.

Coming Soon: Spark Summit San Francisco

For those of you who could not make it to New York this year, Spark Summit will be in San Francisco from June 6-8. If you are interested in sharing your Spark experiences, call for papers is now open. The deadline for submissions is February 29, 2016.

We will be updating this blog with links to videos and presentations of the sessions in the coming weeks. Please check back here regularly. In the meantime, subscribe to our newsletter to keep up with the latest Databricks and Spark news.

Try Databricks for free Get started

Sign up