UDPATE: All slides and videos have been posted on the Spark Summit website, check them out!
Like all previous Spark Summits, the first Summit of 2016 was a resounding success. As a reflection of Apache Spark’s surging popularity, the Summit in New York grew from 900 people last year, to well over 1,300 people representing 500+ companies. This year many large organizations were eager to show off their production Spark use cases with the community, such as Bloomberg, Novartis, and Comcast.
We had two important product announcements at the event: Community Edition Beta and Dashboards. As the company behind Spark Summits, we are excited to bring you the highlights from each day. The PDFs and videos of the talks mentioned in this blog will be posted in the coming weeks. Check back here if you could not make it in person.
Day One Training
The Summit remains a place for many to acquire Spark expertise. Over 500 people – from beginners to advanced users – participated in our Spark training, where topics such as Spark basics to advanced data science with Spark were taught in a hands-on lab setting.
Day Two Highlights
The second day of the Spark Summit kicked off with a keynote featuring Matei Zaharia, the creator of Spark and co-founder / CTO of Databricks, who provided an overview of the upcoming Spark 2.0. Databricks co-founder and CEO Ali Ghodsi took the stage after Matei to announce the beta release of Databricks Community Edition, a free version of our cloud-based Spark platform with the goal of making Spark easy to learn and accessible to the everyone. The announcement ended with an exciting demo of the Community Edition by Michael Armbrust of Databricks, which you can watch on-demand here (the demo notebook is here). The other prominent industry keynotes from Hortonworks, IBM, and SAP highlighted the increased level of Spark adoption they are seeing and how it is accelerating innovation and growth in the enterprise.
Slides from the keynotes from Day Two:
- Matei Zaharia, Co-Founder & CTO, Databricks: Spark 2.0
- Ali Ghodsi, Co-Founder & CEO, Databricks: Democratizing Access to Data
- Shaun Connolly, VP of Business Strategy, Hortonworks: Accelerating Enterprise Spark
- Anjul Bhambhri, VP of Big Data Engineering, IBM: Apache Spark, the Analytics Operating System
- Ken Tsai, Head of Cloud Platform & Data Management, SAP: Spark Usage in Enterprise Business Operations
The theme of Spark establishing itself as the enterprise data processing engine of choice was echoed throughout the day as presenters shared examples of utilizing Spark to achieve measurable business outcomes.
Day Three Highlights
Up first on day three was Reynold Xin, co-founder and Chief Architect at Databricks, who spoke about the rise of continuous applications that act on real-time data and the future of Spark Streaming. Following Reynold were three presentations from industry leaders who are paving the way for mass Spark adoption including heads of technology and analytics from Synchronoss, and eBay.
Selected slides from the keynotes from Day Three:
- Reynold Xin, Co-Founder & Chief Architect, Databricks: The Future of Real-Time in Spark
- Suren Nathan, Head of Big Data Analytics, Razorsight: Data Profiling and Pipeline Processing with Spark
- Seshu Adunuthula, Head of Analytics Infrastructure, eBay: Role of Spark in transforming eBay’s Enterprise Data Platform
The momentum around Spark innovation continued as attendees spilled into various speaking sessions for developers, data scientists, researchers, and enterprise users. Interesting highlights include how BlackRock has built a Spark-based framework for managing data quality tests, Viacom shared their experiences building a just-in-time data warehouse with Databricks, and Novartis presented how they leverage Spark for distributed analytics and interactive visualization of large, high-dimensional screening data.
If you weren’t able to catch these great talks and the others from this week, the recordings will be available on the Spark Summit website in the coming weeks. Slides from Databricks presenters are already available on our website here.
Coming Soon: Spark Summit San Francisco
For those of you who could not make it to New York this year, Spark Summit will be in San Francisco from June 6-8. If you are interested in sharing your Spark experiences, call for papers is now open. The deadline for submissions is February 29, 2016.
We will be updating this blog with links to videos and presentations of the sessions in the coming weeks. Please check back here regularly. In the meantime, subscribe to our newsletter to keep up with the latest Databricks and Spark news.