This February, join the Apache Spark community in New York City at the New York Midtown Hilton for the second annual Spark Summit East on February 16th-18th! We are happy to announce that the community talks agenda has been finalized and you can find the full list of 60 community talks available at the Spark Summit East 2016 Schedule. Those looking to get hands-on with Spark are encouraged to sign up for one of our Databricks Training workshops.
The phenomenal growth of the global Spark community is reflected in the development of the Spark Summit events. In 2015, we had over 4000 combined attendees with Spark Summits in San Francisco, New York, and Amsterdam. With the continued rapid growth, we are expecting 1800 attendees for the first Spark Summit of 2016!
For Spark Summit East 2016, we will continue to have a wide variety of content, catering to different interests. We will continue to have our popular developer, data science, application and research tracks. But this year, we will be introducing our new enterprise tracks featuring enterprise customer case studies from Bloomberg, Viacom, Comcast, Huawei, IBM, Microsoft, EMC, and Netflix. We will also have sessions featuring analysts and thought leaders, Mike Gualtieri (Forrester), Tony Baer (Ovum), Nik Rouda (ESG), and Thomas Dinsmore (Big Analytics Blog).
Session Highlights
We would like to thank everyone who submitted a presentation, and congratulate the selected community talk presenters. For both days, the conference sessions (February 17th and 18th) will kick off with keynotes from leaders of the Spark Community. Leading off on Wednesday will be Matei Zaharia and on Thursday will be Reynold Xin.
Check out a sample of topics (see the full schedule here):
- Distributed Time Travel for Feature Generation: DB Tsai, Prasanna Padmanabhan, and Mohammed H. Taghavi (Netflix)
- Using Spark to Power the Office 365 Delve Organization Analytics: Yi Wang and Paavany Jayanty (Microsoft)
- Structuring Spark: DataFrames, Datasets, and Streaming: Michael Armbrust (Databricks)
- Petabyte Scale Anomaly Detection Using R & Spark: Sridhar Alla and Kiran Muglurmath (Comcast)
- Generalized Linear Models in Spark MLlib and SparkR: Xiangrui Meng (Databricks)
- Implementing Near-Realtime Datacenter Health Analytics using Model-driven Vertex-centric Programming on Spark Streaming and GraphX: David Ohsie and Cheuk Lam (EMC)
- Lambda at Weather Scale: Robbie Strickland (The Weather Company)
- Spark at Bloomberg: Sudarshan Kadambi and Partha Nageswaran (Bloomberg)
Training
For people who are interested in becoming Spark experts, there are three workshops that cater to different interests:
- Apache Spark Essentials will help you get productive with the core capabilities of Spark, as well as provide an overview and examples for some of Spark’s more advanced features.
- Data Science with Apache Spark will show how to use Apache Spark to perform exploratory data analysis (EDA), develop machine learning pipelines, and use the APIs and algorithms available in Spark ML and Spark MLlib. It is designed for software developers, data analysts, data engineers, and data scientists.
- Advanced: Exploring Wikipedia with Spark (Tackling a unified use case): The real power and value proposition of Apache Spark is in building a unified use case that combines ETL, batch analytics, real-time stream analysis, machine learning, graph processing, and visualizations. In class, we will explore various Wikipedia datasets while applying the ideal programming paradigm for each analysis.
How to get tickets
Tickets are available online now, register before the tickets sell out! Use promo code “Databricks20” to receive 20% off your registration fee.
Thanks to Our Sponsors
Our esteemed sponsors are instrumental in bringing Spark Summit East 2016 to life. You’ve heard this before but without our sponsors, the Summits wouldn’t happen.