The Strata + Hadoop World Conference in San Jose last week was abuzz with "putting data to work" in keeping with this year's conference theme. This was a significant shift from last year's event where organizations were highly focused on getting their arms around their big data projects and being steeped in evaluating the multitude of tools of new technologies available. Last week's event highlighted what is top of mind for enterprises and developers alike - how to turn their big data initiatives and projects into real business results?
One theme was loud and clear - Apache Spark's flame shone bright! Derrick Harris from GigaOM summed this up aptly in his article "For now, Spark looks like the future of big data". To quote Derrick, "Titles can be misleading. For example, the O’Reilly Strata + Hadoop World conference took place in San Jose, California, this week but Hadoop wasn’t the star of the show. Based on the news I saw coming out of the event, it’s another Apache project — Spark — that has people excited."
To preface David Ramel from his article yesterday entitled Spark Continues Big Data Ascension, "The flurry of Spark-related news and product releases further cements the project as the darling of the open source Big Data movement, showing a "hockey stick-like" growth in a chart measuring Spark awareness, according to a recent survey. It has been recognized as the most active Apache Software Foundation project and, indeed, most active Big Data open source project of any kind."
The show was also jam-packed with announcements from our partners in the Spark ecosystem, with MemSQL and Tableau announcing new Spark connectors. At the same time, we announced a collaboration effort with Intel to optimize Spark-based analytics on Intel architecture.
Looking beyond the four walls of the show, Donnie Berkholz of Redmonk reflected on the incredible hockey stick like surge of Spark interest amongst users, as observed on Stack Overflow:
As the team behind Spark, we at Databricks are thrilled to have the opportunity to respond to this intense interest with Spark and to connect with users. In line with this year's conference theme of turning their data initiatives into value, we had the opportunity to interact with enterprise users were keen to share with and also learn about running Spark in production on Databricks Cloud.
For those of you who missed our sessions at Strata SJ last week, here are the pointers to some of the presentations and training material:
The material from the Spark Camp training class that was attended by over 300 students can be found here. Info on future Spark training classes, can be found on the training page of our website.
Our Strata San Jose 2015 presentations can be found on our slideshare account:
- Lessons from Running Large Scale Spark Workloads - Reynold Xin, Matei Zaharia
- Spark Streaming — The State of the Union, and Beyond - Tathagata Das
- New Directions for Spark in 2015 - Matei Zaharia
- Tuning and Debugging in Apache Spark - Patrick Wendell
- Everyday I’m Shuffling – Tips for Writing Better Spark Programs - Vida Ha, Holden Karau
For all of you east coast Spark enthusiasts, we will be holding the inaugural Spark Summit East in New York City on March 18th through 19th. The agenda has been released, and there will be many more informative sessions for all. Register today if you have not done so!
To keep up with Databricks news, don't forget to sign up for our monthly newsletter here.