Spark Summit Talks and Apache Spark Roundup
- Databricks and partners set a new world record for CloudSort 2016 Benchmark using Apache Spark, wrote Reynold Xin, chief architect.
- Databricks Chief Technologist Matei Zaharia delivered a keynote, “Simplifying Big Data Applications with Apache Spark 2.0,” at Spark Summit 2016 EU in Brussels, followed by a demo of continuous application by Databricks software engineer Greg Owen.
- Databricks CEO Ali Ghodsi shared his vision of “Democratizing AI with Apache Spark” in his keynote at Spark Summit 2016 EU in Brussels.
- Executive Chairman of Databricks Ion Stoica announced “The Next AmpLAB: Real-time, Intelligent, and Secure Computing,” in his keynote at Spark Summit 2016 EU in Brussels.
- Sameer Agarwal, software engineer at Databricks, presented “Apache Spark’s Performance: Project Tungsten and Beyond,” at Spark Summit 2016 EU in Brussels.
- Herman Van Hovell, software engineer at Databricks, gave a “Deep Dive into the Catalyst Optimizer” talk and a hands-on lab at Spark Summit 2016 EU in Brussels.
- Echoing Ali Ghodsi’s keynote above, Tim Hunter, Databricks software engineer, showed how to use Apache Spark with TensorFlow: “TensorFrames: Deep Learning with TensorFlow on Apache Spark,” at Spark Summit 2016 EU in Brussels.
- Databricks Solution Architect Mikos Christine shared challenges and pitfalls you can avoid with Spark Streaming in his talk “Paddling up the Stream,” at Spark Summit 2016 EU in Brussels.
- Facebook’s Big Compute Team software engineer Sital Kedia described how Apache Spark scales in production in his talk: “Apache Spark at Scale: A 60 TB+ Production Use Case” at Spark Summit 2016 EU in Brussels.
- Morning Paper blogger Adrian Colyer commented on Michael Armbrust et. al. article “Scaling Spark in the Real World: Performance and Usability.”
- Matei Zaharia, Reynold Xin et.al. contributed to Communications of ACM: “Apache Spark: A Unified Engine for Big Data Processing.”
- Tim Hunter, Databricks software engineer, participated on the panel “Modern Software Architectures and Data Pipelines” at Scala by the Bay.
- GraphFrames 0.3.0 released as a spark package. Find out more from graphframes.github.io.
- Apache Spark 1.6.3 Released. Try it on Databricks Community Edition.
- Apache Spark 2.0.2 Released. Kafka 0.10 support and runtime metrics are the two notable features in Structured Streaming in this release. Try it on Databricks Community Edition.
- Databricks released spark-redshift v3.0.0-preview1 spark package with usability improvements. Learn more about its improvements at Redshift Data Source for Apache Spark.
Try Databricks for free. Get started today