Databricks Blog

Page 212

Application Spotlight: Trifacta

October 9, 2014 by Sean Kandel in Company Blog

This post is guest authored by our friends at Trifacta after having their data transformation platform “Certified on Spark.” Today we announced v2...

Sharethrough Uses Apache Spark Streaming to Optimize Advertisers' Return on Marketing Investment

October 7, 2014 by Russell Cardullo in Company Blog

This is a guest blog post from our friends at Sharethrough providing an update on how their use of Apache Spark has continued...

Apache Spark as a platform for large-scale neuroscience

October 1, 2014 by Jeremy Freeman in Engineering Blog

The brain is the most complicated organ of the body, and probably one of the most complicated structures in the universe. It’s millions...

Scalable Decision Trees in MLlib

September 29, 2014 by Manish Amde and Joseph Bradley in Engineering Blog

This is a post written together with one of our friends at Origami Logic. Origami Logic provides a Marketing Intelligence Platform that uses...

Guavus Embeds Apache Spark into its Operational Intelligence Platform Deployed at the World’s Largest Telcos

September 25, 2014 by Eric Carr in Partners

This is a guest blog post from our friends at Guavus - now a Certified Apache Spark Distribution - outlining how they leverage...

Apache Spark Improves the Economics of Video Distribution at NBC Universal

September 24, 2014 by Christopher Burdorf in Company Blog

This is a guest blog post from our friends at NBC Universal outlining their Apache Spark use case. Business Challenge NBC Universal is...

Databricks Reference Applications

September 23, 2014 by Vida Ha in Company Blog

At Databricks, we are often asked how to go beyond the basic Apache Spark tutorials and start building real applications with Spark. As...

Apache Spark 1.1: MLlib Performance Improvements

September 22, 2014 by Burak Yavuz in Engineering Blog

With an ever-growing community, Apache Spark has had it’s 1.1 release . MLlib has had its fair share of contributions and now supports...

Apache Spark 1.1: Bringing Hadoop Input/Output Formats to PySpark

September 17, 2014 by Nick Pentreath and Kan Zhang in Engineering Blog

This is a guest post by Nick Pentreath of Graphflow and Kan Zhang of IBM , who contributed Python input/output format support to Apache Spark 1.1. Two powerful features of Apache Spark include its native APIs provided in Scala, Java and Python, and its compatibility with any Hadoop-based input or output source. This language support means that users can quickly become proficient in the use of Spark even without experience in Scala, and furthermore can leverag

Apache Spark 1.1: The State of Spark Streaming

September 16, 2014 by Tathagata Das and Patrick Wendell in Engineering Blog

With Apache Spark 1.1 recently released, we’d like to take this occasion to feature one of the most popular Spark components - Spark...