Databricks Blog

Page 209

Improvements to Kafka integration of Spark Streaming

March 30, 2015 by Cody Koeninger, Davies Liu and Tathagata Das in Engineering

Apache Kafka is rapidly becoming one of the most popular open source stream ingestion platforms. We see the same trend among the users...

Topic modeling with LDA: MLlib meets GraphX

March 25, 2015 by Joseph Bradley in Engineering

Topic models automatically infer the topics discussed in a collection of documents. These topics can be used to summarize and organize documents, or...

What's new for Spark SQL in Apache Spark 1.3

March 24, 2015 by Michael Armbrust in Engineering

Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Using MongoDB with Apache Spark

March 20, 2015 by Matt Kalan in Engineering

Update August 4th 2016: Since this original post, MongoDB has released a new Databricks-certified connector for Apache Spark. See the updated blog post...

PanTera Big Data Visualization Leverages the Power of Databricks

March 18, 2015 by Cyrus Handy, Uncharted Software, Kaya Ellis, Uncharted Software and Robert Harper, Uncharted Software in Company

This is a guest blog from our one of our partners: Uncharted formerly known as Oculus Info, Inc. About PanTera TM PanTera was...

Databricks Launches "Jobs" Feature for Production Workloads

March 17, 2015 by Ali Ghodsi in Product

Databricks now includes a new feature called Jobs, enabling support for running production pipelines, consisting of standalone Spark applications. Jobs includes a scheduler...

Spark’ing an Anti Money Laundering Revolution

March 16, 2015 by Abhishek Mehta, Tresata in Company

This is a guest blog from our one of our partners: Tresata Tresata and Databricks announced a real-time, Apache Spark and Hadoop-powered Anti-Money...

Announcing Apache Spark 1.3!

March 13, 2015 by Patrick Wendell in Engineering

Today I’m excited to announce the general availability of Apache Spark 1.3! Apache Spark 1.3 introduces the widely anticipated DataFrame API, an evolution...

Sharethrough Selects Databricks to Discover Hidden Patterns in Ad Serving Platform

March 9, 2015 by Kavitha Mariappan and Dave Wang in Company

We’re really excited to announce that Sharethrough has selected Databricks to discover hidden patterns in customer behavior data. Sharethrough builds software for delivering...

Radius Intelligence implements Databricks for real-time insights on targeted marketing campaigns

March 4, 2015 by Kavitha Mariappan and Dave Wang in Company

We’re thrilled to share that Radius Intelligence has selected Databricks as its preferred big data processing platform, to deliver real-time insights in support...