Skip to main content
<
Page 209
>

Introducing Streaming k-means in Apache Spark 1.2

January 28, 2015 by Jeremy Freeman in
Many real world data are acquired sequentially over time, whether messages from social media users, time series from wearable sensors, or — in...

Big data projects are hungry for simpler and more powerful tools: Survey validates Apache Spark is gaining developer traction!

January 26, 2015 by Kavitha Mariappan in
In partnership with Typesafe , we are excited to see the publication of the survey report representing the largest poll of Apache Spark...

Random Forests and Boosting in MLlib

January 21, 2015 by Joseph Bradley and Manish Amde in
This is a post written together with Manish Amde from Origami Logic. Apache Spark 1.2 introduces Random Forests and Gradient-Boosted Trees (GBTs) into...

Apache Spark Certified Developer exams available online!

January 16, 2015 by Kavitha Mariappan in
Complementing our on-going direct and partner-led Apache Spark training efforts, Databricks has teamed up with O'Reilly to offer the industry's first standard for...

Improved Fault-tolerance and Zero Data Loss in Apache Spark Streaming

January 15, 2015 by Tathagata Das in
Real-time stream processing systems must be operational 24/7, which requires them to recover from all kinds of failures in the system. Since its...

Databricks Expands Bay Area Presence, Moves HQ to San Francisco

January 13, 2015 by Databricks Press Office in
Highlights: Databricks Expands Bay Area Presence, Moves HQ to San Francisco Company Names Kavitha Mariappan as Marketing Vice President Press Release: https://finance.yahoo.com/news/databricks-expands-bay-area-presence-140000610.html San...

Spark SQL Data Sources API: Unified Data Access for the Apache Spark Platform

January 9, 2015 by Michael Armbrust in
Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

ML Pipelines: A New High-Level API for MLlib

MLlib’s goal is to make practical machine learning (ML) scalable and easy. Besides new algorithms and performance improvements that we have seen in...

Announcing Apache Spark Packages

December 22, 2014 by Patrick Wendell in
Today, we are happy to announce Apache Spark Packages ( http://spark-packages.org ), a community package index to track the growing number of open source packages and libraries that work with Apache Spark. Spark Packages makes it easy for users to find, discuss, rate, and install packages for any version of Spark, and makes it easy for developers to contribute packages.

Announcing Apache Spark 1.2

December 18, 2014 by Patrick Wendell in
We at Databricks are thrilled to announce the release of Apache Spark 1.2! Apache Spark 1.2 introduces many new features along with scalability...