Skip to main content
<
Page 205
>

Simplify Machine Learning on Apache Spark with Databricks

June 3, 2015 by Denny Lee in
As many data scientists and engineers can attest, the majority of the time is spent not on the models themselves but on the...

Statistical and Mathematical Functions with DataFrames in Apache Spark

We introduced DataFrames in Apache Spark 1.3 to make Apache Spark much easier to use. Inspired by data frames in R and Python...

Databricks Launches MOOC: Data Science on Apache Spark

For the past several months, we have been working in collaboration with professors from the University of California Berkeley and University of California...

Tuning Java Garbage Collection for Apache Spark Applications

May 28, 2015 by Daoyuan Wang and Jie Huang in
This is a guest post from our friends in the SSG STO Big Data Technology group at Intel. Join us at the Spark...

NTT DATA: Operating Apache Spark clusters at thousands-core scale and use cases for Telco and IoT

This is a guest blog from our one of our partners: NTT DATA Corporation About NTT DATA Corporation NTT DATA Corporation is a...

Project Tungsten: Bringing Apache Spark Closer to Bare Metal

April 28, 2015 by Reynold Xin and Josh Rosen in
In a previous blog post , we looked back and surveyed performance improvements made to Apache Spark in the past year. In this...

Recent performance improvements in Apache Spark: SQL, Python, DataFrames, and More

April 24, 2015 by Reynold Xin in
Read Rise of the Data Lakehouse to explore why lakehouses are the data architecture of the future with the father of the data...

Big Graph Analytics with LynxKite & Apache Spark

April 23, 2015 by Daniel Darabos in
This is a guest blog from our one of our partners: Lynx Analytics About Lynx Analytics Lynx Analytics is a data analytics consultancy...

Analyzing Apache Access Logs with Databricks

April 21, 2015 by Ion Stoica and Vida Ha in
Databricks provides a powerful platform to process, analyze, and visualize big and small data in one place. In this blog, we will illustrate...

New MLlib Algorithms in Apache Spark 1.3: FP-Growth and Power Iteration Clustering

This is a guest blog post from Huawei’s big data global team. Huawei, a Fortune Global 500 private company, has put together a...