Skip to main content
<
Page 64
>

SQL Subqueries in Apache Spark 2.0

Try this notebook in Databricks In the upcoming Apache Spark 2.0 release, we have substantially expanded the SQL standard capabilities. In this brief...

Apache Spark 2.0: An Anthology of Technical Assets

June 1, 2016 by Jules Damji in
Older anthologies collated a collection of contributions from various authors around a theme—bounded then as a journal or periodical. Newer anthologies include multiple...

Apache Spark 2.0 Preview: Machine Learning Model Persistence

May 31, 2016 by Joseph Bradley in
Introduction Consider these Machine Learning (ML) use cases: A data scientist produces an ML model and hands it over to an engineering team...

Genome Sequencing in a Nutshell

May 24, 2016 by Deborah Siegel in
This is a guest post from Deborah Siegel from the Northwest Genome Center and the University of Washington with Denny Lee from Databricks...

Parallelizing Genome Variant Analysis

May 24, 2016 by Deborah Siegel in
This is a guest post from Deborah Siegel from the Northwest Genome Center and the University of Washington with Denny Lee from Databricks...

Predicting Geographic Population using Genome Variants and K-Means

May 24, 2016 by Deborah Siegel in
Spark Summit 2016 will be held in San Francisco on June 6–8. Check out the full agenda and get your ticket This is...

Apache Spark as a Compiler: Joining a Billion Rows per Second on a Laptop

When our team at Databricks planned our contributions to the upcoming Apache Spark 2.0 release, we set out with an ambitious goal by...

Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles

Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...

Technical Preview of Apache Spark 2.0 Now on Databricks

May 11, 2016 by Reynold Xin in
For the past few months, we have been busy contributing to the next major release of the big data open source software we...

New Content in Databricks Community Edition

April 12, 2016 by Ion Stoica in
At the Spark Summit New York , we announced Databricks Community Edition (CE) beta. CE is a free version of the Databricks service...