Skip to main content
Page 1
>

Data-driven Software: Towards the Future of Programming in Data Science

This is a guest authored post by Tim Hunter , data scientist, and Rocío Ventura Abreu , data scientist, of ABN AMRO Bank...

Koalas: Easy Transition from pandas to Apache Spark

April 24, 2019 by Tony Liu, Tim Hunter and Cyrielle Simeone in
Today at Spark + AI Summit, we announced Koalas, a new open source project that augments PySpark’s DataFrame API to make it compatible...

Build, Scale, and Deploy Deep Learning Pipelines with Ease

September 6, 2017 by Sue Ann Hong and Tim Hunter in
At the Spark Summit in San Francisco in June , we announced an open-source project Deep Learning Pipelines . Deep Learning Pipelines provides...

A Vision for Making Deep Learning Simple

Try this notebook on Databricks When MapReduce was introduced 15 years ago, it showed the world a glimpse into the future. For the...

On-Demand Webinar and FAQ: Deep Learning and Apache Spark: Workflows and Best Practices

May 23, 2017 by Tim Hunter and Jules Damji in
On May 4th, we hosted a live webinar — Deep Learning and Apache Spark: Workflows and Best Practices . Rather than comparing deep...

Deep Learning on Databricks

December 21, 2016 by Joseph Bradley and Tim Hunter in
We are excited to announce the general availability of Graphic Processing Unit (GPU) and deep learning support on Databricks! This blog post will...

GPU Acceleration in Databricks

Databricks is adding support for Apache Spark clusters with Graphics Processing Units (GPUs), ready to accelerate Deep Learning workloads. With Spark deployments tuned...

Approximate Algorithms in Apache Spark: HyperLogLog and Quantiles

Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...

Introducing GraphFrames

We would like to thank Ankur Dave from UC Berkeley AMPLab for his contribution to this blog post. Databricks is excited to announce...

Auto-scaling scikit-learn with Apache Spark

February 8, 2016 by Tim Hunter and Joseph Bradley in
Data scientists often spend hours or days tuning models to get the highest accuracy. This tuning typically involves running a large number of...