Data-driven Software: Towards the Future of Programming in Data ScienceMay 4, 2021 by Tim Hunter and Rocio Ventura Abreau in Engineering Blog This is a guest authored post by Tim Hunter , data scientist, and Rocío Ventura Abreu , data scientist, of ABN AMRO Bank...
Koalas: Easy Transition from pandas to Apache SparkApril 24, 2019 by Tony Liu, Tim Hunter and Cyrielle Simeone in Solutions Today at Spark + AI Summit, we announced Koalas, a new open source project that augments PySpark’s DataFrame API to make it compatible...
Build, Scale, and Deploy Deep Learning Pipelines with EaseSeptember 6, 2017 by Sue Ann Hong and Tim Hunter in Announcements At the Spark Summit in San Francisco in June , we announced an open-source project Deep Learning Pipelines . Deep Learning Pipelines provides...
A Vision for Making Deep Learning SimpleJune 6, 2017 by Sue Ann Hong, Tim Hunter and Reynold Xin in Engineering Blog Try this notebook on Databricks When MapReduce was introduced 15 years ago, it showed the world a glimpse into the future. For the...
On-Demand Webinar and FAQ: Deep Learning and Apache Spark: Workflows and Best PracticesMay 23, 2017 by Tim Hunter and Jules Damji in Engineering Blog On May 4th, we hosted a live webinar — Deep Learning and Apache Spark: Workflows and Best Practices . Rather than comparing deep...
Deep Learning on DatabricksDecember 21, 2016 by Joseph Bradley and Tim Hunter in Engineering Blog We are excited to announce the general availability of Graphic Processing Unit (GPU) and deep learning support on Databricks! This blog post will...
GPU Acceleration in DatabricksOctober 27, 2016 by Joseph Bradley, Tim Hunter and Yandong Mao in Engineering Blog Databricks is adding support for Apache Spark clusters with Graphics Processing Units (GPUs), ready to accelerate Deep Learning workloads. With Spark deployments tuned...
Approximate Algorithms in Apache Spark: HyperLogLog and QuantilesMay 19, 2016 by Tim Hunter, Hossein Falaki and Joseph Bradley in Solutions Introduction Apache Spark is fast, but applications such as preliminary data exploration need to be even faster and are willing to sacrifice some...
Introducing GraphFramesMarch 3, 2016 by Ankur Dave, Joseph Bradley and Tim Hunter in Engineering Blog We would like to thank Ankur Dave from UC Berkeley AMPLab for his contribution to this blog post. Databricks is excited to announce...
Auto-scaling scikit-learn with Apache SparkFebruary 8, 2016 by Tim Hunter and Joseph Bradley in Engineering Blog Data scientists often spend hours or days tuning models to get the highest accuracy. This tuning typically involves running a large number of...