10 Things I Wish I Knew Before Using Apache SparkRDecember 28, 2016 by Neil Dewar in Engineering This is a guest post from Neil Dewar , a senior data science manager at a global asset management firm. In this blog...
Deep Learning on DatabricksDecember 21, 2016 by Joseph Bradley and Tim Hunter in Engineering We are excited to announce the general availability of Graphic Processing Unit (GPU) and deep learning support on Databricks! This blog post will...
Scalable Partition Handling for Cloud-Native Architecture in Apache Spark 2.1December 15, 2016 by Eric Liang, Michael Allman and Wenchen Fan in Engineering Apache Spark 2.1 is just around the corner: the community is going through voting process for the release candidates. This blog post discusses...
On Demand Webinar and FAQ: Apache Spark MLlib 2.x: Migrating ML Workloads to DataFramesDecember 14, 2016 by Joseph Bradley and Jules Damji in Company Last week, we held a live webinar, Apache Spark MLlib 2.x: Migrating ML Workloads to DataFrames , to demonstrate the ease with which...
Apache Spark Scala Library Development with DatabricksDecember 12, 2016 by Jason Pohl in Company Try this notebook in Databricks The movie Toy Story was released in 1995 by Pixar as the first feature-length computer animated film. Even...
Integrating Apache Airflow and Databricks: Building ETL pipelines with Apache SparkDecember 8, 2016 by Peyman Mohajerian in Product This is one of a series of blogs on integrating Databricks with commonly used software packages. See the “What’s Next” section at the...
On-Demand Webinar and FAQ: How to Evaluate Cloud-based Apache Spark PlatformsNovember 23, 2016 by Wayne Chan in Company Last week, we held a live webinar, How to Evaluate Cloud-based Apache Spark Platforms , to help those who are currently evaluating various...
Oil and Gas Asset Optimization with AWS Kinesis, RDS, and DatabricksNovember 16, 2016 by Don Hillborn in Product The key to success is consistently making good decisions, and the key to making good decisions is having good information. This belief is...
Databricks Bi-Weekly Apache Spark Digest: 11/16/16November 15, 2016 by Jules Damji in Engineering Spark Summit Talks and Apache Spark Roundup Databricks and partners set a new world record for CloudSort 2016 Benchmark using Apache Spark...
$1.44 per terabyte: setting a new world record with Apache SparkNovember 14, 2016 by Reynold Xin in Engineering We are excited to share with you that a joint effort by Nanjing University, Alibaba Group, and Databricks set a new world record...