Skip to main content

Databricks Adds Deep Learning Support to Cloud-Based Apache Spark Platform

Company Delivers Comprehensive Deep Learning Toolkit for Big Data with GPUs Alongside CPUs

October 27, 2016
Share this post

SAN FRANCISCO, CA--(Marketwired - Oct 27, 2016) - Databricks®, the company founded by the creators of the Apache® Spark™ project, today announced the addition of deep learning support to its cloud-based Apache Spark platform. This enhancement adds GPU support and integrates popular deep learning libraries to the Databricks' big data platform, extending its capabilities to enable the rapid development of deep learning models. Data scientists looking to combine deep learning with big data -- whether it's recognizing handwriting, translating speech between languages, or distinguishing between malignant and benign tumors -- can now utilize Databricks for every stage of their workflow, from data wrangling to model tuning. Databricks is the first to integrate these diverse workloads in a fast, secure, and easy-to-use Apache Spark platform in the cloud.

Apache Spark and Deep Learning
The 2016 Spark Survey found that machine learning usage in production saw a 38 percent increase since 2015, making it one of Spark's key growth areas. Many leaders in machine learning, such as Yahoo, are choosing Spark for deep learning to achieve groundbreaking results with big data.

In March 2016, Databricks created and open sourced TensorFrames, a software library that enables the popular deep learning framework, TensorFlow to run on Spark. The enhancements announced today simplify deep learning on Spark by adding out-of-the-box support for using TensorFrames with GPUs -- specialized hardware that can perform an impressive amount of deep learning-specific computations in parallel. With Databricks, data teams can easily conduct deep learning on highly optimized hardware with a few clicks or API calls.

"We are proud to enable organizations to achieve better results in their mission-critical applications and are always looking ahead at the latest technologies -- such as deep learning -- to provide the Spark community with the most flexible, approachable big data toolset," said Ali Ghodsi, CEO and Cofounder at Databricks.

End-to-End Deep Learning with Databricks
Databricks allows organizations to perform data wrangling, interactive exploration, stream data processing, and other advanced analytics techniques alongside deep learning in a comprehensive platform. By seamlessly combining these techniques on Databricks, organizations can avoid unwanted system complexities and simplify the development of deep learning applications such as:

  • More timely and accurate cancer detection for healthcare providers: To read and interpret pathology images with higher accuracy than humans;
  • Faster drug discovery for pharma: To predict therapeutic uses of drugs at earlier stages to speed up the development and sales pipelines;
  • More capable artificial intelligence, such as language translation: To translate spoken speech with computers at an accuracy that rivals human performance.

"Today's dynamic data teams are applying a broad range of analytic tools to more data, but requiring insights and faster ROI," said Tony Baer, Principal Analyst at Ovum. "With the Databricks' platform, they can easily utilize the latest innovations, whether it's Spark Streaming or deep learning, enabling them to build and deploy sophisticated business applications, in a simpler and faster way."

Read the blog to learn more: https://www.databricks.com/blog/2016/10/27/gpu-acceleration-in-databricks.html

Contact Databricks to get started: https://www.databricks.com/company/contact

About Databricks

Databricks' vision is to empower anyone to easily build and deploy advanced analytics solutions. The company was founded by the team who created Apache® Spark™, a powerful open source data processing engine built for sophisticated analytics, ease of use, and speed. Databricks is the largest contributor to the open source Apache Spark project. The company has also trained over 20,000 users on Apache Spark, and has the largest number of customers deploying Spark to date. Databricks provides a just-in-time data platform, to simplify data integration, real-time experimentation, and robust deployment of production applications. Databricks is venture-backed by Andreessen Horowitz and NEA. For more information, contact [email protected].

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.

Recent Press Releases

Databricks is Raising $10B Series J Investment at $62B Valuation
Read Now
Databricks Expands Presence in the Middle East, Launching in the Kingdom of Saudi Arabia
Read Now
Databricks Advances Data and AI Innovation in the UK Public Sector
Read Now
Databricks Announces Over 70% Annualized Growth in France as Demand for the Data Intelligence Platform Grows
Read Now
Databricks Completes the Financial Security Institute’s Security and Safety Assessment for Cloud Service Providers in Korea
Read Now
View All