Deep Learning with Databricks

Complex Data, Tremendous Opportunity

Deep learning is the ideal way to provide big data predictive analytics solutions as data volume and complexity continues to grow, creating a need for increased processing power and more advanced graphics processors.

With deep learning, organizations are able to harness the power of unstructured data such as images, text, and voice to deliver transformative use cases that leverage techniques like AI, image interpretation, automatic translation, natural language processing, and more.

Common Use Cases

Recognize and categorize images for easy sorting and more accurate search.
Fast object detection to make autonomous cars and face recognition a reality.
Accurately understanding spoken words to power new tools like speech-to-text and home automation.

Challenges of Deep Learning

While Big Data and AI offers a ton of potential, extracting actionable insights from Big Data is not an ordinary task. The large and rapidly growing body of information hidden in unstructured data (images, sound, text, etc) requires both the development of advanced technologies and interdisciplinary teams — data engineering, data science, and business — working in close collaboration.

Reliance on separate frameworks and tools (TensorFlow, Keras, PyTorch, MXNet, Caffe, CNTK, Theano) that offer low level APIs with steep learning curves.
Providing the infrastructure to support deep learning can require significant amounts of costly resources and computational power to scale.
Training an accurate deep learning model can be manually intensive on data scientists — often requiring labeling of data and tuning of parameters.

Democratizing Deep Learning

The Databricks Unified Analytics Platform powered by Apache Spark™ allows you to build reliable, performant, and scalable deep learning pipelines that enable data scientists to build, train, and deploy deep learning applications with ease.

Fully managed, serverless cloud infrastructure for isolation, cost control and elasticity. Provides an interactive environment to make it easy to work with major frameworks such as TensorFlow, Keras, PyTorch, MXNet, Caffe, CNTK, and Theano.
A single platform to handle data preparation, exploration, model training, and large-scale prediction. High level APIs and example applications let you easily leverage state of the art models.
A highly performant Databricks Runtime powered by Apache Spark and built to run on powerful GPU hardware at scale.
Collaborate with your team across multiple programming languages to explore data and train deep learning models against real time data sets.

Customer Use Cases classifies images to improve engagement and conversions, increasing processing capacity by 20x while being able to model against 100% of their dataset.  Video

Giphy uses Databricks to understand various image properties (scene, label, colors, etc.) against tens of millions of GIFs to provide better search results and recommendations.

Voicebox leverages natural language processing to identify context in human conversations to deliver smarter AI application such as voice controlled devices and personal assistants.

With Databricks, Riot Games has the ability to understand and detect abusive language within actual gameplay in real-time has helped increase customer satisfaction, retention, and lifetime value. Video