Introducing Databricks Runtime 5.1 for Machine Learning

Last week, we released Databricks Runtime 5.1 Beta for Machine Learning. As part of our commitment to provide developers with the latest deep learning frameworks, this release includes the best of these libraries. In particular, our PyTorch addition makes it simple for a developer to simply import the appropriate Python torch modules and start coding, without installing all of its myriad dependencies. In this blog, we briefly cover these additions.


PyTorch project is a popular deep learning Python package that provides GPU accelerated tensor computation and high-level functionalities for building deep learning networks. PyTorch provides flexible Tensors APIs that are similar to NumPy arrays but they can be accelerated on GPUs.

Several Databricks customers asked for built-in support for PyTorch, both for single-node and distributed deep learning applications using HorovodRunner. With this release, we are including Pytorch version 0.4.1 along with tensorboardX version 1.4. In the future releases, we plan to keep PyTorch support up to date.

To get started quickly, we have included a few examples of how to use PyTorch on Databricks for single-node and distributed deep learning in our user guide (see documentation below).

Updated TensorFlow

To keep abreast with the fast-moving TensorFlow project and provide our customers with its latest features, we have included the latest stable version of TensorFlow 1.12 as part of Databricks Runtime 5.1 ML Beta.

Other Machine Learning Packages

We updated the following packages