With the Databricks Runtime 7.2 release, we are introducing a new magic command
%tensorboard. This brings the interactive TensorBoard experience Jupyter notebook users expect to their Databricks notebooks. The
%tensorboard command starts a TensorBoard server and embeds the TensorBoard user interface inside the Databricks notebook for data scientists and machine learning engineers to visualize and debug their machine learning projects. We’ve made it much easier to use TensorBoard in Databricks.
In 2017, we released the
dbutils.tensorboard.start() API to manage and use TensorBoard inside Databricks python notebooks. This API only permits one active TensorBoard process on a cluster at any given time – which hinders multi-tenant use-cases. Early last year, TensorBoard released its own API for notebooks via the
%tensorboard python magic command. This API not only starts TensorBoard processes but also exposes the TensorBoard’s command line arguments in the notebook environment. In addition, it embeds the TensorBoard UI inside notebooks, whereas the
dbutils.tensorboard.start API prints a link to open TensorBoard in a new tab.
Upgrading to the
%tensorboard magic command in Databricks has allowed us to take advantage of TensorBoard’s new API features. It is now possible to have multiple concurrent TensorBoard processes on a cluster as well as to interact with a TensorBoard UI inline in a notebook.
We’ve built upon the TensorBoard experience to better integrate it into the Databricks workflow:
- A link on top of the embedded TensorBoard UI to open TensorBoard in a new browser tab.
- Notebook-scoped process re-use to improve performance.
- The ability to stop a notebook’s TensorBoard servers and free up cluster resources by detaching the notebook or clearing its state.
With the introduction of
%tensorboard magic command we are deprecating
dbutils.tensorboard.start and plan to remove it in a future major Databricks Runtime release.
How to get started
Here’s how you can quickly start using
%tensorboard in your machine learning project. Inside your Databricks notebook:
%load_ext tensorboardto enable the
- Start and view your TensorBoard by running
%tensorboard --logdir $experiment_log_dir, where
experiment_log_diris the path to a directory in DBFS dedicated to TensorBoard logs.
- Use TensorBoard callbacks or TensorFlow or PyTorch file writers to generate logs during your training process. To make sure your logs are separated by runs, set the function’s log directory to a run specific subdirectory in DBFS. For TensorFlow, this is as simple as:
import datetime from tensorflow.keras import callbacks log_dir = experiment_log_dir + datetime.datetime.now().strftime("%Y%m%d-%H%M%S") tensorboard_callback = callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1) # in your model.fit call model.fit( … callback=[ … tensorboard_callback ] )
- Refresh your TensorBoard user interface to visualize your training process using the data you just generated
For an end-to-end example, check out this notebook using TensorBoard in a TensorFlow project.
For more details on using
%tensorboard in Databricks, you can read our official documentation.