Jupyter Notebook
What is a Jupyter Notebook?
A Jupyter Notebook is an open source web application that allows data scientists to create and share documents that include live code, equations, and other multimedia resources.
What are Jupyter Notebooks used for?
Jupyter notebooks are used for all sorts of data science tasks such as exploratory data analysis (EDA), data cleaning and transformation, data visualization, statistical modeling, machine learning, and deep learning.
What are the benefits of using Jupyter Notebooks?
Jupyter notebooks are especially useful for "showing the work" that your data team has done through a combination of code, markdown, links, and images. They are easy to use and can be run cell by cell to better understand what the code does.
Jupyter notebooks can also be converted to a number of standard output formats (HTML, Powerpoint, LaTeX, PDF, ReStructuredText, Markdown, Python) through the web interface. This flexibility makes it easy for data scientists to share their work with others.
How do Jupyter Notebooks work?
A Jupyter notebook has two components: a front-end web page and a back-end kernel. The front-end web page allows data scientists to enter programming code or text in rectangular "cells." The browser then passes the code to the back-end kernel which runs the code and returns the results.
What are the downsides of using Jupyter Notebooks?
- Difficult to maintain and keep in sync when collaboratively working on code.
- Difficult to operationalize your code when using Jupyter notebooks as they don't feature any built-in integration or tools for operationalizing your machine learning models.
- Difficult to scale — Jupyter notebooks are designed for single-node data science. If your data is too big to fit in your computer's memory, using Jupyter notebooks becomes significantly more difficult.
Are Jupyter Notebooks available on Databricks?
Looking for a powerful data science collaboration tool? Look no further than Databricks! Our notebooks allow you to work together with colleagues across engineering, data science, and machine learning teams in multiple languages, with built-in data visualizations, and operationalization with jobs. Sign up for a free trial.
Does Databricks offer support for Jupyter Notebooks?
Yes. Databricks clusters can be configured to use the IPython kernel in order to take advantage of the Jupyter ecosystem's open source tooling (display and output tools, for example). Databricks also offers support for importing and exporting .ipynb files, so you can easily pick up right where you left off in your Jupyter notebook, on Databricks — and vice versa. Finally, Databricks has long supported the core open source Jupyter libraries within the Databricks Machine Learning Runtime.
How to use the IPython kernel on Databricks
It's easy. View the documentation, then sign up for a free trial of Databricks.