Databricks Plans and Pricing

Databricks on AWS

One platform for all your workloads.

 

Data Engineering Light

Run jobs on Databricks Automated Clusters
Standard Plan ?

This is pricing for the Databricks Standard Plan only. Please visit the Databricks pricing page for more details including pricing by instance type.

$0.07 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Standard Plan with
Operational Security
?

This is pricing for the Standard Plan with Operational Security SKU only. Pricing for other applicable AWS resources will also apply.

$0.10 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

 

Data Engineering

Run jobs on Databricks Automated Clusters, with optimized runtime for better performance
Standard Plan ?

This is pricing for the Databricks Standard Plan only. Pricing for other applicable resources will also apply. Please visit the Databricks pricing page for more details including pricing by instance type.

$0.15 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Standard Plan with
Operational Security
?

This is pricing for the Standard Plan with Operational Security SKU only. Pricing for other applicable AWS resources will also apply.

$0.20 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

 

Data Analytics

Collaborate on projects, notebooks, and experiments with Databricks Interactive Clusters
Standard Plan ?

This is pricing for the Databricks Standard Plan only. Pricing for other applicable resources will also apply. Please visit the Databricks pricing page for more details including pricing by instance type.

$0.40 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Standard Plan with
Operational Security
?

This is pricing for the Standard Plan with Operational Security SKU only. Pricing for other applicable AWS resources will also apply.

$0.55 / DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.


The pricing shown above is for Databricks services only. It does not include pricing for any required AWS resources (e.g. compute instances).

Standard Plan Features

 

Data Engineering Light

X
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
 
Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale
  • ACID transactions
  • Schema management
  • Batch/Stream read/write Support
  • Data versioning
  • Performance optimizations
If you are an existing Databricks customer, please reach out to your account executive about Delta pricing.
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

Data Engineering

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
 
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
 
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

Data Analytics

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
X
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
X
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
X
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC

Standard Plan with Operational Security Features

 

Data Engineering Light

All Standard Plan features
X
Single sign on with SAML 2.0 support
X
Role-based access control for notebooks, clusters, jobs, tables
X
 

Data Engineering

All Standard Plan features
X
Single sign on with SAML 2.0 support
X
Role-based access control for notebooks, clusters, jobs, tables
X
 

Data Analytics

All Standard Plan features
X
Single sign on with SAML 2.0 support
X
Role-based access control for notebooks, clusters, jobs, tables
X

Custom Deployment Add-on

For those requiring additional customization

Custom pricing

You can choose from one up to all of the following:

  • Single tenant deployment
  • AWS GovCloud
  • HIPAA compliant
  • Audit logs
  • Restricted network access for end users
  • Customized CIDR range
  • No public IPs for worker nodes

Pricing Estimator

Use this estimator to understand how Databricks charges for different workloads

Operational Security
No Yes
?Monthly Total:

Grand total:
This is the total monthly consumption rate.

Cluster Type
?AWS Instance Type

View the instance types that Databricks supports.


?# Instances

#Instances:
This is the number of AWS instances used for running driver
and worker nodes, value must be greater than or equal to 2.


Hours / Day
Days / Month
? Instance Hours: 0

Instance Hours Subtotal:
This is the monthly hours for the given instance

? Usage (DBUs): 0.00

Usage DBUs Subtotal:
This is the monthly DBU consumption for given cluster and instance

? Price/month: $0.00

Price Subtotal:
This is the monthly price for the given cluster and instance

?Monthly Total:

Grand total:
This is the total monthly consumption rate.

Please note that the specific instances chosen and amount of usage are very specific to your workload (ETL, streaming, ad-hoc queries, etc.). Engaging with your Databricks contact will help you select appropriate values. The estimator calculates your Databricks usage charges only (and does not include other fees that may be applicable like platform charges). Please discuss with your Databricks contact regarding single-tenant use.


FAQs

 

What is a DBU?

A Databricks Unit (“DBU”) is a unit of processing capability per hour, billed on per-second usage. Databricks supports many AWS EC2 instance types. The larger the instance is, the more DBUs you will be consuming on an hourly basis. For example, 1 DBU is the equivalent of Databricks running on a c4.2xlarge machine for an hour. See the full list of supported instances and details.

What is the difference between Automated Clusters and Interactive Clusters?

Automated workloads are workloads running on automated clusters. Automated clusters are clusters that are both started and terminated by the same Job.Only one job can be run on an automated cluster for isolation purposes.

Interactive workloads are workloads running on interactive clusters. Interactive clusters are clusters that are not classified as automated clusters. They can be used for various purposes such as running commands within Databricks notebooks, connecting via JDBC/ODBC for BI workloads, running MLflow experiments on Databricks. Multiple users can share an interactive cluster for doing interactive analysis in a collaborative way.

There are two cluster options for production jobs – Data Engineering Light and Data Engineering. How do I decide which one to use?

Data Engineering Light is Databricks’ equivalent of open source Apache Spark. It targets simple, non-critical workloads that don’t need the performance, reliability, or autoscaling benefits provided by Databricks’ proprietary technologies. In comparison, the Data Engineering cluster provides you with all of the aforementioned benefits to boost your team productivity and reduce your total cost of ownership.

What’s the difference between production and interactive analysis workloads?

Production workloads (automated workloads) are defined as jobs that both start and terminate the clusters on which they run. For example, a workload may be triggered by the Databricks Job Scheduler which launches a new Apache Spark cluster solely for the job and automatically terminates the cluster after the job is complete.

Interactive analysis workloads are workloads that are not automated workloads, e.g., running a command within Databricks notebooks. These commands run on Apache Spark clusters that may persist until manually terminated. Multiple users can share a cluster for doing interactive analysis in a collaborative way.

What does the free trial include?

The 14-day free trial gives you access to all Databricks features except the Databricks Operational Security Package and Custom Deployment options. Contact us if you are interested in Databricks Operational Security and / or Custom Deployment options.

Note that during trial, AWS will bill you directly for the EC2 instances created in Databricks.

What happens after the free trial?

At the end of the trial, you are automatically subscribed to Databricks without Databricks Enterprise Security. You can cancel your subscription at any time.

What is Databricks Community Edition?

Databricks Community Edition is a free, limited functionality platform designed for anyone who wants to learn Spark. Sign up here.

How will I be billed?

By default, you will be billed monthly based on per-second usage on your credit card. Contact us for more billing options, such as billing by invoice or an annual plan.

Do you provide technical support?

We offer technical support with annual commitments. Contact us to learn more or get started.