Databricks Plans and Pricing

Databricks on AWS

One platform for all your workloads.

Unified Analytics Platform

Data Engineering Light

Run jobs on Databricks Automated Clusters

$0.07/DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Data Engineering

Run jobs on Databricks Automated Clusters, with optimized runtime for better performance

$0.15/DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Data Analytics

Collaborate on projects, notebooks, and experiments with Databricks Interactive Clusters

$0.40/DBU ?

Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

The pricing shown above is for Databricks services only. It does not include pricing for any required AWS resources (e.g. compute instances).
 

Try Databricks Free

Feature Comparison

 

Data Engineering Light

X
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
 
Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale
  • ACID transactions
  • Schema management
  • Batch/Stream read/write Support
  • Data versioning
  • Performance optimizations
If you are an existing Databricks customer, please reach out to your account executive about Delta pricing.
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

Data Engineering

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
 
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
 
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

Data Analytics

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Optimized autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
X
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
X
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
X
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC

 

Add-Ons

Operational
Security

$0.15/DBU

For those requiring enterprise
security capabilities

This add-on includes all of the following:

  • Role-based access control for notebooks, clusters, jobs, tables
  • Single sign on with SAML 2.0 support

Custom
Deployment

Custom pricing

For those requiring additional
customization

You can choose from one up to all of the following:

  • Single tenant deployment
  • AWS GovCloud
  • HIPAA compliant
  • Audit logs
  • No public IPs for worker nodes
  • Customized CIDR range
  • Restricted network access for end users

Managed Delta Lake
(for existing customers on legacy model)

$0.15/ DBU

Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics

This add-on includes all of the following:

  • ACID transactions
  • Schema management
  • Batch/Stream read/write support
  • Data versioning
  • Performance optimizations

Pricing Estimator

Use this estimator to understand how Databricks charges for different workloads

Operational Security
No ($0.00/DBU) Yes ($0.15/DBU)
?Monthly Total:

Grand total:
This is the total monthly consumption rate.

Cluster Type
?AWS Instance Type

View the instance types that Databricks supports.


?# Instances

#Instances:
This is the number of AWS instances used for running driver
and worker nodes, value must be greater than or equal to 2.


Hours / Day
Days / Month
? Instance Hours: 0

Instance Hours Subtotal:
This is the monthly hours for the given instance

? Usage (DBUs): 0.00

Usage DBUs Subtotal:
This is the monthly DBU consumption for given cluster and instance

? Price/month: $0.00

Price Subtotal:
This is the monthly price for the given cluster and instance

Cluster Type
?AWS Instance Type

View the instance types that Databricks supports.


?# Instances

#Instances:
This is the number of AWS instances used for running driver
and worker nodes, value must be greater than or equal to 2.


Hours / Day
Days / Month
? Instance Hours: 0

Instance Hours Subtotal:
This is the monthly hours for the given instance

? Usage (DBUs): 0.00

Usage DBUs Subtotal:
This is the monthly DBU consumption for given cluster and instance

? Price/month: $0.00

Price Subtotal:
This is the monthly price for the given cluster and instance

Cluster Type
?AWS Instance Type

View the instance types that Databricks supports.


?# Instances

#Instances:
This is the number of AWS instances used for running driver
and worker nodes, value must be greater than or equal to 2.


Hours / Day
Days / Month
? Instance Hours: 0

Instance Hours Subtotal:
This is the monthly hours for the given instance

? Usage (DBUs): 0.00

Usage DBUs Subtotal:
This is the monthly DBU consumption for given cluster and instance

? Price/month: $0.00

Price Subtotal:
This is the monthly price for the given cluster and instance


Please note that the specific instances chosen and amount of usage are very specific to your workload (ETL, streaming, ad-hoc queries, etc.). Engaging with your Databricks contact will help you select appropriate values. The estimator calculates your Databricks usage charges only (and does not include other fees that may be applicable like platform charges). Please discuss with your Databricks contact regarding single-tenant use.


FAQs

 

What is a DBU?

A Databricks Unit (“DBU”) is a unit of processing capability per hour, billed on per-second usage. Databricks supports many AWS EC2 instance types. The larger the instance is, the more DBUs you will be consuming on an hourly basis. For example, 1 DBU is the equivalent of Databricks running on a c4.2xlarge machine for an hour. See the full list of supported instances and details.

What is the difference between Automated Clusters and Interactive Clusters?

Automated workloads are workloads running on automated clusters. Automated clusters are clusters that are both started and terminated by the same Job.Only one job can be run on an automated cluster for isolation purposes.

Interactive workloads are workloads running on interactive clusters. Interactive clusters are clusters that are not classified as automated clusters. They can be used for various purposes such as running commands within Databricks notebooks, connecting via JDBC/ODBC for BI workloads, running MLflow experiments on Databricks. Multiple users can share an interactive cluster for doing interactive analysis in a collaborative way.

There are two cluster options for production jobs – Data Engineering Light and Data Engineering. How do I decide which one to use?

Data Engineering Light is Databricks’ equivalent of open source Apache Spark. It targets simple, non-critical workloads that don’t need the performance, reliability, or autoscaling benefits provided by Databricks’ proprietary technologies. In comparison, the Data Engineering cluster provides you with all of the aforementioned benefits to boost your team productivity and reduce your total cost of ownership.

What’s the difference between production and interactive analysis workloads?

Production workloads (automated workloads) are defined as jobs that both start and terminate the clusters on which they run. For example, a workload may be triggered by the Databricks Job Scheduler which launches a new Apache Spark cluster solely for the job and automatically terminates the cluster after the job is complete.

Interactive analysis workloads are workloads that are not automated workloads, e.g., running a command within Databricks notebooks. These commands run on Apache Spark clusters that may persist until manually terminated. Multiple users can share a cluster for doing interactive analysis in a collaborative way.

Databricks Operational Security add-on package has an additional charge for DBU usage. Does it apply to all workload types?

Yes, Databricks Operational Security applies to all types of clusters offered. When you choose to deploy Databricks with Databricks Operational Security, an additional charge of $0.15 per DBU will be applied on top of the cluster price. Please contact us for details.

What does the free trial include?

The 14-day free trial gives you access to all Databricks features except the Databricks Operational Security Package and Custom Deployment options. Contact us if you are interested in Databricks Operational Security and / or Custom Deployment options.

Note that during trial, AWS will bill you directly for the EC2 instances created in Databricks.

What happens after the free trial?

At the end of the trial, you are automatically subscribed to Databricks without Databricks Enterprise Security. You can cancel your subscription at any time.

What is Databricks Community Edition?

Databricks Community Edition is a free, limited functionality platform designed for anyone who wants to learn Spark. Sign up here.

How will I be billed?

By default, you will be billed monthly based on per-second usage on your credit card. Contact us for more billing options, such as billing by invoice or an annual plan.

Do you provide technical support?

We offer technical support with annual commitments. Contact us to learn more or get started.