AWS Pricing

One platform for all your workloads.

Standard

$0.07/DBU Jobs Light Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.
$0.15/DBU Jobs Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.
$0.40/DBU All-Purpose Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Calculate Price

  • Managed Apache Spark
  • Optimized Delta Lake
  • Cluster autopilot
  • Notebooks & collaboration
  • Connectors & Integration
  • ML Runtime
  • Managed MLflow
  • Single Sign-On (SSO)

Premium

$0.10/DBU Jobs Light Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.
$0.20/DBU Jobs Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.
$0.55/DBU All-Purpose Compute
Databricks Unit (DBU)
A unit of processing capability per hour, billed on a per-second usage. View the instance types Databricks supports.

Calculate Price

  • Standard Features +
  • Optimized autoscaling
  • Federated IAM
  • Role-based Access Control
  • Audit Log delivery
  • Customer Managed VPC (preview)
  • Token Management API (preview)
  • Policies (preview)
  • Secure Cluster Connectivity (preview)

Dedicated

Custom Pricing

Pricing solutions tailored to your company’s needs

  • Enterprise Features +
  • PVC (Private Cloud)
  • Enterprise Customization

The pricing shown above is for Databricks services only. It does not include pricing for any required AWS resources (e.g. compute instances).

Workload Types Feature Comparison

 

Jobs Light
Compute

Run Databricks jobs on Jobs Light clusters with Databricks' open source Spark runtime

X
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
 
Easy to manage and cost-effective clusters
  • Autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
 
Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale
  • ACID transactions
  • Schema management
  • Batch/Stream read/write Support
  • Data versioning
  • Performance optimizations
If you are an existing Databricks customer, please reach out to your account executive about Delta pricing.
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

Jobs
Compute

Run Databricks jobs on Jobs clusters with Databricks' optimized runtime for massive performance and scalability improvement

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
X
 

Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale
  • ACID transactions
  • Schema management
  • Batch/Stream read/write Support
  • Data versioning
  • Performance optimizations
If you are an existing Databricks customer, please reach out to your account executive about Delta pricing.
 
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
 
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
 
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC
 

All-Purpose
Compute

Run any workloads on All-Purpose clusters including interactive data science and analysis, BI workloads via JDBC/ODBC, MLflow experiments, Databricks jobs, etc

X
 
Apache Spark on Databricks platform
  • Clusters for running production jobs
  • Alerting and monitoring with retries
X
 
Easy to run production jobs including streaming with monitoring
  • Scheduler for running libraries
  • Production streaming with monitoring
X
 
Ability to use Scala, Python, R and SQL notebooks and notebook workflows
  • Schedule Scala, Python, R, SQL notebooks
  • Notebook workflows
X
 
Easy to manage and cost-effective clusters
  • Autoscaling of compute
  • Autoscaling of instance storage
  • Automatic start and termination of clusters
X
 
Out-of-the-box ML frameworks
  • Apache Spark / Horovod integration
  • XGBoost support
  • TensorFlow, PyTorch and Keras support
X
 
Run MLflow on Databricks platform to simplify the end-to-end ML lifecycle
  • MLflow remote execution on Databricks platform
  • Databricks managed tracking server
  • Run MLflow from outside of Databricks (usage may be subject to a limit)
If you want to call Managed MLflow from outside of Databricks, please contact us to get started.
Managed Delta Lake
X
 

Robust pipelines serving clean, quality data supporting high performance batch and streaming analytics at scale
  • ACID transactions
  • Schema management
  • Batch/Stream read/write Support
  • Data versioning
  • Performance optimizations
If you are an existing Databricks customer, please reach out to your account executive about Delta pricing.
X
 
High-concurrency mode for multiple users
  • Persistent clusters for analytics
  • High concurrency clusters for multi-user sharing
X
 
Highly productive work among analysts and with other colleagues
  • Scala, Python, SQL and R notebooks
  • One-click visualization
  • Interactive dashboards
  • Collaboration
  • Revision history
  • Version control systems integration (Github, Bitbucket)
X
 
Ability to work with RStudio® and a range of third party BI tools
  • RStudio integration
  • BI Integration through JDBC/ODBC

AWS Pricing FAQs

 

What is a DBU?

A Databricks Unit (“DBU”) is a unit of processing capability per hour, billed on per-second usage. Databricks supports many AWS EC2 instance types. The larger the instance is, the more DBUs you will be consuming on an hourly basis. For example, 1 DBU is the equivalent of Databricks running on a c4.2xlarge machine for an hour. See the full list of supported instances and details.

What is the difference between Jobs workloads and All-Purpose workloads?

Jobs workloads are workloads running on Jobs clusters. Jobs clusters are clusters that are both started and terminated by the same Job. Only one job can be run on a Jobs cluster for isolation purposes.

All-Purpose workloads are workloads running on All-Purpose clusters. All-Purpose clusters are clusters that are not classified as Jobs clusters. They can be used for various purposes such as running commands within Databricks notebooks, connecting via JDBC/ODBC for BI workloads, running MLflow experiments on Databricks. Multiple users can share a All-Purpose cluster for doing interactive analysis in a collaborative way.

There are two cluster options for jobs – Jobs cluster and Jobs Light cluster. How do I decide which one to use?

Jobs Light cluster is Databricks’ equivalent of open source Apache Spark. It targets simple, non-critical workloads that don’t need the performance, reliability, or autoscaling benefits provided by Databricks’ proprietary technologies. In comparison, the Jobs cluster provides you with all of the aforementioned benefits to boost your team productivity and reduce your total cost of ownership.

What does the free trial include?

The 14-day free trial gives you access to either Standard or Premium feature sets depending on your choice of the plan. Contact us if you are interested in Databricks Enterprise or Dedicated plan for custom deployment and other enterprise customizations.

Note that during trial, AWS will bill you directly for the EC2 instances created in Databricks.

What happens after the free trial?

At the end of the trial, you are automatically subscribed to the plan that you have been on during the free trial. You can cancel your subscription at any time.

What is Databricks Community Edition?

Databricks Community Edition is a free, limited functionality platform designed for anyone who wants to learn Spark. Sign up here.

How will I be billed?

By default, you will be billed monthly based on per-second usage on your credit card. Contact us for more billing options, such as billing by invoice or an annual plan.

Do you provide technical support?

We offer technical support with annual commitments. Contact us to learn more or get started.

I want to process protected health information (PHI) within Databricks / I want a HIPAA-compliant deployment. Is there anything I need to know to get started?

You must contact us for a HIPAA-compliant deployment. Please note that prior to processing any PHI data in Databricks, a signed business associate agreement (BAA) must be in place between your organization and (a) Databricks, Inc.; and (b) because you must have your own account with AWS to deploy Databricks on AWS, Amazon Web Services. Please see here for more details.

For features marked as “(Preview)”, what does that mean? Will these features be automatically turned on?

Please contact us to get access to preview features.

For intra-node encryption for Spark, is there anything I need to do to turn it on?

Yes. We provide our customers with the ability to decide for themselves whether the tradeoffs for additional encryption are necessary given the workloads being processed. Please contact us to enable it.