The Databricks Lakehouse Platform

One open, simple platform to store and manage all of your data for all of your analytics workloads

Delta Lake

Reliability and performance for data lakes

Reliable Data Lakes

Delta Lake brings data reliability and scalability to your existing data lake, with an open source transactional storage layer designed for the full data lifecycle. Learn more about Delta Lake.

Fast & Efficient Data Pipelines

Simple data processing on auto-scaling infrastructure. Powered by highly optimized Apache Spark™ for up to 50x performance gains.
Learn more about Apache Spark.

Data science and machine learning

Collaboration across the full data science and machine learning lifecycle

Collaborative Notebooks

Quickly access and explore data, find and share new insights, and build models collaboratively, with languages and tools of choice.
Learn more about Notebooks.

Optimized ML Environments

One click access to preconfigured ML environments for augmented machine learning with state of the art and popular ML frameworks.
Learn more about ML Runtime.

Complete ML Lifecycle

Track and share experiments, reproduce runs, and manage models collaboratively from a central repository, from experimentation to production. Learn more about MLflow.

Business analytics

More complete and recent data to drive insights for every team

Query data lakes with SQL

Run SQL workloads directly on your data lake to query and analyze your freshest data with up to 9x better price/performance than traditional cloud data warehouses.

Visualize and share insights

Quickly and easily visualize query results and organize visualizations into rich dashboards to share live insights with your team with automatic alerts for critical changes.

Broad BI tool integration

Use your preferred BI tools, like Tableau and Microsoft Power BI, with optimized connectors that provide fast performance, low latency, and high user concurrency to your data lake.

Enterprise security and administration

A massively secure and scalable multi-cloud platform running millions of machines every day

Platform Security

Give all your users the right access to the right data with comprehensive audit trails by using your existing cloud security policies and identity management system to create compliant, private, and isolated workspaces. Learn more about Platform Security.

360° Administration

Quickly spin up and down collaborative workspaces for any project while being equipped with the right tools to manage user access, control spend, audit usage, and analyze activity across every workspace, all while seamlessly enforcing user and data governance.
Learn more about 360° Administration.

Elastic Scalability

Use fully-configured data environments and API’s to quickly take initiatives from development to production. Once in production, data teams can use on-demand autoscaling to optimize performance and reduce down time of data pipelines and ML models in production by efficiently matching resources to demand. Learn more about Elastic Scalability.

Multi-cloud Management

Securely integrate a single platform into each cloud to enable your data teams to do data analytics and machine learning without asking your users to learn cloud-specific tools and processes. Learn more about Databricks for Microsoft Azure and Amazon Web Services.

Customer Success Story

Comcast’s Journey to Building an Agile Data and AI Platform at Scale

In this talk, Jim Forsythe and Jan Neumann describe Comcast’s data and machine learning infrastructure built on Databricks Unified Data Analytics Platform. Comcast uses Databricks to train and fuel the machine learning models at the heart of these products and gain deeper insights into how its users use these products.

Ready to Get Started?