One cloud platform for massive scale data engineering and collaborative data science
Collaboration across the full data and machine learning lifecycle
Quickly access and explore data, find and share new insights, and build models collaboratively, with languages and tools of choice. Learn more about Notebooks.
One click access to preconfigured ML environments for augmented machine learning with state of the art and popular ML frameworks.
Learn more about ML Runtime.
Track and share experiments, reproduce runs, and manage models collaboratively from a central repository, from experimentation to production. Learn more about MLflow.
High quality data with great performance
Delta Lake brings data reliability and scalability to your existing data lake, with an open source transactional storage layer designed for the full data lifecycle. Learn more about Delta Lake.
Simple data processing on auto-scaling infrastructure. Powered by highly optimized Apache Spark™ for up to 50x performance gains. Learn more about Apache Spark.
Leverage your entire data lake, including streaming data, for the most complete BI reporting and visualizations.
A massively scalable and secure multi-cloud service running millions of machines every day
Give all your users the right access to the right data with comprehensive audit trails by using your existing cloud security policies and identity management system to create compliant, private, and isolated workspaces. Learn more about Platform Security.
Quickly spin up and down collaborative workspaces for any project while being equipped with the right tools to manage user access, control spend, audit usage, and analyze activity across every workspace, all while seamlessly enforcing user and data governance. Learn more about 360° Administration.
Use fully-configured data environments and API’s to quickly take initiatives from development to production. Once in production, data teams can use on-demand autoscaling to optimize performance and reduce down time of data pipelines and ML models in production by efficiently matching resources to demand. Learn more about Elastic Scalability.
More complete and recent data to drive insights for every team
In this talk, Jim Forsythe and Jan Neumann describe Comcast’s data and machine learning infrastructure built on Databricks Unified Data Analytics Platform. Comcast uses Databricks to train and fuel the machine learning models at the heart of these products and gain deeper insights into how its users use these products.