Skip to main content

Mosaic AI

Build and deploy production-quality ML and GenAI applications

Databricks Mosaic AI provides unified tooling to build, deploy and monitor AI and ML solutions — from building predictive models to the latest GenAI and large language models (LLMs). Built on the Databricks Data Intelligence Platform, Mosaic AI enables organizations to securely and cost-effectively integrate their enterprise data into the AI lifecycle. 

We Put the Company First

Complete Control

Maintain ownership over both the models and the data

secondary-icon-graphic-10

Production Quality

Deliver accurate, safe and governed AI applications

Value Action

Lower Cost

Train and serve your own custom LLMs at 10x lower cost

Start building your generative AI solution

Start building your generative AI solution

There are four architectural patterns to consider when building a large language model–based solution, including prompt engineering, retrieval augmented generation (RAG), fine-tuning and pretraining. Databricks is the only provider that enables all four generative AI architectural patterns, ensuring you have the most options and can evolve as your business requirements change.

Complete ownership over your models and data

Complete ownership over your models and data

Mosaic AI is part of the Databricks Data Intelligence Platform, which unifies data, model training and production environments in a single solution. You can securely use your enterprise data to augment, fine-tune or build your own machine learning and generative AI models, powering them with a semantic understanding of your business without sending your data and IP outside your walls.

Deploy and govern all your AI models centrally

Deploy and govern all your AI models centrally

Model Serving is a unified service for deploying, governing and querying AI models. Our unified approach makes it easy to experiment with and productionize models. This includes:

  • Custom ML models like PyFunc, scikit-learn and LangChain
  • Foundation models (FMs) on Databricks like Llama 2, MPT, Mistral and BGE
  • Foundation models hosted elsewhere like ChatGPT, Claude 2, Cohere and Stable Diffusion
Monitor data, features and AI models

Monitor data, features and AI models in one place

Lakehouse Monitoring provides a single, unified monitoring solution inside the Databricks Data Intelligence Platform. It monitors the statistical properties and quality of all tables with a single click. For applications powered by generative AI, it can scan outputs for toxic and unsafe content as well as diagnose errors.

Govern and track lineage across the full AI lifecycle

Govern and track lineage across the full AI lifecycle — from data to models

Enforce proper permissions, set rate limits and track lineage to meet stringent security and governance requirements. All ML assets from data to models can be governed with a single tool, Unity Catalog, to help ensure consistent oversight and control at every stage of the ML lifecycle through development, deployment and maintenance.

Train and serve your own custom LLMs at 10x lower cost

Train and serve your own custom LLMs at 10x lower cost

With Mosaic AI, you can build your own custom large language model from scratch to ensure the foundational knowledge of the model is tailored to your specific domain. By training on your organization’s IP with your data, it creates a customized model that is uniquely differentiated. Databricks Mosaic AI Training is an optimized training solution that can build new multibillion-parameter LLMs in days with up to 10x lower training costs.

Collaborative Notebooks Card Image

Collaborative Notebooks

Databricks Notebooks natively support Python, R, SQL and Scala so practitioners can work together with the languages and libraries of their choice to discover, visualize and share insights.

Learn more
Graphic

Runtime for Machine Learning

One-click access to preconfigured ML-optimized clusters, powered by a scalable and reliable distribution of the most popular ML frameworks (such as PyTorch, TensorFlow and scikit-learn), with built-in optimizations for unmatched performance at scale.

Learn more
Graphic

Feature Store

Facilitate the reuse of features with a data lineage–based feature search that leverages automatically logged data sources. Make features available for training and serving with simplified model deployment that doesn’t require changes to the client application.

Learn more
Graphic

AutoML

Empower everyone from ML experts to citizen data scientists with a “glass box” approach to AutoML that delivers not only the highest performing model, but also generates code for further refinement by experts.

Learn more
mlflow

Managed MLflow

Built on top of MLflow — the world’s leading open source platform for the ML lifecycle — Managed MLflow helps ML models quickly move from experimentation to production, with enterprise security, reliability and scale.

Learn more
Graphic

Production-Grade Model Serving

Serve models at any scale with one-click simplicity, with the option to leverage serverless compute.

Learn more
Graphic

Model Monitoring

Monitor model performance and how it affects business metrics in real time. Databricks delivers end-to-end visibility and lineage from models in production back to source data systems, helping analyze model and data quality across the full ML lifecycle, and pinpoint issues before they have a damaging impact.

Learn more
Graphic

Repos

Repos allows engineers to follow Git workflows in Databricks, enabling data teams to leverage automated CI/CD workflows and code portability.

Learn more
Graphic

Large Language Models

Databricks makes it simple to deploy, govern, query and monitor access to LLMs and integrate them into your workflows, and provides platform capabilities for augmenting (RAG) or fine-tuning LLMs using your own data, resulting in better domain performance. We also provide optimized tools to pretrain your own LLMs in days — at 10x lower cost.

Learn more