Skip to main content

Mosaic AI

Build and deploy production-quality ML and GenAI applications

Databricks Mosaic AI provides unified tooling to build, deploy, evaluate and govern AI and ML solutions — from building predictive ML models to the latest GenAI apps. Built on the Databricks Data Intelligence Platform, Mosaic AI enables organizations to securely and cost-effectively build production-quality AI apps integrated with their enterprise data.

secondary-icon-graphic-10

Production Quality

Deliver accurate, safe and governed AI applications

We Put the Company First

Complete Control

Maintain ownership over both the models and the data

Value Action

Lower Cost

Train and serve your own custom LLMs at 10x lower cost

Start building your generative AI solution

Start building your generative AI solution

There are four architectural patterns to consider when building a large language model–based (LLM) solution, including prompt engineering, retrieval augmented generation (RAG), fine-tuning and pretraining. Databricks is the only provider that enables all four generative AI architectural patterns, ensuring you have the most options and can evolve as your business requirements change.

Complete ownership over your models and data

Complete ownership over your models and data

Mosaic AI is part of the Databricks Data Intelligence Platform, which unifies data, model training and production environments in a single solution. You can securely use your enterprise data to augment, fine-tune or build your own machine learning and generative AI models, powering them with a semantic understanding of your business without sending your data and IP outside your walls.

Deploy and govern all your AI models centrally

Deploy and govern all your AI models centrally

Model Serving is a unified service for deploying, governing and querying AI models. Our unified approach makes it easy to experiment with and productionize models. This includes:

  • Custom ML models like PyFunc, scikit-learn and LangChain
  • Foundation models (FMs) on Databricks like Llama 3, MPT, Mistral and BGE
  • Foundation models hosted elsewhere like ChatGPT, Claude 3, Cohere and Stable Diffusion
Monitor data, features and AI models

Monitor data, features and AI models in one place

Lakehouse Monitoring provides a single, unified monitoring solution inside the Databricks Data Intelligence Platform. It monitors the statistical properties and quality of all tables with a single click. For applications powered by generative AI, it can scan outputs for toxic and unsafe content as well as diagnose errors.

Govern and track lineage across the full AI lifecycle

Govern and track lineage across the full AI lifecycle — from data to models

Enforce proper permissions, set rate limits and track lineage to meet stringent security and governance requirements. All ML assets from data to models can be governed with a single tool, Unity Catalog, to help ensure consistent oversight and control at every stage of the ML lifecycle through development, deployment and maintenance.

Train and serve your own custom LLMs at 10x lower cost

Train and serve your own custom LLMs at 10x lower cost

With Mosaic AI, you can build your own custom large language model from scratch to ensure the foundational knowledge of the model is tailored to your specific domain. By training on your organization’s IP with your data, it creates a customized model that is uniquely differentiated. Databricks Mosaic AI Training is an optimized training solution that can build new multibillion-parameter LLMs in days with up to 10x lower training costs.

Databricks Notebooks

Databricks Notebooks natively support Python, R, SQL and Scala so practitioners can work together with the languages and libraries of their choice to discover, visualize and share insights.

Learn more

Runtime for Machine Learning

One-click access to preconfigured ML-optimized clusters, powered by a scalable and reliable distribution of the most popular ML frameworks (such as PyTorch, TensorFlow and scikit-learn), with built-in optimizations for unmatched performance at scale.

Learn more

Feature Store

Facilitate the reuse of features with a data lineage–based feature search that leverages automatically logged data sources. Make features available for training and serving with simplified model deployment that doesn’t require changes to the client application.

Learn more

Repos

Repos allows engineers to follow Git workflows in Databricks, enabling data teams to leverage automated CI/CD workflows and code portability.

Learn more

Large Language Models

Databricks makes it simple to deploy, govern, query and monitor access to LLMs and integrate them into your workflows, and provides platform capabilities for retrieval augmented generation (RAG) or fine-tuning LLMs using your own data, resulting in better domain performance. We also provide optimized tools to pretrain your own LLMs in days — at 10x lower cost.

Learn more