Over the last year, we have seen a surge of commercial and open-source foundation models showing strong reasoning abilities on general knowledge tasks. While general models are an important building block, production AI applications often employ Compound AI Systems, which leverage multiple components such as tuned models, retrieval, tool use, and reasoning agents. These AI agent systems augment foundation models to drive much better quality and help customers confidently take these GenAI apps to production.
Today at the Data and AI Summit, we announced several new capabilities that make Databricks Mosaic AI the best platform for building production-quality AI agent systems. These features are based on our experience working with thousands of companies to put AI-powered applications into production. Today’s announcements include support for fine-tuning foundation models, an enterprise catalog for AI tools, a new SDK for building, deploying, and evaluating AI agents, and a unified AI gateway for governing deployed AI services.
With this announcement, Databricks has entirely integrated and substantially expanded the model-building capabilities first included in our MosaicML acquisition one year ago.
The evaluation of monolithic AI models to compound systems is an active area of both academic and industry research. Recent results have found that “state-of-the-art AI results are increasingly obtained by compound systems with multiple components, not just monolithic models.” These findings are reinforced by what we see in our customer base. Take for example financial research firm FactSet – when they deployed a commercial LLM for their Text-to-Financial-Formula use case, they could only get 55% accuracy in the generated formula, however, modularizing their model into a compound system allowed them to specialize each task and achieve 85% accuracy. Databricks Mosaic AI supports building AI systems through the following products:
Users only have to select a task and base model and provide training data (as a Delta table or a .jsonl file) to get a fully fine-tuned model that they own for their specialized task