by Ahmed Bilal and Kasey Uhlenhuth
Last year, we launched foundation model support in Databricks Model Serving to enable enterprises to build secure and custom GenAI apps on a unified data and AI platform. Since then, thousands of organizations have used Model Serving to deploy GenAI apps customized to their unique datasets.
Today, we're excited to announce new updates that make it easier to experiment, customize, and deploy GenAI apps. These updates include access to new large language models (LLMs), easier discovery, simpler customization options, and improved monitoring. Together, these improvements help you develop and scale GenAI apps more quickly and at a lower cost.
Databricks Model Serving is accelerating our AI-driven projects by making it easy to securely access and manage multiple SaaS and open models, including those hosted on or outside Databricks. Its centralized approach simplifies security and cost management, allowing our data teams to focus more on innovation and less on administrative overhead - Greg Rokita, VP, Technology at Edmunds.com
We’re continually adding new open-source and proprietary models to Model Serving, giving you access to a broader range of options via a unified interface.
All models can be accessed via a unified OpenAI-compatible API and SQL interface, making it easy to compare, experiment with, and select the best model for your needs.
At Experian, we’re developing Gen AI models with the lowest rates of hallucination while preserving core functionality. Utilizing the Mixtral 8x7b model on Databricks has facilitated rapid prototyping, revealing its superior performance and quick response times." - James Lin, Head of AI/ML Innovation at Experian.
As we continue to expand the list of models on Databricks, many of you have shared that discovering them has become more challenging. We're excited to introduce new capabilities to simplify model discovery:
Most GenAI applications require combining LLMs or integrating them with external systems. With Databricks Model Serving, you can deploy custom orchestration logic using LangChain or arbitrary Python code. This enables you to manage and deploy an end-to-end application entirely on Databricks. We're introducing updates to make compound systems even easier on the platform.
More updates are coming soon, including streaming support for LangChain and PyFunc models and playground integration to further simplify building production-grade compound AI apps on Databricks.
By bringing model serving and monitoring together, we can ensure deployed models are always up-to-date and delivering accurate results. This streamlined approach allows us to focus on maximizing the business impact of AI without worrying about availability and operational concerns. - Don Scott, VP Product Development at Hitachi Solutions
Monitoring LLMs and other AI models is just as crucial as deploying them. We're excited to announce that Inference Tables now supports all endpoint types, including GPU-deployed and externally hosted models. Inference Tables continuously capture inputs and predictions from Databricks Model Serving endpoints and log them into a Unity Catalog Delta Table. You can then utilize existing data tools to evaluate, monitor, and fine-tune your AI models.
To join the preview, go to your Account > Previews > Enable Inference Tables For External Models And Foundation Models.
Visit the Databricks AI Playground to try Foundation Models directly from your workspace. For more information, please refer to the following resources: