Skip to main content

Last year, we launched foundation model support in Databricks Model Serving to enable enterprises to build secure and custom GenAI apps on a unified data and AI platform. Since then, thousands of organizations have used Model Serving to deploy GenAI apps customized to their unique datasets.

Today, we're excited to announce new updates that make it easier to experiment, customize, and deploy GenAI apps. These updates include access to new large language models (LLMs), easier discovery, simpler customization options, and improved monitoring. Together, these improvements help you develop and scale GenAI apps more quickly and at a lower cost.

Databricks Model Serving is accelerating our AI-driven projects by making it easy to securely access and manage multiple SaaS and open models, including those hosted on or outside Databricks. Its centralized approach simplifies security and cost management, allowing our data teams to focus more on innovation and less on administrative overhead - Greg Rokita, VP, Technology at Edmunds.com  

Access New Open and Proprietary Models Through Unified Interface

We’re continually adding new open-source and proprietary models to Model Serving, giving you access to a broader range of options via a unified interface.

  • New Open Source Models: Recent additions, such as DBRX and Llama-3, set a new benchmark for open language models, delivering capabilities that rival the most advanced closed model offerings. These models are instantly accessible on Databricks via Foundation Model APIs with optimized GPU inference, keeping your data secure within Databricks' security perimeter.
  • New External Models Support: The External Models feature now supports latest proprietary state-of-the-art models, including Gemini Pro and Claude 3. External models allow you to securely manage 3rd-party model provider credentials and provide rate limiting and permission support. 

All models can be accessed via a unified OpenAI-compatible API and SQL interface, making it easy to compare, experiment with, and select the best model for your needs.

At Experian, we’re developing Gen AI models with the lowest rates of hallucination while preserving core functionality. Utilizing the Mixtral 8x7b model on Databricks has facilitated rapid prototyping, revealing its superior performance and quick response times." - James Lin, Head of AI/ML Innovation at Experian.

Discover Models and Endpoints Through New Discovery Page and Search Experience

As we continue to expand the list of models on Databricks, many of you have shared that discovering them has become more challenging. We're excited to introduce new capabilities to simplify model discovery:

  • Personalized Homepage: The new homepage personalizes your Databricks experience based on your common actions and workloads. The 'Mosaic AI' tab on the Databricks homepage showcases state-of-the-art models for easy discovery. To enable this Preview feature, visit your account profile and navigate to Settings > Developer > Databricks Homepage.
  • Universal Search: The search bar now supports models and endpoints, providing a faster way to find existing models and endpoints, reducing discovery time, and facilitating model reuse. 
homepage

Build Compound AI Systems with Chain Apps and Function Calling

Most GenAI applications require combining LLMs or integrating them with external systems. With Databricks Model Serving, you can deploy custom orchestration logic using LangChain or arbitrary Python code. This enables you to manage and deploy an end-to-end application entirely on Databricks. We're introducing updates to make compound systems even easier on the platform.

  • Vector Search (now GA): Databricks Vector Search seamlessly integrates with Model Serving, providing accurate and contextually relevant responses. Now generally available, it's ready for large-scale, production-ready deployments.
  • Function Calling (Preview): Currently, in private preview, function calling allows LLMs to generate structured responses more reliably. This capability allows you to use an LLM as an agent that can call functions by outputting JSON objects and mapping arguments. Common function calling examples are: calling external services like DBSQL, translating natural language into API calls, and extracting structured data from text. Join the preview
  • Guardrails (Preview): In private preview, guardrails provide request and response filtering for harmful or sensitive content. Join the preview
  • Secrets UI: The new Secrets UI streamlines the addition of environment variables and secrets to endpoints, facilitating seamless communication with external systems (API is also available). 
The search results are a mix of articles, tutorials, and community discussions related to Databricks, a data and AI platform. Here's a summary of the content:1. The first result is a search result for an image file, which appears to be a screenshot or an image related to Databricks.2. The second result is an article from Databricks' documentation on how to use the image data source in Spark. It explains the structure of image files, how to read and write image data, and provides examples of how to use the image data source in notebooks.3. The third result is the Databricks website, which showcases the company's data intelligence platform and its capabilities in AI, data engineering, and data science.4. The fourth result is a community discussion on how to show an image in a Databricks notebook using HTML. The discussion provides several solutions, including using the `displayHTML` function, adding a preceding slash to the image path, and using the IPython library.5. The fifth result is another community discussion on rendering markdown images hard-coded as data image PNG base64 in Databricks. The discussion provides a solution using base64 encoding and constructing a data URI.6. The sixth result is a sample notebook from Databricks' documentation on how to use the image data source. The notebook provides an example of how to read and write image data using the image data source.Overall, the search results provide a mix of technical information, tutorials, and community discussions related to Databricks and its capabilities in data engineering, AI, and data science.

More updates are coming soon, including streaming support for LangChain and PyFunc models and playground integration to further simplify building production-grade compound AI apps on Databricks.

By bringing model serving and monitoring together, we can ensure deployed models are always up-to-date and delivering accurate results. This streamlined approach allows us to focus on maximizing the business impact of AI without worrying about availability and operational concerns. -  Don Scott, VP Product Development at Hitachi Solutions

Monitor All Types of Endpoints with Inference Tables

Monitoring LLMs and other AI models is just as crucial as deploying them. We're excited to announce that Inference Tables now supports all endpoint types, including GPU-deployed and externally hosted models. Inference Tables continuously capture inputs and predictions from Databricks Model Serving endpoints and log them into a Unity Catalog Delta Table. You can then utilize existing data tools to evaluate, monitor, and fine-tune your AI models.

To join the preview, go to your Account > Previews > Enable Inference Tables For External Models And Foundation Models.

It appears that you've shared a link to an image, but the image itself is not visible in this chat platform. The text you've shared is likely the HTML code for the image, which is not human-readable.If you'd like to share the image, you can try uploading it to a hosting platform like Imgur or Dropbox and sharing the link here. Alternatively, you can describe the image and its contents, and I'll do my best to help you with your question.

Get Started Today!

Visit the Databricks AI Playground to try Foundation Models directly from your workspace. For more information, please refer to the following resources:

Try Databricks for free

Related posts

Better LLMs with Better Data using Cleanlab Studio

June 1, 2023 by Anish Athalye in
This post and accompanying notebook and tutorial video demonstrate how to use Cleanlab Studio to improve the performance of Large Language Models (LLMs...

Automating ML, Scoring, and Alerting for Detecting Criminals and Nation States Through DNS Analytics

August 2, 2022 by Arun Pamulapati in
This blog is part two of our DNS Analytics blog, where you learned how to detect a remote access trojan using passive DNS...

Offline LLM Evaluation: Step-by-Step GenAI Application Assessment on Databricks

Background In an era where Retrieval-Augmented Generation (RAG) is revolutionizing the way we interact with AI-driven applications, ensuring the efficiency and effectiveness of...
See all Generative AI posts