We are pleased to share the General Availability of AI Model Sharing within Databricks Delta Sharing and the Databricks Marketplace. This milestone follows the Public Preview announcement in January 2024. Since the Public Preview launch, we have worked with new AI model sharing customers and providers such as Bitext, AI21 Labs, and Ripple to further simplify AI Model Sharing.
You can easily share and serve AI models securely using Delta Sharing. Sharing could be within your organization or externally across clouds, platforms, and regions. In addition, Databricks Marketplace now has over 75+ AI Models including new industry-specific AI models from John Snow Labs, OLA Krutrim, and Bitext as well as foundation models like Databricks DBRX, Llama 3, AI21 Labs, Mistral and several others. In this blog, we will review the business need for AI model sharing and take a deeper dive into use cases driven by AI21 ’s Jamba 1.5 Mini foundation model and Bitext models.
AI models are also now readily available out-of-the-box from the Unity Catalog, streamlining the process for users to access and deploy models efficiently. This development not only simplifies the user experience but also enhances the accessibility of AI models, supporting seamless integration and deployment across various platforms and regions.
Here are the 3 benefits of AI Model Sharing with Databricks we saw with early adopters and launch partners
AI Model Sharing is powered by Delta Sharing. Providers can share AI models with customers either directly using Delta Sharing or by listing them on the Databricks Marketplace, which also uses Delta Sharing.
Delta Sharing makes it easy to use AI models wherever you need them. You can train models anywhere, and then you can use them anywhere without having to manually move them around. The model weights (i.e. parameters that the AI model has learned during training) will be automatically pulled into the serving endpoint (i.e. the place where the model "lives"). This eliminates the need for cumbersome model movement after each model training or fine-tuning, ensuring a single source of truth and streamlining the serving process. For example, customers can train models in the cloud and region that provides the cheapest training infrastructure, and then serve the model in another region closer to the end users to minimize the inference latency (i.e reducing the time it takes for an AI model to process data and provide results).
Databricks Marketplace, powered by Delta Sharing, lets you easily find and use over 75 AI models. You can set up these models as if they're on your local system, and Delta Sharing automatically updates them during deployment or upgrades. You can also customize models with your data for tasks like managing a knowledge base. As a provider, you only need one copy of your model to share it with all your Databricks clients.
Since the Public Preview of AI Model Sharing was announced in Jan 2024, we’ve worked with several customers and partners to ensure that AI Model Sharing delivers significant cost savings for the organizations
"We use Reinforcement learning (RL) models in some of our products. Compared to supervised learning models, RL models have longer training times and many sources of randomness in the training process. These RL models need to be deployed in 3 workspaces in separate AWS regions. With model sharing we can have one RL model available in multiple workspaces without having to retrain it again or without any cumbersome manual steps to move the model."— Mihir Mavalankar Machine Learning Engineer, Ripple
AI21 Labs, a leader in generative AI and large language models, has published Jamba 1.5 Mini, part of the Jamba 1.5 Model Family, on the Databricks Marketplace. Jamba 1.5 Mini by AI21 Labs introduces a novel approach to AI language models for enterprise use. Its innovative hybrid Mamba-Transformer architecture enables a 256K token effective context window, along with exceptional speed and quality. With Mini’s optimization for efficient use of computing, it can handle context lengths of up to 140K tokens on a single GPU.
"AI21 Labs is pleased to announce that Jamba 1.5 Mini is now on the Databricks Marketplace. With Delta Sharing, enterprises can access our Mamba-Transformer architecture, featuring a 256K context window, ensuring exceptional speed and quality for transformative AI solutions"— Pankaj Dugar, SVP & GM , AI21 Labs
A 256K token effective context window in AI models refers to the model's ability to process and consider 256,000 tokens of text at once. This is significant because it allows the AI21 Models model to handle large and complex data sets, making it particularly useful for tasks that require understanding and analyzing extensive information, such as lengthy documents or intricate data-heavy workflows, and enhancing the retrieval stage of any RAG-based workflow. Jamba’s hybrid architecture ensures the model’s quality does not degrade as context increases, unlike what is typically seen with Transformer-based LLMs’ claimed context windows.
Check out this video tutorial that demonstrates how to obtain AI21 Jamba 1.5 Mini model from the Databricks Marketplace, fine-tune it, and serve it
Jamba 1.5 Mini’s 256k context window means the models can efficiently handle the equivalent of 800 pages of text in a single prompt. Here are a few examples of how Databricks customers in different industries can use these models
Bitext offers pre-trained verticalized models on the Databricks Marketplace. These models are versions of the Mistral-7B-Instruct-v0.2 model fine-tuned for the creation of chatbots, virtual assistants and copilots for the Retail Banking domain, providing customers with fast and accurate answers about their banking needs. These models can be produced for any family of foundation models: GPT, Llama, Mistral, Jamba, OpenELM…
A leading social trading App was experiencing high dropout rates during user onboarding. It leveraged Bitext's pretrained verticalized Banking models to revamp its onboarding process, transforming static forms into a conversational, intuitive, and personalized user experience.
Bitext shared the verticalized AI model with the customer. Using that model as a base, a data scientist did the initial fine-tuning with customer-specific data, such as common FAQs. This step ensured that the model understood the unique requirements and language of the user base. This was followed by advanced Fine-Tuning with Databricks Mosaic AI.
Once the Bitext model was fine-tuned, it was deployed using Databricks AI Model Serving.
The collaboration set a new standard in user interaction within the social finance sector, significantly improving customer engagement and retention. Thanks to the jump-start provided by the shared AI model, the entire implementation was completed within 2 weeks.
Take a look at the demo that shows how to install and fine-tune Bitext Verticalized AI Model from Databricks Marketplace here
"Unlike generic models that need a lot of training data, starting with a specialized model for a specific industry reduces the data needed to customize it. This helps customers quickly deploy tailored AI models. We're thrilled about AI Model Sharing. Our customers have experienced up to a 60% reduction in resource costs (fewer data scientists and lower computational requirements) and up to 50% savings in operational disruptions (quicker testing and deployment) with our specialized AI models available on the Databricks Marketplace."— Antonio S. Valderrábanos , Founder & CEO, Bitext
Cost Components |
Generic LLM Approach |
Bitext's Verticalized Model on Databricks Marketplace |
Cost Savings (%) |
Verticalization |
High - Extensive fine-tuning for sector & use case |
Low - Start with pre-finetuned vertical LLM |
60% |
Customization with Company Data |
Medium - Further fine-tuning required |
Low - Specific customization needed |
30% |
Total Training Time |
3-6 months |
1-2 months |
50-60% reduction |
Resource Allocation |
High - More data scientists and computational power |
Low - Less intensive |
40-50% |
Operational Disruption |
High - Longer integration and testing phases |
Low - Faster deployment |
50% |
Now that AI model sharing is generally available (GA) for both Delta Sharing and new AI models on the Databricks Marketplace, we encourage you to: