Harnessing Enterprise AI: Innovations & Wins at Databricks

Published: July 1, 2024

Generative AI (GenAI) can unlock immense value. Organizations are cognizant of the potential but wary of the need to make smart choices about how and where to adopt the technology. The number of models, vendors, and approaches is overwhelming. Budget holders understandably need to see viable return on investment (ROI) strategies that can justify the investment and re-organization that GenAI adoption entails.

Databricks has a long history of harnessing the power of enterprise AI internally for everything from fraud detection to financial forecasts. Our GenAI platform ingests data from several sources, including Salesforce and Metronome, and channels it into our central logfood architecture, where it is extracted, and transformed so it can be leveraged by different personas including our data scientists and software engineers. This process involves 10+ petabytes of data and 60 multi-cloud and multi-geographical regions and is used to help us handle over 100,000 daily tasks for more than 2,000 weekly users. As we collaborate with our customers on their AI strategy and journey, it's useful to explore how we ourselves harness AI in business, and the tools, strategies, and heuristics we employ.

One way to frame our AI strategy is one in which we begin by establishing a robust AI governance regime that involves collaboration with legal, engineering and security teams. Once established, we adopt a hybrid approach that combines mature third party solutions with internal GenAI built programs that leverage rigorous A/B testing to compare performance against traditional approaches. This framework and decision methodology can be instructive for a wide range of AI practitioners, as it highlights clear successes that allow us to establish footholds for further use case development. Below are some examples of clear wins and experimental approaches that highlight how Databricks puts its multi-step GenAI vision into practice.

Clear Wins

The use of GenAI for internal and external support teams has been a clear win for Databricks, and indeed many organizations that have sought to leverage the technology. Strengthening an organization's support function is often the first step in an AI strategy, and in our case, we focused on giving our support teams better documentation, knowledge, an increased ability to drive velocity or reduce support cases, automated functionality, and more self-service for our customers. Over 40 engineering channels currently use our internal Slackbot support function, together with 3,000 active users. In total, we have been able to automate responses to around 40,000 questions internally, related to areas such as issue resolution, script and SQL assistance, error code explanation, and architecture or implementation guidance.

When it comes to external use the same Slackbot, which has hundreds of active users, has managed to answer more than 1,200 questions. On the IT support side, we infused GenAI with existing technologies to help with our support and learning function. Together, support and AI chatbots are set up to handle common queries, which has delivered a 30% deflection rate, up from zero two years ago. Our eventual goal is to reach 60% by the end of 2024. Meanwhile, our BrickNuggets chatbot (which is folded into Field Sidekick) has provided microlearning for our sales team. Our overall third party chatbot is leveraged globally by our teams to collaborate and get specific answers to common questions and used by more than 4,700 monthly active users within the organization.

The second clear use case success relates to the use of GenAI in software development. By leveraging copilots, we have improved the productivity of our engineers, including the development of engineering IP. Copilot capability brings enormous efficiency and productivity benefits; a survey of early access users found that 70% claimed they were more productive, 73% said they could complete tasks faster and 67% said the platform saved them time to focus on more important tasks.

At Databricks, we leverage GenAI copilots to build tools, dashboards and machine learning (ML) models at a faster rate, including models that would traditionally have proved harder to create or require more specific engineering expertise. We are extensive users of DatabricksIQ and assistant copilots to speed up data engineering, data ingestion, reporting, and other data tasks. Additional uses of copilots extend to language migration, test case development, and code explanation. The productivity gains make a noticeable difference to our business, with increases of up to 30% in some cases.

A spirit of experimentation

As well as recognizing clear wins, Databricks has also shown a willingness to adopt an experimental approach towards our AI strategy, with appropriate guardrails. Many ideas that morphed into pilots or eventually went into production emerged from many Databricks hackathons which reflect a culture of idea generation and a recognition that we are not solely infusing our products with AI but building AI-centred infrastructure.

One example relates to email generation for our inside sales team. Automating email generation is a convenient and efficient way of managing sales team workloads, but can be difficult to execute because of the need for context regarding a specific industry, product, and customer base. Our approach has been to harness the intelligence in our data, which is managed and governed in our lakehouse, with the power of LLMs. This means we are able to combine open-source AI models with our data intelligence platform (which integrates data warehouse data sets, the Databricks' Unity Catalog governance platform, a model-serving endpoint for model execution, our retrieval augmented generation (RAG) Studio platform and Mosaic AI) to fine-tune structured and unstructured data and deliver high-quality response rates. RAG is a crucial component in our approach, as it not only allows us to combine LLMs with enterprise data, but offers the right balance of quality and speed to expedite the learning process.

The result is an intelligent email generation capability, which combines contextual information such as the role of the contact, the industry they represent, and similar customer references with email generation assistance, including word count, tone and syntax, and effective email guidelines. We worked closely with our business development SMEs to develop the right prompts to train the models. This approach has proved invaluable; the reply and response rates on AI-generated emails from our model are comparable to a sales/business development representative sending those emails for the first time (namely a 30% to 60% click-through rate, and a 3-5% reply rate). Cost per email, meanwhile, decreased from US$0.07 per email to US$0.005 with the use of fine-tuned open-sourced model. Our Sales Development Reps (SDRs) have full editorial rights on these emails before they are being sent to a prospect. Both the automated technology and our editorial process are infused with safeguards to ensure we eliminate hallucinations and irrelevant data, making sure our email campaigns are focused and effective.

Another promising tool for internal sales representatives is our sales-based agent LLM model. This leverages 'hover' chatbot functionality to provide information for sales teams about possible opportunities and use cases for a particular company. For instance, users in Salesforce can use the tool to understand any recent changes at a company in advance of a meeting, or use structured data from similar companies to identify potentially beneficial interventions, such as cloud platform migration or the construction of a new data warehouse. The key element in the model's functionality is the way it combines both structured Salesforce data and unstructured data from internal and external sources, in a way that preserves access control and meets thresholds around data confidentiality.

We are also experimenting with new approaches in contract management, building a GenAI tool to help with contract summarization. It can evaluate non-standard terms and conditions against validated data in Salesforce and determine the level of indemnity and legal risk associated with a particular agreement. This move towards auto-summarization enables faster processing of contracts, lightening the workload for our in-house legal teams, and is supported by a broader AI governance and safety framework designed in collaboration with our security and privacy teams.

Key considerations

Whether developing experimental use cases or building on successes, several common strands must be heeded when working on GenAI.

While sophisticated platforms have advantages, some projects have emerged from foundational and open-source models such as DBRX and Llama 3 and RAG approaches can reduce and mitigate risk. We use a combination of structured and unstructured data with RAG-based models to deliver actionable insights and minimize hallucinations; increasingly, we use our own Databricks RAG Studio platform to check the efficacy of models, which is key to ensuring ROI and minimizing costs. Using specialized prompts to guide LLM behavior can be combined with enterprise data using the Databricks Intelligence Platform to optimize and learn quickly from experiments. These approaches offer a good balance of speed and quality and can be finetuned or incorporated into an LLM pretraining procedure. Measuring performance against different campaigns, as well as models, highlights the benefit for the company and other stakeholders.
Any GenAI tool should seek to recognize and quantify employee satisfaction as well as efficiency. Monitoring employee experience early in implementation and throughout the lifecycle, ensures employees are maximizing the functionality of the technology and helps embed technology use. This should happen across the board through continuous feedback from different teams. Protocols can ensure technology is used consistently and effectively.
The process of experimentation is not easy, and the route to production is fraught with data and testing challenges. As organizations scale their use of AI, challenges grow in complexity, but they are far from insurmountable. While it is true that data is messy and testing is difficult, there are many steps organizations can take to ease the strain. Leveraging lakehouse capability, adopting an iterative approach to database expansion, and developing a plan to measure business impact when undergoing testing are all crucial steps. Moving cleanly between ML Ops stages, planning for focused sessions to deliver high-quality prompts, and ensuring that answers deliver actionable insights are also critical.
Experiments can be enabled without extensive coordination, especially when costs are low, but moving from experimentation to production needs a centralized approach. This involves IT and governance functions, both of which can help evaluate ROI.

Looking ahead, Databricks is pursuing a plethora of innovative and high-value internal use cases for GenAI, across areas such as business operations (covering areas such as the deal desk and IT support), field productivity (account alerts, content discovery and meeting preparation), marketing (content generation and outbound prospecting), HR (ticket deflection and recruiting efficiency), legal (contract data extraction) and business analytics (self-serve, ad-hoc queries). However, we are not ignoring the value of GenAI for our external customer base.

US airline JetBlue built a chatbot using a combination of our data intelligence platform and sophisticated open-source LLMs that allows employees to gain access to KPIs and information that is specific to their role. The impact of this solution has been to reduce training requirements and the turnaround time for feedback, as well as simplify access to insights for the entire organization. European carrier easyJet built a similar GenAI solution, intended as a tool for non-technical users to pose voice-based questions in their natural language and receive insights that can feed into the decision-making process. This solution has not only helped improve the organization's data strategy and provided users with easier access to data and LLM-driven insights but has also sparked new ideas around other innovative GenAI use cases, including resource optimization, chatbots centered on operational processes and compliance, and personal assistants that offer tailored travel recommendations.

While GenAI projects need to be delivered with security, governance, and ROI in mind, our experience makes clear that when organizations embrace GenAI's cross-functional potential through iteration and experimentation, the potential efficiency gains of this AI strategy can give both them and their customers a competitive advantage.