Announcing the General Availability of Databricks Assistant Autocomplete

Published: October 10, 2024

Platform & Products & Announcements6 min read

by Jason Messer, Matthew Hayes, Evion Kim, Linqing Liu, Varun Nambikrishnan and Beishao Cao

Summary

We are excited to announce the general availability of Assistant Autocomplete. Assistant Autocomplete provides AI-powered real-time code suggestions as you type.

Today, we are excited to announce the general availability of Databricks Assistant Autocomplete on all cloud platforms. Assistant Autocomplete provides personalized AI-powered code suggestions as-you-type for both Python and SQL.

Assistant Autocomplete

Directly integrated into the notebook, SQL editor, and AI/BI Dashboards, Assistant Autocomplete suggestions blend seamlessly into your development flow, allowing you to stay focused on your current task.

“While I’m generally a bit of a GenAI skeptic, I’ve found that the Databricks Assistant Autocomplete tool is one of the very few actually great use cases for the technology. It is generally fast and accurate enough to save me a meaningful number of keystrokes, allowing me to focus more fully on the reasoning task at hand instead of typing. Additionally, it has almost entirely replaced my regular trips to the internet for boilerplate-like API syntax (e.g. plot annotation, etc).” - Jonas Powell, Staff Data Scientist, Rivian

We are excited to bring these productivity improvements to everyone. Over the coming weeks, we’ll be enabling Databricks Assistant Autocomplete across eligible workspaces.

A compound AI system

Compound AI refers to AI systems that combine multiple interacting components to tackle complex tasks, rather than relying on a single monolithic model. These systems integrate various AI models, tools, and processing steps to form a holistic workflow that is more flexible, performant, and adaptable than traditional single-model approaches.

Assistant Autocomplete is a compound AI system that intelligently leverages context from related code cells, relevant queries and notebooks using similar tables, Unity Catalog metadata, and DataFrame variables to generate accurate and context-aware suggestions as you type.

Our Applied AI team utilized Databricks and Mosaic AI frameworks to fine-tune, evaluate, and serve the model, targeting accurate domain-specific suggestions.

Leveraging Table Metadata and Recent Queries

Consider a scenario where you've created a simple metrics table with the following columns:

date (STRING)
click_count (INT)
show_count (INT)

Assistant Autocomplete makes it easy to compute the click-through rate (CTR) without needing to manually recall the structure of your table. The system uses retrieval-augmented generation (RAG) to provide contextual information on the table(s) you're working with, such as its column definitions and recent query patterns.

For example, with table metadata, a simple query like this would be suggested:

If you've previously computed click rate using a percentage, the model may suggest the following:

Using RAG for additional context keeps responses grounded and helps prevent model hallucinations.

Leveraging runtime DataFrame variables

Let’s analyze the same table using PySpark instead of SQL. By utilizing runtime variables, it detects the schema of the DataFrame and knows which columns are available.

For example, you may want to compute the average click count per day:

In this case, the system uses the runtime schema to offer suggestions tailored to the DataFrame.

Domain-Specific Fine-Tuning

While many code completion LLMs excel at general coding tasks, we specifically fine-tuned the model for the Databricks ecosystem. This involved continued pre-training of the model on publicly available notebook/SQL code to focus on common patterns in data engineering, analytics, and AI workflows. By doing so, we've created a model that understands the nuances of working with big data in a distributed environment.

Benchmark-Based Model Evaluation

To ensure the quality and relevance of our suggestions, we evaluate the model using a suite of commonly used coding benchmarks such as HumanEval, DS-1000, and Spider. However, while these benchmarks are useful in assessing general coding abilities and some domain knowledge, they don’t capture all the Databricks capabilities and syntax. To address this, we developed a custom benchmark with hundreds of test cases covering some of the most commonly used packages and languages in Databricks. This evaluation framework goes beyond general coding metrics to assess performance on Databricks-specific tasks as well as other quality issues that we encountered while using the product.

If you are interested in learning more about how we evaluate the model, check out our recent post on evaluating LLMs for specialized coding tasks.

To know when to (not) generate

There are often cases when the context is sufficient as is, making it unnecessary to provide a code suggestion. As shown in the following examples from an earlier version of our coding model, when the queries are already complete, any additional completions generated by the model could be unhelpful or distracting.

Initial Code (with cursor represented by <here>)

Completed Code (suggested code in bold, from an earlier model)

-- get the click percentage per day across all time

SELECT date, click_count<here>*100.0/show_count as click_pct

from main.product_metrics.client_side_metrics

-- get the click percentage per day across all time

SELECT date, click_count, show_count, click_count*100.0/show_count as click_pct

from main.product_metrics.client_side_metrics

-- get the click percentage per day across all time

SELECT date, click_count*100<here>.0/show_count as click_pct

from main.product_metrics.client_side_metrics

-- get the click percentage per day across all time

SELECT date, click_count*100.0/show_count as click_pct

from main.product_metrics.client_side_metrics.0/show_count as click_pct

from main.product_metrics.client_side_metrics

In all of the examples above, the ideal response is actually an empty string. While the model would sometimes generate an empty string, cases like the ones above were common enough to be a nuisance. The problem here is that the model should know when to abstain – that is, produce no output and return an empty completion.

To achieve this, we introduced a fine-tuning trick, where we forced 5-10% of the cases to consist of an empty middle span at a random location in the code. The thinking was that this would teach the model to recognize when the code is complete and a suggestion isn’t necessary. This approach proved to be highly effective. For the SQL empty response test cases, the pass rate went from 60% up to 97% without impacting the other coding benchmark performance. More importantly, once we deployed the model to production, there was a clear step increase in code suggestion acceptance rate. This fine-tuning enhancement directly translated into noticeable quality gains for users.

Fast Yet Cost-Efficient Model Serving

Given the real-time nature of code completion, efficient model serving is crucial. We leveraged Databricks' optimized GPU-accelerated model serving endpoints to achieve low-latency inferences while controlling the GPU usage cost. This setup allows us to deliver suggestions quickly, ensuring a smooth and responsive coding experience.

Assistant Autocomplete is built for your enterprise needs

As a data and AI company focused on helping enterprise customers extract value from their data to solve the world’s toughest problems, we firmly believe that both the companies developing the technology and the companies and organizations using it need to act responsibly in how AI is deployed.

We designed Assistant Autocomplete from day one to meet the demands of enterprise workloads. Assistant Autocomplete respects Unity Catalog governance and meets compliance standards for certain highly regulated industries. Assistant Autocomplete respects Geo restrictions and can be used in workspaces that deal with processing Protected Health Information (PHI) data. Your data is never shared across customers and is never used to train models. For more detailed information, see Databricks Trust and Safety.

Getting started with Databricks Assistant Autocomplete

Databricks Assistant Autocomplete is available across all clouds at no additional cost and will be enabled in workspaces in the coming weeks. Users can enable or disable the feature in developer settings:

Navigate to Settings.
Under Developer, toggle Automatic Assistant Autocomplete.
As you type, suggestions automatically appear. Press Tab to accept a suggestion. To manually trigger a suggestion, press Option + Shift + Space (on macOS) or Control + Shift + Space (on Windows). You can manually trigger a suggestion even if automatic suggestions is disabled.

For more information on getting started and a list of use cases, check out the documentation page and public preview blog post.

What's next?

December 5, 2024/3 min read

Unlock the Predictive Power of Your Time Series Data

December 9, 2024/8 min read

Summary

Assistant Autocomplete

A compound AI system

Leveraging Table Metadata and Recent Queries

Leveraging runtime DataFrame variables

Domain-Specific Fine-Tuning

Benchmark-Based Model Evaluation

To know when to (not) generate

Initial Code (with cursor represented by <here>)

Completed Code (suggested code in bold, from an earlier model)

Fast Yet Cost-Efficient Model Serving

Assistant Autocomplete is built for your enterprise needs

Getting started with Databricks Assistant Autocomplete

Never miss a Databricks post

Sign up

What's next?

Unlock the Predictive Power of Your Time Series Data

Streamline AI Agent Evaluation with New Synthetic Data Capabilities