Step-by-Step Guide: AI-Powered Customer Sentiment Analysis

How to use AI functions in Databricks SQL together with AI/BI Dashboards

Updated: March 10, 2025

Published: November 7, 2024

Product11 min read

Summary

In this blog post, we’ll follow a step-by-step guide to building a review analysis pipeline. We’ll leverage the Databricks Marketplace to import sample data, use ai_query() to create a pipeline of reviews and serve that analysis in an AI/BI dashboard. The final result is an interactive way for users to understand customer opinions from product reviews.

Analyzing data from free-form text reviews can provide critical insights into customer feedback. What do customers think about your business? What particular aspects can be improved?

Traditionally, understanding large volumes of unstructured text data like user reviews involves a significant investment. You really need ML engineers to train and deploy classification and/or named entity recognition models that are purpose-built for each task. With AI Functions, however, Databricks is making it possible for anyone comfortable working in SQL to get answers to these questions. No bespoke modeling is required. It’s as simple as writing a few lines of SQL.

opinion mining review dashboard

In this blog post, we’ll follow a step-by-step guide for a SQL analyst to understand trends in freeform text reviews. We’ll leverage the Databricks Marketplace to import sample data, then use ai_query() to create a pipeline of reviews. We’ll then display and share that analysis using an AI/BI dashboard. The final result is an interactive way for users to understand customer opinions from product reviews.

Step-by-step process

Let’s walk through the process of mining opinions in Databricks SQL.

Importing data from the Databricks Marketplace
Build an opinion mining pipeline
Authoring a dashboard

Step 1: Importing data from the Databricks Marketplace

For this example, we’ll use a sample Amazon reviews dataset from BrightData, one of the partners on the Databricks Marketplace. To access this data, navigate to the Databricks Marketplace via the left-hand navigation menu.

Databricks Marketplace

Databricks Marketplace is an open marketplace for all your data, analytics and AI, powered by the open source Delta Sharing standard. The Databricks Marketplace expands your opportunity to deliver innovation and advance all your analytics and AI initiatives. You can leverage these assets directly within your Databricks environment.

To find the dataset, go to the 'Search for products' search box and type in 'amazon - best seller products'. Click the first tile to see details about the dataset and provider.

Databricks Marketplace with Amazon best seller products

From there, click the blue 'Get instant access’ button on the top right corner and accept the terms and conditions.

Amazon reviews in Databricks Marketplace

Instant access will add a shared dataset to Unity Catalog. Within Unity Catalog, click the 'amazon_reviews’ table to see some example reviews.

Access in Unity Catalog

Step 2: Building an opinion mining pipeline

Databricks AI Functions provide an easy way to perform various analytical tasks that would be impossible using traditional SQL. The below options are currently supported with out-of-the-box functionality:

These functions are helpful in quickly testing out AI capabilities on top of text data. For instance, we can call ai_analyze_sentiment to find the review sentiment.

This will assign a positive, negative, neutral, or mixed sentiment to the text. For example, here is a review that was classified as positive:

For more customized use cases, Databricks recommends using ai_query(). ai_query() will allow you to specify your own model, prompt, and structured output response. It enables high throughput performance with no need for complex configuration. Let's use ai_query(). First, we'll take a look at sample data below.

Next, we should decide what we want in the output. Ideally, we could extract the critical opinions, record the sentiment, and classify the mention into a relevant category. The response might look like this:

To produce this output, we’ll construct a SQL query that uses ai_query() to call an LLM. We’ll create a custom prompt that instructs the LLM to classify reviews into our desired categories and provide a sample review and output to improve the quality of the response. We'll also specify a responseFormat, which defines a structured output that we want the result returned in.

Here's the query:

Copy and paste this into the SQL editor and run the query to see the results.

run query and see results

Optimize the process

Now that we know this works, we can make a few improvements.

Extract the JSON columns.
Filter out any hallucinated values that don’t match our desired classifications.
Create a materialized view to store these results so we don’t have to recompute the LLM outputs each time we want to look at our dashboard. You can refresh the materialized view whenever you want to incrementally compute more review opinions without needing to rescan the entire dataset.

To implement these improvements, copy and paste the following code into a new query.

Execute this query. It may take some time to complete, so make a cup of coffee or tea until the query completes.

Step 3: Building a dashboard

Finally, we want to display these results on a shared dashboard. We’ll create the dashboard below.

review dashboard

Create a new dashboard

Let’s create a new AI/BI dashboard by clicking on +New > Dashboard.

new dashboard menu

Double-click the dashboard title to give it a name - we’ll call it Reviews Dashboard.

add dashboard canvas

Choose the materialized view

Next, let's pull data from the materialized view we created. Go to the Data tab and click Select a table. Find the materialized view you created and import it into the dashboard.

materialized view results — *Results from the materialized view*

Create visualizations

Now that we have the dataset loaded, we can create some visualizations. We’ll want to answer the following questions:

What is the distribution of sentiment across different products?
What product attributes are considered positive? What attributes are negative?
Show me which reviews are considered positive vs. negative.

We’ll start by adding a Text Widget. Go back to the Canvas tab and add a Text Box, then place the Text Widget at the top of the canvas. Use Markdown to provide a header text widget to our dashboard—#Review Dashboard.

text widget

Now, we’re going to add some visualizations. Let’s add a row of KPIs across the top of the dashboard to show critical metrics. click the Add a visualization tool in the bottom toolbar, and place a widget on a 1-wide by 3-tall grid. Choose Counter as the visualization type from the right-hand dropdown menu, and select count(*) as the Value. Name this widget Total Opinions.

Clone this first counter and place it to the right. Rename it Positive. Next, in the Filter fields menu, add a filter on Sentiment and choose Positive as the value.

Continue doing this to get a row of 6 widgets across the top of the dashboard for these KPIs:

Total Opinions
Positive
Negative
Mixed
Neutral
Avg Rating

You can optionally style these counters to make them look more appealing.

add row of counters — *The row of counters for our review dashboard*

Next, we’re going to add three more charts to our dashboard:

A bar chart showing sentiment per classification.
- X-axis: COUNT(*)
- Y axis: Classification
- Color: Sentiment
- Layout (in X-axis kebab): Grouped
A table showing raw results to review the actual text, with the following fields:
- Classification
- Comment
- Sentiment
- PRODUCT_NAME
- RATING
- AUTHOR_NAME
- REVIEW_ID
- REVIEW_TEXT
A heatmap showing the distribution of sentiment per product
- X-axis: PRODUCT_NAME
- Y axis: Sentiment
- Color: COUNT(*)

visualizations added to the dashboard — *Visualizations added to the dashboard*

Finally, we will add filters to slice our dashboard by different dimensions. Place three filter widgets at the top of Canvas. Set each filter widget to Multiple Values for filter type. On filter 1, select the PRODUCT_NAME field to filter on. For filter 2, select Sentiment. For filter 3, select Classification. Optionally rename the filters to have a more friendly name.

add filters to the dashboard — *Adding filters to our dashboard*

We’ve finished updating our dashboard! To see the final result, click the “Publish” icon at the top right corner of the dashboard. From there, you can share your dashboard with additional users to ensure anyone can see your review insights.

publish dashboard buttons — *The publish button in the top right corner*

Conclusion

Unstructured text data like customer reviews provide a treasure trove of data about your business. Previously, analyzing text required advanced knowledge of NLP techniques. Using AI Functions and AI/BI dashboards, SQL users can seamlessly create engaging dashboards without building an AI or ML model. You can extend this pattern to any unstructured text analysis in SQL - think classification, entity extraction, translation, and more.

Thank you for reading along. Our team would love to hear from you if you have any questions or comments. We can’t wait to see what you do with AI functions and AI/BI dashboards!

Get started with sentiment analysis

Learn more about batch inference by reading the blog post here. Also, check out other AI functions for applying AI to your data from within SQL.

To learn more about Databricks SQL, visit our website. You can also check out the product tour for Databricks SQL. Suppose you want to migrate your existing warehouse to a high-performance, serverless data warehouse with a great user experience and lower total cost. In that case, Databricks SQL is the solution — try it for free.

What's next?

November 20, 2024/4 min read

Introducing Predictive Optimization for Statistics

November 21, 2024/3 min read