Databricks SQL offers a scalable, fully managed, interactive query engine for all BI and SQL workloads, one that outperforms leading data warehouses on a price/performance basis. It offers best-in-class open connectors to the BI and reporting tools you already use, like Power BI, Looker and Tableau, as well as a built-in SQL IDE.
By the end of this demo, we’ll show you how easy it is to deploy, manage and administer Databricks SQL, to enable self-serve analytics directly on the Databricks Lakehouse Platform.
So let’s get started.
Click to expand the transcript →
We’ll enter each user’s email that we want to invite.
Here, we’ll give them access to the Databricks Workspace, as well as access to Databricks SQL.
Once users have access to Databricks SQL, they’ll be able to log into the Lakehouse, and we can start giving them permissions to access data and run queries.
Configure Serverless SQL Endpoints + data access
Now that we’ve invited some users, let’s return to the Databricks Homepage.
From here, we can jump over to Databricks SQL by clicking the sidebar, and selecting the Databricks SQL lens. You’ll see the brand new Databricks SQL guided onboarding experience in the panel on the left hand side.
Go ahead and click “Configure data access” as the first step in our journey.
Before we can start querying data, as administrators, we’ll want to modify a couple of settings in the SQL Admin Console.
From this screen in the SQL Admin Console, you can turn on serverless SQL endpoints, and add an Instance Profile to enable users of your cluster to access data in the lakehouse.
Next, let’s jump over to the Data Explorer to view some data and manage permissions. From here, data analysts and users can discover all the data sets in the lakehouse, browse schemas, and see sample data.
Let’s take a look at the “loans_delta” table. When we select it, we can view its schema on the right hand side.
Clicking through the tabs, we can view some sample data from this table, see details about where the data files themselves are actually located, and whether it is a fully managed table or not.
Data Permissions – UI
On the Permissions tab, we can grant access to groups or individual users. Let’s go ahead and add the data scientists group, as well as our friend Amit.
Perfect. As you can see, those permissions are now recorded here on the right hand side.
SQL GRANT statements
Finally, when you need to automate your permission management, you can grant or revoke access to data objects in the lakehouse using standard SQL GRANT statements, like the ones seen here.
Simply run these statements using the built-in SQL Query Editor, and Databricks SQL updates the permissions immediately.
And if you want, you can also synchronize these access controls directly with Azure Active Directory, or your cloud identity service of choice.
Create SQL Endpoint
Jumping back to the SQL Endpoints tab.
From here, we can view and monitor the health of all of the endpoints that are currently running, or start up a new one. Let’s go ahead and create a new endpoint now.
As you can see, we have some really simple options here, which is great! Databricks SQL abstracts away all of the complexity of cluster creation, leaving only the most essential options. For example, Databricks SQL endpoints offer t-shirt sizing for clusters. Choose small, medium or large, and Databricks automatically determines instance types and configuration for the best price performance.
In addition, multi-cluster load balancing allows you to automatically launch new clusters to meet demand when the load on your endpoints becomes too high.
And finally, the Databricks Photon Engine is a native vectorized query engine that provides faster, more efficient execution of all your SQL workloads.
Connect Databricks SQL Endpoint to Tableau
Now that we have an endpoint running, we’re ready to start querying data. We can use the built-in Databricks SQL query engine and IDE, or we can connect our Endpoint to an external BI tool. For this demo, we’ll use Databricks SQL as the backend for Tableau.
For now, let’s use Databricks SQL as a powerful SQL backend for Tableau, and we’ll demo the built-in BI and visualization tools later.
To set up the Tableau connection, first we need to copy the SQL Endpoint credentials from the Connection Details tab.
Then, we’ll switch over to the user settings to create a new personal access token that will be used to authenticate to Tableau.
Now we’re ready to switch over to Tableau Desktop.
Let’s go ahead and create a new Workbook, and then select Databricks as our data source.
Next, just paste in the credentials from the previous steps.
Once we sign in, in the upper left hand corner, you’ll see that Tableau is now connected to our Databricks SQL Endpoint. We can search for the database we’ve been working with so far, “deltadb”, and select the “loans_delta” table.
When we click the “Update Now” Button, Databricks SQL returns query results almost instantly.
The next-gen Photon query engine runs your data warehousing workloads as fast and efficiently as possible, so creating new rollups, or building new auto updating dashboards like this one, is a breeze.
Now that we have a direct connection to Tableau, we can run interactive queries, and build beautiful visualizations and dashboards using Databricks SQL on the back end.
Monitoring SQL Endpoints
Returning now to Databricks SQL. As data admins, we’d like to be able to monitor the health and performance of our endpoints to minimize downtime.
We can view the status and load on our endpoints by navigating to Monitoring, and viewing the load balancing chart.
As good stewards of our data, we’d also like to be able to know what data sets people accessed, when they accessed them, and what queries they ran, for purposes of auditability, lineage, and data governance. We can view this information by clicking the Query History tab.
SQL Query History
From here, we can view a central log of all of the timestamped queries that users have run on a particular SQL endpoint — regardless of whether it originated from a B.I. tool like Looker, or from a user running a query directly in the built-in SQL Query Editor.
We can click on any individual query to get a general overview with some summary statistics.
Or, we can choose the Execution Details tab for even more granular metrics.
If we want to dive down even deeper, we can view the Execution Summary. The information shown here allows you to identify bottlenecks for long-running queries, understand the physical plan, or view detailed execution information for debugging purposes.
SQL Analyst Experience
Now that we’ve touched on the admin experience, let’s briefly switch gears and explore Databricks SQL from the data analyst’s perspective.
If you want a quick way to get started, the starter Databricks SQL Endpoint comes preloaded with sample datasets, queries, and dashboards. You can copy these as templates, or just play around with them to discover features and get comfortable with the built-in SQL IDE.
Tabbed SQL Query Editor
The new SQL Query editor allows you to run queries in tabs, that persist even if you close your browser and return to Databricks SQL later on.
From here, you can quickly convert the results of your queries into visualizations, and build those into dashboards.
Finally, the Past Executions tab offers a quick look at your personal query history, so that you can easily go back and review interesting queries.
Thanks for watching this demo. Visit databricks.com/try, to get started with Databricks SQL today.
Ready to get started?