Announcing Multi-Cloud support for Security Analysis Tool (SAT)

Monitor the security health of your Databricks workspaces

Published: February 3, 2023

by Anindita Mahapatra, Arun Pamulapati and Ramdas Murali

Last November, we announced the availability of the Security Analysis Tool (SAT) for AWS on our blog. Today we are excited to announce that SAT is available for Databricks customers on Azure and GCP. SAT helps our customers harden their Databricks environments by reviewing current deployments against our security best practices. It uses a checklist that prioritizes observed deviations by severity and provides links to resources that help resolve outstanding issues. SAT can be run as a routine scan for all workspaces in your environment to help establish continuous adherence to best practices, and health reports can be scheduled to provide continual confidence in the security of all data, including your sensitive datasets.

At Databricks, we build security into every layer of the Databricks Lakehouse Platform. Databricks has worked with thousands of customers to deploy the platform securely with security features that fit their architecture requirements. Security Best Practices documents for AWS, Azure, and GCP provide a checklist of the recommended security practices, considerations, and patterns you can apply to your deployment. SAT is built keeping these best practices in mind and helps our customers to analyze and harden their Databricks deployments by reviewing current workspace deployments against our security best practices. See the current list of checks SAT supports.

SAT builds on Databricks's multi-cloud experience, covers security aspects of your Databricks deployment on the same set of controls on all clouds, and applies cloud-specific checks automatically where necessary by using and abstracting the cloud-specific APIs as applicable.

How to install & run SAT?

SAT is designed to be installed and configured in a single workspace per account. It runs in the customer's account as an automated workflow and collects details about the account, workspace(s), clusters, jobs, etc., via Databricks REST APIs of all other workspaces in that account. An administrator can choose which workspaces to include/exclude from routine scans.

Scan results are persisted in Delta tables to analyze security health trends over time. Findings are grouped into five security categories - Network Security, Identity & Access, Data Protection, Governance, and Informational - that are displayed on a Databricks SQL Dashboard. Security teams can set up alerts that will notify them when SAT detects insecure configurations and policy deviations. It also provides additional details on individual checks that fail so that an admin can quickly pinpoint and remediate the issue. For more details on deployment, please refer to the setup docs and the AWS and Azure checklists.

Figure 2. Deployment and Run steps of SAT

Deploy & Run SAT

To deploy and run SAT:

Import the SAT tool github repository into your Databricks environment
Configure the access rights required for the SAT tool based on your cloud environment requirements
Run the "Initializer" notebook to set up SAT. The Initializer notebook collects the list of all accessible workspaces, verifies access to each workspace, and uses the data to set up the reporting dashboard and alert framework
By default, all the workspaces where the connection test succeeds are enabled for analysis. An administrator can change the config to indicate which set of checks to run and which workspaces should be analyzed and where alerts should be sent on check violation
It is recommended to run the driver daily to ensure all checks are in place as expected

How to use SAT insights?

The SAT dashboard showcases your workspace's security posture and provides a historic view of your security health over time. There is also a provision to go back in time and check the details of a previous run. For critical checks, it is recommended to configure Email alerts to your administrators that notify you when a violation occurs.

The following list provides a high-level guide on how to navigate the SAT dashboard and what each of the display sections convey:

Choose the workspace to analyze from the dropdown list at the top of the dashboard
By default, the latest run information is displayed. You can choose a specific run date using the Date Picker dropdown at the top of the dashboard.
The security checks are divided into the following sections:
- High-level summary by category and severity
- General Workspace usage stats
- Detailed security checks by category
- Informational Section for information nuggets to aid an investigation
- Drill down section to look into additional details of a check to identify root cause
- Security Deviation Trend takes a date range and displays count of deviations over time
- The Security Deviation Comparison section takes two dates and provides a list of checks that were different. It also plots the count of checks by each day in the range to show if things have degraded or improved in that period.
Each check has a hyperlink that takes you to details on what the security feature is, why it is important, and guidance on how to resolve it.

Apart from additional checks in each category since the last release, the feature enhancements to the main dashboard includes:

The ability to track the trend of security best practice deviations over a date range. This helps identify the inflection point where improvements or degradations started to aid the investigation and remediation.

Figure 3. Security Deviation Trend Dashboard

For example, the diagram above shows a count of deviations in various categories by run date. The expectation is that over time the height of these bar charts should shrink or, at best, remain the same. If there is a sudden increase, it warrants immediate investigation as it indicates a possible inadvertent human error.

The ability to compare two runs side by side along each of the security dimensions. This drill-down option helps pinpoint the checks that have either been rectified or degraded, so that security folks can address them speedily.

Figure 4. Security Deviation Comparison Dashboard

For example, The diagram above shows the individual checks in various categories for each run. The red rectangle in the diagram shows an improvement in "Enforce User Isolation" but a degradation in the "Admin Count" best practice. The expectation is that over time the cross marks should change to tick marks. If it is the opposite, it warrants immediate investigation as it indicates a degradation. An alert will also be triggered to notify via email if detrimental changes are detected.

Conclusion

The Security Analysis Tool (SAT) for the Databricks Lakehouse Platform is easy to set up and observes and reports on the security health of your Databricks workspaces over time across all three major clouds including AWS, Azure, and GCP. We invite you to set up SAT in your Databricks deployments or ask for help from your Databricks account team. Stay tuned for more posts and video content on Databricks Security Best Practices!

If you are curious about how Databricks approaches security, please review our Security & Trust Center. We encourage you to review Databricks Security Best Practices documents. If you have questions or suggestions about SAT, please feel free to reach us at [email protected].

What's next?

October 24, 2024/4 min read

Building a Cost-Optimized Chatbot with Semantic Caching

November 20, 2024/4 min read

How to install & run SAT?

How to use SAT insights?

Conclusion

Never miss a Databricks post

Sign up

What's next?

Building a Cost-Optimized Chatbot with Semantic Caching

Introducing Predictive Optimization for Statistics