Automate Azure Databricks Platform Provisioning and Configuration
Table of Contents Introduction Automation options Common workflow Pre-Requisites Create Azure Resource Group and Virtual Network Provision Azure Application / Service Principal Assign Role to Service Principal Configure Postman Environment Provision Azure Databricks Workspace Generate AAD Access Token Deploy Workspace using the ARM template Get workspace URL Generate Access Token for Auth Generate AAD Access...
Data Exfiltration Protection with Azure Databricks
In the previous blog, we discussed how to securely access Azure Data Services from Azure Databricks using Virtual Network Service Endpoints or Private Link. Given a baseline of those best practices, in this article we walkthrough detailed steps on how to harden your Azure Databricks deployment from a network security perspective in order to prevent...
Securely Accessing Azure Data Sources from Azure Databricks
Azure Databricks is a Unified Data Analytics Platform that is a part of the Microsoft Azure Cloud. Built upon the foundations of Delta Lake, MLFlow , Koalas and Apache Spark, Azure Databricks is a first party service on Microsoft Azure cloud that provides one-click setup, native integrations with other Azure services, interactive workspace, and enterprise-grade...
Simplify Market Basket Analysis using FP-growth on Databricks
When providing recommendations to shoppers on what to purchase, you are often looking for items that are frequently purchased together (e.g. peanut butter and jelly). A key technique to uncover associations between different items is known as market basket analysis. In your recommendation engine toolbox, the association rules generated by market basket analysis (e.g. if...