Measuring Advertising Effectiveness with Sales Forecasting and Attribution
Click below to download the notebooks for this solution accelerator: Campaign Effectiveness -- ETL Campaign Effectiveness -- Machine Learning How do you connect the impact of marketing and your ad spend toward driving sales? As the advertising landscape continues to evolve, advertisers are finding it increasingly challenging to efficiently pinpoint the impact of various revenue-generating...
Diving Into Delta Lake: DML Internals (Update, Delete, Merge)
In previous blogs Diving Into Delta Lake: Unpacking The Transaction Log and Diving Into Delta Lake: Schema Enforcement & Evolution, we described how the Delta Lake transaction log works and the internals of schema enforcement and evolution. Delta Lake supports DML (data manipulation language) commands including DELETE, UPDATE, and MERGE. These commands simplify change data...
Automate Azure Databricks Platform Provisioning and Configuration
Table of Contents Introduction Automation options Common workflow Pre-Requisites Create Azure Resource Group and Virtual Network Provision Azure Application / Service Principal Assign Role to Service Principal Configure Postman Environment Provision Azure Databricks Workspace Generate AAD Access Token Deploy Workspace using the ARM template Get workspace URL Generate Access Token for Auth Generate AAD Access...
Easily Clone your Delta Lake for Testing, Sharing, and ML Reproducibility
Introducing Clones An efficient way to make copies of large datasets for testing, sharing and reproducing ML experiments We are excited to introduce a new capability in Databricks Delta Lake - table cloning. Creating copies of tables in a data lake or data warehouse has several practical uses. However, given the volume of data in...
Announcing Databricks Labs Terraform integration on AWS and Azure
We are pleased to announce integration for deploying and managing Databricks environments on Microsoft Azure and Amazon Web Services (AWS) with HashiCorp Terraform. It is a popular open source tool for creating safe and predictable cloud infrastructure across several cloud providers. With this release, our customers can manage their entire Databricks workspaces along with the...
It’s an ESG World and We’re Just Living in it
The future of finance goes hand in hand with socially responsible investing, environmental stewardship, and corporate ethics. In order to stay competitive, Financial Services Institutions (FSI) are increasingly disclosing more information about their environmental, social, and corporate governance (ESG) performance. Hence the increasing importance of ESG ratings and ESG scores to investment managers and institutional...
An Update on Project Zen: Improving Apache Spark for Python Users
Apache Spark™ has reached its 10th anniversary with Apache Spark 3.0 which has many significant improvements and new features including but not limited to type hint support in pandas UDF, better error handling in UDFs, and Spark SQL adaptive query execution. It has grown to be one of the most successful open-source projects as the...
Introducing the Databricks Web Terminal
Introduction We're excited to introduce the public preview of the Databricks Web Terminal in the 3.25 platform release. Any user with "Can Attach To" cluster permissions can now use the Web Terminal to interactively run Bash commands on the driver node of their cluster. The new Databricks web terminal provides a fully interactive shell that...
Improving Public Health Surveillance During COVID-19 with Data Analytics and AI
As the leader of the State and Local Government business at Databricks, I get to see what governments all over the U.S. are doing to address the Novel Coronavirus and COVID-19 crisis. I am continually inspired by the work of public servants as they go about their business to save lives and address this crisis....
Enabling Spark SQL DDL and DML in Delta Lake on Apache Spark 3.0
Last week, we had a fun Delta Lake 0.7.0 + Apache Spark 3.0 AMA where Burak Yavuz, Tathagata Das, and Denny Lee provided a recap of Delta Lake 0.7.0 and answered your Delta Lake questions. The theme for this AMA was the release of Delta Lake 0.7.0 coincided with the release of Apache Spark 3.0...