Bilal Obeidat is a solution architect working for Databricks.
June 23, 2020 05:00 PM PT
Columbia is a data-driven enterprise, integrating data from all line-of-business-systems to manage its wholesale and retail businesses. This includes integrating real-time and batch data to better manage purchase orders and generate accurate consumer demand forecasts. It also includes analyzing product reviews to increase customer satisfaction. In this presentation, we'll walk through how we achieved a 70% reduction in pipeline creation time and reduced ETL workload times from four hours with previous data warehouses to minutes using Azure Databricks, hence enabling near real-time analytics. We migrated from multiple legacy data warehouses, run by individual lines of business, to a single scalable, reliable, performant data lake on top of Azure and Delta Lake.