Denis Dubeau is a Partner Solution Architect providing guidance and enablement on modernizing data lake strategies using Databricks on AWS. Denis is a seasoned professional with significant industry experience in Data Engineering and Data Warehousing with previous stops at Greenplum, Hortonworks, IBM and AtScale.
How are customers building enterprise data lakes on AWS with Databricks? Learn how Databricks complements the AWS data lake strategy and how HP has succeeded in transforming business with this approach.
You might have an existing Data Lake or are considering leveraging AWS to centralize your data assets in the cloud. In this session, we will demonstrate how to leverage the AWS Glue data cataloging feature and crawl various S3 data sources to discover their schemas and populating your centralized metadata repository. Then, cleans your data with the reliability of Databricks Delta Lake to produce fast and accurate business outcomes.
Next, we will use the Amazon Athena serverless interactive query service to analyze the curated data residing on the highly available and durable Amazon S3 and integrated with the unified AWS Glue metadata repository.
Finally learn how to quickly enhance your application with rich interactive data visualization and analytics capabilities with minimal effort using Amazon QuickSight.