Denis Dubeau is a Partner Solution Architect providing guidance and enablement on modernizing data lake strategies using Databricks on AWS. Denis is a seasoned professional with significant industry experience in Data Engineering and Data Warehousing with previous stops at Greenplum, Hortonworks, IBM and AtScale.
Ever wanted to get the low cost of a data lake combined with the performance of a data warehouse? Welcome to the Lakehouse. In this session learn how you can build a Lakehouse on your AWS cloud platform, using Amazon S3 and Delta Lake. Integrate with Amazon Glue and make the content available to all of your AWS services like Athena and Redshift. Learn how other companies have created an affordable and high performance Lakehouse to drive all their analytics efforts.
How are customers building enterprise data lakes on AWS with Databricks? Learn how Databricks complements the AWS data lake strategy and how HP has succeeded in transforming business with this approach.
You might have an existing Data Lake or are considering leveraging AWS to centralize your data assets in the cloud. In this session, we will demonstrate how to leverage the AWS Glue data cataloging feature and crawl various S3 data sources to discover their schemas and populating your centralized metadata repository. Then, cleans your data with the reliability of Databricks Delta Lake to produce fast and accurate business outcomes.
Next, we will use the Amazon Athena serverless interactive query service to analyze the curated data residing on the highly available and durable Amazon S3 and integrated with the unified AWS Glue metadata repository.
Finally learn how to quickly enhance your application with rich interactive data visualization and analytics capabilities with minimal effort using Amazon QuickSight.