Implementing a Reliable Data Lake with Databricks Delta and the AWS Ecosystem - Databricks

Implementing a Reliable Data Lake with Databricks Delta and the AWS Ecosystem

You might have an existing Data Lake or are considering leveraging AWS to centralize your data assets in the cloud. In this session, we will demonstrate how to leverage the AWS Glue data cataloging feature and crawl various S3 data sources to discover their schemas and populating your centralized metadata repository. Then, cleans your data with the reliability of Databricks Delta Lake to produce fast and accurate business outcomes.

Next, we will use the Amazon Athena serverless interactive query service to analyze the curated data residing on the highly available and durable Amazon S3 and integrated with the unified AWS Glue metadata repository.

Finally learn how to quickly enhance your application with rich interactive data visualization and analytics capabilities with minimal effort using Amazon QuickSight.


Try Databricks
See More Spark + AI Summit Europe 2019 Videos

« back
About Denis Dubeau


Denis Dubeau is a Partner Solution Architect providing guidance and enablement on modernizing data lake strategies using Databricks on AWS. Denis is a seasoned professional with significant industry experience in Data Engineering and Data Warehousing with previous stops at Greenplum, Hortonworks, IBM and AtScale.

About Jordan Martz


Jordan Martz is a Partner Solutions Architect for Databricks, supporting projects run by systems integrators and providing the enablement and tools for them, as well. Previously, Jordan was the Director of Technology Solutions at Attunity. His role focused on managing the Database Migrations Services within Azure and AWS. Lastly, his blog/company, DataMartz has been training for Microsoft, Hortonworks, Cloudera, IBM, and AWS for over 10 years.