A Guide to the Databricks + AWS Cloud Data Lake Dev Day Workshop

Published: August 24, 2020

by Kristen Stewart and Andreana Phillips

The Databricks team has been working hard to recreate content and enhance the experience as we transition all our events to a virtual experience. And we’ve learned a thing or two about what you want to learn: integrating Databricks with AWS services. So we’re excited to present our Cloud Data Lake Dev Day Workshop in partnership with AWS and Onica.

Organizations want to leverage the wealth of data accumulated in their data lake for deep analytics insights. However, most organizations struggle with preparing data for analytics and automating data pipelines to leverage new data as data lakes are constantly updated. Making the shift to automated data pipelines can be challenging, but it’s become more urgent as the COVID-19 pandemic accelerates the move to a completely virtual workforce and collaborative problem solving.

Learn how to move from manual management of data pipelines to seamless automation in this collaborative workshop with experienced partners and customers to pave the way. Join us Thursday, August 27, at 9:00 AM PDT to experience a deep dive into the technology that makes up a modern cloud-based data and analytics platform. The sessions will include live chat interactions with our system architects to answer all your questions.

Meet our speakers

Arsalan Tavakoli-Shiraji, Co-Founder and SVP of Field Engineering, Databricks
Prior to Databricks, Arsalan was an associate principal at McKinsey & Company, where he advised enterprises, vendors and the public sector on a broad spectrum of strategic topics, including next-generation IT, cloud computing and big initiatives as well as general IT and corporate strategy. Arsalan received a Ph.D. in computer science from UC Berkeley in the area of Networking and Distributed Systems and a B.Eng. from the University of Virginia.

Kevin Miller, General Manager, Amazon S3, AWS
With more than 8 years of experience on the Amazon Web Services team, Kevin has a deep understanding of the storage technologies and options available to a wide range of industries. Kevin can also speak to customer experiences with S3 and Databricks. Prior to Amazon, Kevin was an assistant director at Duke University and member of the Technology Architecture Group where he was charged with establishing technical strategies and developing organizational technical maturity.

Sally Hoppe, Big Data System Architect, HP
Sally is a big data system architect at HP. With a background in math and computer science, she is a versatile software engineering professional with experience developing enterprise software solutions and managing cross-functional teams. While working for a large corporation, she has sought out opportunities in new businesses to learn new technologies and work with passionate coworkers. Because she likes to make order out of chaos, she frequently finds herself in positions that require deep technical knowledge and management skills.

Daniel Ferrante, Director of Platform Engineering, Digital Turbine
For the past 4+ years, Daniel has been leading the Platform and Data Engineering teams at Digital Turbine. He is well skilled in data engineering techniques using Apache Spark™, Scala, Python, Java and Spring Boot. His interests and focus lie in helping businesses succeed in the automation of business analytics and data mastery.

Traey Hatch, Practice Manager, Onica
For the past year, Traey has been at Onica, a cloud consulting and managed services company, helping businesses enable, operate and innovate on the cloud. From migration strategy to operational excellence and immersive transformation, Onica is a full spectrum AWS integrator, helping hundreds of companies realize the value, efficiency and productivity of the cloud. Traey has experience in creating greenfield data lakes for multiple customers ranging from POC projects to full production deployments.

Denis Dubeau, AWS Partner Solution Architect Manager, Databricks
Denis is a partner solution architect providing guidance and enablement on modernizing data lake strategies using Databricks on AWS. Denis is a seasoned professional with significant industry experience in data engineering and data warehousing with previous stops at Greenplum, Hortonworks, IBM and AtScale.

An overview of what you’ll learn:

How to build highly scalable and reliable data pipelines for analytics
How you can make your existing S3 data lake analytics-ready with open-source Delta Lake technology
Evaluate options to migrate current on-premises data lakes (Hadoop, etc.) to AWS with Databricks Delta
How to integrate that data with services such as Amazon SageMaker, Amazon Redshift, AWS Glue and Amazon Athena as well as how to leverage your AWS security and roles without moving your data out of your account
Understand open-source technologies like Delta Lake and Apache Spark that are portable and powerful at any organization and for any analytics use case

Get ready

Register: If you have not registered for the event, you can do so here.
Training: If you are new to Databricks and want to learn more, check out our free online training course here.
Learn more about Databricks on AWS at www.databricks.com/aws

What's next?

August 30, 2024/3 min read

Data Warehousing Trends from Data + AI Summit

November 11, 2024/4 min read

Meet our speakers

An overview of what you’ll learn:

Get ready

Never miss a Databricks post

Sign up

What's next?

Data Warehousing Trends from Data + AI Summit

Azure Databricks at Microsoft Ignite 2024