Join the Day 1 keynote to hear from Databricks co-founders and original creators of Apache Spark Ali Ghodsi, Matei Zaharia, and Reynold Xin on how the open source community is taking on the biggest challenges in data.
This talk will deep dive on the latest updates on the Delta Lake project and how it’s realizing the vision of lakehouse architecture to help data teams tackle their toughest challenges. The keynote will also cover the latest data management innovations on the Databricks platform.
Keynotes this morning include:
Ali Ghodsi – Databricks
Rohan Dhupelia – Atlassian
Michael Armbrust – Databricks
Matei Zaharia – Databricks
Reynold Xin – Databricks
Matt Garman – AWS
Matei Zaharia is an Assistant Professor of Computer Science at Stanford University and Chief Technologist at Databricks. He started the Apache Spark project during his PhD at UC Berkeley in 2009, and has worked broadly in datacenter systems, co-starting the Apache Mesos project and contributing as a committer on Apache Hadoop. Today, Matei tech-leads the MLflow development effort at Databricks in addition to other aspects of the platform. Matei’s research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best PhD dissertation in computer science, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE).
Malala was enrolled in her father’s school at the age of 4. Truly her father’s daughter, while other children fantasized about playing with toys, Malala fantasized about giving lectures. Her brother jokingly shares that Malala is “addicted to books.” But it was in the hallways of her father’s school that Malala found her voice and her vision. Her father, Ziauddin Yousafzai, is a Pakistani educator, activist, and humanitarian who established a thriving school in their rural home in Swat Valley, which aimed to provide educational opportunities for all children. Ziauddin’s dedication to education and peaceful resistance against the Taliban made the world take notice. Inspired by her father’s activism, Malala began her campaign for girls’ education at age 11 with her anonymous blog for the BBC, Diary of a Pakistani Schoolgirl, about life under the Taliban. Malala soon began advocating publicly for girls’ education. She would join her father on his visits to neighboring villages to recruit for the school. While he spoke to the men, she would speak to the women. Their crusade was the subject of a New York Times short documentary in 2009. Independently, Malala began attracting international media attention and awards. Due to her increased prominence, at age 15 she was attacked by the Taliban for speaking out. Malala recovered in the United Kingdom and has continued her fight for girls ever since. In 2013, she founded Malala Fund with her father. A year later, Malala received the Nobel Peace Prize in recognition of her efforts to see every girl complete 12 years of free, safe, quality education. Malala is currently completing her undergraduate degree at Oxford University, with a focus on Philosophy, Politics, and Economics. Malala is the author of three books, I Am Malala: The Girl Who Stood Up for Education and Was Shot by the Taliban, Malala’s Magic Pencil, and We Are Displaced: My Journey and Stories from Refugee Girls Around the World. Malala used to be known as her father’s daughter, but now Ziauddin is known as his daughter’s father – and he is proud to have it that way.
Rohan Dhupelia leads the analytics platform at Atlassian, which acts as the backbone for all analytical applications across the company, providing capabilities necessary to power both 1st party (internal data science and analytics) and 3rd party (in-product and direct customer) uses of analytical data. Rohan has spent the last 10+ years of his career in the data space across a variety of industries, including FMCGs, property, and technology, doing everything from BI report development to data warehousing and data engineering.
Matt Garman joined Amazon in 2006, is currently the Senior Vice President of the Worldwide Sales and Marketing organization in Amazon Web Services (AWS), and also sits on Amazon’s executive leadership S-Team. He has held several leadership positions in AWS over that time. Matt previously served as Vice President of the Amazon EC2 and Compute Services businesses for AWS for over 10 years. Matt was responsible for P&L, product management, and engineering and operations for all compute and storage services in AWS. He started at Amazon when AWS first launched in 2006 and served as one of the first Product Managers, helping to launch the initial set of AWS services. Prior to Amazon, he spent time in product management roles at early stage Internet startups. Matt earned a BS and MS in Industrial Engineering from Stanford University, and an MBA from the Kellogg School of Management at Northwestern University.
Ali Ghodsi is the CEO and co-founder of Databricks, responsible for the growth and international expansion of the company. He previously served as the VP of Engineering and Product Management before taking the role of CEO in January 2016. In addition to his work at Databricks, Ali serves as an adjunct professor at UC Berkeley and is on the board at UC Berkeley’s RiseLab. Ali was one of the original creators of open source project, Apache Spark, and ideas from his academic research in the areas of resource management and scheduling and data caching have been applied to Apache Mesos and Apache Hadoop. Ali received his MBA from Mid-Sweden University in 2003 and PhD from KTH/Royal Institute of Technology in Sweden in 2006 in the area of Distributed Computing.
Michael Armbrust is committer and PMC member of Apache Spark and the original creator of Spark SQL. He currently leads the team at Databricks that designed and built Structured Streaming and Databricks Delta. He received his PhD from UC Berkeley in 2013, and was advised by Michael Franklin, David Patterson, and Armando Fox. His thesis focused on building systems that allow developers to rapidly build scalable interactive applications, and specifically defined the notion of scale independence. His interests broadly include distributed systems, large-scale structured storage and query optimization.
Reynold is an Apache Spark PMC member and the top contributor to the project. He initiated and led efforts such as DataFrames and Project Tungsten. He is also a co-founder and Chief Architect at Databricks.