Catalin Toda is a Software engineering working on the Data Platform team at Lyft. Catalin has been working on defining the Spark infrastructure Lyft uses to support an increasing number of jobs ranging from SQL queries to ML workloads. He is also working on deprecating Hive in favor of Spark. Previously, he worked as a Production Engineer supporting Spark infrastructure in Facebook. Before that, he supported the Hortonworks Data Flow in Asia Pacific region.
May 28, 2021 10:30 AM PT
Lyft is on the mission to improve people’s lives with the world’s best transportation. Starting 2019, Lyft has been running both Batch ETL and ML spark workloads primarily on Kubernetes with the Apache Spark on k8s operator. However, with the increasing scale of workloads in frequency and resource requirements, we started hitting numerous reliability issues related to IP allocation, container images, IAM role assignment, and Kubernetes Control Plane.
To continue supporting growing Spark usage with Lyft, the team came up with a hybrid architecture optimized for containerized and non-containerized workload based on Kubernetes and YARN. In this talk, we will also cover a dynamic runtime controller that helps with per environment config overrides and easy switchover between resource managers.