Kirby graduated from University of Exeter, UK with a Masters in Physics. She now works for Mars Petcare, and has lead the design and implementation of the first Delta Lake engine on the Mars Petcare Data Platform.
June 24, 2020 05:00 PM PT
At Mars Petcare (in a division known as Kinship Data & Analytics) we are building out the Petcare Data Platform - a cloud based Data Lake solution. Leveraging Microsoft Azure, we were faced with important decisions around tools and design. We chose Delta Lake as a storage layer to build out our platform and bring insight to the science community across Mars Petcare. Migrating away from Azure Data Factory completely, we leveraged Spark and Databricks to build 'Kyte', a bespoke pipeline tool which has massively accelerated our ability to ingest, cleanse and process new data sources from across our large and complicated organisation. Building on this we have started to use Delta Lake for our ETL configurations and have built a bespoke UI for monitoring and scheduling our Spark pipelines. Find out more about why we chose a Spark-heavy ETL design and a Delta Lake driven platform, the advantages (and difficulties) of migrating away from Azure Data Factory, and why we are committing to Spark and Delta Lake as the core of our Platform to support our mission: Making a Better World for Pets! Key Takeaways: