Yuliana Havryshchuk

Developer, Zillow

Yuliana is a software engineer on Zillow’s Data Governance team. After experiencing continuous data quality challenges in critical business pipelines, she built a proof of concept that grew into a platform serving all data engineering teams across Zillow today. She is currently focused on expanding tools and services to bring data quality monitoring to Zillow’s large-scale data pipelines in a way that is accessible to all data users, from engineers to product managers. Yuliana has a Bachelor of Mathematics from the University of Waterloo in Computer Science and Statistics.

Past sessions

Summit 2021 Democratizing Data Quality Through a Centralized Platform

May 27, 2021 03:15 PM PT

Bad data leads to bad decisions and broken customer experiences. Organizations depend on complete and accurate data to power their business, maintain efficiency, and uphold customer trust. With thousands of datasets and pipelines running, how do we ensure that all data meets quality standards, and that expectations are clear between producers and consumers? Investing in shared, flexible components and practices for monitoring data health is crucial for a complex data organization to rapidly and effectively scale.

At Zillow, we built a centralized platform to meet our data quality needs across stakeholders. The platform is accessible to engineers, scientists, and analysts, and seamlessly integrates with existing data pipelines and data discovery tools. In this presentation, we will provide an overview of our platform’s capabilities, including:

  • Giving producers and consumers the ability to define and view data quality expectations using a self-service onboarding portal 
  • Performing data quality validations using libraries built to work with spark
  • Dynamically generating pipelines that can be abstracted away from users
  • Flagging data that doesn’t meet quality standards at the earliest stage and giving producers the opportunity to resolve issues before use by downstream consumers
  • Exposing data quality metrics alongside each dataset to provide producers and consumers with a comprehensive picture of health over time
In this session watch:
Yuliana Havryshchuk, Developer, Zillow
Smit Shah, Senior Software Engineer, Big Data, Zillow