Keerthika Thiyagaran, currently a Software Development Engineer 3 has been working in Flipkart Financial Data Engineering team for the past 5 years.
May 27, 2021 11:35 AM PT
Availability of high-quality data is central to success of any organization in the current era. As every organization ramps up its collection and storage of data, its usefulness largely depends on the confidence of its quality. In the Financial Data Engineering team at Flipkart, where the bar for the data quality is 100% correctness and completeness, this problem takes on a wholly different dimension. Currently, countless number of data analysts and engineers try to find various issues in the financial data to keep it that way. We wanted to find a way that is less manual, more scalable and cost-effective.
As we evaluated various solutions available in the public domain, we found quite a few gaps.
In this presentation, we discuss how we developed a comprehensive data quality framework. Our framework has also been developed with the assumption that the people interested in and involved in fixing these issues are not necessarily data engineers. Our framework has been developed to be largely config driven with pluggable logic for categorisation and cleaning. We will then talk about how it helped achieve scale in fixing the data quality issues and helped reduce many of the repeated issues.