Jan has studied his BA & MCS in Trinity College Dublin. During his studies, he worked as an intern in SAP. During this work, he earned valuable experience with in-memory database systems, which led to his interests in big data technologies. In 2014, he started a PhD in FMG, TCD, with focus on optimising the resource utilisation of big data frameworks namely MapReduce. In 2015, Jan started working as a big data engineer for Barclays Africa in Prague. He is now in charge of building an internal big data engineering expertise and development of new tools and products including Spline.
Data lineage tracking is one of the significant problems that financial institutions face when using modern big data tools. This presentation describes Spline - a data lineage tracking and visualization tool for Apache Spark. Spline captures and stores lineage information from internal Spark execution plans and visualizes it in a user-friendly manner. Session hashtag: #EUent3