I’m a Data Engineer at Runtastic (Linz, Austria) since January 2020. Previously I worked as a researcher at Fondazione Bruno Kessler (Trento, Italy) on the topic of Security Testing.
November 17, 2020 04:00 PM PT
Data & ML projects bring many new complexities beyond the traditional software development lifecycle. Unlike software projects, after they were successfully delivered and deployed, they cannot be abandoned but must be continuously monitored if model performance still satisfies all requirements. We can always get new data with new statistical characteristics that can break our pipelines or influence model performance.
All these qualities of data & ML projects lead us to the necessity of continuous testing and monitoring of our models and pipelines. In this talk we will show how CI/CD Templates can simplify these tasks: bootstrap new data project within a minute, set up CI/CD pipeline using GitHub Actions, implement integration tests on Databricks. All this is possible because of conventions introduced by CI/CD Templates which helps automate deployments & testing of abstract data pipelines and ML models.
The CI/CD templates are used by Runtastic for automating deployment processes of their Databricks pipelines. During this webinar Emanuele Viglianisi, Data Engineer at Runtastic will show how Runtasic is using CI/CD templates during their day to day development to run, test and deploy their pipelines directly from PyCharm IDE to Databricks. Emanuele will present the challenges Runtastic has faced and how they successfully solved them by integrating the CI/CD template in their workflow.
Speakers: Michael Shtelma and Emanuele Viglianisi