Taylor Hess is currently a senior quantitative analyst on Morningstar’s Quantitative Research team. His work consists of architecting large statistical models, putting quantitative research and machine learning models into production, and exploring new technologies. He has extensive experiencing designing and building solutions with cloud computing technologies.
Morningstar's Risk Model project is created by stitching together statistical and machine learning models to produce risk and performance metrics for millions of financial securities. Previously, we were running a single version of this application, but needed to expand it to allow for customizations based on client demand. With the goal of running hundreds of custom Risk Model runs at once at an output size of around 1TB of data each, we had a challenging technical problem on our hands! In this presentation, we'll talk about the challenges we faced replatforming this application to Spark, how we solved them, and the benefits we saw.
Some things we'll touch on include how we created customized models, the architecture of our machine learning application, how we maintain an audit trail of data transformations (for rigorous third party audits), and how we validate the input data our model takes in and output data our model produces. We want the attendees to walk away with some key ideas of what worked for us when productizing a large scale machine learning platform.