Patrick Caldon is a director in the Quantitative Research team at Morningstar. He leads the research effort on Morningstar’s Risk Model platform. His main research interest is using statistical techniques to uncover latent patterns in financial markets data. He previously developed research methodologies for the Australian hybrids market and US municipal bonds. Patrick holds a PhD from the University of New South Wales where he worked in applications of logic in artificial intelligence.
Morningstar's Risk Model project is created by stitching together statistical and machine learning models to produce risk and performance metrics for millions of financial securities. Previously, we were running a single version of this application, but needed to expand it to allow for customizations based on client demand. With the goal of running hundreds of custom Risk Model runs at once at an output size of around 1TB of data each, we had a challenging technical problem on our hands! In this presentation, we'll talk about the challenges we faced replatforming this application to Spark, how we solved them, and the benefits we saw.
Some things we'll touch on include how we created customized models, the architecture of our machine learning application, how we maintain an audit trail of data transformations (for rigorous third party audits), and how we validate the input data our model takes in and output data our model produces. We want the attendees to walk away with some key ideas of what worked for us when productizing a large scale machine learning platform.