VEGAS: The Missing Matplotlib for Scala/Apache Spark

Download Slides

In this talk, we’ll present techniques for visualizing large scale machine learning systems in Spark. These are techniques that are employed by Netflix to understand and refine the machine learning models behind Netflix’s famous recommender systems that are used to personalize the Netflix experience for their 99 millions members around the world. Essential to these techniques is Vegas, a new OSS Scala library that aims to be the “missing MatPlotLib” for Spark/Scala. We’ll talk about the design of Vegas and its usage in Scala notebooks to visualize Machine Learning Models.
Session hashtag: #EUds0

About Roger Menezes

Roger works as a Senior Research Engineer at Netflix where he is using large scale Machine Learning algorithms to improve Movies Recommendations for Netflix's 100 M subscribers. Prior to Netflix, he applied Machine Learning and Information Retrieval algorithms to improve the user experience at Yahoo! and Microsoft Bing. Other than ML, he is very interested in distributed computing, having had a brief stint at Amazon AWS.

About DB Tsai

DB Tsai is an Apache Spark PMC and committer and a Senior Research Engineer working on Personalized Recommendation Algorithms at Netflix. He implemented several algorithms including linear models with Elastici-Net (L1/L2) regularization using LBFGS/OWL-QN optimizers in Apache Spark. Prior to joining Netflix, DB was a Lead Machine Learning Engineer at Alpine Data Labs, where he led a team to develop innovative large-scale distributed learning algorithms, and then contributed back to open source Apache Spark project. DB was a Ph.D. candidate in Applied Physics at Stanford University. He holds a Master's degree in Electrical Engineering from Stanford.