Jan Neumann leads the Comcast Applied Artificial Intelligence Research group with team members in Washington, DC, Philadelphia, Chicago, Denver and Silicon Valley. His team combines large-scale machine learning, deep learning, NLP and computer vision to develop novel algorithms and product concepts that improve the experience of Comcast’s customers such as the X1 voice remote and personalization features, virtual assistants and predictive intelligence for customer service, as well as smart video and sensor analytics.
Before Comcast, he worked for Siemens Corporate Research on various computer vision related projects such as driver assistance systems and video surveillance. He has published over 20 papers in scientific conferences and journals, and is a frequent speaker on machine learning and data science. He holds a Ph.D. in Computer Science from the University of Maryland, College Park.
"Comcast is the largest cable and internet provider in the US, reaching more than 30 million customers. Over the last couple years, Comcast has transformed the customer experience using machine learning. For example, Comcast uses machine learning to power the X1 voice remote, which was used over 8B times in 2018 by our customers to find something they love to watch, get the latest sports statistics, control their home, or check their bill and troubleshoot their service using natural language. What all these different applications have in common is that to create and operate the machine learning models powering these applications we need to ingest many TBs of data on daily basis in an efficient and resilient manner, and need a machine learning platform that allows for fast exploration of new ideas while at the same time automatic deployment of the resulting machine learning models into a production environment that can handle Comcast scale. In this talk we describe our data and machine learning infrastructure built on Databricks Unified Analytics Platform including how Databricks Delta is used for the ingest and initial processing of the raw telemetry from our video and voice applications and devices. We then explain how this data can be used by both the product organizations to gain deeper insights into how our products are being used, as well as by our research and engineering teams to train and fuel the machine learning models at the heart of of these products. This keynote will also include an end-to-end demonstration of our machine learning platform that is centered around Databricks and MLFlow and how it integrates with other open source machine learning frameworks such as Tensorflow, PyTorch, Sklearn, H20 and Kubeflow to name a few."