We will discuss what feature engineering is all about , various techniques to use and how to scale to 20000 column datasets using random forest, svd, pca. Also demonstrated is how we can build a service around these to save time and effort when building 100s of models. We will share how we did all this using spark ml to build logistic regression, neural networks, Bayesian networks, etc.
Session hashtag: #EUds12
At Comcast NBCUniversal, Nabeel Sarwar operationalizes machine learning pipelines under the banner of improving customer experience, operations, field, and anything in between. In the process, he oversees data ingest, feature engineering, and the generation and deployment of the AI models. He has a BA in astrophysics from Princeton University.