Baburao is a Principal Data Scientist at GoDaddy Inc. and is passionate about using large scale distributed systems for customer analytics and machine learning. He works on fundamental technologies in areas such as customer segmentation, developing features, natural language processing and machine learning. Baburao is the author or co-author of over 20 published papers in analytics, geospatial machine learning
GoDaddy’s vision is to radically shift the global economy toward fulfilling independent ventures. At GoDaddy, we have developed the Apache Spark based architecture to enables automatically features generation from 61M domains of 16M Customers using the machine and human generated data. We are analyzing the signals from the generated features using ML pipelines to calculate the “GoDaddy Small Business Success Index”, Churn and LTV. In this presentation, we will walk through the development of the “GoDaddy Small Business Success Index”, and lessons we learned in building and running large scale Spark ML applications successfully in production.
GoDaddy powers the world’s largest cloud platform dedicated to small, independent ventures. With more than 14 million customers worldwide and more than 63 million domain names under management, GoDaddy is the place people come to name their idea, build a professional website, attract customers and manage their work. At GoDaddy, the Advanced Analytics team developed the Apache Spark-based Customer Success Dashboard to help customer and marketing teams. It is a product of the Apache Spark-based analytics and ETL pipelines to get the insights of the customer data collected from Internet, machine logs and communication like voice and text. In this session, GoDaddy will discuss how Apache Spark is used as a distributed framework that they build their own algorithms on top of to generate features for Customer Success Dashboard recommendations for each of their 14 million customers. Learn about specific techniques they use at GoDaddy to scale, and the various pitfalls they've found along the way. Session hashtag: #SFent1