GoDaddy Small Business Success Index Using Apache Spark

GoDaddy’s vision is to radically shift the global economy toward fulfilling independent ventures. At GoDaddy, we have developed the Apache Spark based architecture to enables automatically features generation from 61M domains of 16M Customers using the machine and human generated data. We are analyzing the signals from the generated features using ML pipelines to calculate the “GoDaddy Small Business Success Index”, Churn and LTV. In this presentation, we will walk through the development of the “GoDaddy Small Business Success Index”, and lessons we learned in building and running large scale Spark ML applications successfully in production.

About Baburao Kamble

Baburao is a Principal Data Scientist at GoDaddy Inc. and is passionate about using large scale distributed systems for customer analytics and machine learning. He works on fundamental technologies in areas such as customer segmentation, developing features, natural language processing and machine learning. Baburao is the author or co-author of over 20 published papers in analytics, geospatial machine learning