Eric is a data scientist on the T-Mobile Marketing Solutions team. Eric has a background in mathematics and experience working on a diverse set of problems including computer vision, predictive maintenance, and marketing analytics.
June 23, 2020 05:00 PM PT
The development of big data products and solutions - at scale - brings many challenges to the teams of platform architects, data scientists, and data engineers. While it is easy to find ourselves working in silos, successful organizations intensively collaborate across disciplines such that problems can be understood, a proposed model and solution can be scaled and optimized on multi-terabytes of data. In this session, the T-Mobile Marketing Solutions (TMS) Data Science team will present a platform architecture and production framework supporting TMS internal products and services. Powered by Apache Spark technologies, these services operate in a hybrid of on-premises and cloud environments. As a showcase example, we will discuss key lessons learned and best practices from our Advertising Fraud Detection service. An important focus is on how we scaled data science algorithms outside of the Spark MLlib framework. We will also demonstrate various Spark optimization tips to improve product performance and utilization of MLflow for tracking and reporting. We hope to show the best practices we've learned from our journey of building end-to-end Big Data products.