My name is Ulukbek. I am working as a data scientist in recommendation team of HepsiBurada which is the largest e-commerce platform in Turkey. Mostly, I am responsible for analytics and modeling tasks in our recommendation team. I have been working for the recommendation team of HepsiBurada for one year. Before I worked at companies such as Vodafone, DenizBank and Insider.
I pursued my Master’s Degree from Istanbul Technical University and my thesis was about multi-document summarization task.
My main research area is applying NLP algorithms to recommendation systems.
November 17, 2020 04:00 PM PT
We are the recommendation team that performs Data Engineering + Machine Learning + Software Engineering practices in "hepsiburada.com" which is the largest e-commerce platform in Turkey and in the Middle East. Our aim is to generate relevant recommendations to our users in the most appropriate manner in terms of time, context and products.
One of the many recommendations we serve to our clients is the "frequently bought together" products. Generation of "frequently bought together" recommendations of millions of products to millions of customers is a challenging process which requires specific approaches. There are many steps the recommendation development team must take to achieve this goal.
In this talk, we plan to explain the problems we have overcome, from training a model to productionize it, following the metrics of the model in production and keeping the model updated.
Our tips and tricks to be shared with the community, are as follows:
1. Embedding based recommendation
- Context and arithmetic operation problems
- Pros and Cons
- In which cases you may need dimension reduction
2. Offline Metrics
- Hyper parameter tuning and pre production check with Mlflow
- Etl (Pyspark + Oozie) and serving layer based on continuous delivery mindset
4. Experimental UI
- Why do you need a manual control mechanism for such a product
5. Embedding Serving Layer
- Knn Search
- Hnswlib vs others.. Pros/cons
- Programming language and environment selection for serving layer
- Post Processing Needs; Metadata, filtering and sorting options
6. Online Metrics
- Why are online metrics better than offline ones
7. Do we need a more complex model or better tricks
- Time and position tricks can be better than a much more complex model
Speakers: Mehmet Selman Sezgin and Ulukbek Attokurov