Mehmet Selman Sezgin

Senior Data Engineer, HepsiBurada

Multi-talented Senior Software&Data Engineer successful completing simultaneous projects. Willing to jump in to develop “outside the box” solutions. Talented project leader and complex problem solver with results focused and driven approach. Data enthusiast.

Past sessions

Summit Europe 2020 Frequently Bought Together Recommendations Based on Embeddings

November 17, 2020 04:00 PM PT

We are the recommendation team that performs Data Engineering + Machine Learning + Software Engineering practices in "hepsiburada.com" which is the largest e-commerce platform in Turkey and in the Middle East. Our aim is to generate relevant recommendations to our users in the most appropriate manner in terms of time, context and products.

One of the many recommendations we serve to our clients is the "frequently bought together" products. Generation of "frequently bought together" recommendations of millions of products to millions of customers is a challenging process which requires specific approaches. There are many steps the recommendation development team must take to achieve this goal.

In this talk, we plan to explain the problems we have overcome, from training a model to productionize it, following the metrics of the model in production and keeping the model updated.

Our tips and tricks to be shared with the community, are as follows:
1. Embedding based recommendation
- Context and arithmetic operation problems
- Pros and Cons
- In which cases you may need dimension reduction
2. Offline Metrics
- Hyper parameter tuning and pre production check with Mlflow
3. Pipeline
- Etl (Pyspark + Oozie) and serving layer based on continuous delivery mindset
4. Experimental UI
- Why do you need a manual control mechanism for such a product
5. Embedding Serving Layer
- Knn Search
- Hnswlib vs others.. Pros/cons
- Programming language and environment selection for serving layer
- Post Processing Needs; Metadata, filtering and sorting options
6. Online Metrics
- Why are online metrics better than offline ones
7. Do we need a more complex model or better tricks
- Time and position tricks can be better than a much more complex model

Speakers: Mehmet Selman Sezgin and Ulukbek Attokurov