Building an open-source online Learn-to-rank engine

Roman Grebennikov • Location: TUECHTIG • Back to Haystack EU 2022

Relevancy is subjective. Same items in search results for a “jeans” query may have a completely opposite value for you and me, as we’re different in sizes, shapes, and tastes.

But leveraging past visitor behavior for LTR tasks often becomes a not so easy data engineering challenge when you want to use complex ML features in your ranking. Implementing advanced things like sliding window counters, per-item conversion+CTR rates, and customer profile tracking, working both online and offline - you need a whole team of DS/DE/MLE people and a lot of time to glue things together!

We got tired of reinventing the wheel of LTR again and again, and present you Metarank, an open-source personalization service handling the most typical data+feature engineering tasks. It takes an event stream describing your visitor behavior, maps it to most common ML features, and reorders items in real-time to maximize the goal like CTR. All you need is a YAML config and a bit of JSON I/O.

Roman Grebennikov

An independent search engineer, working on relevancy, personalization and recommendations. A pragmatic fan of open-source software, functional programming, learn-to-rank models and performance tuning.