Haystack EU 2022

Talks from the Search & Relevance Community at the Haystack Conference!

Please read the Event Safety, Code of Conduct and COVID-19 Policy.

Our venue is TUECHTIG, Berlin Oudenarder Straße 16, 13347 Berlin

Tuesday, September 27th, 2022

Time Track 1
8:00-9:00am Registration
9:00-9:15 Introduction and Welcome
9:15-10:00am Keynote

Dmitry Kan

10:00am-10:45am Fine-tuning for Vector Search

The "hard part" of applied vector search with dense retrieval is often building an embedding model that works for the user's target domain. Unfortunately, it's not always clear where to start with this. This talk will summarize the most popular fine-tuning methods for semantic search and QA applications. When/where they should be used, their intended purpose, and the data requirements. Covering methods like:

* MSE-loss, MNR-loss where labeled data is available.
* Multilingual knowledge distillation for transferring semantic knowledge into new languages using translation pairs data.
* Unsupervised semantic-similarity methods like TSDAE.
* Augmentation of small datasets with AugSBERT.
Unsupervised QA methods like GenQ and GPL.

At the end of this, the audience should have a good grasp of when and where to use the different training methods for their embedding models based on their use-case and training data.

James Briggs

10:45-11:00am Break
11:00am-11:45am Building an open-source online Learn-to-rank engine

Relevancy is subjective. Same items in search results for a “jeans” query may have a completely opposite value for you and me, as we’re different in sizes, shapes, and tastes. But leveraging past visitor behavior for LTR tasks often becomes a not so easy data engineering challenge when you want to use complex ML features in your ranking. Implementing advanced things like sliding window counters, per-item conversion+CTR rates, and customer profile tracking, working both online and offline - you need a whole team of DS/DE/MLE people and a lot of time to glue things together! We got tired of reinventing the wheel of LTR again and again, and present you Metarank, an open-source personalization service handling the most typical data+feature engineering tasks. It takes an event stream describing your visitor behavior, maps it to most common ML features, and reorders items in real-time to maximize the goal like CTR. All you need is a YAML config and a bit of JSON I/O.

Roman Grebennikov

11:45am-12:30pm A practical approach to measuring the relevance and preventing regressions

In this session we present a practical approach in implementing a component that automatically measures the relevance of an e-commerce search engine, helps configuring it and prevents relevance regressions. Our component is based on the Normalized Discounted Cumulative Gain (NDCG) theory and integrated into a real-world large-scale e-commerce search engine. We give our return on experience and present all the practical details of its implementation - logs collection, integration with the search engine administration interface, deployment automation and give an idea about its return on investment (ROI).

Aline Paponaud & Roudy Khoury

12:30pm-1:30pm Lunch
1:30pm-2:15pm Lowering the entry threshold for Neural Vector Search by applying Similarity Learning

Neural search is the hottest trend in the area of search engines. It is not debatable. Training a neural network is, in turn, thought to require a lot of annotated data and may struggle in some specific domains if we decide to use some pretrained models without any further fine-tuning. Similarity learning addresses most of the possible issues and allows training your models in hours or days, not months, with much-lowered data requirements,also in a rapidly changing environment. This talk will reveal how to design an end-to-end pipeline with some Open Source tools and describe some common pitfalls and the ways to avoid them.

Kacper Łukawski

2:15pm-3:00pm Women of Search

A diverse workforce means happier and healthier employees and helps access the huge range of talents and skills required for organisations to thrive. The search & information retrieval sector, like many other areas of technology, can struggle to attract and retain women despite their visible achievements as pioneers, creators and maintainers of leading search technologies. In this session we’ll lay out the issues and present the results of a brief survey. We’ll then hear from women working in search and celebrate their amazing achievements, before discussing next steps towards a more inclusive future. Hosted by Atita Arora of OpenSource Connections and the Women of Search group in Relevance Slack.

Atita Arora

3:00-3:15pm Break
3:15pm-4:00pm Increasing relevant product recall through smart use of customer behavioral data

Kramp is a B2B wholesale company in the agricultural sector, selling over 1.7 million technical products across 26 countries. Most of our business runs through our digital channels, where our customers need to find and order the right parts and products as quickly as possible. Our key challenge is poor product data quality, which results in customers not being able to find the products they need, even though Kramp sells it. To improve recall for specific queries, we use customer behavioral data to enrich search results . In short, we track search behavior and query refinements during a 'search session', allowing us to identify patterns to relate products with interactions to earlier unsuccessful search queries. We ran AB-tests with this feature and we see a clear positive impact of this functionality on key metrics such as product interactions, add-to-carts and 0-result pages. But we also found some interesting negative side-effects which we plan to tackle in the near-future.

Eric Rongen & Jelmer Krauwer

4:00pm-4:45pm An unbiased Neural Ranking Model for Product Search

At Otto, we are currently testing neural networks for the LTR task due to their success in other machine learning areas. In order to serve diverse customer needs and satisfy the big data requirements of deep neural networks, we leverage multiple implicit customer feedbacks like clicks and orders from our tracklogs. Since such signals exhibit a strong bias towards the current ranking, we train a separate bias-estimator to cancel out this so-called position bias. Moreover, our architecture comprises of an encoder, which generates embeddings of user queries and product descriptions to compare their similarity beyond exact text matches. In the end, a transformer-based scoring function that models competition between products in the SERP provides the final ranking score. To learn from the different relevance signals prevalent in our tracklogs, the ranking function is embedded in a multitask learning framework.

Laurin Luttman