Ranking article comments using reinforcement learning

Lester Solbakken • Back to Haystack Europe 2019

View the slides and video of this talk.

When allowing users to make comments on your content pages you face the problem that not all of them are worth showing - a difficult problem to solve. In this talk we’ll show how this problem was attacked using reinforcement learning at serving time on Yahoo content sites, using the Vespa open source platform to create a scalable production solution.

Yahoo properties such as Yahoo Finance, News and Sports allow users to comment on the articles, similar to many other apps and websites. To support this the team needed a system that can add, find, count and serve comments at scale in real time. Not all comments are equally as interesting or relevant though, and some articles can have hundreds of thousands of comments, so a good commenting system must also choose the right comments among these to show to users viewing the article. To accomplish this, the system must observe what users are doing and learn how to pick comments that are interesting.

Here we’ll explain how this problem was solved for Yahoo properties by using Vespa - the open source big data serving engine. We’ll start with the basics and then show how comment selection using neural nets and reinforcement learning was implemented with the successful result of increasing the average time spent per user on the sites.

Lester Solbakken is a Principle Software Engineer at Verizon Media (previously Yahoo) on the vespa.ai platform, the open big data serving engine. His primary focus areas are machine learning engineering with emphasis on serving and search system ranking. Lester previously pursued a PhD within Artificial Intelligence and Machine Learning with neural networks, exploratory data analysis and self-organizing systems as main research topics.