Click-based Hot Fixes for Underperforming Torso Queries
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F16%3A00306241" target="_blank" >RIV/68407700:21230/16:00306241 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1145/2911451.2911500" target="_blank" >http://dx.doi.org/10.1145/2911451.2911500</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/2911451.2911500" target="_blank" >10.1145/2911451.2911500</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Click-based Hot Fixes for Underperforming Torso Queries
Popis výsledku v původním jazyce
Ranking documents using their historical click-through rate (CTR) can improve relevance for frequently occurring queries, i.e., so-called head queries. It is difficult to use such click signals on non-head queries as they receive fewer clicks. In this paper, we address the challenge of dealing with torso queries on which the production ranker is performing poorly. Torso queries are queries that occur frequently enough so that they are not considered as tail queries and yet not frequently enough to be head queries either. They comprise a large portion of most commercial search engines' traffic, so the presence of a large number of underperforming torso queries can harm the overall performance significantly. We propose a practical method for dealing with such cases, drawing inspiration from the literature on learning to rank (LTR). Our method requires relatively few clicks from users to derive a strong re-ranking signal by comparing document relevance between pairs of documents instead of using absolute numbers of clicks per document. By infusing a modest amount of exploration into the ranked lists produced by a production ranker and extracting preferences between documents, we obtain substantial improvements over the production ranker in terms of page-level online metrics. We use an exploration dataset consisting of real user clicks from a large-scale commercial search engine to demonstrate the effectiveness of the method. We conduct further experimentation on public benchmark data using simulated clicks to gain insight into the inner workings of the proposed method. Our results indicate a need for LTR methods that make more explicit use of the query and other contextual information.
Název v anglickém jazyce
Click-based Hot Fixes for Underperforming Torso Queries
Popis výsledku anglicky
Ranking documents using their historical click-through rate (CTR) can improve relevance for frequently occurring queries, i.e., so-called head queries. It is difficult to use such click signals on non-head queries as they receive fewer clicks. In this paper, we address the challenge of dealing with torso queries on which the production ranker is performing poorly. Torso queries are queries that occur frequently enough so that they are not considered as tail queries and yet not frequently enough to be head queries either. They comprise a large portion of most commercial search engines' traffic, so the presence of a large number of underperforming torso queries can harm the overall performance significantly. We propose a practical method for dealing with such cases, drawing inspiration from the literature on learning to rank (LTR). Our method requires relatively few clicks from users to derive a strong re-ranking signal by comparing document relevance between pairs of documents instead of using absolute numbers of clicks per document. By infusing a modest amount of exploration into the ranked lists produced by a production ranker and extracting preferences between documents, we obtain substantial improvements over the production ranker in terms of page-level online metrics. We use an exploration dataset consisting of real user clicks from a large-scale commercial search engine to demonstrate the effectiveness of the method. We conduct further experimentation on public benchmark data using simulated clicks to gain insight into the inner workings of the proposed method. Our results indicate a need for LTR methods that make more explicit use of the query and other contextual information.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval
ISBN
978-1-4503-4069-4
ISSN
—
e-ISSN
—
Počet stran výsledku
10
Strana od-do
195-204
Název nakladatele
ACM Press
Místo vydání
New York
Místo konání akce
Pisa
Datum konání akce
17. 7. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—