Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F16%3A00088811" target="_blank" >RIV/00216224:14330/16:00088811 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1145/2983323.2983815" target="_blank" >http://dx.doi.org/10.1145/2983323.2983815</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/2983323.2983815" target="_blank" >10.1145/2983323.2983815</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Popis výsledku v původním jazyce
Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online.(1)
Název v anglickém jazyce
Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search
Popis výsledku anglicky
Retrieval pipelines commonly rely on a term-based search to obtain candidate records, which are subsequently re-ranked. Some candidates are missed by this approach, e.g., due to a vocabulary mismatch. We address this issue by replacing the term-based search with a generic k-NN retrieval algorithm, where a similarity function can take into account subtle term associations. While an exact brute-force k-NN search using this similarity function is slow, we demonstrate that an approximate algorithm can be nearly two orders of magnitude faster at the expense of only a small loss in accuracy. A retrieval pipeline using an approximate k-NN search can be more effective and efficient than the term-based pipeline. This opens up new possibilities for designing effective retrieval pipelines. Our software (including data-generating code) and derivative data based on the Stack Overflow collection is available online.(1)

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Centrum pro multi-modální interpretaci dat velkého rozsahu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2016
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT
ISBN
9781450340731
ISSN
—
e-ISSN
—
Počet stran výsledku
10
Strana od-do
1099-1108
Název nakladatele
ASSOC COMPUTING MACHINERY
Místo vydání
NEW YORK
Místo konání akce
IUPUI, Indianapolis, IN
Datum konání akce
24. 10. 2016
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000390890800113

Podobné výsledky(10)

Approximate Best Bin First k-d Tree All Nearest Neighbor Search with Incremental Updates Rank Aggregation of Candidate Sets for Efficient Similarity Search PPP-Codes: Similarity Search Index

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Off the Beaten Path: Let's Replace Term-Based Retrieval with k-NN Search

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)