CUNI team: CLEF eHealth Consumer Health Search Task 2018
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F18%3A10390207" target="_blank" >RIV/00216208:11320/18:10390207 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
CUNI team: CLEF eHealth Consumer Health Search Task 2018
Popis výsledku v původním jazyce
In this paper, we present our participation in CLEF Consumer Health Search Task 2018, mainly, its monolingual and multilingual subtasks: IRTask1 and IRTask4. In IRTask1, we use language-model based retrieval model, vector-space model and Kullback-Leiber divergence query expansion mechanism to build our runs. In IRTask4, we submitted 4 runs for each language of Czech, French and German. We follow query-translation approach in which we employ a Statistical Machine Translation (SMT) system to get a ranked list of translation hypotheses in English. We use this list for two systems: the first one uses 1-best-list translation to construct queries, and the second one uses a hypotheses reranker to select the best translation (in terms of retrieval performance) to construct queries. We also present our term reranking model for query expansion, in which we deploy feature set from different resources (the document collection, Wikipedia articles, translation hypotheses). These features are used to train a logisti
Název v anglickém jazyce
CUNI team: CLEF eHealth Consumer Health Search Task 2018
Popis výsledku anglicky
In this paper, we present our participation in CLEF Consumer Health Search Task 2018, mainly, its monolingual and multilingual subtasks: IRTask1 and IRTask4. In IRTask1, we use language-model based retrieval model, vector-space model and Kullback-Leiber divergence query expansion mechanism to build our runs. In IRTask4, we submitted 4 runs for each language of Czech, French and German. We follow query-translation approach in which we employ a Statistical Machine Translation (SMT) system to get a ranked list of translation hypotheses in English. We use this list for two systems: the first one uses 1-best-list translation to construct queries, and the second one uses a hypotheses reranker to select the best translation (in terms of retrieval performance) to construct queries. We also present our term reranking model for query expansion, in which we deploy feature set from different resources (the document collection, Wikipedia articles, translation hypotheses). These features are used to train a logisti
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Centrum pro multi-modální interpretaci dat velkého rozsahu</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů