K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F17%3A00314191" target="_blank" >RIV/68407700:21240/17:00314191 - isvavai.cz</a>
Výsledek na webu
<a href="http://ceur-ws.org/Vol-1885/144.pdf" target="_blank" >http://ceur-ws.org/Vol-1885/144.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Popis výsledku v původním jazyce
In application domains where there exists a large amount of unlabelled data but obtaining labels is expensive, active learning is a useful way to select which data should be labelled. In addition to its traditional successful use in classification and regression tasks, active learning has been also applied to sequence labelling. According to the standard active learning approach, sequences for which the labelling would be the most informative should be labelled. However, labelling the entire sequence may be inefficient as for some its parts, the labels can be predicted using a model. Labelling such parts brings only a little new information. Therefore in this paper, we investigate a sequence labelling approach in which in the sequence selected for labelling, the labels of most tokens are predicted by a model and only tokens that the model can not predict with sufficient confidence are labelled. Those tokens are identified using the k-best Viterbi algorithm.
Název v anglickém jazyce
K-best Viterbi Semi-supervized Active Learning in Sequence Labelling
Popis výsledku anglicky
In application domains where there exists a large amount of unlabelled data but obtaining labels is expensive, active learning is a useful way to select which data should be labelled. In addition to its traditional successful use in classification and regression tasks, active learning has been also applied to sequence labelling. According to the standard active learning approach, sequences for which the labelling would be the most informative should be labelled. However, labelling the entire sequence may be inefficient as for some its parts, the labels can be predicted using a model. Labelling such parts brings only a little new information. Therefore in this paper, we investigate a sequence labelling approach in which in the sequence selected for labelling, the labels of most tokens are predicted by a model and only tokens that the model can not predict with sufficient confidence are labelled. Those tokens are identified using the k-best Viterbi algorithm.
Klasifikace
Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2017
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
CEUR workshop proceedings
ISSN
1613-0073
e-ISSN
—
Svazek periodika
2017
Číslo periodika v rámci svazku
9
Stát vydavatele periodika
DE - Spolková republika Německo
Počet stran výsledku
9
Strana od-do
144-152
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—