Active Learning Efficiency Benchmark for Coreference Resolution Including Advanced Uncertainty Representations
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F23%3A00374834" target="_blank" >RIV/68407700:21230/23:00374834 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1109/CISDS61173.2023.00016" target="_blank" >https://doi.org/10.1109/CISDS61173.2023.00016</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/CISDS61173.2023.00016" target="_blank" >10.1109/CISDS61173.2023.00016</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Active Learning Efficiency Benchmark for Coreference Resolution Including Advanced Uncertainty Representations
Popis výsledku v původním jazyce
Active learning is a powerful technique that accelerates model learning by iteratively expanding training data based on the model’s feedback. This approach has proven particularly relevant in natural language processing and other machine learning domains. While active learning has been extensively studied for conventional classification tasks, its application to more specialized tasks like neural coreference resolution has the potential for improvement. In our research, we present a significant advancement by applying active learning to the neural coreference problem, and setting a benchmark of 39% reduction in required annotations for training data. Simultaneously, it preserves performance compared to the original model trained on the full data. We compare various uncertainty sampling techniques along with Bayesian modifications of coreference resolution models, conducting a comprehensive analysis of annotation efforts. The results demonstrate that the best-performing techniques seek to maximize label annotation in previously chosen documents, showcasing their effectiveness and preserving performance.
Název v anglickém jazyce
Active Learning Efficiency Benchmark for Coreference Resolution Including Advanced Uncertainty Representations
Popis výsledku anglicky
Active learning is a powerful technique that accelerates model learning by iteratively expanding training data based on the model’s feedback. This approach has proven particularly relevant in natural language processing and other machine learning domains. While active learning has been extensively studied for conventional classification tasks, its application to more specialized tasks like neural coreference resolution has the potential for improvement. In our research, we present a significant advancement by applying active learning to the neural coreference problem, and setting a benchmark of 39% reduction in required annotations for training data. Simultaneously, it preserves performance compared to the original model trained on the full data. We compare various uncertainty sampling techniques along with Bayesian modifications of coreference resolution models, conducting a comprehensive analysis of annotation efforts. The results demonstrate that the best-performing techniques seek to maximize label annotation in previously chosen documents, showcasing their effectiveness and preserving performance.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
<a href="/cs/project/TL05000057" target="_blank" >TL05000057: Signál a šum v éře Žurnalistiky 5.0 - komparativní perspektiva novinářských žánrů automatizovaných obsahů</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2023
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
2023 2nd International Conference on Frontiers of Communications, Information System and Data Science
ISBN
979-8-3503-8147-4
ISSN
—
e-ISSN
—
Počet stran výsledku
8
Strana od-do
40-47
Název nakladatele
IEEE Computer Society
Místo vydání
Los Alamitos
Místo konání akce
Xi’an
Datum konání akce
24. 11. 2023
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—