A Dutch coreference resolution system with an evaluation on literary fiction

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F19%3A10427060" target="_blank" >RIV/00216208:11320/19:10427060 - isvavai.cz</a>
Result on the web
<a href="https://clinjournal.org/clinj/article/view/91" target="_blank" >https://clinjournal.org/clinj/article/view/91</a>
DOI - Digital Object Identifier
—

Result language
angličtina
Original language name
A Dutch coreference resolution system with an evaluation on literary fiction
Original language description
Coreference resolution is the task of identifying descriptions that refer to the same entity. In this paper we consider the task of entity coreference resolution for Dutch with a particular focus on literary texts. We make three main contributions. First, we propose a simplified annotation scheme to reduce annotation effort. This scheme is used for the annotation of a corpus of 107k tokens from 21 contemporary works of literature. Second, we present a rule-based coreference resolution system for Dutch based on the Stanford deterministic multi-sieve coreference architecture and heuristic rules for quote attribution. Our system (dutchcoref) forms a simple but strong baseline and improves on previous systems in shared task evaluations. Finally, we perform an evaluation and error analysis on literary texts which highlights difficult cases of coreference in general, and the literary domain in particular. The code of our system is made available at https://github.com/andreasvc/dutchcoref/
Czech name
—
Czech description
—

Type
O - Miscellaneous
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Similar results(10)