Acquiring Data for Textual Entailment Recognition
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F13%3A00070350" target="_blank" >RIV/00216224:14330/13:00070350 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Acquiring Data for Textual Entailment Recognition
Original language description
Language resources are hardly ever large enough. Building language resources that can be used as a gold standard for semantic analysis requires effort and investment. We present a prototype for acquiring language resources by means of a language game which is a cheap but long-term method. Games employed to acquire language resources are not new. For example games with a purpose are used for collecting common sense knowledge. The game presented in this paper is a work in progress. It collects annotated pairs text?hypothesis suitable for recognizing textual entailment in Czech. The game narrative is based on Sherlock Holmes and dr. Watson dialogues. For generating the dialogue line we use rule-based approaches such as syntactic analysis, anaphora resolution, synonym and hypernym replacement, word order rearrangement and verb frame based inference. To generate natural sounding sentences we added a language model score (based on n-gram frequencies in a corpus).
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LM2010013" target="_blank" >LM2010013: LINDAT-CLARIN: Institute for analysis, processing and distribution of linguistic data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2013
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Seventh Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2013
ISBN
9788026305200
ISSN
—
e-ISSN
—
Number of pages
9
Pages from-to
29-37
Publisher name
Tribun EU
Place of publication
Brno
Event location
Brno
Event date
Jan 1, 2013
Type of event by nationality
CST - Celostátní akce
UT code for WoS article
—