Slovak Dataset for Multilingual Question Answering
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AMRRQJR8T" target="_blank" >RIV/00216208:11320/23:MRRQJR8T - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/10082887/" target="_blank" >https://ieeexplore.ieee.org/document/10082887/</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ACCESS.2023.3262308" target="_blank" >10.1109/ACCESS.2023.3262308</a>
Alternative languages
Result language
angličtina
Original language name
Slovak Dataset for Multilingual Question Answering
Original language description
"SK-QuAD is the first manually annotated dataset of questions and answers in Slovak. It consists of more than 91k factual questions and answers from various fields. Each question has an answer marked in the corresponding paragraph. It also contains negative examples in the form of ‘‘unanswered questions’’ and ‘‘plausible answers’’. The dataset is published free of charge for scientific use. We aim to contribute to the creation of Slovak or multilingual systems for generating an answer to a question in a natural language. The paper provides an overview of the existing datasets for question answering. It describes the annotation process and statistically analyzes the created content. The dataset expands the possibilities of training and evaluation of multilingual language models. Experiments show that the dataset achieves state-of-the-art results for Slovak and improves question answering for other languages in zero-shot learning. We compare the effect of machine-translated data with manually annotated. Additional data improve the modeling for low-resourced languages."
Czech name
—
Czech description
—
Classification
Type
J<sub>ost</sub> - Miscellaneous article in a specialist periodical
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
"IEEE Access"
ISSN
2169-3536
e-ISSN
—
Volume of the periodical
11
Issue of the periodical within the volume
2023-7-19
Country of publishing house
US - UNITED STATES
Number of pages
13
Pages from-to
32869-32881
UT code for WoS article
000967620900001
EID of the result in the Scopus database
2-s2.0-85151539273