Large-Scale Colloquial Persian 0.5
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F20%3A10424543" target="_blank" >RIV/00216208:11320/20:10424543 - isvavai.cz</a>
Result on the web
<a href="https://iasbs.ac.ir/~ansari/lscp/" target="_blank" >https://iasbs.ac.ir/~ansari/lscp/</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Large-Scale Colloquial Persian 0.5
Original language description
Large Scale Colloquial Persian Dataset" (LSCP) is hierarchically organized in asemantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. LSCP includes 120M sentences from 27M casual Persian tweets with its dependency relations in syntactic annotation, Part-of-speech tags, sentiment polarity and automatic translation of original Persian sentences in five different languages (EN, CS, DE, IT, HI).
Czech name
—
Czech description
—
Classification
Type
R - Software
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/GX19-26934X" target="_blank" >GX19-26934X: Neural Representations in Multi-modal and Multi-lingual Modeling</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2020
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Internal product ID
[http://hdl.handle.net/11234/1-3
Technical parameters
Výsledek volně dostupný na adrese https://iasbs.ac.ir/~ansari/lscp/.
Economical parameters
Based on 19-26934X (NEUREM3) grants of GAČR by 47,572 thou. CZK
Owner IČO
00216208
Owner name
Univerzita Karlova