EnTam: An English-Tamil Parallel Corpus
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F13%3A10194841" target="_blank" >RIV/00216208:11320/13:10194841 - isvavai.cz</a>
Result on the web
<a href="http://ufal.mff.cuni.cz/~ramasamy/parallel/html/" target="_blank" >http://ufal.mff.cuni.cz/~ramasamy/parallel/html/</a>
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
EnTam: An English-Tamil Parallel Corpus
Original language description
We have collected English-Tamil bilingual data from some of the publicly available websites for NLP research involving Tamil. The standard set of processing has been applied on the the raw web data before the data became available in sentence aligned English-Tamil parallel corpus suitable for various NLP tasks. The parallel corpora cover texts from bible, cinema and news domains.
Czech name
—
Czech description
—
Classification
Type
R - Software
CEP classification
AI - Linguistics
OECD FORD branch
—
Result continuities
Project
—
Continuities
R - Projekt Ramcoveho programu EK
Others
Publication year
2013
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Internal product ID
UFAL-DATA-EnTam-2.0
Technical parameters
http://ufal.mff.cuni.cz/~ramasamy/parallel/html/
Economical parameters
The data is available for research.
Owner IČO
00216208
Owner name
Univerzita Karlova v Praze