A New Version of Corpus Corporum, the Latin Full-Text Database and Tool

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F24%3A10486726" target="_blank" >RIV/00216208:11210/24:10486726 - isvavai.cz</a>
Výsledek na webu
<a href="https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Y-zFo05UUV" target="_blank" >https://verso.is.cuni.cz/pub/verso.fpl?fname=obd_publikace_handle&handle=Y-zFo05UUV</a>
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
A New Version of Corpus Corporum, the Latin Full-Text Database and Tool
Popis výsledku v původním jazyce
The article provides background information on the freely accessible online project Corpus Corporum, the largest structured Latin text meta-corpus in existence. A completely reworked version of software and presentation was launched in 2021. The main purpose of the project is to provide readers access to Latin texts, help in perusing them, and in finding passages. The new version is more stable, more easily extendible, and provides a number of new features, some of them still under construction. The project uses exclusively free and open software, most importantly BaseX, TreeTagger, and Sphinx. TEI XML files are used as input, they are automatically PoS-tagged and lemmatised. Users can then read the texts and by clicking words visualise lemma entries in some of the most important Latin dictionaries. The major novelties are: searches that can ignore orthographic and medieval spelling variation, the automatic identification of possible text-reuse, and metrical analyses.
Název v anglickém jazyce
A New Version of Corpus Corporum, the Latin Full-Text Database and Tool
Popis výsledku anglicky
The article provides background information on the freely accessible online project Corpus Corporum, the largest structured Latin text meta-corpus in existence. A completely reworked version of software and presentation was launched in 2021. The main purpose of the project is to provide readers access to Latin texts, help in perusing them, and in finding passages. The new version is more stable, more easily extendible, and provides a number of new features, some of them still under construction. The project uses exclusively free and open software, most importantly BaseX, TreeTagger, and Sphinx. TEI XML files are used as input, they are automatically PoS-tagged and lemmatised. Users can then read the texts and by clicking words visualise lemma entries in some of the most important Latin dictionaries. The major novelties are: searches that can ignore orthographic and medieval spelling variation, the automatic identification of possible text-reuse, and metrical analyses.

Klasifikace

Druh
J<sub>ost</sub> - Ostatní články v recenzovaných periodicích
CEP obor
—
OECD FORD obor
60500 - Other Humanities and the Arts

Návaznosti výsledku

Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach

Ostatní

Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Bulletin Du Cange
ISSN
2662-8198
e-ISSN
2662-8198
Svazek periodika
80
Číslo periodika v rámci svazku
Neuveden
Stát vydavatele periodika
BE - Belgické království
Počet stran výsledku
16
Strana od-do
251-266
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

InterCorp, a Parallel Corpus of 40 Languages KUK 0.0 KUK 1.0

Co hledáte?

Rychlé hledání

Chytré vyhledávání

A New Version of Corpus Corporum, the Latin Full-Text Database and Tool

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)