Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23330%2F21%3A43962988" target="_blank" >RIV/49777513:23330/21:43962988 - isvavai.cz</a>
Výsledek na webu
<a href="http://ceur-ws.org/Vol-2989/short_paper12.pdf" target="_blank" >http://ceur-ws.org/Vol-2989/short_paper12.pdf</a>
DOI - Digital Object Identifier
—

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach
Popis výsledku v původním jazyce
Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.
Název v anglickém jazyce
Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach
Popis výsledku anglicky
Large-scale synthetic research in ancient history is often hindered by the incompatibility of tax- onomies used by different digital datasets. Using the example of enriching the Latin Inscriptions from the Roman Empire dataset (LIRE), we demonstrate that machine-learning classification mod- els can bridge the gap between two distinct classification systems and make comparative study possible. We report on training, testing and application of a machine learning classification model using inscription categories from the Epigraphic Database Heidelberg (EDH) to label inscriptions from the Epigraphic Database Claus-Slaby (EDCS). The model is trained on a labeled set of records included in both sources (N=46,171). Several different classification algorithms and parametriza- tions are explored. The final model is based on Extremely Randomized Trees algorithm (ET) and employs 10,055 features, based on several attributes. The final model classifies two thirds of a test dataset with 98% accuracy and 85% of it with 95% accuracy. After model selection and evaluation, we apply the model on inscriptions covered exclusively by EDCS (N=83,482) in an attempt to adopt one consistent system of classification for all records within the LIRE dataset.

Klasifikace

Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
60102 - Archaeology

Návaznosti výsledku

Projekt
—
Návaznosti
N - Vyzkumna aktivita podporovana z neverejnych zdroju

Ostatní

Rok uplatnění
2021
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název statě ve sborníku
Proceedings of the Conference on Computational Humanities Research 2021
ISBN
—
ISSN
1613-0073
e-ISSN
1613-0073
Počet stran výsledku
13
Strana od-do
123-135
Název nakladatele
CEUR-WS
Místo vydání
Amsterdam
Místo konání akce
Amsterdam
Datum konání akce
17. 11. 2021
Typ akce podle státní příslušnosti
EUR - Evropská akce
Kód UT WoS článku
—

Podobné výsledky(10)

On the Role of Training Data for SVM-Based Microwave Brain Stroke Detection and Classification SYDAGenerator 2 - Advanced tool for generating datasets using a 3D object Label errors in point cloud in training data for classification using machine learning

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Classifying Latin Inscriptions of the Roman Empire: A Machine-Learning Approach

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)