Authorship and Time Attribution of Arabic Texts Using JGAAP

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11210%2F18%3A10365864" target="_blank" >RIV/00216208:11210/18:10365864 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1007/978-3-319-67056-0_16" target="_blank" >http://dx.doi.org/10.1007/978-3-319-67056-0_16</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/978-3-319-67056-0_16" target="_blank" >10.1007/978-3-319-67056-0_16</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Authorship and Time Attribution of Arabic Texts Using JGAAP
Popis výsledku v původním jazyce
One basic task in Natural Language processing is text classification, such as sorting documents by their content. A less well-known variant on this task is classifying documents by inferred metadata, such as the document's (inferred) language, date of composition or authorship. Authorship attribution is a well-studied problem, but most of the work done has been in major European languages such as English. [Notable exceptions who have studied Arabic, in particular, include. We present a study selected from a new corpus (CLAUDia) containing nearly a half-billion words of Arabic text using a standard authorship analysis tool (JGAAP) to study the effects of author, genre, and time of composition on writing style and by extension on classification. We have selected a subcorpus balanced to permit comparisons between genres as well as between time periods to see how best-performing methods change with genre and time. We also provide an analysis of a larger variety of different feature sets than has previously been done for Arabic.
Název v anglickém jazyce
Authorship and Time Attribution of Arabic Texts Using JGAAP
Popis výsledku anglicky
One basic task in Natural Language processing is text classification, such as sorting documents by their content. A less well-known variant on this task is classifying documents by inferred metadata, such as the document's (inferred) language, date of composition or authorship. Authorship attribution is a well-studied problem, but most of the work done has been in major European languages such as English. [Notable exceptions who have studied Arabic, in particular, include. We present a study selected from a new corpus (CLAUDia) containing nearly a half-billion words of Arabic text using a standard authorship analysis tool (JGAAP) to study the effects of author, genre, and time of composition on writing style and by extension on classification. We have selected a subcorpus balanced to permit comparisons between genres as well as between time periods to see how best-performing methods change with genre and time. We also provide an analysis of a larger variety of different feature sets than has previously been done for Arabic.

Klasifikace

Druh
C - Kapitola v odborné knize
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

Projekt
<a href="/cs/project/GA13-28220S" target="_blank" >GA13-28220S: Struktury kultury: Arabsko-islámská kultura prismatem korpusové lingvistiky</a><br>
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název knihy nebo sborníku
Intelligent Natural Language Processing: Trends and Applications
ISBN
978-3-319-67056-0
Počet stran výsledku
25
Strana od-do
325-349
Počet stran knihy
776
Název nakladatele
Springer
Místo vydání
Cham
Kód UT WoS kapitoly
—

Podobné výsledky(10)

Multilingual Stylometry. The Influence of Language on the Performance of Authorship Attribution using Corpora from the European Literary Text Collection (ELTeC)Simple rules for syllabification of arabic texts Crowd Sourcing as an Improvement of N-Grams Text Document Classification Algorithm

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Authorship and Time Attribution of Arabic Texts Using JGAAP

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)