Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F12%3A00060279" target="_blank" >RIV/00216224:14330/12:00060279 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Similarity Ranking as Attribute for Machine Learning Approach to Authorship Identification
Original language description
In the authorship identification task, examples of short writings of N authors and an anonymous document written by one of these N authors are given. The task is to determine the authorship of the anonymous text. Practically all approaches solved this problem with machine learning methods. The input attributes for the machine learning process are usually formed by stylistic or grammatical properties of individual documents or a defined similarity between a document and an author. In this paper, we present the results of an experiment to extend the machine learning attributes by ranking the similarity between a document and an author: we transform the similarity between an unknown document and one of the N authors to the order in which the author is themost similar to the document in the set of N authors. The comparison of similarity probability and similarity ranking was made using the Support Vector Machines algorithm.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
AI - Linguistics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/VF20102014003" target="_blank" >VF20102014003: Natural Language Analysis in the Internet Environment</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the Eight International Conference on Language Resources and Evaluation
ISBN
9782951740877
ISSN
—
e-ISSN
—
Number of pages
4
Pages from-to
—
Publisher name
European Language Resources Association
Place of publication
Istanbul (Turkey)
Event location
Istanbul (Turkey)
Event date
May 23, 2012
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—