Language complexity across sub-styles and genres in legal Russian
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AUZ62N7J4" target="_blank" >RIV/00216208:11320/23:UZ62N7J4 - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85169335305&doi=10.18413%2f2313-8912-2023-9-2-0-5&partnerID=40&md5=c60cdc2650a34264b3f78e4d84b9783b" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85169335305&doi=10.18413%2f2313-8912-2023-9-2-0-5&partnerID=40&md5=c60cdc2650a34264b3f78e4d84b9783b</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.18413/2313-8912-2023-9-2-0-5" target="_blank" >10.18413/2313-8912-2023-9-2-0-5</a>
Alternative languages
Result language
angličtina
Original language name
Language complexity across sub-styles and genres in legal Russian
Original language description
"The purpose of the paper is to find out the differences in linguistic complexity between legal documents, opposed by domain, sub-style and genre. The authors explore the large and diverse corpus of Russian legal texts and compare (1) international documents and documents of national law, (2) documents of the three sub-styles (administrative, legislative and justiciary), and (3) texts of different genres within sub-styles. To obtain complexity scores, an automatic model is used whose modules are capable of predicting complexity either by using the fine-tuned ruBERT model, or by using 133 language metrics, or in a hybrid way. The paper analyzes a dataset consisting of 43,804 documents, 118,768,028 words. National law documents are classified into three sub-styles. In addition, each document is characterized according to the genre and to the issuing body. Thus, 68 genres were identified. All documents were assigned complexity scores ranging from “0” to “12”. The vast majority of all documents were scored as maximally complex. The hybrid model assigned a complexity level of “12” to 97.1% of administrative sub-style documents, 94.5% of legislative sub-style documents, and 99.7% of judicial sub-style documents of national law. For all international law documents, the proportion of documents with a level of complexity of “12” is 94.1%. The set of legislative sub-style texts is the most varied in complexity. On average, the most complex documents in the dataset are of justiciary sub-style ones. Linguistic features successfully contrast international and national documents, as well as legislative and justiciary sub-styles. When comparing documents by genre, the authors interpreted only the average values of the 22 syntactic metrics. In general, a comparison of the genre-based document groups showed that it is not the genre itself that may be decisive for the complexity score, but the issuing body. © 2023 Belgorod State National Research University. All rights reserved."
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
—
Continuities
—
Others
Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
"Research Result. Theoretical and Applied Linguistics"
ISSN
2313-8912
e-ISSN
—
Volume of the periodical
9
Issue of the periodical within the volume
2
Country of publishing house
US - UNITED STATES
Number of pages
24
Pages from-to
73-96
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85169335305