Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Language complexity across sub-styles and genres in legal Russian

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AUZ62N7J4" target="_blank" >RIV/00216208:11320/23:UZ62N7J4 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85169335305&doi=10.18413%2f2313-8912-2023-9-2-0-5&partnerID=40&md5=c60cdc2650a34264b3f78e4d84b9783b" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85169335305&doi=10.18413%2f2313-8912-2023-9-2-0-5&partnerID=40&md5=c60cdc2650a34264b3f78e4d84b9783b</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.18413/2313-8912-2023-9-2-0-5" target="_blank" >10.18413/2313-8912-2023-9-2-0-5</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Language complexity across sub-styles and genres in legal Russian

  • Popis výsledku v původním jazyce

    "The purpose of the paper is to find out the differences in linguistic complexity between legal documents, opposed by domain, sub-style and genre. The authors explore the large and diverse corpus of Russian legal texts and compare (1) international documents and documents of national law, (2) documents of the three sub-styles (administrative, legislative and justiciary), and (3) texts of different genres within sub-styles. To obtain complexity scores, an automatic model is used whose modules are capable of predicting complexity either by using the fine-tuned ruBERT model, or by using 133 language metrics, or in a hybrid way. The paper analyzes a dataset consisting of 43,804 documents, 118,768,028 words. National law documents are classified into three sub-styles. In addition, each document is characterized according to the genre and to the issuing body. Thus, 68 genres were identified. All documents were assigned complexity scores ranging from “0” to “12”. The vast majority of all documents were scored as maximally complex. The hybrid model assigned a complexity level of “12” to 97.1% of administrative sub-style documents, 94.5% of legislative sub-style documents, and 99.7% of judicial sub-style documents of national law. For all international law documents, the proportion of documents with a level of complexity of “12” is 94.1%. The set of legislative sub-style texts is the most varied in complexity. On average, the most complex documents in the dataset are of justiciary sub-style ones. Linguistic features successfully contrast international and national documents, as well as legislative and justiciary sub-styles. When comparing documents by genre, the authors interpreted only the average values of the 22 syntactic metrics. In general, a comparison of the genre-based document groups showed that it is not the genre itself that may be decisive for the complexity score, but the issuing body. © 2023 Belgorod State National Research University. All rights reserved."

  • Název v anglickém jazyce

    Language complexity across sub-styles and genres in legal Russian

  • Popis výsledku anglicky

    "The purpose of the paper is to find out the differences in linguistic complexity between legal documents, opposed by domain, sub-style and genre. The authors explore the large and diverse corpus of Russian legal texts and compare (1) international documents and documents of national law, (2) documents of the three sub-styles (administrative, legislative and justiciary), and (3) texts of different genres within sub-styles. To obtain complexity scores, an automatic model is used whose modules are capable of predicting complexity either by using the fine-tuned ruBERT model, or by using 133 language metrics, or in a hybrid way. The paper analyzes a dataset consisting of 43,804 documents, 118,768,028 words. National law documents are classified into three sub-styles. In addition, each document is characterized according to the genre and to the issuing body. Thus, 68 genres were identified. All documents were assigned complexity scores ranging from “0” to “12”. The vast majority of all documents were scored as maximally complex. The hybrid model assigned a complexity level of “12” to 97.1% of administrative sub-style documents, 94.5% of legislative sub-style documents, and 99.7% of judicial sub-style documents of national law. For all international law documents, the proportion of documents with a level of complexity of “12” is 94.1%. The set of legislative sub-style texts is the most varied in complexity. On average, the most complex documents in the dataset are of justiciary sub-style ones. Linguistic features successfully contrast international and national documents, as well as legislative and justiciary sub-styles. When comparing documents by genre, the authors interpreted only the average values of the 22 syntactic metrics. In general, a comparison of the genre-based document groups showed that it is not the genre itself that may be decisive for the complexity score, but the issuing body. © 2023 Belgorod State National Research University. All rights reserved."

Klasifikace

  • Druh

    J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

Ostatní

  • Rok uplatnění

    2023

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    "Research Result. Theoretical and Applied Linguistics"

  • ISSN

    2313-8912

  • e-ISSN

  • Svazek periodika

    9

  • Číslo periodika v rámci svazku

    2

  • Stát vydavatele periodika

    US - Spojené státy americké

  • Počet stran výsledku

    24

  • Strana od-do

    73-96

  • Kód UT WoS článku

  • EID výsledku v databázi Scopus

    2-s2.0-85169335305