A Language Framework for Measuring Semantic and Syntactic Similarity for Arabic Texts

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F25%3A3F85CGXQ" target="_blank" >RIV/00216208:11320/25:3F85CGXQ - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85188898036&doi=10.1007%2fs42979-024-02691-x&partnerID=40&md5=b78c4d0c2a44025a094611d2030a6de4" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85188898036&doi=10.1007%2fs42979-024-02691-x&partnerID=40&md5=b78c4d0c2a44025a094611d2030a6de4</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s42979-024-02691-x" target="_blank" >10.1007/s42979-024-02691-x</a>

Alternative languages

Result language
angličtina
Original language name
A Language Framework for Measuring Semantic and Syntactic Similarity for Arabic Texts
Original language description
A language framework for determining the similarity of two snipped texts is proposed. The edit distance concept is employed as a frame algorithm to capture syntactic and semantic similarities. In the proposed work, syntax level distances between lemma-form words are calculated, while partial edit costs are allowed to embed semantic similarity measurements. Many knowledge resources have been used, such as words’ synonyms, negation rules, and word semantic spaces. A researchable Arabic thesaurus dictionary is built in two forms, surface form and lemma form. Semantic word spaces are generated from one of the word embedding models, which represents the words in vector spaces. The algorithm is enhanced to overcome problems with different word orders between sentences by a word permutation technique that elects the best alignment of the snipped text words to yield the best matching score. The algorithm also studied the effect of negation words on textual similarity. The proposed approach was implemented to find the similarity between Arabic language texts. Results are compared with other state-of-the-art algorithms using two benchmark datasets. The experimental results show that the proposed approach achieves a higher Pearson correlation coefficient compared to other works. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2024.
Czech name
—
Czech description
—

Classification

Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
—
Continuities
—

Others

Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Name of the periodical
SN Computer Science
ISSN
2662-995X
e-ISSN
—
Volume of the periodical
5
Issue of the periodical within the volume
4
Country of publishing house
US - UNITED STATES
Number of pages
14
Pages from-to
1-14
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85188898036

Similar results(10)

Linear transformations for cross-lingual semantic textual similarity Attempting to separate inflection and derivation using vector space representations The representation of some phrases in Arabic word semantic vector spaces

What are you looking for?

Quick search

Smart search

A Language Framework for Measuring Semantic and Syntactic Similarity for Arabic Texts

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)