A Novel Regression Approach: Analyzing Textual Data in Similarity Space
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216275%3A25530%2F24%3A39921485" target="_blank" >RIV/00216275:25530/24:39921485 - isvavai.cz</a>
Result on the web
<a href="https://ieeexplore.ieee.org/document/10516346" target="_blank" >https://ieeexplore.ieee.org/document/10516346</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.23919/FRUCT61870.2024.10516346" target="_blank" >10.23919/FRUCT61870.2024.10516346</a>
Alternative languages
Result language
angličtina
Original language name
A Novel Regression Approach: Analyzing Textual Data in Similarity Space
Original language description
The proliferation of textual data, notably in the form of database records, calls for innovative methods of analysis that go beyond traditional numerical techniques. While least squares regression has been a cornerstone in quantitative data analysis, its applicability to textual data remains largely unexplored. This study aims to bridge this gap by introducing a similarity-based least squares method tailored for textual data. Drawing on the principles of similarity measures in text, such as semantic and syntactic closeness, we propose an extension to the conventional least squares framework. Our approach incorporates word-based similarity metrics into the least squares objective function, enabling the analysis of textual data in a manner coherent with its qualitative nature. The developed methodology is rigorously evaluated using both synthetic and real-world database records, demonstrating its efficacy in uncovering intricate relationships within textual data. Our findings open new avenues for textual data analysis, blending the precision of classical statistical methods with the subtleties of text similarity.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10200 - Computer and information sciences
Result continuities
Project
—
Continuities
R - Projekt Ramcoveho programu EK
Others
Publication year
2024
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
Proceedings of the 35th Conference of Open Innovations Association FRUCT
ISBN
979-8-3503-4947-4
ISSN
2305-7254
e-ISSN
2305-7254
Number of pages
8
Pages from-to
596-603
Publisher name
IEEE (Institute of Electrical and Electronics Engineers)
Place of publication
New York
Event location
Tampere
Event date
May 24, 2024
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
—