TwIdw—A Novel Method for Feature Extraction from Unstructured Texts

The result's identifiers

Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AFUR2H2ZI" target="_blank" >RIV/00216208:11320/23:FUR2H2ZI - isvavai.cz</a>
Result on the web
<a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161544753&doi=10.3390%2fapp13116438&partnerID=40&md5=e95ec9fb96be72f2f9485421db49d986" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161544753&doi=10.3390%2fapp13116438&partnerID=40&md5=e95ec9fb96be72f2f9485421db49d986</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.3390/app13116438" target="_blank" >10.3390/app13116438</a>

Alternative languages

Result language
angličtina
Original language name
TwIdw—A Novel Method for Feature Extraction from Unstructured Texts
Original language description
"Featured Application: The research has a potential application in the field of fake news detection. By using the feature extraction technique, TwIdw, proposed in this paper, more relevant and informative features can be extracted from the text data, which can lead to an enhancement in the accuracy of the classification models employed in these tasks. This research proposes a novel technique for fake news classification using natural language processing (NLP) methods. The proposed technique, TwIdw (Term weight–inverse document weight), is used for feature extraction and is based on TfIdf, with the term frequencies replaced by the depth of the words in documents. The effectiveness of the TwIdw technique is compared to another feature extraction method—basic TfIdf. Classification models were created using the random forest and feedforward neural networks, and within those, three different datasets were used. The feedforward neural network method with the KaiDMML dataset showed an increase in accuracy of up to 3.9%. The random forest method with TwIdw was not as successful as the neural network method and only showed an increase in accuracy with the KaiDMML dataset (1%). The feedforward neural network, on the other hand, showed an increase in accuracy with the TwIdw technique for all datasets. Precision and recall measures also confirmed good results, particularly for the neural network method. The TwIdw technique has the potential to be used in various NLP applications, including fake news classification and other NLP classification problems. © 2023 by the authors."
Czech name
—
Czech description
—

Classification

Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

Project
—
Continuities
—

Others

Publication year
2023
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

Name of the periodical
"Applied Sciences (Switzerland)"
ISSN
2076-3417
e-ISSN
—
Volume of the periodical
13
Issue of the periodical within the volume
11
Country of publishing house
US - UNITED STATES
Number of pages
15
Pages from-to
1-15
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-85161544753

Similar results(10)

Feature extraction from unstructured texts as a combination of the morphological and the syntactic analysis and its usage in fake news classification tasks Improving fake news classification using dependency grammar A natural language processing approach to Malware classification

What are you looking for?

Quick search

Smart search

TwIdw—A Novel Method for Feature Extraction from Unstructured Texts

The result's identifiers

Alternative languages

Classification

Result continuities

Others

Data specific for result type

Similar results(10)

What are you looking for?

Quick search

Smart search

Result description

The result's identifiers

The result's identifiers

Alternative languages

Alternative languages

Classification

Classification

Result continuities

Result continuities

Others

Others

Data specific for result type

Data specific for result type

Similar results(10)