All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

TwIdw—A Novel Method for Feature Extraction from Unstructured Texts

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F23%3AFUR2H2ZI" target="_blank" >RIV/00216208:11320/23:FUR2H2ZI - isvavai.cz</a>

  • Result on the web

    <a href="https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161544753&doi=10.3390%2fapp13116438&partnerID=40&md5=e95ec9fb96be72f2f9485421db49d986" target="_blank" >https://www.scopus.com/inward/record.uri?eid=2-s2.0-85161544753&doi=10.3390%2fapp13116438&partnerID=40&md5=e95ec9fb96be72f2f9485421db49d986</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.3390/app13116438" target="_blank" >10.3390/app13116438</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    TwIdw—A Novel Method for Feature Extraction from Unstructured Texts

  • Original language description

    "Featured Application: The research has a potential application in the field of fake news detection. By using the feature extraction technique, TwIdw, proposed in this paper, more relevant and informative features can be extracted from the text data, which can lead to an enhancement in the accuracy of the classification models employed in these tasks. This research proposes a novel technique for fake news classification using natural language processing (NLP) methods. The proposed technique, TwIdw (Term weight–inverse document weight), is used for feature extraction and is based on TfIdf, with the term frequencies replaced by the depth of the words in documents. The effectiveness of the TwIdw technique is compared to another feature extraction method—basic TfIdf. Classification models were created using the random forest and feedforward neural networks, and within those, three different datasets were used. The feedforward neural network method with the KaiDMML dataset showed an increase in accuracy of up to 3.9%. The random forest method with TwIdw was not as successful as the neural network method and only showed an increase in accuracy with the KaiDMML dataset (1%). The feedforward neural network, on the other hand, showed an increase in accuracy with the TwIdw technique for all datasets. Precision and recall measures also confirmed good results, particularly for the neural network method. The TwIdw technique has the potential to be used in various NLP applications, including fake news classification and other NLP classification problems. © 2023 by the authors."

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

  • Continuities

Others

  • Publication year

    2023

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    "Applied Sciences (Switzerland)"

  • ISSN

    2076-3417

  • e-ISSN

  • Volume of the periodical

    13

  • Issue of the periodical within the volume

    11

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    15

  • Pages from-to

    1-15

  • UT code for WoS article

  • EID of the result in the Scopus database

    2-s2.0-85161544753