All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Improvement of text compression using subset of words

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F14%3A86092523" target="_blank" >RIV/61989100:27240/14:86092523 - isvavai.cz</a>

  • Result on the web

    <a href="http://dx.doi.org/10.1166/asl.2014.5282" target="_blank" >http://dx.doi.org/10.1166/asl.2014.5282</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1166/asl.2014.5282" target="_blank" >10.1166/asl.2014.5282</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Improvement of text compression using subset of words

  • Original language description

    This paper describes a novel approach to the text compression based on the combination of the characters and words approach. New approach uses subset of words for improvement of text compression. The amount of words used in the algorithm is based on thesize and the content of the compressed texts. The ideal number of the words with respect to the compression algorithm used and compressed data is also investigated in this paper. Several source files will be evaluated and different number of words will be combined with the characters to achieve better compression. Moreover three different compression algorithms will be evaluated. The effect of the combination of words with characters on different text files from the standard compression corpuses and different compression algorithms will be investigated in the experiments. The results show that these combinations are always better than the pure word or the pure character approach. Moreover a few ideas about necessary numbers of words for

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)

  • CEP classification

    IN - Informatics

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/GPP202%2F11%2FP142" target="_blank" >GPP202/11/P142: Optimization and parallelization of compression methods</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2014

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Advanced Science Letters

  • ISSN

    1936-6612

  • e-ISSN

  • Volume of the periodical

    20

  • Issue of the periodical within the volume

    1

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    5

  • Pages from-to

    312-316

  • UT code for WoS article

  • EID of the result in the Scopus database