Improvement of text compression using subset of words

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F14%3A86092523" target="_blank" >RIV/61989100:27240/14:86092523 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1166/asl.2014.5282" target="_blank" >http://dx.doi.org/10.1166/asl.2014.5282</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1166/asl.2014.5282" target="_blank" >10.1166/asl.2014.5282</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
Improvement of text compression using subset of words
Popis výsledku v původním jazyce
This paper describes a novel approach to the text compression based on the combination of the characters and words approach. New approach uses subset of words for improvement of text compression. The amount of words used in the algorithm is based on thesize and the content of the compressed texts. The ideal number of the words with respect to the compression algorithm used and compressed data is also investigated in this paper. Several source files will be evaluated and different number of words will be combined with the characters to achieve better compression. Moreover three different compression algorithms will be evaluated. The effect of the combination of words with characters on different text files from the standard compression corpuses and different compression algorithms will be investigated in the experiments. The results show that these combinations are always better than the pure word or the pure character approach. Moreover a few ideas about necessary numbers of words for
Název v anglickém jazyce
Improvement of text compression using subset of words
Popis výsledku anglicky
This paper describes a novel approach to the text compression based on the combination of the characters and words approach. New approach uses subset of words for improvement of text compression. The amount of words used in the algorithm is based on thesize and the content of the compressed texts. The ideal number of the words with respect to the compression algorithm used and compressed data is also investigated in this paper. Several source files will be evaluated and different number of words will be combined with the characters to achieve better compression. Moreover three different compression algorithms will be evaluated. The effect of the combination of words with characters on different text files from the standard compression corpuses and different compression algorithms will be investigated in the experiments. The results show that these combinations are always better than the pure word or the pure character approach. Moreover a few ideas about necessary numbers of words for

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GPP202%2F11%2FP142" target="_blank" >GPP202/11/P142: Optimalizace a paralelizace kompresních metod</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Advanced Science Letters
ISSN
1936-6612
e-ISSN
—
Svazek periodika
20
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
US - Spojené státy americké
Počet stran výsledku
5
Strana od-do
312-316
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Improving Evolved Alphabet Using Tabu Set Searching for optimal alphabet for data compression using simulated annealing Simple rules for syllabification of arabic texts

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Improvement of text compression using subset of words

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)