On Implementation of Word-Based Compression Methods
The paper presents an implementation of dictionary and statistical word-based data compression methods. The data compression is one of the main techniques of reducing time needed to transmit data over the network. The word-based text compression is a novel compression approach which exploits high correlation between words in sentence. The basic idea of the word-based compression methods is to consider words as source units instead of characters. These methods are efficient especially for natural language compression. Our results prove better compression ratio of word-based methods in comparison to character-based methods. We present generalized concept of dense coding in this paper. This concept allows us to adjust the coding schema to data domain andso achieve better compression ratio.
Implementace slovních kompresních metod
V tomto článku jsme popsali naše implementace slovních kontextových metod: slovní statistické a slovní slovníkové metody. V rámci této práce jsme také vytvořili nový kódovací systém Open Dense Coding založený na zobecnění End-Tagged Dense Coding. Tento systém výrazně zrychluje kódování ve slovních kompresních metodách.
GA201/06/1039: Text processing and analysis
4th Doctoral Workshop on Mathematical and Engineering Methods in Computer Science
Nov 14, 2008
