Using clustering to improve WLZ77 compression
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F08%3A00021071" target="_blank" >RIV/61989100:27240/08:00021071 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Using clustering to improve WLZ77 compression
Original language description
Many types of Information Retrieval Systems (IRS) are created and more and more documents are stored in them too. The fundamental process of IRS is building of textual database, and compression of the documents stored in the database. One possibility forcompression of textual data is word-based compression. Several algorithms for word-based compression algorithms based on Huffman encoding, LZW or BWT algorithm was proposed. In this paper, we describe word-based compression method based on LZ77 algorithm. IRS can also perform cluster analysis of textual database to improve quality of answers to users? queries. The information retrieved from the clustering can be very helpful in compression. Word-based compression using information about cluster hierarchy is presented in this paper. Experimental results which are provided at the end of the paper were achieved not only using well-known word-based compression algorithms WBW and WLZW but also using quite new WLZ77 algorithm.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GA201%2F06%2F0756" target="_blank" >GA201/06/0756: Development of a native storage for XML data</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2008
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
FIRST INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES
ISBN
978-1-4244-2623-2
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
—
Publisher name
IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA
Place of publication
NEW YORK
Event location
Ostrava
Event date
Aug 4, 2008
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000263224700054