Using clustering to improve WLZ77 compression
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F08%3A00021071" target="_blank" >RIV/61989100:27240/08:00021071 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Using clustering to improve WLZ77 compression
Popis výsledku v původním jazyce
Many types of Information Retrieval Systems (IRS) are created and more and more documents are stored in them too. The fundamental process of IRS is building of textual database, and compression of the documents stored in the database. One possibility forcompression of textual data is word-based compression. Several algorithms for word-based compression algorithms based on Huffman encoding, LZW or BWT algorithm was proposed. In this paper, we describe word-based compression method based on LZ77 algorithm. IRS can also perform cluster analysis of textual database to improve quality of answers to users? queries. The information retrieved from the clustering can be very helpful in compression. Word-based compression using information about cluster hierarchy is presented in this paper. Experimental results which are provided at the end of the paper were achieved not only using well-known word-based compression algorithms WBW and WLZW but also using quite new WLZ77 algorithm.
Název v anglickém jazyce
Using clustering to improve WLZ77 compression
Popis výsledku anglicky
Many types of Information Retrieval Systems (IRS) are created and more and more documents are stored in them too. The fundamental process of IRS is building of textual database, and compression of the documents stored in the database. One possibility forcompression of textual data is word-based compression. Several algorithms for word-based compression algorithms based on Huffman encoding, LZW or BWT algorithm was proposed. In this paper, we describe word-based compression method based on LZ77 algorithm. IRS can also perform cluster analysis of textual database to improve quality of answers to users? queries. The information retrieved from the clustering can be very helpful in compression. Word-based compression using information about cluster hierarchy is presented in this paper. Experimental results which are provided at the end of the paper were achieved not only using well-known word-based compression algorithms WBW and WLZW but also using quite new WLZ77 algorithm.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GA201%2F06%2F0756" target="_blank" >GA201/06/0756: Vývoj nativního úložiště pro XML data</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2008
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
FIRST INTERNATIONAL CONFERENCE ON THE APPLICATIONS OF DIGITAL INFORMATION AND WEB TECHNOLOGIES
ISBN
978-1-4244-2623-2
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
—
Název nakladatele
IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA
Místo vydání
NEW YORK
Místo konání akce
Ostrava
Datum konání akce
4. 8. 2008
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000263224700054