Optimizing alphabet using genetic algorithms
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989100%3A27240%2F11%3A86081152" target="_blank" >RIV/61989100:27240/11:86081152 - isvavai.cz</a>
Výsledek na webu
<a href="http://dx.doi.org/10.1109/ISDA.2011.6121705" target="_blank" >http://dx.doi.org/10.1109/ISDA.2011.6121705</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/ISDA.2011.6121705" target="_blank" >10.1109/ISDA.2011.6121705</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Optimizing alphabet using genetic algorithms
Popis výsledku v původním jazyce
Data compression algorithms were usually designed for data processing symbol by symbol. The input symbols of these algorithms are usually taken from the ASCII table, i.e. the size of the input alphabet is 256 symbols which are representable by 8-bit numbers. Several other techniques were developed-syllable-based compression, which uses the syllable as a basic compression symbol, and word-based compression, which uses words as basic symbols. These three approaches are strictly bounded and no overlap is allowed. This may be a problem because it may be helpful to have an overlap between them and use a character-based approach with a few symbols as a sequence of characters. This paper describes an algorithm that looks for the optimal alphabet for differenttext files. The alphabet may contain characters and 2-grams.
Název v anglickém jazyce
Optimizing alphabet using genetic algorithms
Popis výsledku anglicky
Data compression algorithms were usually designed for data processing symbol by symbol. The input symbols of these algorithms are usually taken from the ASCII table, i.e. the size of the input alphabet is 256 symbols which are representable by 8-bit numbers. Several other techniques were developed-syllable-based compression, which uses the syllable as a basic compression symbol, and word-based compression, which uses words as basic symbols. These three approaches are strictly bounded and no overlap is allowed. This may be a problem because it may be helpful to have an overlap between them and use a character-based approach with a few symbols as a sequence of characters. This paper describes an algorithm that looks for the optimal alphabet for differenttext files. The alphabet may contain characters and 2-grams.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GPP202%2F11%2FP142" target="_blank" >GPP202/11/P142: Optimalizace a paralelizace kompresních metod</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2011
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
11th International Conference on Intelligent Systems Design and Applications ISDA 2011 : proceedings
ISBN
978-1-4577-1676-8
ISSN
—
e-ISSN
—
Počet stran výsledku
6
Strana od-do
498-503
Název nakladatele
IEEE
Místo vydání
London
Místo konání akce
Cordoba
Datum konání akce
22. 11. 2011
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
—