Document Engineering for Digital Libraries (invited talk 5.11.2010,Portsmouth University Computing Seminar,UK)
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14330%2F10%3A00045289" target="_blank" >RIV/00216224:14330/10:00045289 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
Document Engineering for Digital Libraries (invited talk 5.11.2010,Portsmouth University Computing Seminar,UK)
Original language description
Several innovative document transformations and tools developed in the process of building the Digital Mathematical Library DML-CZ http://dml.cz are described. The main result is our new PDF re-compression tool, developed using a enhanced jbig2enc library. Together with pdfsizeopt.py by Péter Szabó, we have managed to decrease PDF storage size and transmission needs by 62%: using both programs we reduced the size of the original already compressed PDFs to 38%. We briefly describe workflow and tools developed for creating the digital library. The batch digital signature stamper, the document similarity metrics which uses four different methods, a [meta]data validation process and math OCR tools represent some of the main [by]products. Such document engineering, together with Google Scholar indexing optimization, have led to the success of serving digitized and born-digital scientific math documents to the public in DML-CZ, and are being employed also in The European Digital Mathematics
Czech name
—
Czech description
—
Classification
Type
O - Miscellaneous
CEP classification
AF - Documentation, librarianship, work with information
OECD FORD branch
—
Result continuities
Project
—
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2010
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů