All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Automatic Web Document Restructuring Based on Visual Information Analysis

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F10%3APU82674" target="_blank" >RIV/00216305:26230/10:PU82674 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Automatic Web Document Restructuring Based on Visual Information Analysis

  • Original language description

    Many documents available on the current web have quite a complex structure that allows to present various kinds of information. Apart from the main content, the documents usually contain headers and footers, navigation sections and other types of additional information. For many applications such as document indexing or browsing on special devices, it is desirable that the main document information should precede the additional information in the underlying HTML code. In this paper, we propose a methodof document preprocessing that automatically restructures the document code according to this criteria. Our method is based on rendered document analysis. A page segmentation algorithm is used for detecting the basic blocks on the page and the relevanceof the individual parts is estimated from the visual properties of the text content.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    JC - Computer hardware and software

  • OECD FORD branch

Result continuities

  • Project

  • Continuities

    Z - Vyzkumny zamer (s odkazem do CEZ)

Others

  • Publication year

    2010

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Advances in Intelligent Web Mastering - 2, Proceedings of the 6th Atlantic Web Intelligence Conference - AWIC'2009

  • ISBN

    978-3-642-10686-6

  • ISSN

  • e-ISSN

  • Number of pages

    10

  • Pages from-to

    61-70

  • Publisher name

    Springer Verlag

  • Place of publication

    Prague

  • Event location

    Prague

  • Event date

    Sep 9, 2009

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article