All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Visual Area Classification for Article Identification in Web Documents

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F10%3APU89585" target="_blank" >RIV/00216305:26230/10:PU89585 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Visual Area Classification for Article Identification in Web Documents

  • Original language description

    In the World Wide Web, the news and other articles are usually published in complex HTML documents containing many types of additional information that is not explicitly marked. In this paper, we propose a visual information analysis approach to the article discovery in complex HTML documents. We use a classification approach for the identification the important parts of the article within the page and we propose an algorithm for the detection of the article bounds within the page. Finally, we provide the results of an experimental evaluation.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    IN - Informatics

  • OECD FORD branch

Result continuities

  • Project

  • Continuities

    Z - Vyzkumny zamer (s odkazem do CEZ)<br>S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2010

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    21st International Workshop on Databases and Expert Systems Applications

  • ISBN

    978-0-7695-4174-7

  • ISSN

  • e-ISSN

  • Number of pages

    5

  • Pages from-to

  • Publisher name

    IEEE Computer Society

  • Place of publication

    Bilbao

  • Event location

    Bilbao

  • Event date

    Aug 31, 2010

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article