All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Two-Phase Categorization of Web Documents

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F10%3APU89654" target="_blank" >RIV/00216305:26230/10:PU89654 - isvavai.cz</a>

  • Result on the web

  • DOI - Digital Object Identifier

Alternative languages

  • Result language

    angličtina

  • Original language name

    Two-Phase Categorization of Web Documents

  • Original language description

    The number of pages on the World Wide Web is permanently growing and there is a need to process pages efficiently and obtain some useful knowledge from them. Web page categorization is a very important issue in this area. The method proposed here takes both visual and textual information into consideration. It consists of two phases. In the first phase, web page areas obtained by segmentation are classified based on their visual properties, and in the second phase, pages are classified, based on information from the first phase and textual information. Several experiments with web pages taken from news web sites are presented in the final part of the paper.

  • Czech name

  • Czech description

Classification

  • Type

    D - Article in proceedings

  • CEP classification

    JC - Computer hardware and software

  • OECD FORD branch

Result continuities

  • Project

  • Continuities

    Z - Vyzkumny zamer (s odkazem do CEZ)<br>S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2010

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Article name in the collection

    Proceedings of the International Conference on Knowledge Discovery and Information Retrieval

  • ISBN

    978-989-8425-28-7

  • ISSN

  • e-ISSN

  • Number of pages

    5

  • Pages from-to

  • Publisher name

    Institute for Systems and Technologies of Information, Control and Communication

  • Place of publication

    Valencia

  • Event location

    Valencia

  • Event date

    Oct 25, 2010

  • Type of event by nationality

    WRD - Celosvětová akce

  • UT code for WoS article