Accelerating the process of web page segmentation via template clustering
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26230%2F16%3APU121558" target="_blank" >RIV/00216305:26230/16:PU121558 - isvavai.cz</a>
Result on the web
<a href="http://www.fit.vutbr.cz/research/pubs/all.php?id=10530" target="_blank" >http://www.fit.vutbr.cz/research/pubs/all.php?id=10530</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1504/IJIIDS.2016.075424" target="_blank" >10.1504/IJIIDS.2016.075424</a>
Alternative languages
Result language
angličtina
Original language name
Accelerating the process of web page segmentation via template clustering
Original language description
Segmenting a web page is often one of the initial steps when performing some data mining on that page. We acknowledge that there is a lot of research in the area of segmentation based on visual perception of the web page. In this paper we propose a method how to improve the efficiency of virtually all vision-based segmentation algorithms. Our method, called Cluster-based Page Segmentation, takes the widely spread concept of web templates and utilizes it to improve the efficiency of vision-based page segmentation by clustering web pages and performing the segmentation on the cluster instead of on each page in that cluster. To prove the efficiency of our algorithm we offer experimental results gathered using three different vision-based segmentation algorithms.
Czech name
—
Czech description
—
Classification
Type
J<sub>SC</sub> - Article in a specialist periodical, which is included in the SCOPUS database
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/ED1.1.00%2F02.0070" target="_blank" >ED1.1.00/02.0070: IT4Innovations Centre of Excellence</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)<br>S - Specificky vyzkum na vysokych skolach
Others
Publication year
2016
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
International Journal of Intelligent Information and Database System
ISSN
1751-5858
e-ISSN
1751-5866
Volume of the periodical
2016
Issue of the periodical within the volume
2
Country of publishing house
CH - SWITZERLAND
Number of pages
20
Pages from-to
134-153
UT code for WoS article
—
EID of the result in the Scopus database
2-s2.0-84962382995