Journal article

ZELENÝ Jan and BURGET Radek. Accelerating the process of web page segmentation via template clustering. International Journal of Intelligent Information and Database System. Geneva: Inderscience Publishers, 2016, vol. 2016, no. 2, pp. 134-153. ISSN 1751-5858.
Publication language:english
Original title:Accelerating the process of web page segmentation via template clustering
Title (cs):Zrychlení procesu segmentace webových stránek skrze shlukování šablon
Pages:134-153
Place:CH
Year:2016
Journal:International Journal of Intelligent Information and Database System, Vol. 2016, No. 2, Geneva, CH
ISSN:1751-5858
Files: 
+Type Name Title Size Last modified
iconjzeleny.pdf338 KB2014-02-14 22:36:03
^ Select all
With selected:
Keywords
VIPS, page segmentation, vision-based page segmentation, web page segmentation, web page preprocessing, segmentation performance, clustering, template, template detection
Annotation
Segmenting a web page is often one of the initial steps when performing some data mining on that page. We acknowledge that there is a lot of research in the area of segmentation based on visual perception of the web page. In this paper we propose a method how to improve the efficiency of virtually all vision-based segmentation algorithms. Our method, called Cluster-based Page Segmentation, takes the widely spread concept of web templates and utilizes it to improve the efficiency of vision-based page segmentation by clustering web pages and performing the segmentation on the cluster instead of on each page in that cluster. To prove the efficiency of our algorithm we offer experimental results gathered using three different vision-based segmentation algorithms.
BibTeX:
@ARTICLE{
   author = {Jan Zelen{\'{y}} and Radek Burget},
   title = {Accelerating the process of web page segmentation via
	template clustering},
   pages = {134--153},
   journal = {International Journal of Intelligent Information and
	Database System},
   volume = {2016},
   number = {2},
   year = {2016},
   ISSN = {1751-5858},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=10530}
}

Your IPv4 address: 54.92.194.75
Switch to IPv6 connection

DNSSEC [dnssec]