Thesis Details

Segmentace webových stránek s využitím shlukování

Master's Thesis Student: Lengál Tomáš Academic Year: 2016/2017 Supervisor: Burget Radek, doc. Ing., Ph.D.
English title
Web Page Segmentation Algorithms Based on Clustering
Language
Czech
Abstract

This report deals with segmentation of web pages, which is important discipline of information extraction. In the first part, we describe several general ways to implement it. After that we introduce method Box Clustering Segmentation, which comes with a slightly different approach towards segmentation. In the second half, we describe implementation of this method as a part of framework FITLayout and final testing.

Keywords

Web page segmemntation, information extraction, Box Clustering Segmentation algorithm, FITLayout framework

Department
Degree Programme
Information Technology, Field of Study Information Systems
Files
Status
defended, grade B
Date
22 June 2017
Reviewer
Committee
Hruška Tomáš, prof. Ing., CSc. (DIFS FIT BUT), předseda
Burget Radek, doc. Ing., Ph.D. (DIFS FIT BUT), člen
Holík Lukáš, doc. Mgr., Ph.D. (DITS FIT BUT), člen
Očenášek Pavel, Mgr. Ing., Ph.D. (DIFS FIT BUT), člen
Trenz Oldřich, doc. Ing., Ph.D. (Mendelu), člen
Zendulka Jaroslav, doc. Ing., CSc. (DIFS FIT BUT), člen
Citation
LENGÁL, Tomáš. Segmentace webových stránek s využitím shlukování. Brno, 2017. Master's Thesis. Brno University of Technology, Faculty of Information Technology. 2017-06-22. Supervised by Burget Radek. Available from: https://www.fit.vut.cz/study/thesis/19293/
BibTeX
@mastersthesis{FITMT19293,
    author = "Tom\'{a}\v{s} Leng\'{a}l",
    type = "Master's thesis",
    title = "Segmentace webov\'{y}ch str\'{a}nek s vyu\v{z}it\'{i}m shlukov\'{a}n\'{i}",
    school = "Brno University of Technology, Faculty of Information Technology",
    year = 2017,
    location = "Brno, CZ",
    language = "czech",
    url = "https://www.fit.vut.cz/study/thesis/19293/"
}
Back to top