Conference paper

BURGET Radek. Extrakce informace z WWW na základě znalosti struktury dat. In: Sborník příspěvků 2. ročníku konference Znalosti 2003. Ostrava: Faculty of Electrical Engineering and Computer Science, VSB-TU Ostrava, 2003, pp. 271-280. ISBN 80-248-0229-5.
Publication language:czech
Original title:Extrakce informace z WWW na základě znalosti struktury dat
Title (en):Information Extraction from WWW based on the data structure knowledge
Pages:271-280
Proceedings:Sborník příspěvků 2. ročníku konference Znalosti 2003
Conference:Znalosti 2003
Place:Ostrava, CZ
Year:2003
ISBN:80-248-0229-5
Publisher:Faculty of Electrical Engineering and Computer Science, VSB-TU Ostrava
Keywords
Information Extraction, HTML, XML
Annotation
This paper deals with the matter of modelling the logical structure of a Web site and using such model for information extraction. It proposes an algorithm for creating a site model based on the HTML code analysis and a XML/XSL based system for information extraction from this model. Furthermore, the possibility of the usage of tree matching algorithms for automating the extraction process is discussed.
BibTeX:
@INPROCEEDINGS{
   author = {Radek Burget},
   title = {Extrakce informace z WWW na z{\'{a}}klad{\v{e}} znalosti
	struktury dat},
   pages = {271--280},
   booktitle = {Sborn{\'{i}}k p{\v{r}}{\'{i}}sp{\v{e}}vk{\r{u}} 2.
	ro{\v{c}}n{\'{i}}ku konference Znalosti 2003},
   year = {2003},
   location = {Ostrava, CZ},
   publisher = {Faculty of Electrical Engineering and Computer Science,
	VSB-TU Ostrava},
   ISBN = {80-248-0229-5},
   language = {czech},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=7136}
}

Your IPv4 address: 54.198.143.210
Switch to IPv6 connection

DNSSEC [dnssec]