Conference paper

BURGET Radek. Hierarchies in HTML Documents: Linking Text to Concepts. In: 15th International Workshop on Database and Expert Systems Applications. Zaragoza: IEEE Computer Society, 2004, pp. 186-190. ISBN 0-7695-2195-9.
Publication language:english
Original title:Hierarchies in HTML Documents: Linking Text to Concepts
Title (cs):Hierarchie v HTML dokumentech: Přiřazování textu ke konceptům
Pages:186-190
Proceedings:15th International Workshop on Database and Expert Systems Applications
Conference:International Workshop on Web Semantics
Place:Zaragoza, ES
Year:2004
ISBN:0-7695-2195-9
Publisher:IEEE Computer Society
Keywords
HTML, Information extraction, Ontology, Logical document structure
Annotation
For the successful setting of the Semantic Web, it is necessary to provide tools for linking the large amounts of data that are currently available in HTML documents to the Semantic Web ontologies. Due to the enormous variability of the HTML code, it is very limiting to define direct bindings between patterns of the HTML code and the concepts. We propose an approach based on modeling the visual part of the rendered document and describing the key characteristics of the data presentation in a general way. As a next step, we propose the way for using this model for locating the instances of the concepts in the document using the approximate tree matching algorithms and regular expressions.
BibTeX:
@INPROCEEDINGS{
   author = {Radek Burget},
   title = {Hierarchies in HTML Documents: Linking Text to Concepts},
   pages = {186--190},
   booktitle = {15th International Workshop on Database and Expert Systems
	Applications},
   year = {2004},
   location = {Zaragoza, ES},
   publisher = {IEEE Computer Society},
   ISBN = {0-7695-2195-9},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php?id=7549}
}

Your IPv4 address: 23.22.136.56
Switch to IPv6 connection

DNSSEC [dnssec]