Conference paper

MILIČKA Martin and BURGET Radek. Multi-aspect Document Content Analysis using Ontological Modelling. In: Proceedings of 9th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2014). Smolenice: Vydavateľstvo STU, 2014, pp. 9-12. ISBN 978-80-227-4267-2.
Publication language:english
Original title:Multi-aspect Document Content Analysis using Ontological Modelling
Title (cs):Analýza více aspektů obsahu dokumentu s využitím ontologií
Pages:9-12
Proceedings:Proceedings of 9th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2014)
Conference:9th Workshop on Intelligent and Knowledge oriented Technologies
Place:Smolenice, SK
Year:2014
ISBN:978-80-227-4267-2
Publisher:Vydavateľstvo STU
Files: 
+Type Name Title Size Last modified
iconwikt_burget.pdf142 KB2014-09-18 15:24:54
^ Select all
With selected:
Keywords
document modeling, information extraction, page segmentation, content classification, ontology, RDF
Annotation
Existing methods of information extraction from web documents are usually based on a single aspect of the document or its contents such as the code, textual features or visual features. Due to the great variability of the available online documents, it seems reasonable to combine multiple kinds of analysis in order to use all the available knowledge for identifying a particular information in the document. In this paper, we propose an ontological document model that allows to integrate the results of the analysis of different document aspects. We propose a generic architecture of an information extraction system based on this model and we show its applicability on a practical example.
BibTeX:
@INPROCEEDINGS{
   author = {Martin Mili{\v{c}}ka and Radek Burget},
   title = {Multi-aspect Document Content Analysis using Ontological
	Modelling},
   pages = {9--12},
   booktitle = {Proceedings of 9th Workshop on Intelligent and Knowledge
	Oriented Technologies (WIKT 2014)},
   year = {2014},
   location = {Smolenice, SK},
   publisher = {Vydavate{\'{l}}stvo STU},
   ISBN = {978-80-227-4267-2},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=10724}
}

Your IPv4 address: 54.198.143.210
Switch to IPv6 connection

DNSSEC [dnssec]