| Milička, M.: Document Comparison Based on the Page Layout, In: Proceedings of the 18th Conference STUDENT EEICT 2012, Brno, CZ, VUT v Brně, 2012, p. 405-409, ISBN 978-80-214-4462-1 | | Publication language: | english |
|---|
| Original title: | Document Comparison Based on the Page Layout |
|---|
| Title (cs): | Porovnávání dokumentů na základě struktury stránky |
|---|
| Pages: | 405-409 |
|---|
| Proceedings: | Proceedings of the 18th Conference STUDENT EEICT 2012 |
|---|
| Conference: | Student EEICT 2012 |
|---|
| Series: | vol. 3 |
|---|
| Place: | Brno, CZ |
|---|
| Year: | 2012 |
|---|
| ISBN: | 978-80-214-4462-1 |
|---|
| Publisher: | Brno University of Technology |
|---|
| Files: | |
|---|
|
| | Keywords |
|---|
| document comparison, document structure, visual features, layout detection |
| Annotation |
|---|
| The paper suggests the pre-processing method for the document comparison based on visual features. The method tries to extract the basic layouts of the web pages. Afterwards, it uses these layouts in the comparison based on the web page layout that is the pre-processing part of the complete comparison process. The idea of pre-prosessing phase is based on the knowledge where the layout comparison is faster than the complex document comparison with visual features. It makes sense in the case where two documents with different layouts are in the comparison process. |
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Martin Milička},
title = {Document Comparison Based on the Page Layout},
pages = {405--409},
booktitle = {Proceedings of the 18th Conference STUDENT EEICT 2012},
series = {vol. 3},
year = {2012},
location = {Brno, CZ},
publisher = {Brno University of Technology},
ISBN = {978-80-214-4462-1},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=9933}
} |
|