FITLayout Web Page Segmentation Framework

Authors:Burget Radek, Milička Martin
Type:software
Created:2014
Licence:required - no fee
Keywords:web page segmentation, document analysis, text classification, web page rendering
Description:
FitLayout is an extensible web page segmentation framework written in Java. It defines a generic Java API for representing a rendered web page and its division to visual areas and it provides a base for implementing page segmentation algorithms with a common application interface. As a sample segmentation method, it implements a previously published segmentation algorithm based on recursive visual area merging and separator detection. The framework includes tools for post-processing the segmentation result by different text or visual classification methods. Finally, it also provides tools for controlling the segmentation process and examining the segmentation results through a graphical user interface. The segmentation result may be stored as RDF data for later analysis.
Location:
http://www.fit.vutbr.cz/~burgetr/FITLayout/
Research groups:
Departments:
Licence terms:
Free software under the terms of the GNU GPL license.

Your IPv4 address: 54.167.44.32
Switch to IPv6 connection

DNSSEC [dnssec]