PDF DOM Parser

Authors:Burget Radek
Type:software
Created:2011
Licence:required - no fee
Keywords:PDF DOM HTML parser convertor java
Description:
Pdf2Dom is a PDF parser that converts the documents to a HTML DOM representation. The obtained DOM tree may be then serialized to a HTML file or further processed. The inline CSS definitions contained in the resulting document are used for making the HTML page as similar as possible to the PDF input. A command-line utility for converting the PDF documents to HTML is included in the distribution package. Pdf2D0m may be also used as an independent Java library with a standard DOM interface for your DOM-based applications or as an alternative parser for the CSSBox rendering engine in order to add the PDF processing capability to CSSBox.
Location:
http://cssbox.sourceforge.net/pdf2dom
Research groups:
Departments:
Licence terms:
PDF DOM Parser je k dispozici zdarma pod licencí LGPLv3.