Extracting PDF Document Content

The Document class can allow alternate access to the content in a PDF document. It is possible to extract document meta-data, text, and images.

Extracting Meta-Data

ICEpdf supports extracting document meta-data via the API that is available on the document hierarchy classes in the org.icepdf.core.pobjects package. The main entry-point into the document meta-data is the Document class.

See Content Extraction for an example that illustrates extracting meta-data from a document.

Also, see the API documentation for the org.icepdf.core.pobjects package for more information on what types of data are available.

Extracting PDF Document Content

Extracting Meta-Data

Labels