The Document class provides functionality for rendering PDF content into other formats via a Java2D graphics context. As a result, rendering PDF content to other formats is a relatively simple process with very powerful results. ICEpdf also supports Java headless mode when rending PDF content, which can be useful for server side solutions.
An example of how to extract PDF document content to SVG is available in the SVG class found in the package org.icepdf.ri.util. The following is an example of how to save page captures in PNG format.
Building a Page Capturing Class
Create a file called PageCapture.java similar to the following:
import org.icepdf.core.exceptions.PDFException;
import org.icepdf.core.exceptions.PDFSecurityException;
import org.icepdf.core.pobjects.Document;
import org.icepdf.core.pobjects.Page;
import org.icepdf.core.util.GraphicsRenderingHints;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
public class PageCapture {
public static void main(String[] args) {
String filePath = args[0];
Document document = new Document();
try {
document.setFile(filePath);
} catch (PDFException ex) {
System.out.println("Error parsing PDF document " + ex);
} catch (PDFSecurityException ex) {
System.out.println("Error encryption not supported " + ex);
} catch (FileNotFoundException ex) {
System.out.println("Error file not found " + ex);
} catch (IOException ex) {
System.out.println("Error IOException " + ex);
}
float scale = 1.0f;
float rotation = 0f;
for (int i = 0; i < document.getNumberOfPages(); i++) {
BufferedImage image = (BufferedImage) document.getPageImage(
i, GraphicsRenderingHints.PRINT, Page.BOUNDARY_CROPBOX, rotation, scale);
RenderedImage rendImage = image;
try {
System.out.println(" capturing page " + i);
File file = new File("imageCapture1_" + i + ".png");
ImageIO.write(rendImage, "png", file);
} catch (IOException e) {
e.printStackTrace();
}
image.flush();
}
document.dispose();
}
}
The Import Statements
The org.icepdf.core.* packages are always required. The java.* packages are necessary for saving the page captures to image.
The Static Main Method
The following lines create a new Document object and open a PDF document specified by a URL.
Document document = new Document();
try {
document.setFile(filePath);
} catch(PDFException ex) {
System.out.println("Error parsing PDF document " + ex);
} catch(PDFSecurityException ex) {
System.out.println("Error encryption not supported " + ex);
} catch(FileNotFoundException ex) {
System.out.println("Error file not found " + ex);
} catch (IOException ex) {
System.out.println("Error IOException " + ex);
}
This will take care of loading the PDF document and catch any errors that may be thrown in the process.
Before page content can be captured, it is necessary to set the zoom and rotation used to render the page's content. For this example, we are using a scale factor of 100% and will be using the default rotation of zero degrees.
float scale = 1.0f;
float rotation = 0f;
The page content can now be saved to a file.
for (int i = 0; i < document.getNumberOfPages(); i++) {
BufferedImage image = (BufferedImage) document.getPageImage(
i, GraphicsRenderingHints.PRINT, Page.BOUNDARY_CROPBOX, rotation, scale);
RenderedImage rendImage = image;
try {
System.out.println(" capturing page " + i);
File file = new File("imageCapture1_" + i + ".png");
ImageIO.write(rendImage, "png", file);
} catch (IOException e) {
e.printStackTrace();
}
image.flush();
}
The code iterates through all the pages in the document and gets an image rendering of each page which is then written to a file. The last step is to free up the resources used by the Document class during the rendering process by calling the dispose() method.
You now have a simple class that can save PDF page captures to disk.