View Source

Image extraction is possible for all PDF documents.

{note:title=Note}If a document is encrypted, the document permissions should be checked to make sure that content extraction is allowed.{note}
The following code demonstrates how to extract images from the first page of a PDF document. The images on the first page of the document are extracted into a vector of {{{*}Image{*}}} objects using the Document {{{*}getPageImages(int pgNumber)*}} method. The image vector is then iterated with each image entry being saved to disk as a separate image file.

// load the file
URL documentURL = new URL("your url");
Document document = new Document();

// Get the images for a single page
Enumeration tmpImages = document.getPageImages(0).elements();

// Save the images as JPEGs
int count = 0;
while (tmpImages.hasMoreElements()){
Image image = (Image) tmpImages.nextElement();
// create new buffered image to paint to.
BufferedImage bufferedImage = new BufferedImage(
image.getWidth(this), image.getHeight(this), BufferedImage.TYPE_INT_RGB);
Graphics2D g2d = bufferedImage.createGraphics();
g2d.drawImage(image, 0, 0, image.getWidth(this), image.getHeight(this), this);
RenderedImage rendImage = bufferedImage;
try {
// Save as JPEG
File file = new File("newimage_" + count + ".jpg");
ImageIO.write(rendImage, "jpg", file);
} catch (IOException e) {
// Clean up document resources