Extracting Images

Table of Contents

Image extraction is possible for all PDF documents.

If a document is encrypted, the document permissions should be checked to make sure that content extraction is allowed.

The following code demonstrates how to extract images from the first page of a PDF document. The images on the first page of the document are extracted into a vector of Image objects using the Document getPageImages(int pgNumber) method. The image vector is then iterated with each image entry being saved to disk as a separate image file.

// load the file
URL documentURL = new URL("your url");
Document document = new Document();

// Get the images for a single page
Enumeration tmpImages = document.getPageImages(0).elements();

// Save the images as JPEGs
int count = 0;
while (tmpImages.hasMoreElements()){
   Image image = (Image) tmpImages.nextElement();
   // create new buffered image to paint to.
   BufferedImage bufferedImage = new BufferedImage(
      image.getWidth(this), image.getHeight(this), BufferedImage.TYPE_INT_RGB);
   Graphics2D g2d = bufferedImage.createGraphics();
   g2d.drawImage(image, 0, 0, image.getWidth(this), image.getHeight(this), this);
   RenderedImage rendImage = bufferedImage;
   try {
      // Save as JPEG
      File file = new File("newimage_" + count + ".jpg");
      ImageIO.write(rendImage, "jpg", file);
   } catch (IOException e) {
// Clean up document resources

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

© Copyright 2017 ICEsoft Technologies Canada Corp.