View Source

\\
Image extraction is possible for all PDF documents.

{note:title=Note}If a document is encrypted, the document permissions should be checked to make sure that content extraction is allowed.{note}
The following code demonstrates how to extract images from the first page of a PDF document. The images on the first page of the document are extracted into a vector of {{{*}Image{*}}} objects using the Document {{{*}getPageImages(int pgNumber)*}} method. The image vector is then iterated with each image entry being saved to disk as a separate image file.

{code}
// load the file
URL documentURL = new URL("your url");
Document document = new Document();
document.setUrl(documentURL);

// Get the images for a single page
Enumeration tmpImages = document.getPageImages(0).elements();

// Save the images as JPEGs
int count = 0;
while (tmpImages.hasMoreElements()){
Image image = (Image) tmpImages.nextElement();
// create new buffered image to paint to.
BufferedImage bufferedImage = new BufferedImage(
image.getWidth(this), image.getHeight(this), BufferedImage.TYPE_INT_RGB);
Graphics2D g2d = bufferedImage.createGraphics();
g2d.drawImage(image, 0, 0, image.getWidth(this), image.getHeight(this), this);
RenderedImage rendImage = bufferedImage;
try {
// Save as JPEG
File file = new File("newimage_" + count + ".jpg");
ImageIO.write(rendImage, "jpg", file);
} catch (IOException e) {
e.printStackTrace();
}
g2d.dispose();
bufferedImage.flush();
}
// Clean up document resources
document.dispose();

{code}
\\
\\