Parse & Extract with REST API & Java SDK

Parse documents such as invoices, receipts or financial tables to extract text, images and metadata from 40+ popular document formats.

Start Free Trial
GroupDocs.Parser Cloud SDK for Java

GroupDocs.Parser Cloud SDK for Java

 

GroupDocs.Parser Cloud for Java is a wrapper around document data extraction REST API to parse over 50 document types. Parsing documents with predefined templates is one of the most valuable features of GroupDocs.Parser Cloud API. It is very simple and easy to define templates and extract data from invoices or other kinds of typical documents.

The API provides methods to extract images, extract text and metadata from almost all of the most common file formats. Along with the regular documents, you can use the supported features on password protected files and containers like ZIP archives, OST/PST mail data files, eBooks, markups, and PDF portfolios in your Java applications.

Previous Next

Document Parser REST API Features

 

 

Parse PDF, word processing documents, spreadsheets & presentations

 

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

 

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

 

Extract text content from the whole document

 

Extract text and images from specific pages

 

Extract formatted text by setting extraction mode

 

Extract text from the documents in ZIP archives or Emails

 

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

 

Get document page count and other information

Extract Images from a Document

GroupDocs.Parser Cloud API, provides several ways to extract images from the documents. You can extract all the images from the whole document, extract images from the specific pages by setting pages range. Following example shows how to extract images from a document inside a container.

Extract images from a PDF Portfolio - Java


// Get AppKey and AppSID from https://dashboard.groupdocs.cloud
String MyAppKey = "";
String MyAppSid = "";

Configuration configuration = new Configuration(MyAppSid, MyAppKey);
ParseApi apiInstance = new ParseApi(configuration);

FileInfo fileInfo = new FileInfo();
fileInfo.setFilePath("pdf/PDF with attachements.pdf");
fileInfo.setPassword("password");

ContainerItemInfo containerItemInfo = new ContainerItemInfo();
containerItemInfo.setRelativePath("template-document.pdf");

ImagesOptions options = new ImagesOptions();
options.setFileInfo(fileInfo);
options.setContainerItemInfo(containerItemInfo);
 
ImagesRequest request = new ImagesRequest(options);
ImagesResult response = apiInstance.images(request);

// For complete examples and data files, visit https://github.com/groupdocs-parser-cloud/groupdocs-parser-cloud-java-samples

Support and Learning Resources

 
  

GroupDocs.Parser Cloud also offers individual document parser SDKs for other popular languages as listed below: