GroupDocs.Parser Cloud for Java is a wrapper around document data extraction REST API to parse over 50 document types. Parsing documents with predefined templates is one of the most valuable features of GroupDocs.Parser Cloud API. It is very simple and easy to define templates and extract data from invoices or other kinds of typical documents. The API provides methods to extract images, extract text and metadata from almost all of the most common file formats. Along with the regular documents, you can use the supported features on password protected files and containers like ZIP archives, OST/PST mail data files, eBooks, markups, and PDF portfolios in your Java applications.

An overview of the features supported by the document parser Cloud API.‎

Features

Parse documents
Parse document by template
Extract text
Extract images
Extract metadata

Extract from Documents

Emails
eBooks
PDF files
PDF Portfolio
Files within ZIP archives
Text and Markup documents
Microsoft & OpenDocument

GroupDocs.Parser

Document parser Cloud API supported formats.

Microsoft Office Formats

Word Processing: DOC, DOCX, DOCM, DOT, DOTX, DOTM
Spreadsheets: XLS, XLSX, XLSM, XLSB, XLT, XLTX, XLTM, XLA, XLAM
Presentations: PPT, PPTX, PPTM, PPS, PPSX, PPSM, POT, POTX, POTM
OneNote: ONE

OpenDocument & Other Formats

OpenDocument Formats: ODT, OTT, ODS, OTS, ODP, OTP
Text: TXT, RTF
Markup: HTML, XHTML, MHTML, MD, XML
Portable Formats: PDF
Email: PST, OST, EML, EMLX, MSG
eBook Formats: EPUB, FB2, CHM
Other: NUMBERS, CSV, ZIP

GroupDocs.Parser

Supported Operating Systems and Frameworks

Frequently Asked Questions

I want to create my own Java application that can parse files online?

Check our Cloud SDKs for Java at GitHub if you are looking for the source code to parse files in the Cloud.

Can I try GroupDocs.Parser REST APIs on Java for free?

You can try GroupDocs.Parser Low-Code Java APIs without any limitations.

I do not want to upload my confidential files anywhere? What are my options?

GroupDocs.Parser Cloud is also available as Docker image which can be used to self-host the service. Or you may build your own services using GroupDocs.Parser High-code APIs which currently drive both our Free Apps and REST APIs.

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Images from a Document

GroupDocs.Parser Cloud API, provides several ways to extract images from the documents. You can extract all the images from the whole document, extract images from the specific pages by setting pages range. Following example shows how to extract images from a document inside a container.

Extract images from a PDF Portfolio - Java

// Get AppKey and AppSID from https://dashboard.groupdocs.cloud
String MyAppKey = "";
String MyAppSid = "";

Configuration configuration = new Configuration(MyAppSid, MyAppKey);
ParseApi apiInstance = new ParseApi(configuration);

FileInfo fileInfo = new FileInfo();
fileInfo.setFilePath("pdf/PDF with attachments.pdf");
fileInfo.setPassword("password");

ContainerItemInfo containerItemInfo = new ContainerItemInfo();
containerItemInfo.setRelativePath("template-document.pdf");

ImagesOptions options = new ImagesOptions();
options.setFileInfo(fileInfo);
options.setContainerItemInfo(containerItemInfo);

ImagesRequest request = new ImagesRequest(options);
ImagesResult response = apiInstance.images(request);

// For complete examples and data files, visit https://github.com/groupdocs-parser-cloud/groupdocs-parser-cloud-java-samples

Parse & Extract with REST API & Java SDK

GroupDocs.Parser
Cloud SDK Java

Frequently Asked Questions

I want to create my own Java application that can parse files online?

Can I try GroupDocs.Parser REST APIs on Java for free?

I do not want to upload my confidential files anywhere? What are my options?

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Images from a Document

Extract images from a PDF Portfolio - Java

GroupDocs.Parser Cloud also offers individual document parsing SDKs for other languages as listed below:

GroupDocs.Parser
Cloud cURL

GroupDocs.Parser
Cloud SDK .NET

GroupDocs.Parser
Cloud SDK PHP

GroupDocs.Parser
Cloud SDK Python

GroupDocs.Parser
Cloud SDK Ruby

GroupDocs.Parser
Cloud SDK Node.js

Parse & Extract with REST API & Java SDK

GroupDocs.ParserCloud SDK Java

Frequently Asked Questions

I want to create my own Java application that can parse files online?

Can I try GroupDocs.Parser REST APIs on Java for free?

I do not want to upload my confidential files anywhere? What are my options?

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Images from a Document

Extract images from a PDF Portfolio - Java

GroupDocs.Parser Cloud also offers individual document parsing SDKs for other languages as listed below:

GroupDocs.ParserCloud cURL

GroupDocs.ParserCloud SDK .NET

GroupDocs.ParserCloud SDK PHP

GroupDocs.ParserCloud SDK Python

GroupDocs.ParserCloud SDK Ruby

GroupDocs.ParserCloud SDK Node.js

GroupDocs.Parser
Cloud SDK Java

GroupDocs.Parser
Cloud cURL

GroupDocs.Parser
Cloud SDK .NET

GroupDocs.Parser
Cloud SDK PHP

GroupDocs.Parser
Cloud SDK Python

GroupDocs.Parser
Cloud SDK Ruby

GroupDocs.Parser
Cloud SDK Node.js