GroupDocs.Parser Cloud for Python allows to parse data from over 50 popular document types. Text extraction, image extraction, parsing data by template are most valuable features of GroupDocs.Parser and available in our Cloud REST API service. With our SDK it is possible to solve many document processing automation tasks and extract data easily just applying a predefined template. Along with the regular documents, you can use the supported features on password protected files and containers like ZIP archives, OST/PST mail data files, eBooks, markups, and PDF portfolios in your Python applications.

At a
Glance
Supported
File Formats
Platform
Independence

An overview of the features supported by the document parser Cloud API.‎

Features

Parse documents
Parse document by template
Extract text
Extract images
Extract metadata

Extract from Documents

Emails
eBooks
PDF files
PDF Portfolio
Files within ZIP archives
Text and Markup documents
Microsoft & OpenDocument

GroupDocs.Parser

Document parser Cloud API supported formats.

Microsoft Office Formats

Word Processing: DOC, DOCX, DOCM, DOT, DOTX, DOTM
Spreadsheets: XLS, XLSX, XLSM, XLSB, XLT, XLTX, XLTM, XLA, XLAM
Presentations: PPT, PPTX, PPTM, PPS, PPSX, PPSM, POT, POTX, POTM
OneNote: ONE

OpenDocument & Other Formats

OpenDocument Formats: ODT, OTT, ODS, OTS, ODP, OTP
Text: TXT, RTF
Markup: HTML, XHTML, MHTML, MD, XML
Portable Formats: PDF
Email: PST, OST, EML, EMLX, MSG
eBook Formats: EPUB, FB2, CHM
Other: NUMBERS, CSV, ZIP

GroupDocs.Parser

Supported Operating Systems and Frameworks

Frequently Asked Questions

I want to create my own Python application that can parse files online?

Check our Cloud SDKs for Python at GitHub if you are looking for the source code to parse files in the Cloud.

Can I try GroupDocs.Parser REST APIs on Python for free?

You can try GroupDocs.Parser Low-Code Python APIs without any limitations.

I do not want to upload my confidential files anywhere? What are my options?

GroupDocs.Parser Cloud is also available as Docker image which can be used to self-host the service. Or you may build your own services using GroupDocs.Parser High-code APIs which currently drive both our Free Apps and REST APIs.

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Data from a Document applying a template

GroupDocs.Parser Cloud API, allows to extract data from documents by a template. You may define a template for your typical documents such as invoices, receipts, reports and applying the template you can easily extract data for further processing. Following example shows how to extract data by template.

Extract data from documents by template - Python

# For complete examples and data files, please go to https://github.com/groupdocs-parser-cloud/groupdocs-parser-cloud-python-samples
import groupdocs_parser_cloud

app_sid = "XXXX-XXXX-XXXX-XXXX" # Get AppKey and AppSID from https://dashboard.groupdocs.cloud
app_key = "XXXXXXXXXXXXXXXX" # Get AppKey and AppSID from https://dashboard.groupdocs.cloud

parseApi = groupdocs_parser_cloud.ParseApi.from_keys(app_sid, app_key)

options = groupdocs_parser_cloud.ParseOptions()
options.file_info = groupdocs_parser_cloud.FileInfo()
options.file_info.file_path = "words-processing/docx/companies.docx"
options.template_path = "templates/companies.json"

request = groupdocs_parser_cloud.ParseRequest(options)
result = parseApi.parse(request)
    ```

Document processing made easy with this document parsing REST API and Python SDK

GroupDocs.Parser
Cloud SDK for Python

Frequently Asked Questions

I want to create my own Python application that can parse files online?

Can I try GroupDocs.Parser REST APIs on Python for free?

I do not want to upload my confidential files anywhere? What are my options?

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Data from a Document applying a template

Extract data from documents by template - Python

Extract images from Documents using GroupDocs.Parser Free App

Extract images from Documents using GroupDocs.Parser Free App

Support and Learning Resources

Learning Resources

Product Support

GroupDocs.Parser Cloud also offers individual document parsing SDKs for other languages as listed below:

GroupDocs.Parser
Cloud for cURL

GroupDocs.Parser
Cloud SDK for .NET

GroupDocs.Parser
Cloud SDK for Java

GroupDocs.Parser
Cloud SDK for PHP

GroupDocs.Parser
Cloud SDK for Ruby

GroupDocs.Parser
Cloud SDK for Node.js

Document processing made easy with this document parsing REST API and Python SDK

GroupDocs.ParserCloud SDK for Python

Frequently Asked Questions

I want to create my own Python application that can parse files online?

Can I try GroupDocs.Parser REST APIs on Python for free?

I do not want to upload my confidential files anywhere? What are my options?

Document Parser REST API Features

Parse PDF, word processing documents, spreadsheets & presentations

Parse Microsoft Word, Excel, PowerPoint and OpenDocument template files

Parse Macro-Enabled document & templates for MS Word, Excel & PowerPoint

Extract text content from the whole document

Extract text and images from specific pages

Extract formatted text by setting extraction mode

Extract text from the documents in ZIP archives or Emails

Retrieve documents from Emails, PDF portfolios & MS Outlook storages

Get document page count and other information

Extract Data from a Document applying a template

Extract data from documents by template - Python

Extract images from Documents using GroupDocs.Parser Free App

Extract images from Documents using GroupDocs.Parser Free App

Support and Learning Resources

Learning Resources

Product Support

GroupDocs.Parser Cloud also offers individual document parsing SDKs for other languages as listed below:

GroupDocs.ParserCloud for cURL

GroupDocs.ParserCloud SDK for .NET

GroupDocs.ParserCloud SDK for Java

GroupDocs.ParserCloud SDK for PHP

GroupDocs.ParserCloud SDK for Ruby

GroupDocs.ParserCloud SDK for Node.js

Cookies

Cookies Settings

GroupDocs.Parser
Cloud SDK for Python

GroupDocs.Parser
Cloud for cURL

GroupDocs.Parser
Cloud SDK for .NET

GroupDocs.Parser
Cloud SDK for Java

GroupDocs.Parser
Cloud SDK for PHP

GroupDocs.Parser
Cloud SDK for Ruby

GroupDocs.Parser
Cloud SDK for Node.js