API V2

Extract Text by Expression

Extracts text from PDF using regular expression.

  • Method: POST
  • Endpoint: /api/v2/ExtractTextByExpression

Parameters

File Contentbase64, Required

The content of the input file

File Namestring, Required

Source PDF file name with .pdf extension

Expressionstring, Required

Example 1- I have one PDF (4 pages). “US” or “%” word is used couple of time. It will extract all “US” word or “%” from the input pdf file.
Regular Expression - %: #%: [^$.|?*+()

Page Sequencestring, Required
  • Specify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”).
  • If not specified, the default configuration processes all pages. The input must be in string format.

Output

Text Listarray of string , Required

It will display Text List.

Header
Content- Type:application/json
Authorization: Please copy key from the link.

Payload

{
  "docContent": "Please put PDF base64 content",
  "docName": "output.pdf",
  "expression": "%",
  "pageSequence": "1"
}

PDF4me api samples

CSharp(C#)
Java
JavaScript
Python
Salesforce
n8n
Google Script
AWS Lambda