Extract Text by Expression
Extracts text from PDF using regular expression.
- Method: POST
- Endpoint: /api/v2/ExtractTextByExpression
Parameters
- File Contentbase64, Required
The content of the input file
- File Namestring, Required
Source PDF file name with .pdf extension
- Expressionstring, Required
Example 1- I have one PDF (4 pages). “US” or “%” word is used couple of time. It will extract all “US” word or “%” from the input pdf file.
Regular Expression - %: #%: [^$.|?*+()
- Page Sequencestring, Required
- Specify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”).
- If not specified, the default configuration processes all pages. The input must be in string format.
Output
- Text Listarray of string , Required
It will display Text List.
Header
Content- Type:application/json
Authorization: Please copy key from the link.
Payload
{
"docContent": "Please put PDF base64 content",
"docName": "output.pdf",
"expression": "%",
"pageSequence": "1"
}
PDF4me api samples
- CSharp(C#)
- Java
- JavaScript
- Python
- Salesforce
- n8n
- Google Script
- AWS Lambda