POST /v1/pdf/find
Attributes
Attributes are case-sensitive and should be inside JSON for POST request. for example:
{ "url": "https://example.com/file1.pdf" }| Attribute | Type | Required | Default | Description |
|---|---|---|---|---|
url | string | Yes | - | URL to the source file url attribute |
callback | string | No | - | The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when async is set to true. |
httpusername | string | No | - | HTTP auth user name if required to access source URL. |
httppassword | string | No | - | HTTP auth password if required to access source URL. |
pages | string | No | all pages | Specify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”). The first-page index is 0. Use ”!” before a number for inverted page numbers (e.g. “!0” for the last page). If not specified, the default configuration processes all pages. The input must be in string format. |
inline | boolean | No | false | Set to true to return results inside the response. Otherwise, the endpoint will return a URL to the output file generated. |
password | string | No | - | Password for the PDF file. |
async | boolean | No | false | Set async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks |
searchString | string | Yes | - | Text to search can support regular expressions if you set the regexSearch param to true. |
wordMatchingMode | string | No | None | WordMatchingMode defines how search terms match PDF text. Modes: None (exact string match only), SmartMatch (default; flexible word boundary match, includes letters/digits/punctuation), ExactMatch (strict word boundaries, whole-word match only). |
regexSearch | boolean | No | false | Set to true to enable regular expression search for the searchString(s) parameter. |
profiles | object | No | - | See Profiles for more information. |
ColumnDetectionMode | string | No | ContentGroupsAndBorders | Controls column detection/alignment in PDF table extraction. Modes: ContentGroupsAndBorders (default; text + lines), ContentGroups (text grouping only), Borders (lines only), BorderedTables (OCR-based for bordered tables), ContentGroupsAI (AI for dense/complex layouts). |
DetectionMinNumberOfRows | integer | No | 1 | Minimum number of rows to detect in a table |
DetectionMinNumberOfColumns | integer | No | 1 | Minimum number of columns to detect in a table |
DetectionMaxNumberOfInvalidSubsequentRowsAllowed | integer | No | 0 | Maximum number of invalid subsequent rows allowed in a table |
DetectionMinNumberOfLineBreaksBetweenTables | integer | No | 0 | Minimum number of line breaks between tables |
EnhanceTableBorders | boolean | No | true | Enhance table borders or not |
OCRDetectPageRotation | boolean | No | false | Controls whether to detect page rotation in the PDF document when OCR applied. Set to true to detect page rotation. See Support page rotation for more information. |
DataEncryptionAlgorithm | string | No | - | Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256. |
DataEncryptionKey | string | No | - | Controls the encryption key used for data encryption. See User-Controlled Encryption for more information. |
DataEncryptionIV | string | No | - | Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information. |
DataDecryptionAlgorithm | string | No | - | Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256. |
DataDecryptionKey | string | No | - | Controls the decryption key used for data decryption. See User-Controlled Encryption for more information. |
DataDecryptionIV | string | No | - | Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information. |
requestParametersDocument | string | No | - | |
responseParameters | object | No | - | - |
error | boolean | No | - | Indicates whether an error occurred (false means success) |
status | string | No | - | Status code of the request (200, 404, 500, etc.). For more information, see Response Codes. |
message | string | No | - | Message of the request |
credits | integer | No | - | Number of credits consumed by the request |
remainingCredits | integer | No | - | Number of credits remaining in the account |
duration | integer | No | - | Time taken for the operation in milliseconds |
errorCode | integer | No | - | Error code of the request (400, 401, 402, 403, 404, 500, etc.) |
Support page rotation
This endpoint supports PDF page rotation as follows:Find only bordered tables
You can limit search to bordered tables only by enabling the legacy table search mode with the followingprofiles config:
Example Payload
To see the request size limits, please refer to the Request Size Limits.
Example Response
To see the main response codes, please refer to the Response Codes page.
Inconsistent URL Encoding in cURL Output: When using cURL to make API requests, the output JSON may show URL characters encoded as Unicode escape sequences. For example, the ampersand character (
&) may appear as \u0026 in the cURL output. This is normal JSON encoding behavior and does not affect the validity of the URL. The URL will function correctly when used, as JSON parsers automatically decode these escape sequences. If you’re parsing the response programmatically, your JSON parser will handle this conversion automatically.Code Samples
- CURL
- JavaScript/Node.js
- Python
- C#
- Java