POST /v1/pdf/info
Extracts basic information about an input PDF file, PDF file security permissions, and other information. If you want to extract information about fillable fields (checkboxes, radiobuttons, listboxes) from PDF then please use /pdf/info/fields instead.
For one-time check of PDF file information and find form fields please use PDF
Edit Add Helper.
Attributes
Attributes are case-sensitive and should be inside JSON for POST request. for example: { "url": "https://example.com/file1.pdf" }
| Attribute | Type | Required | Default | Description |
url | string | Yes | - | URL to the source file url attribute |
callback | string | No | - | The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when async is set to true. |
httpusername | string | No | - | HTTP auth user name if required to access source URL. |
httppassword | string | No | - | HTTP auth password if required to access source URL. |
password | string | No | - | Password for the PDF file. |
async | boolean | No | false | Set async to true for long processes to run in the background, API will then return a jobId which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks |
profiles | object | No | - | See Profiles for more information. |
OCRMode | string | No | Auto | Specifies how OCR (Optical Character Recognition) should process input content, offering various modes to tailor text extraction based on content type such as images, fonts, and vector graphics. For more information, see OCR Extraction Modes. |
OCRResolution | integer | No | 300 | Use this parameter to change the OCR resolution from the default 300 dpi. The range is from 72 to 1200 dpi. |
RotationAngle | integer | No | - | Use manual rotation to handle PDFs with vertically drawn text. Normally, OCR automatically detects page rotation in PDFs and extracts text accurately. However, in some cases, the PDF might not have an actual rotated page --- Rather, the text itself is drawn vertically. In such scenarios, auto-detection may fail. You can use this parameter to manually set the page rotation. The available angles are: 0, 1, 2, 3. |
LineGroupingMode | string | No | None | Controls line grouping in PDF text extraction. Modes: None (no grouping), GroupByRows (merge rows if all cells align), GroupByColumns (merge cells by column), JoinOrphanedRows (merge single-cell rows to above if no separator). |
ConsiderFontColors | boolean | No | false | Controls whether font colors should be considered when detecting table structure and merging text objects during PDF extraction. Set to true to consider font colors. |
DetectNewColumnBySpacesRatio | string | No | 1.2 | Controls how spaces between words are interpreted for column detection in PDF text extraction. It defines the ratio of space width that determines when text should be treated as being in separate columns. |
AutoAlignColumnsToHeader | boolean | No | true | Controls how columns are detected and aligned during table extraction from PDF documents. It affects both table structure detection and text extraction with formatting preservation. Set to true to automatically align columns to the header row. When set to true (default), the row with the most columns is used as the header, and all other rows are aligned to this structure --- ideal for well-structured tables. When set to false, columns are analyzed independently across all rows to build the structure, which works better for inconsistent or irregular tables. |
OCRImagePreprocessingFilters.AddGammaCorrection() | array[string (float format)] | No | ["1.4"] | Adds a gamma correction filter to the image preprocessing pipeline used during OCR (Optical Character Recognition). This filter adjusts the brightness and contrast of an image by applying a non-linear gamma correction to improve text recognition quality. |
OCRImagePreprocessingFilters.AddGrayscale() | boolean | No | false | Set to true to preprocessing filter that converts a colored document/image to grayscale before performing OCR |
DataEncryptionAlgorithm | string | No | - | Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256. |
DataEncryptionKey | string | No | - | Controls the encryption key used for data encryption. See User-Controlled Encryption for more information. |
DataEncryptionIV | string | No | - | Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information. |
DataDecryptionAlgorithm | string | No | - | Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: AES128, AES192, AES256. |
DataDecryptionKey | string | No | - | Controls the decryption key used for data decryption. See User-Controlled Encryption for more information. |
DataDecryptionIV | string | No | - | Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information. |
Query parameters
No query parameters accepted.
Responses
| Parameter | Type | Description |
info | object | Info details. |
error | boolean | Indicates whether an error occurred (false means success) |
status | string | Status code of the request (200, 404, 500, etc.). For more information, see Response Codes. |
credits | integer | Number of credits consumed by the request |
remainingCredits | integer | Number of credits remaining in the account |
duration | integer | Time taken for the operation in milliseconds |
Example Payload
{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-info/sample.pdf",
"async": false
}
Example Response
To see the main response codes, please refer to the
Response Codes page.
{
"info": {
"PageCount": 1,
"Author": "Alice V. Knox",
"Title": "Kid's News 1",
"Producer": "Acrobat Distiller 4.0 for Windows",
"Subject": "Kid's News 1",
"CreationDate": "8/15/2001 2:50:36 PM",
"Bookmarks": "",
"Keywords": "",
"Creator": "Adobe PageMaker 6.52",
"Encrypted": false,
"PageRectangle": {
"Location": {
"IsEmpty": true,
"X": 0,
"Y": 0
},
"Size": "612, 792",
"X": 0,
"Y": 0,
"Width": 612,
"Height": 792,
"Left": 0,
"Top": 0,
"Right": 612,
"Bottom": 792,
"IsEmpty": false
},
"ModificationDate": "9/20/2001 6:23:02 PM",
"EncryptionAlgorithm": 0,
"PermissionPrinting": true,
"PermissionModifyDocument": true,
"PermissionContentExtraction": true,
"PermissionModifyAnnotations": true,
"PermissionFillForms": true,
"PermissionAccessibility": true,
"PermissionAssemble": true,
"PermissionHighQualityPrint": true
},
"error": false,
"status": 200,
"remainingCredits": 77732
}
Inconsistent URL Encoding in cURL Output: When using cURL to make API requests, the output JSON may show URL characters encoded as Unicode escape sequences. For example, the ampersand character (&) may appear as \u0026 in the cURL output. This is normal JSON encoding behavior and does not affect the validity of the URL. The URL will function correctly when used, as JSON parsers automatically decode these escape sequences. If you’re parsing the response programmatically, your JSON parser will handle this conversion automatically.
Code Samples
CURL
JavaScript/Node.js
Python
C#
Java
PHP
curl --location --request POST 'https://api.pdf.co/v1/pdf/info' \
--header 'x-api-key: *******************' \
--header 'Content-Type: application/json' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-info/sample.pdf",
"async": false
}'
var https = require("https");
var path = require("path");
var fs = require("fs");
// `request` module is required for file upload.
// Use "npm install request" command to install.
var request = require("request");
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";
// Source PDF file to get information
const SourceFile = "./sample.pdf";
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
getPresignedUrl(API_KEY, SourceFile)
.then(([uploadUrl, uploadedFileUrl]) => {
// 2. UPLOAD THE FILE TO CLOUD.
uploadFile(API_KEY, SourceFile, uploadUrl)
.then(() => {
// 3. GET INFORMATION FROM UPLOADED FILE
getPdfInfo(API_KEY, uploadedFileUrl);
})
.catch(e => {
console.log(e);
});
})
.catch(e => {
console.log(e);
});
function getPresignedUrl(apiKey, localFile) {
return new Promise(resolve => {
// Prepare request to `Get Presigned URL` API endpoint
let queryPath = `/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=${path.basename(SourceFile)}`;
let reqOptions = {
host: "api.pdf.co",
path: encodeURI(queryPath),
headers: { "x-api-key": API_KEY }
};
// Send request
https.get(reqOptions, (response) => {
response.on("data", (d) => {
let data = JSON.parse(d);
if (data.error == false) {
// Return presigned url we received
resolve([data.presignedUrl, data.url]);
}
else {
// Service reported error
console.log("getPresignedUrl(): " + data.message);
}
});
})
.on("error", (e) => {
// Request error
console.log("getPresignedUrl(): " + e);
});
});
}
function uploadFile(apiKey, localFile, uploadUrl) {
return new Promise(resolve => {
fs.readFile(SourceFile, (err, data) => {
request({
method: "PUT",
url: uploadUrl,
body: data,
headers: {
"Content-Type": "application/octet-stream"
}
}, (err, res, body) => {
if (!err) {
resolve();
}
else {
console.log("uploadFile() request error: " + e);
}
});
});
});
}
function getPdfInfo(apiKey, uploadedFileUrl) {
// Prepare URL for `PDF Info` API call
var queryPath = `/v1/pdf/info`;
// JSON payload for api request
var jsonPayload = JSON.stringify({
url: uploadedFileUrl
});
var reqOptions = {
host: "api.pdf.co",
method: "POST",
path: queryPath,
headers: {
"x-api-key": apiKey,
"Content-Type": "application/json",
"Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
}
};
// Send request
var postRequest = https.request(reqOptions, (response) => {
response.on("data", (d) => {
response.setEncoding("utf8");
// Parse JSON response
let data = JSON.parse(d);
if (data.error == false) {
// Display PDF document information
for (var key in data.info) {
console.log(`${key}: ${data.info[key]}`);
}
}
else {
// Service reported error
console.log("getPdfInfo(): " + data.message);
}
});
})
.on("error", (e) => {
// Request error
console.log("getPdfInfo(): " + e);
});
// Write request data
postRequest.write(jsonPayload);
postRequest.end();
}
import os
import requests # pip install requests
# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"
# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"
# Source PDF file
SourceFile = ".\\sample.pdf"
def main(args = None):
uploadedFileUrl = uploadFile(SourceFile)
if (uploadedFileUrl != None):
getInfoFromPDF(uploadedFileUrl)
def getInfoFromPDF(uploadedFileUrl):
"""Get Information using PDF.co Web API"""
# Prepare requests params as JSON
# See documentation: https://developer.pdf.co/
parameters = {}
parameters["url"] = uploadedFileUrl
# Prepare URL for 'PDF Info' API request
url = "{}/pdf/info".format(BASE_URL)
# Execute request and get response as JSON
response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
if (response.status_code == 200):
json = response.json()
if json["error"] == False:
# Display information
print(json["info"])
else:
# Show service reported error
print(json["message"])
else:
print(f"Request error: {response.status_code} {response.reason}")
def uploadFile(fileName):
"""Uploads file to the cloud"""
# 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.
# Prepare URL for 'Get Presigned URL' API request
url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
BASE_URL, os.path.basename(fileName))
# Execute request and get response as JSON
response = requests.get(url, headers={ "x-api-key": API_KEY })
if (response.status_code == 200):
json = response.json()
if json["error"] == False:
# URL to use for file upload
uploadUrl = json["presignedUrl"]
# URL for future reference
uploadedFileUrl = json["url"]
# 2. UPLOAD FILE TO CLOUD.
with open(fileName, 'rb') as file:
requests.put(uploadUrl, data=file, headers={ "x-api-key": API_KEY, "content-type": "application/octet-stream" })
return uploadedFileUrl
else:
# Show service reported error
print(json["message"])
else:
print(f"Request error: {response.status_code} {response.reason}")
return None
if __name__ == '__main__':
main()
using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
namespace PDFcoApiExample
{
class Program
{
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const String API_KEY = "***********************************";
// Source PDF file to get information
const string SourceFile = @".\sample.pdf";
static void Main(string[] args)
{
// Create standard .NET web client instance
WebClient webClient = new WebClient();
// Set API Key
webClient.Headers.Add("x-api-key", API_KEY);
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct file URL, skip to the step 3.
// Prepare URL for `Get Presigned URL` API call
string query = Uri.EscapeUriString(string.Format(
"https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name={0}",
Path.GetFileName(SourceFile)));
try
{
// Execute request
string response = webClient.DownloadString(query);
// Parse JSON response
JObject json = JObject.Parse(response);
if (json["error"].ToObject<bool>() == false)
{
// Get URL to use for the file upload
string uploadUrl = json["presignedUrl"].ToString();
string uploadedFileUrl = json["url"].ToString();
// 2. UPLOAD THE FILE TO CLOUD.
webClient.Headers.Add("content-type", "application/octet-stream");
webClient.UploadFile(uploadUrl, "PUT", SourceFile); // You can use UploadData() instead if your file is byte[] or Stream
// 3. GET INFORMATION FROM UPLOADED FILE
// URL for `PDF Info` API call
var url = "https://api.pdf.co/v1/pdf/info";
// Prepare requests params as JSON
Dictionary<string, object> parameters = new Dictionary<string, object>();
parameters.Add("url", uploadedFileUrl);
// Convert dictionary of params to JSON
string jsonPayload = JsonConvert.SerializeObject(parameters);
// Execute POST request with JSON payload
response = webClient.UploadString(url, jsonPayload);
// Parse JSON response
json = JObject.Parse(response);
if (json["error"].ToObject<bool>() == false)
{
// Display PDF document information
foreach (JToken token in json["info"])
{
JProperty property = (JProperty) token;
Console.WriteLine("{0}: {1}", property.Name, property.Value);
}
}
else
{
Console.WriteLine(json["message"].ToString());
}
}
else
{
Console.WriteLine(json["message"].ToString());
}
}
catch (WebException e)
{
Console.WriteLine(e.ToString());
}
webClient.Dispose();
Console.WriteLine();
Console.WriteLine("Press any key...");
Console.ReadKey();
}
}
}
package com.company;
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;
import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Map;
public class Main
{
// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
final static String API_KEY = "***********************************";
// Source file name
final static Path SourceFile = Paths.get(".\\sample.pdf");
public static void main(String[] args) throws IOException
{
// Create HTTP client instance
OkHttpClient webClient = new OkHttpClient();
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have a direct file URL, skip to the step 3.
// Prepare URL for `Get Presigned URL` API call
String query = String.format(
"https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
SourceFile.getFileName());
// Prepare request
Request request = new Request.Builder()
.url(query)
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Get URL to use for the file upload
String uploadUrl = json.get("presignedUrl").getAsString();
// Get URL of uploaded file to use with later API calls
String uploadedFileUrl = json.get("url").getAsString();
// 2. UPLOAD THE FILE TO CLOUD.
if (uploadFile(webClient, uploadUrl, SourceFile.toFile()))
{
// 3. GET INFORMATION FROM UPLOADED FILE
getPdfInfo(webClient, uploadedFileUrl);
}
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static void getPdfInfo(OkHttpClient webClient, String uploadedFileUrl) throws IOException {
// Prepare URL for `PDF Info` API call
String query = "https://api.pdf.co/v1/pdf/info";
// Make correctly escaped (encoded) URL
URL url = null;
try
{
url = new URI(null, query, null).toURL();
}
catch (URISyntaxException e)
{
e.printStackTrace();
}
// Create JSON payload
String jsonPayload = String.format("{\"url\": \"%s\"}",
uploadedFileUrl);
// Prepare request body
RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);
// Prepare request
Request request = new Request.Builder()
.url(url)
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.addHeader("Content-Type", "application/json")
.post(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
if (response.code() == 200)
{
// Parse JSON response
JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();
boolean error = json.get("error").getAsBoolean();
if (!error)
{
// Display PDF document information
JsonObject info = (JsonObject) json.get("info");
for (Map.Entry<String, JsonElement> entry : info.entrySet())
{
System.out.println(entry.getKey() + ": " + entry.getValue());
}
}
else
{
// Display service reported error
System.out.println(json.get("message").getAsString());
}
}
else
{
// Display request error
System.out.println(response.code() + " " + response.message());
}
}
public static boolean uploadFile(OkHttpClient webClient, String url, File sourceFile) throws IOException
{
// Prepare request body
RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile);
// Prepare request
Request request = new Request.Builder()
.url(url)
.addHeader("x-api-key", API_KEY) // (!) Set API Key
.addHeader("content-type", "application/octet-stream")
.put(body)
.build();
// Execute request
Response response = webClient.newCall(request).execute();
return (response.code() == 200);
}
}
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>PDF Information Results</title>
</head>
<body>
<?php
// Get submitted form data
$apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co
// 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
// * If you already have the direct PDF file link, go to the step 3.
// Create URL
$url = "https://api.pdf.co/v1/file/upload/get-presigned-url" .
"?name=" . urlencode($_FILES["file"]["name"]) .
"&contenttype=application/octet-stream";
// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
// Execute request
$result = curl_exec($curl);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
$json = json_decode($result, true);
// Get URL to use for the file upload
$uploadFileUrl = $json["presignedUrl"];
// Get URL of uploaded file to use with later API calls
$uploadedFileUrl = $json["url"];
// 2. UPLOAD THE FILE TO CLOUD.
$localFile = $_FILES["file"]["tmp_name"];
$fileHandle = fopen($localFile, "r");
curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream"));
curl_setopt($curl, CURLOPT_PUT, true);
curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));
// Execute request
curl_exec($curl);
fclose($fileHandle);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
// 3. GET INFORMATION ABOUT UPLOADED PDF DOCUMENT
ExtractInfo($apiKey, $uploadedFileUrl);
}
else
{
// Display request error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
}
else
{
// Display service reported error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
curl_close($curl);
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
function ExtractInfo($apiKey, $uploadedFileUrl)
{
// Create URL
$url = "https://api.pdf.co/v1/pdf/info";
// Prepare requests params
$parameters = array();
$parameters["url"] = $uploadedFileUrl;
// Create Json payload
$data = json_encode($parameters);
// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
// Execute request
$result = curl_exec($curl);
if (curl_errno($curl) == 0)
{
$status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
if ($status_code == 200)
{
$json = json_decode($result, true);
if (!isset($json["error"]) || $json["error"] == false)
{
$documentInfo = $json["info"];
// Display the document info
echo "<div><h2>Document Info:</h2><p>";
foreach ($documentInfo as $key => $value)
{
if(is_array($value)){
echo $key . ' = ' . json_encode($value) . '<br/>';
}
else{
echo $key . ' = ' . $value . '<br/>';
}
}
echo "</p></div>";
}
else
{
// Display service reported error
echo "<p>Error: " . $json["message"] . "</p>";
}
}
else
{
// Display request error
echo "<p>Status code: " . $status_code . "</p>";
echo "<p>" . $result . "</p>";
}
}
else
{
// Display CURL error
echo "Error: " . curl_error($curl);
}
}
?>
</body>
</html>