PDF Search and Delete Text

`POST /v1/pdf/edit/delete-text`

When using regular expressions in JSON payloads, ensure that backslashes are properly escaped. For example, a single backslash \ should be written as \\.

Attributes

Attributes are case-sensitive and should be inside JSON for POST request. for example: { "url": "https://example.com/file1.pdf" }

Attribute	Type	Required	Default	Description
`url`	string	Yes	-	URL to the source file `url` attribute
`callback`	string	No	-	The callback URL (or Webhook) used to receive the POST data. see Webhooks & Callbacks. This is only applicable when `async` is set to `true`.
`httpusername`	string	No	-	HTTP auth user name if required to access source URL.
`httppassword`	string	No	-	HTTP auth password if required to access source URL.
`searchStrings`	array[string]	Yes	-	The array of strings to search.
`replacementLimit`	integer	No	`0`	Limit the number of searches & replacements for every item. The value 0 means every found occurrence will be replaced.
`caseSensitive`	boolean	No	`true`	Set to `false` to don’t use case-sensitive search.
`regex`	boolean	No	`false`	Set to `true` to use regular expression for search string(s).
`name`	string	No	-	File name for the generated output, the input must be in string format.
`expiration`	integer	No	`60`	Set the expiration time for the output link in minutes. After this specified duration, any generated output file(s) will be automatically deleted from PDF.co Temporary Files Storage. The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using PDF.co Built-In Files Storage.
`pages`	string	No	all pages	Specify page indices as comma-separated values or ranges to process (e.g. “0, 1, 2-” or “1, 2, 3-7”). The first-page index is 0. Use ”!” before a number for inverted page numbers (e.g. “!0” for the last page). If not specified, the default configuration processes all pages. The input must be in string format.
`password`	string	No	-	Password for the PDF file.
`async`	boolean	No	`false`	Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the Background Job Check endpoint. Also see Webhooks & Callbacks
`profiles`	object	No	-	See Profiles for more information.
`outputDataFormat`	string	No	-	If you require your output as `base64` format, set this to `base64`
`removeTextUnderPatch`	boolean	No	`true`	Controls whether to remove text under the patch or not
`usepatch`	boolean	No	`false`	Controls whether to use a patch or not
`patchColor`	string	No	`#000000`	Controls the color of the patch
`DataEncryptionAlgorithm`	string	No	-	Controls the encryption algorithm used for data encryption. See User-Controlled Encryption for more information. The available algorithms are: `AES128`, `AES192`, `AES256`.
`DataEncryptionKey`	string	No	-	Controls the encryption key used for data encryption. See User-Controlled Encryption for more information.
`DataEncryptionIV`	string	No	-	Controls the encryption IV used for data encryption. See User-Controlled Encryption for more information.
`DataDecryptionAlgorithm`	string	No	-	Controls the decryption algorithm used for data decryption. See User-Controlled Encryption for more information. The available algorithms are: `AES128`, `AES192`, `AES256`.
`DataDecryptionKey`	string	No	-	Controls the decryption key used for data decryption. See User-Controlled Encryption for more information.
`DataDecryptionIV`	string	No	-	Controls the decryption IV used for data decryption. See User-Controlled Encryption for more information.

Showing Redacted Text

By default when we delete text using post-tag-pdf-edit-delete-text it will simply remove text leaving a space where the text was.

In the case where you need to blackout deleted text it can be acheived using following profiles parameters.

Set UsePatch parameter to true.
Set PatchColor parameter to color we want to use for redacting in hex format. For example: 'PatchColor': '#000000'.

In case we want to only blackout text, but not remove it so that we can still copy it, we can do so using RemoveTextUnderPatch parameter and set it to false.

If RemoveTextUnderPatch is set to false then a user could still copy the text making the redaction less secure than you might require!

{
 "profiles": "{'UsePatch': true, 'PatchColor': '#000000', 'RemoveTextUnderPatch': true}"
}

Query parameters

No query parameters accepted.

Responses

Parameter	Type	Description
`url`	string	Direct URL to the final PDF file stored in S3.
`outputLinkValidTill`	string	Timestamp indicating when the output link will expire
`pageCount`	integer	Number of pages in the PDF document.
`error`	boolean	Indicates whether an error occurred (`false` means success)
`status`	string	Status code of the request (200, 404, 500, etc.). For more information, see Response Codes.
`name`	string	Name of the output file
`credits`	integer	Number of credits consumed by the request
`remainingCredits`	integer	Number of credits remaining in the account
`duration`	integer	Time taken for the operation in milliseconds

`Example` Payload

To see the request size limits, please refer to the Request Size Limits.

{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
  "name": "pdfWithTextDeleted",
  "caseSensitive": "false",
  "searchString": "Invoice",
  "replacementLimit": 0,
  "async": false
}

`Example` Response

To see the main response codes, please refer to the Response Codes page.

{
  "url": "https://pdf-temp-files.s3.us-west-2.amazonaws.com/ZOSEQZFNVCYLD5N5CJFVIYQKBVLR8OKD/pdfWithTextDeleted.pdf?X-Amz-Expires=3600&X-Amz-Security-Token=FwoGZXIvYXdzECYaDKOO4WmO5C5shyOYYSKCAVsAo6VkB5HQjTBd9dMlJujQdEkPfNdPeLfq2mF54s2ESZBmIAJ5UgDUo3J9R475CCS4M3nuuo%2FSJwRy5gNiJdb1ZY0uCtP87x83nH%2B%2BSDu5JK%2F%2BEOrd3MREt8KE3BsQOrv%2FKMdnK%2BT5nJ2x2hC87vHue%2FudY7%2FWX54vx4tfFobEyhEozLbPnwYyKOdEsYYWH7e8tm7XV4UeKxCoKMaXSEPvOod80hR62qXnEI42fOsON3M%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=ASIA4NRRSZPHLUVIAIPX/20230220/us-west-2/s3/aws4_request&X-Amz-Date=20230220T205521Z&X-Amz-SignedHeaders=host&X-Amz-Signature=9f79c1a30d4f373e495e735e908375dad2ae6dcafcee761a477748c2b8298605",
  "pageCount": 1,
  "error": false,
  "status": 200,
  "name": "pdfWithTextDeleted.pdf",
  "credits": 21,
  "duration": 189,
  "remainingCredits": 96235635
}

Code Samples

curl --location --request POST 'https://api.pdf.co/v1/pdf/edit/delete-text' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
"name": "pdfWithTextDeleted",
"caseSensitive": "false",
"searchString": "Invoice",
"replacementLimit": 0,
"async": false
}'

curl --location --request POST 'https://api.pdf.co/v1/pdf/edit/delete-text' \
--header 'Content-Type: application/json' \
--header 'x-api-key: *******************' \
--data-raw '{
"url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-text/sample.pdf",
"name": "pdfWithTextDeleted",
"caseSensitive": "false",
"searchString": "Invoice",
"replacementLimit": 0,
"async": false
}'

var https = require("https");
var path = require("path");
var fs = require("fs");


// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
const API_KEY = "***********************************";


// Direct URL of source PDF file.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/    
const SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
const Password = "";
// Destination PDF file name
const DestinationFile = "./result.pdf";


// Prepare request to `Delete Text from PDF` API endpoint
var queryPath = `/v1/pdf/edit/delete-text`;
// JSON payload for api request
var jsonPayload = JSON.stringify({
    name: path.basename(destinationFile), password: password, url: SourceFileUrl, searchString: 'conspicuous'
});

var reqOptions = {
    host: "api.pdf.co",
    method: "POST",
    path: queryPath,
    headers: {
        "x-api-key": API_KEY,
        "Content-Type": "application/json",
        "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
    }
};

// Send request
var postRequest = https.request(reqOptions, (response) => {
    response.on("data", (d) => {
        // Parse JSON response
        var data = JSON.parse(d);
        if (data.error == false) {
            // Download PDF file
            var file = fs.createWriteStream(DestinationFile);
            https.get(data.url, (response2) => {
                response2.pipe(file)
                    .on("close", () => {
                        console.log(`Generated PDF file saved as "${DestinationFile}" file.`);
                    });
            });
        }
        else {
            // Service reported error
            console.log(data.message);
        }
    });
}).on("error", (e) => {
    // Request error
    console.log(e);
});

// Write request data
postRequest.write(jsonPayload);
postRequest.end();

import os
import requests # pip install requests

# The authentication key (API Key).
# Get your own by registering at https://app.pdf.co
API_KEY = "******************************************"

# Base URL for PDF.co Web API requests
BASE_URL = "https://api.pdf.co/v1"

# Direct URL of source PDF file.
# You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/    
SourceFileURL = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf"
# PDF document password. Leave empty for unprotected documents.
Password = ""
# Destination PDF file name
DestinationFile = ".\\result.pdf"

def main(args = None):
    deleteTextFromPdf(SourceFileURL, DestinationFile)


def deleteTextFromPdf(uploadedFileUrl, destinationFile):
    """Delete Text from PDF using PDF.co Web API"""

    # Prepare requests params as JSON
    parameters = {}
    parameters["name"] = os.path.basename(destinationFile)
    parameters["password"] = Password
    parameters["url"] = uploadedFileUrl
    parameters["searchString"] = "conspicuous"

    # Prepare URL for 'Delete Text from PDF' API request
    url = "{}/pdf/edit/delete-text".format(BASE_URL)

    # Execute request and get response as JSON
    response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
    if (response.status_code == 200):
        json = response.json()

        if json["error"] == False:
            #  Get URL of result file
            resultFileUrl = json["url"]            
            # Download result file
            r = requests.get(resultFileUrl, stream=True)
            if (r.status_code == 200):
                with open(destinationFile, 'wb') as file:
                    for chunk in r:
                        file.write(chunk)
                print(f"Result file saved as \"{destinationFile}\" file.")
            else:
                print(f"Request error: {response.status_code} {response.reason}")
        else:
            # Show service reported error
            print(json["message"])
    else:
        print(f"Request error: {response.status_code} {response.reason}")

if __name__ == '__main__':
    main()

using System;
using System.Collections.Generic;
using System.IO;
using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

namespace PDFcoApiExample
{
  class Program
  {
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const String API_KEY = "***********************************";
    
    // Direct URL of source PDF file.
        // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/
    const string SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    const string Password = "";
    // Destination PDF file name
    const string DestinationFile = @".\result.pdf";

    static void Main(string[] args)
    {
      // Create standard .NET web client instance
      WebClient webClient = new WebClient();

      // Set API Key
      webClient.Headers.Add("x-api-key", API_KEY);

      // Prepare requests params as JSON
      Dictionary<string, string> parameters = new Dictionary<string, string>();
      parameters.Add("name", Path.GetFileName(DestinationFile));
      parameters.Add("password", Password);
      parameters.Add("url", SourceFileUrl);
      parameters.Add("searchString", "conspicuous");

      // Convert dictionary of params to JSON
      string jsonPayload = JsonConvert.SerializeObject(parameters);

      // URL of `Delete Text from PDF` API call
      string url = "https://api.pdf.co/v1/pdf/edit/delete-text";

      try
      {
        // Execute POST request with JSON payload
        string response = webClient.UploadString(url, jsonPayload);

        // Parse JSON response
        JObject json = JObject.Parse(response);

        if (json["error"].ToObject<bool>() == false)
        {
          // Get URL of generated PDF file
          string resultFileUrl = json["url"].ToString();

          // Download PDF file
          webClient.DownloadFile(resultFileUrl, DestinationFile);

          Console.WriteLine("Generated PDF file saved as \"{0}\" file.", DestinationFile);
        }
        else
        {
          Console.WriteLine(json["message"].ToString());
        }
      }
      catch (WebException e)
      {
        Console.WriteLine(e.ToString());
      }

      webClient.Dispose();


      Console.WriteLine();
      Console.WriteLine("Press any key...");
      Console.ReadKey();
    }
  }
}

package com.company;

import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
import okhttp3.*;

import java.io.*;
import java.net.*;
import java.nio.file.Path;
import java.nio.file.Paths;

public class Main
{
    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    final static String API_KEY = "**********************************";

    // Direct URL of source PDF file.
    // You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/    
    final static String SourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
    // PDF document password. Leave empty for unprotected documents.
    final static String Password = "";
    // Destination PDF file name
    final static Path DestinationFile = Paths.get(".\\result.pdf");


    public static void main(String[] args) throws IOException
    {
        // Create HTTP client instance
        OkHttpClient webClient = new OkHttpClient();

        // Prepare URL for `Delete Text from PDF` API call
        String query = "https://api.pdf.co/v1/pdf/edit/delete-text";

        // Make correctly escaped (encoded) URL
        URL url = null;
        try
        {
            url = new URI(null, query, null).toURL();
        }
        catch (URISyntaxException e)
        {
            e.printStackTrace();
        }

        // Create JSON payload
        String jsonPayload = String.format("{\"name\": \"%s\", \"password\": \"%s\", \"url\": \"%s\", \"searchString\": \"conspicuous\"}",
                DestinationFile.getFileName(),
                Password,
                SourceFileUrl);

        // Prepare request body
        RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);

        // Prepare request
        Request request = new Request.Builder()
            .url(url)
            .addHeader("x-api-key", API_KEY) // (!) Set API Key
            .addHeader("Content-Type", "application/json")
            .post(body)
            .build();

        // Execute request
        Response response = webClient.newCall(request).execute();

        if (response.code() == 200)
        {
            // Parse JSON response
            JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

            boolean error = json.get("error").getAsBoolean();
            if (!error)
            {
                // Get URL of generated PDF file
                String resultFileUrl = json.get("url").getAsString();

                // Download PDF file
                downloadFile(webClient, resultFileUrl, DestinationFile.toFile());

                System.out.printf("Generated PDF file saved as \"%s\" file.", DestinationFile.toString());
            }
            else
            {
                // Display service reported error
                System.out.println(json.get("message").getAsString());
            }
        }
        else
        {
            // Display request error
            System.out.println(response.code() + " " + response.message());
        }
    }

    public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
    {
        // Prepare request
        Request request = new Request.Builder()
                .url(url)
                .build();
        // Execute request
        Response response = webClient.newCall(request).execute();

        byte[] fileBytes = response.body().bytes();

        // Save downloaded bytes to file
        OutputStream output = new FileOutputStream(destinationFile);
        output.write(fileBytes);
        output.flush();
        output.close();

        response.close();
    }
}

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Cloud API asynchronous "Delete Text from PDF" job example (allows to avoid timeout errors).</title>
</head>
<body>

<?php 

// Cloud API asynchronous "Delete Text from PDF" job example.

// The authentication key (API Key).
// Get your own by registering at https://app.pdf.co
$apiKey = "***********************************";

// Direct URL of source PDF file. Check another example if you need to upload a local file to the cloud.
// You can also upload your own file into PDF.co and use it as url. Check "Upload File" samples for code snippets: https://github.com/bytescout/pdf-co-api-samples/tree/master/File%20Upload/    
$sourceFileUrl = "https://bytescout-com.s3.amazonaws.com/files/demo-files/cloud-api/pdf-split/sample.pdf";
// PDF document password. Leave empty for unprotected documents.
$password = "";

// Prepare URL for `Delete Text from PDF` API call
$url = "https://api.pdf.co/v1/pdf/edit/delete-text";

// Prepare requests params
$parameters = array();
$parameters["password"] = $password;
$parameters["url"] = $sourceFileUrl;
$parameters["searchString"] = "conspicuous";
$parameters["async"] = true; // (!) Make asynchronous job

// Create Json payload
$data = json_encode($parameters);

// Create request
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_POST, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

// Execute request
$result = curl_exec($curl);

if (curl_errno($curl) == 0)
{
    $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
    
    if ($status_code == 200)
    {
        $json = json_decode($result, true);
        
        if (!isset($json["error"]) || $json["error"] == false)
        {
            // URL of generated PDF file that will available after the job completion
            $resultFileUrl = $json["url"];
            // Asynchronous job ID
            $jobId = $json["jobId"];
            
            // Check the job status in a loop
            do
            {
                $status = CheckJobStatus($jobId, $apiKey); // Possible statuses: "working", "failed", "aborted", "success".
                
                // Display timestamp and status (for demo purposes)
                echo "<p>" . date(DATE_RFC2822) . ": " . $status . "</p>";
                
                if ($status == "success")
                {
                    // Display link to the file with conversion results
                    echo "<div><h2>Conversion Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>";
                    break;
                }
                else if ($status == "working")
                {
                    // Pause for a few seconds
                    sleep(3);
                }
                else 
                {
                    echo $status . "<br/>";
                    break;
                }
            }
            while (true);
        }
        else
        {
            // Display service reported error
            echo "<p>Error: " . $json["message"] . "</p>"; 
        }
    }
    else
    {
        // Display request error
        echo "<p>Status code: " . $status_code . "</p>"; 
        echo "<p>" . $result . "</p>"; 
    }
}
else
{
    // Display CURL error
    echo "Error: " . curl_error($curl);
}

// Cleanup
curl_close($curl);


function CheckJobStatus($jobId, $apiKey)
{
    $status = null;
    
  // Create URL
  $url = "https://api.pdf.co/v1/job/check";
    
  // Prepare requests params
  $parameters = array();
  $parameters["jobid"] = $jobId;

  // Create Json payload
  $data = json_encode($parameters);

  // Create request
  $curl = curl_init();
  curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
  curl_setopt($curl, CURLOPT_URL, $url);
  curl_setopt($curl, CURLOPT_POST, true);
  curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
    
    // Execute request
    $result = curl_exec($curl);
    
    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
        
        if ($status_code == 200)
        {
            $json = json_decode($result, true);
        
            if (!isset($json["error"]) || $json["error"] == false)
            {
                $status = $json["status"];
            }
            else
            {
                // Display service reported error
                echo "<p>Error: " . $json["message"] . "</p>"; 
            }
        }
        else
        {
            // Display request error
            echo "<p>Status code: " . $status_code . "</p>"; 
            echo "<p>" . $result . "</p>"; 
        }
    }
    else
    {
        // Display CURL error
        echo "Error: " . curl_error($curl);
    }
    
    // Cleanup
    curl_close($curl);
    
    return $status;
}

?>

</body>
</html>

Welcome

Extraction

Editing

PDF Conversion

Excel Conversion

PDF Merging & Splitting

Forms

Find & Search

Document, File & System

Pages

Barcodes

Glossary

`POST /v1/pdf/edit/delete-text`

Attributes

Showing Redacted Text

Query parameters

Responses

`Example` Payload

`Example` Response

Code Samples

Welcome

Extraction

Editing

PDF Conversion

Excel Conversion

PDF Merging & Splitting

Forms

Find & Search

Document, File & System

Pages

Barcodes

Glossary

​POST /v1/pdf/edit/delete-text

​Attributes

​Showing Redacted Text

​Query parameters

​Responses

​Example Payload

​Example Response

​Code Samples

`POST /v1/pdf/edit/delete-text`

Attributes

Showing Redacted Text

Query parameters

Responses

`Example` Payload

`Example` Response

Code Samples