> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pdf.co/llms.txt
> Use this file to discover all available pages before exploring further.

# PDF to CSV

> Convert PDF and scanned images into CSV representation with layout, columns, rows, and tables.

## `POST /v1/pdf/convert/to/csv`

## Attributes

<Note>Attributes are case-sensitive and should be inside JSON for POST request. for example: `{ "url": "https://example.com/file1.pdf" }`</Note>

| Attribute                              | Type                          | Required | Default                    | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| -------------------------------------- | ----------------------------- | -------- | -------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `url`                                  | string                        | *Yes*    | -                          | URL to the source file [`url` attribute](/api-reference/url-input-and-request-limits#supported-file-sources)                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `callback`                             | string                        | *No*     | -                          | The callback URL (or Webhook) used to receive the POST data. see [Webhooks & Callbacks](/api-reference/webhooks). This is only applicable when `async` is set to `true`.                                                                                                                                                                                                                                                                                                                                                                                                           |
| `httpusername`                         | string                        | *No*     | -                          | HTTP auth user name if required to access source URL.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| `httppassword`                         | string                        | *No*     | -                          | HTTP auth password if required to access source URL.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| `pages`                                | string                        | *No*     | all pages                  | Specify page indices as comma-separated values or ranges to process (e.g. "0, 1, 2-" or "1, 2, 3-7"). The first-page index is 0. Use "!" before a number for inverted page numbers (e.g. "!0" for the last page). If not specified, the default configuration processes all pages. The input must be in string format.                                                                                                                                                                                                                                                             |
| `unwrap`                               | boolean                       | *No*     | `false`                    | Unwrap lines into a single line within table cells in provided PDF documents. This is only applicable when `lineGrouping` is set to `1`.                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `rect`                                 | string                        | *No*     | -                          | Defines coordinates for extraction. Use`PDF Edit Add Helper`to get or measure PDF coordinates. The format is `{x} {y} {width} {height}`.                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `lang`                                 | string                        | *No*     | `eng`                      | Set the language for OCR (text from image) to use for scanned PDF, PNG, and JPG documents input when extracting text. see [Language Support](/api-reference/language-support). You can also use 2 languages simultaneously like this: `eng+deu` (any combination).                                                                                                                                                                                                                                                                                                                 |
| `inline`                               | boolean                       | *No*     | `false`                    | Set to true to return results inside the response. Otherwise, the endpoint will return a URL to the output file generated.                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `lineGrouping`                         | string                        | *No*     | -                          | Controls how lines of text are grouped when extracting data from a PDF. Line grouping within table cells. The available modes are: `1`, `2`, `3`. For more information, see [Line Grouping](#line-grouping-options).                                                                                                                                                                                                                                                                                                                                                               |
| `password`                             | string                        | *No*     | -                          | Password for the PDF file.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `async`                                | boolean                       | *No*     | `false`                    | Set `async` to `true` for long processes to run in the background, API will then return a `jobId` which you can use with the [Background Job Check endpoint](/api-reference/job-check). Also see [Webhooks & Callbacks](/api-reference/webhooks)                                                                                                                                                                                                                                                                                                                                   |
| `name`                                 | string                        | *No*     | -                          | File name for the generated output, the input must be in string format.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| `expiration`                           | integer                       | *No*     | `60`                       | Set the expiration time for the output link in minutes. After this specified duration, any generated output file(s) will be automatically deleted from [PDF.co Temporary Files Storage](/api-reference/file-upload/overview). The maximum duration for link expiration varies based on your current subscription plan. To store permanent input files (e.g. re-usable images, pdf templates, documents) consider using [PDF.co Built-In Files Storage](https://app.pdf.co/tools/files).                                                                                            |
| `profiles`                             | object                        | *No*     | -                          | See [Profiles](/api-reference/profiles) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|     `outputDataFormat`                 | string                        | *No*     | -                          | If you require your output as `base64` format, set this to `base64`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|     `ColumnDetectionMode`              | string                        | *No*     | Content Groups And Borders | Controls column detection/alignment in PDF table extraction. See [Column Detection Mode](#column-detection-mode) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                                             |
|     `OCRMode`                          | string                        | *No*     | `Auto`                     | Specifies how OCR (Optical Character Recognition) should process input content, offering various modes to tailor text extraction based on content type such as images, fonts, and vector graphics. For more information, see [OCR Extraction Modes](/api-reference/profiles#ocr-extraction-modes).                                                                                                                                                                                                                                                                                 |
|     `OCRResolution`                    | integer                       | *No*     | `300`                      | Use this parameter to change the OCR resolution from the default 300 dpi. The range is from `72` to `1200` dpi.                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|     `RotationAngle`                    | integer                       | *No*     | -                          | Use manual rotation to handle PDFs with vertically drawn text. Normally, OCR automatically detects page rotation in PDFs and extracts text accurately. However, in some cases, the PDF might not have an actual rotated page  ---  Rather, the text itself is drawn vertically. In such scenarios, auto-detection may fail. You can use this parameter to manually set the page rotation. The available angles are: `0`, `1`, `2`, `3`.                                                                                                                                            |
|     `LineGroupingMode`                 | string                        | *No*     | `None`                     | Controls line grouping in PDF text extraction. Modes: `None` (no grouping), `GroupByRows` (merge rows if all cells align), `GroupByColumns` (merge cells by column), `JoinOrphanedRows` (merge single-cell rows to above if no separator).                                                                                                                                                                                                                                                                                                                                         |
|     `ConsiderFontColors`               | boolean                       | *No*     | `false`                    | Controls whether font colors should be considered when detecting table structure and merging text objects during PDF extraction. Set to true to consider font colors.                                                                                                                                                                                                                                                                                                                                                                                                              |
|     `DetectNewColumnBySpacesRatio`     | string                        | *No*     | `1.2`                      | Controls how spaces between words are interpreted for column detection in PDF text extraction. It defines the ratio of space width that determines when text should be treated as being in separate columns.                                                                                                                                                                                                                                                                                                                                                                       |
|     `AutoAlignColumnsToHeader`         | boolean                       | *No*     | `true`                     | Controls how columns are detected and aligned during table extraction from PDF documents. It affects both table structure detection and text extraction with formatting preservation. Set to true to automatically align columns to the header row. When set to true (default), the row with the most columns is used as the header, and all other rows are aligned to this structure --- ideal for well-structured tables. When set to false, columns are analyzed independently across all rows to build the structure, which works better for inconsistent or irregular tables. |
|     `OCRImagePreprocessingFilters`     | object                        | *No*     | -                          | Image preprocessing filters for OCR. Refer to [OCRImagePreprocessingFilters](#ocrimagepreprocessingfilters) for usage examples.                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|         `.AddGrayscale`                | boolean                       | *No*     | `false`                    | Converts to grayscale before OCR.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|         `.AddGammaCorrection`          | array\[string (float format)] | *No*     | \["1.4"]                   | Adds a gamma correction filter.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|     `OCRAutoModeMinExistingTextLength` | integer                       | *No*     | `8`                        | The minimum number of characters a page must have to skip OCR. If a page has fewer, OCR will run. For example, if set to 8, OCR is skipped on pages with more than 8 characters.                                                                                                                                                                                                                                                                                                                                                                                                   |
|     `SaveVectors`                      | boolean                       | *No*     | `false`                    | Controls whether to save vector graphics during PDF to HTML conversion. Set to true to save vector graphics.                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
|     `SaveImages`                       | string                        | *No*     | `None`                     | Controls how images are saved during PDF to HTML conversion. Modes: `None` (no images), `OuterFile` (save to sub-folder), `Embed` (embed as Base64 data:URI).                                                                                                                                                                                                                                                                                                                                                                                                                      |
|     `ConsiderFontSizes`                | boolean                       | *No*     | `false`                    | Set to true to this parameter makes the converter consider font size differences in document text when detecting and parsing table structures. This can be helpful in cases where tables are formatted using different font sizes to distinguish between headers, data cells, or other structural elements.                                                                                                                                                                                                                                                                        |
|     `ExtractionArea`                   | array\[numbe]                 | *No*     | -                          | Extract text in a specific area by defining the extraction area - set with points in the format \[x, y, width, height].                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|     `ExtractShadowLikeText`            | boolean                       | *No*     | `true`                     | Controls whether to extract invisible text from a PDF document. Set to false to skip over invisible text during extraction. This is particularly useful when dealing with PDFs that contain hidden text layers or when you only want to extract visible content. When this value is set to false, OCRMode must be set to `Auto` to properly apply the shadow text filtering effect.                                                                                                                                                                                                |
|     `DataEncryptionAlgorithm`          | string                        | *No*     | -                          | Controls the encryption algorithm used for data encryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information. The available algorithms are: `AES128`, `AES192`, `AES256`.                                                                                                                                                                                                                                                                                                                                                          |
|     `DataEncryptionKey`                | string                        | *No*     | -                          | Controls the encryption key used for data encryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                            |
|     `DataEncryptionIV`                 | string                        | *No*     | -                          | Controls the encryption IV used for data encryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                             |
|     `DataDecryptionAlgorithm`          | string                        | *No*     | -                          | Controls the decryption algorithm used for data decryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information. The available algorithms are: `AES128`, `AES192`, `AES256`.                                                                                                                                                                                                                                                                                                                                                          |
|     `DataDecryptionKey`                | string                        | *No*     | -                          | Controls the decryption key used for data decryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                            |
|     `DataDecryptionIV`                 | string                        | *No*     | -                          | Controls the decryption IV used for data decryption. See [User-Controlled Encryption](/knowledgebase/user-controlled-encryption) for more information.                                                                                                                                                                                                                                                                                                                                                                                                                             |

<Note>You can use [profiles](/api-reference/profiles#converting-pdfs) to control the convert process and output of the CSV file.</Note>

### Column Detection Mode

This might be case when a document contains a number of overlapping invisible text and vector objects that affect column detection. In this case you may need to fix the wrongly positioned data.

Set the options for your column detection via the following `profiles` parameters:

`ColumnDetectionMode` - available values:

* `ContentGroupsAndBorders` (default, no need to specify)
* `ContentGroups`
* `Borders`
* `BorderedTables`
* `ContentGroupsAI`

```json theme={null}
{
 "profiles": "{ 'ColumnDetectionMode': 'ContentGroups' }"
}
```

### `OCRImagePreprocessingFilters`

To set image preprocessing filters, please use:

```json theme={null}
{
 "profiles": "{
    "ExtractShadowLikeText": false,
    "OCRMode": "Auto",
    "OCRImagePreprocessingFilters.AddGrayscale()": [],
    "OCRImagePreprocessingFilters.AddGammaCorrection()": [
        1.4
    ]
}"
}
```

### Line Grouping Options

* `"1"`: GroupByRows – Each row is checked against the next row to see if they can be grouped together. Rows will only be grouped if all cells in the current row can be grouped with all cells in the next row. Useful when merging related content that spans multiple lines but belongs to the same logical row.
* `"2"`: GroupByColumns – Each cell is checked against the cell below it in the next row to determine if they can be grouped. Cells are grouped within the same column even if others can't be grouped. Useful for columnar data where content in each column might span multiple lines.
* `"3"`: JoinOrphanedRows – Joins a row with a single cell to the previous row if there is no separator between them. Useful for handling cases with orphaned or misaligned content.

## Query parameters

*No query parameters accepted.*

## Responses

| Parameter          | Type    | Description                                                                                                                  |
| ------------------ | ------- | ---------------------------------------------------------------------------------------------------------------------------- |
| `body`             | string  | Stringified CSV content                                                                                                      |
| `pageCount`        | integer | Number of pages in the PDF document.                                                                                         |
| `error`            | boolean | Indicates whether an error occurred (`false` means success)                                                                  |
| `status`           | string  | Status code of the request (200, 404, 500, etc.). For more information, see [Response Codes](/api-reference/response-codes). |
| `name`             | string  | Name of the output file                                                                                                      |
| `credits`          | integer | Number of credits consumed by the request                                                                                    |
| `remainingCredits` | integer | Number of credits remaining in the account                                                                                   |
| `duration`         | integer | Time taken for the operation in milliseconds                                                                                 |

## `Example` Payload

<Note>To see the request size limits, please refer to the [Request Size Limits](/api-reference/url-input-and-request-limits#pdf-co-request-size).</Note>

```json theme={null}
{
  "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-csv/sample.pdf",
  "lang": "eng",
  "inline": "true",
  "unwrap": "",
  "pages": "0-",
  "rect": "",
  "async": "false",
  "name": "result.csv",
  "password": "",
  "lineGrouping": "",
  "profiles": ""
}
```

## `Example` Response

<Note>To see the main response codes, please refer to the [Response Codes](/api-reference/response-codes) page.</Note>

```json theme={null}
{
  "body": "\"Your Company Name\",\"\",\"\",\"\",\r\n\"Your Address\",\"\",\"\",\"\",\r\n\"City, State Zip\",\"\",\"\",\"\",\r\n\"\",\"\",\"\",\"Invoice No. 123456\",\r\n\"\",\"\",\"\",\"Invoice Date 01/01/2016\",\r\n\"Client Name\",\"\",\"\",\"\",\r\n\"Address\",\"\",\"\",\"\",\r\n\"City, State Zip\",\"\",\"\",\"\",\r\n\"Notes\",\"\",\"\",\"\",\r\n\"Item\",\"Quantity\",\"Price\",\"Total\",\r\n\"Item 1\",\"1\",\"40.00\",\"40.00\",\r\n\"Item 2\",\"2\",\"30.00\",\"60.00\",\r\n\"Item 3\",\"3\",\"20.00\",\"60.00\",\r\n\"Item 4\",\"4\",\"10.00\",\"40.00\",\r\n\"\",\"\",\"TOTAL\",\"200.00\",\r\n",
  "pageCount": 2,
  "error": false,
  "status": 200,
  "name": "result.csv",
  "remainingCredits": 616411,
  "credits": 56
}
```

<Note>
  **Inconsistent URL Encoding in cURL Output:** When using cURL to make API requests, the output JSON may show URL characters encoded as Unicode escape sequences. For example, the ampersand character (`&`) may appear as `\u0026` in the cURL output. This is normal JSON encoding behavior and does not affect the validity of the URL. The URL will function correctly when used, as JSON parsers automatically decode these escape sequences. If you're parsing the response programmatically, your JSON parser will handle this conversion automatically.
</Note>

## Code Samples

<Tabs>
  <Tab title="CURL">
    ```bash theme={null}
    curl --location --request POST 'https://api.pdf.co/v1/pdf/convert/to/csv' \
    --header 'Content-Type: application/json' \
    --header 'x-api-key: *******************' \
    --data-raw '{
    "url": "https://pdfco-test-files.s3.us-west-2.amazonaws.com/pdf-to-csv/sample.pdf",
    "lang": "eng",
    "inline": "true",
    "unwrap": "",
    "pages": "0-",
    "rect": "",
    "async": "false",
    "name": "result.csv",
    "password": "",
    "lineGrouping": "",
    "profiles": ""
    }'
    ```
  </Tab>

  <Tab title="JavaScript/Node.js">
    ```javascript theme={null}
    var https = require("https");
    var path = require("path");
    var fs = require("fs");

    // `request` module is required for file upload.
    // Use "npm install request" command to install.
    var request = require("request");

    // The authentication key (API Key).
    // Get your own by registering at https://app.pdf.co
    const API_KEY = "*********************************";

    // Source PDF file
    const SourceFile = "./sample.pdf";
    // Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
    const Pages = "";
    // PDF document password. Leave empty for unprotected documents.
    const Password = "";
    // Destination CSV file name
    const DestinationFile = "./result.csv";


    // 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.
    getPresignedUrl(API_KEY, SourceFile)
        .then(([uploadUrl, uploadedFileUrl]) => {
            // 2. UPLOAD THE FILE TO CLOUD.
            uploadFile(API_KEY, SourceFile, uploadUrl)
                .then(() => {
                    // 3. CONVERT UPLOADED PDF FILE TO CSV
                    convertPdfToCsv(API_KEY, uploadedFileUrl, Password, Pages, DestinationFile);
                })
                .catch(e => {
                    console.log(e);
                });
        })
        .catch(e => {
            console.log(e);
        });


    function getPresignedUrl(apiKey, localFile) {
        return new Promise(resolve => {
            // Prepare request to `Get Presigned URL` API endpoint
            let queryPath = `/v1/file/upload/get-presigned-url?name=${path.basename(SourceFile)}`;
            let reqOptions = {
                host: "api.pdf.co",
                path: encodeURI(queryPath),
                headers: { "x-api-key": API_KEY }
            };
            // Send request
            https.get(reqOptions, (response) => {
                response.on("data", (d) => {
                    let data = JSON.parse(d);
                    if (data.error == false) {
                        // Return presigned url we received
                        resolve([data.presignedUrl, data.url]);
                    }
                    else {
                        // Service reported error
                        console.log("getPresignedUrl(): " + data.message);
                    }
                });
            })
                .on("error", (e) => {
                    // Request error
                    console.log("getPresignedUrl(): " + e);
                });
        });
    }

    function uploadFile(apiKey, localFile, uploadUrl) {
        return new Promise(resolve => {
            fs.readFile(SourceFile, (err, data) => {
                request({
                    method: "PUT",
                    url: uploadUrl,
                    body: data,
                }, (err, res, body) => {
                    if (!err) {
                        resolve();
                    }
                    else {
                        console.log("uploadFile() request error: " + e);
                    }
                });
            });
        });
    }

    function convertPdfToCsv(apiKey, uploadedFileUrl, password, pages, destinationFile) {
        // Prepare request to `PDF To CSV` API endpoint
        var queryPath = `/v1/pdf/convert/to/csv`;

        // JSON payload for api request
        var jsonPayload = JSON.stringify({
            name: path.basename(destinationFile), password: password, pages: pages, url: uploadedFileUrl, async: true
        });

        var reqOptions = {
            host: "api.pdf.co",
            method: "POST",
            path: queryPath,
            headers: {
                "x-api-key": apiKey,
                "Content-Type": "application/json",
                "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
            }
        };
        // Send request
        var postRequest = https.request(reqOptions, (response) => {
            response.on("data", (d) => {
                response.setEncoding("utf8");

                // Parse JSON response
                let data = JSON.parse(d);
                console.log(`Job #${data.jobId} has been created!`);

                if (data.error == false) {
                    checkIfJobIsCompleted(data.jobId, data.url, destinationFile);
                }
                else {
                    // Service reported error
                    console.log("convertPdfToCsv(): " + data.message);
                }
            });
        })
            .on("error", (e) => {
                // Request error
                console.log("convertPdfToCsv(): " + e);
            });

        // Write request data
        postRequest.write(jsonPayload);
        postRequest.end();
    }

    function checkIfJobIsCompleted(jobId, resultFileUrl, destinationFile) {
        let queryPath = `/v1/job/check`;

        // JSON payload for api request
        let jsonPayload = JSON.stringify({
            jobid: jobId
        });

        let reqOptions = {
            host: "api.pdf.co",
            path: queryPath,
            method: "POST",
            headers: {
                "x-api-key": API_KEY,
                "Content-Type": "application/json",
                "Content-Length": Buffer.byteLength(jsonPayload, 'utf8')
            }
        };

        // Send request
        var postRequest = https.request(reqOptions, (response) => {
            response.on("data", (d) => {
                response.setEncoding("utf8");

                // Parse JSON response
                let data = JSON.parse(d);
                console.log(`Checking Job #${jobId}, Status: ${data.status}, Time: ${new Date().toLocaleString()}`);

                if (data.status == "working") {
                    // Check again after 3 seconds
                    setTimeout(function () { checkIfJobIsCompleted(jobId, resultFileUrl, destinationFile); }, 3000);
                }
                else if (data.status == "success") {
                    // Download CSV file
                    var file = fs.createWriteStream(destinationFile);
                    https.get(resultFileUrl, (response2) => {
                        response2.pipe(file)
                            .on("close", () => {
                                console.log(`Generated CSV file saved as "${destinationFile}" file.`);
                            });
                    });
                }
                else {
                    console.log(`Operation ended with status: "${data.status}".`);
                }
            })
        });

        // Write request data
        postRequest.write(jsonPayload);
        postRequest.end();
    }
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    import os
    import requests # pip install requests

    # The authentication key (API Key).
    # Get your own by registering at https://app.pdf.co
    API_KEY = "******************************************"

    # Base URL for PDF.co Web API requests
    BASE_URL = "https://api.pdf.co/v1"

    # Source PDF file
    SourceFile = ".\\sample.pdf"
    # Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
    Pages = ""
    # PDF document password. Leave empty for unprotected documents.
    Password = ""
    # Destination CSV file name
    DestinationFile = ".\\result.csv"


    def main(args = None):
        uploadedFileUrl = uploadFile(SourceFile)
        if (uploadedFileUrl != None):
            convertPdfToCSV(uploadedFileUrl, DestinationFile)


    def convertPdfToCSV(uploadedFileUrl, destinationFile):
        """Converts PDF To CSV using PDF.co Web API"""

        # Prepare requests params as JSON
        # See documentation: https://docs.pdf.co/
        parameters = {}
        parameters["name"] = os.path.basename(destinationFile)
        parameters["password"] = Password
        parameters["pages"] = Pages
        parameters["url"] = uploadedFileUrl

        # Prepare URL for 'PDF To CSV' API request
        url = "{}/pdf/convert/to/csv".format(BASE_URL)

        # Execute request and get response as JSON
        response = requests.post(url, data=parameters, headers={ "x-api-key": API_KEY })
        if (response.status_code == 200):
            json = response.json()

            if json["error"] == False:
                #  Get URL of result file
                resultFileUrl = json["url"]            
                # Download result file
                r = requests.get(resultFileUrl, stream=True)
                if (r.status_code == 200):
                    with open(destinationFile, 'wb') as file:
                        for chunk in r:
                            file.write(chunk)
                    print(f"Result file saved as \"{destinationFile}\" file.")
                else:
                    print(f"Request error: {response.status_code} {response.reason}")
            else:
                # Show service reported error
                print(json["message"])
        else:
            print(f"Request error: {response.status_code} {response.reason}")


    def uploadFile(fileName):
        """Uploads file to the cloud"""
        
        # 1. RETRIEVE PRESIGNED URL TO UPLOAD FILE.

        # Prepare URL for 'Get Presigned URL' API request
        url = "{}/file/upload/get-presigned-url?contenttype=application/octet-stream&name={}".format(
            BASE_URL, os.path.basename(fileName))
        
        # Execute request and get response as JSON
        response = requests.get(url, headers={ "x-api-key": API_KEY })
        if (response.status_code == 200):
            json = response.json()
            
            if json["error"] == False:
                # URL to use for file upload
                uploadUrl = json["presignedUrl"]
                # URL for future reference
                uploadedFileUrl = json["url"]

                # 2. UPLOAD FILE TO CLOUD.
                with open(fileName, 'rb') as file:
                    requests.put(uploadUrl, data=file, headers={ "x-api-key": API_KEY, "content-type": "application/octet-stream" })

                return uploadedFileUrl
            else:
                # Show service reported error
                print(json["message"])    
        else:
            print(f"Request error: {response.status_code} {response.reason}")

        return None


    if __name__ == '__main__':
        main()
    ```
  </Tab>

  <Tab title="C#">
    ```csharp theme={null}
    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Net;
    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;

    namespace PDFcoApiExample
    {
      class Program
      {
        // The authentication key (API Key).
        // Get your own by registering at https://app.pdf.co
        const String API_KEY = "**************************************";
        
        // Source PDF file
        const string SourceFile = @".\sample.pdf";
        // Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
        const string Pages = "";
        // PDF document password. Leave empty for unprotected documents.
        const string Password = "";
        // Destination CSV file name
        const string DestinationFile = @".\result.csv";

        static void Main(string[] args)
        {
          // Create standard .NET web client instance
          WebClient webClient = new WebClient();

          // Set API Key
          webClient.Headers.Add("x-api-key", API_KEY);

          // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
          // * If you already have a direct file URL, skip to the step 3.
          
          // Prepare URL for `Get Presigned URL` API call
          string query = Uri.EscapeUriString(string.Format(
                    "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name={0}", 
            Path.GetFileName(SourceFile)));

          try
          {
            // Execute request
            string response = webClient.DownloadString(query);

            // Parse JSON response
            JObject json = JObject.Parse(response);

            if (json["status"].ToString() != "error")
            {
              // Get URL to use for the file upload
              string uploadUrl = json["presignedUrl"].ToString();
              string uploadedFileUrl = json["url"].ToString();

              // 2. UPLOAD THE FILE TO CLOUD.

              webClient.Headers.Add("content-type", "application/octet-stream");
              webClient.UploadFile(uploadUrl, "PUT", SourceFile); // You can use UploadData() instead if your file is byte[] or Stream
              webClient.Headers.Remove("content-type");

              // 3. CONVERT UPLOADED PDF FILE TO CSV

              // URL for `PDF To CSV` API call
              var url = "https://api.pdf.co/v1/pdf/convert/to/csv";

              // Prepare requests params as JSON
              Dictionary<string, object> parameters = new Dictionary<string, object>();
              parameters.Add("name", Path.GetFileName(DestinationFile));
              parameters.Add("password", Password);
              parameters.Add("pages", Pages);
              parameters.Add("url", uploadedFileUrl);

              // Convert dictionary of params to JSON
              string jsonPayload = JsonConvert.SerializeObject(parameters);

              // Execute POST request with JSON payload
              response = webClient.UploadString(url, jsonPayload);

              // Parse JSON response
              json = JObject.Parse(response);

                        if (json["status"].ToString() != "error")
                        {
                // Get URL of generated CSV file
                string resultFileUrl = json["url"].ToString();

                // Download CSV file
                webClient.DownloadFile(resultFileUrl, DestinationFile);

                Console.WriteLine("Generated CSV file saved as \"{0}\" file.", DestinationFile);
              }
              else
              {
                Console.WriteLine(json["message"].ToString());
              }
            }
            else
            {
              Console.WriteLine(json["message"].ToString());
            }
          }
          catch (WebException e)
          {
            Console.WriteLine(e.ToString());
          }

          webClient.Dispose();

          Console.WriteLine();
          Console.WriteLine("Press any key...");
          Console.ReadKey();
        }
      }
    }
    ```
  </Tab>

  <Tab title="Java">
    ```java theme={null}
    package com.company;

    import com.google.gson.JsonObject;
    import com.google.gson.JsonParser;
    import okhttp3.*;

    import java.io.*;
    import java.net.*;
    import java.nio.file.Path;
    import java.nio.file.Paths;

    public class Main
    {
        // The authentication key (API Key).
        // Get your own by registering at https://app.pdf.co
        final static String API_KEY = "***********************************";

        // Source PDF file
        final static Path SourceFile = Paths.get(".\\sample.pdf");
        // Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
        final static String Pages = "";
        // PDF document password. Leave empty for unprotected documents.
        final static String Password = "";
        // Destination CSV file name
        final static Path DestinationFile = Paths.get(".\\result.csv");


        public static void main(String[] args) throws IOException
        {
            // Create HTTP client instance
            OkHttpClient webClient = new OkHttpClient();

            // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
            // * If you already have a direct file URL, skip to the step 3.

            // Prepare URL for `Get Presigned URL` API call
            String query = String.format(
                    "https://api.pdf.co/v1/file/upload/get-presigned-url?contenttype=application/octet-stream&name=%s",
                    SourceFile.getFileName());

            // Prepare request
            Request request = new Request.Builder()
                    .url(query)
                    .addHeader("x-api-key", API_KEY) // (!) Set API Key
                    .build();
            // Execute request
            Response response = webClient.newCall(request).execute();

            if (response.code() == 200)
            {
                // Parse JSON response
                JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

                String status = json.get("status").getAsString();
                if (!status.equals("error"))
                {
                    // Get URL to use for the file upload
                    String uploadUrl = json.get("presignedUrl").getAsString();
                    // Get URL of uploaded file to use with later API calls
                    String uploadedFileUrl = json.get("url").getAsString();

                    // 2. UPLOAD THE FILE TO CLOUD.

                    if (uploadFile(webClient, API_KEY, uploadUrl, SourceFile))
                    {
                        // 3. CONVERT UPLOADED PDF FILE TO CSV

                        PdfToCsv(webClient, API_KEY, DestinationFile, Password, Pages, uploadedFileUrl);
                    }
                }
                else
                {
                    // Display service reported error
                    System.out.println(json.get("message").getAsString());
                }
            }
            else
            {
                // Display request error
                System.out.println(response.code() + " " + response.message());
            }
        }

        public static void PdfToCsv(OkHttpClient webClient, String apiKey, Path destinationFile,
            String password, String pages, String uploadedFileUrl) throws IOException
        {
            // Prepare URL for `PDF To CSV` API call
            String query = "https://api.pdf.co/v1/pdf/convert/to/csv";

            // Make correctly escaped (encoded) URL
            URL url = null;
            try
            {
                url = new URI(null, query, null).toURL();
            }
            catch (URISyntaxException e)
            {
                e.printStackTrace();
            }

            // Create JSON payload
        String jsonPayload = String.format("{\"name\": \"%s\", \"password\": \"%s\", \"pages\": \"%s\", \"url\": \"%s\"}",
                    destinationFile.getFileName(),
                    password,
                    pages,
                    uploadedFileUrl);

            // Prepare request body
            RequestBody body = RequestBody.create(MediaType.parse("application/json"), jsonPayload);
            
            // Prepare request
            Request request = new Request.Builder()
                .url(url)
                .addHeader("x-api-key", API_KEY) // (!) Set API Key
                .addHeader("Content-Type", "application/json")
                .post(body)
                .build();
            
            // Execute request
            Response response = webClient.newCall(request).execute();
            

            if (response.code() == 200)
            {
                // Parse JSON response
                JsonObject json = new JsonParser().parse(response.body().string()).getAsJsonObject();

                String status = json.get("status").getAsString();
                if (!status.equals("error"))
                {
                    // Get URL of generated CSV file
                    String resultFileUrl = json.get("url").getAsString();

                    // Download CSV file
                    downloadFile(webClient, resultFileUrl, destinationFile.toFile());

                    System.out.printf("Generated CSV file saved as \"%s\" file.", destinationFile.toString());
                }
                else
                {
                    // Display service reported error
                    System.out.println(json.get("message").getAsString());
                }
            }
            else
            {
                // Display request error
                System.out.println(response.code() + " " + response.message());
            }
        }

        public static boolean uploadFile(OkHttpClient webClient, String apiKey, String url, Path sourceFile) throws IOException
        {
            // Prepare request body
            RequestBody body = RequestBody.create(MediaType.parse("application/octet-stream"), sourceFile.toFile());

            // Prepare request
            Request request = new Request.Builder()
                    .url(url)
                    .addHeader("x-api-key", apiKey) // (!) Set API Key
                    .addHeader("content-type", "application/octet-stream")
                    .put(body)
                    .build();

            // Execute request
            Response response = webClient.newCall(request).execute();

            return (response.code() == 200);
        }

        public static void downloadFile(OkHttpClient webClient, String url, File destinationFile) throws IOException
        {
            // Prepare request
            Request request = new Request.Builder()
                    .url(url)
                    .build();
            // Execute request
            Response response = webClient.newCall(request).execute();

            byte[] fileBytes = response.body().bytes();

            // Save downloaded bytes to file
            OutputStream output = new FileOutputStream(destinationFile);
            output.write(fileBytes);
            output.flush();
            output.close();

            response.close();
        }
    }
    ```
  </Tab>

  <Tab title="PHP">
    ```php theme={null}
    <!DOCTYPE html>
    <html lang="en">
    <head>
        <meta charset="UTF-8">
        <title>PDF To CSV Extraction Results</title>
    </head>
    <body>

    <?php 
    // Note: If you have input files large than 200kb we highly recommend to check "async" mode example.

    // Get submitted form data
    $apiKey = $_POST["apiKey"]; // The authentication key (API Key). Get your own by registering at https://app.pdf.co
    $pages = $_POST["pages"];


    // 1. RETRIEVE THE PRESIGNED URL TO UPLOAD THE FILE.
    // * If you already have the direct PDF file link, go to the step 3.

    // Create URL
    $url = "https://api.pdf.co/v1/file/upload/get-presigned-url" . 
        "?name=" . urlencode($_FILES["file"]["name"]) .
        "&contenttype=application/octet-stream";
        
    // Create request
    $curl = curl_init();
    curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey));
    curl_setopt($curl, CURLOPT_URL, $url);
    curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
    // Execute request
    $result = curl_exec($curl);

    if (curl_errno($curl) == 0)
    {
        $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
        
        if ($status_code == 200)
        {
            $json = json_decode($result, true);
            
            // Get URL to use for the file upload
            $uploadFileUrl = $json["presignedUrl"];
            // Get URL of uploaded file to use with later API calls
            $uploadedFileUrl = $json["url"];
            
            // 2. UPLOAD THE FILE TO CLOUD.
            
            $localFile = $_FILES["file"]["tmp_name"];
            $fileHandle = fopen($localFile, "r");
            
            curl_setopt($curl, CURLOPT_URL, $uploadFileUrl);
            curl_setopt($curl, CURLOPT_HTTPHEADER, array("content-type: application/octet-stream"));
            curl_setopt($curl, CURLOPT_PUT, true);
            curl_setopt($curl, CURLOPT_INFILE, $fileHandle);
            curl_setopt($curl, CURLOPT_INFILESIZE, filesize($localFile));

            // Execute request
            curl_exec($curl);
            
            fclose($fileHandle);
            
            if (curl_errno($curl) == 0)
            {
                $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
                
                if ($status_code == 200)
                {
                    // 3. CONVERT UPLOADED PDF FILE TO CSV
                    
                    ExtractCSV($apiKey, $uploadedFileUrl, $pages);
                }
                else
                {
                    // Display request error
                    echo "<p>Status code: " . $status_code . "</p>"; 
                    echo "<p>" . $result . "</p>"; 
                }
            }
            else
            {
                // Display CURL error
                echo "Error: " . curl_error($curl);
            }
        }
        else
        {
            // Display service reported error
            echo "<p>Status code: " . $status_code . "</p>"; 
            echo "<p>" . $result . "</p>"; 
        }
        
        curl_close($curl);
    }
    else
    {
        // Display CURL error
        echo "Error: " . curl_error($curl);
    }

    function ExtractCSV($apiKey, $uploadedFileUrl, $pages) 
    {
        // Create URL
        $url = "https://api.pdf.co/v1/pdf/convert/to/csv";
        
        // Prepare requests params
        $parameters = array();
        $parameters["url"] = $uploadedFileUrl;
        $parameters["pages"] = $pages;

        // Create Json payload
        $data = json_encode($parameters);

        // Create request
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_HTTPHEADER, array("x-api-key: " . $apiKey, "Content-type: application/json"));
        curl_setopt($curl, CURLOPT_URL, $url);
        curl_setopt($curl, CURLOPT_POST, true);
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
        curl_setopt($curl, CURLOPT_POSTFIELDS, $data);

        // Execute request
        $result = curl_exec($curl);
        
        if (curl_errno($curl) == 0)
        {
            $status_code = curl_getinfo($curl, CURLINFO_HTTP_CODE);
            
            if ($status_code == 200)
            {
                $json = json_decode($result, true);
                
                if (!isset($json["error"]) || $json["error"] == false)
                {
                    $resultFileUrl = $json["url"];
                    
                    // Display link to the file with conversion results
                    echo "<div><h2>Conversion Result:</h2><a href='" . $resultFileUrl . "' target='_blank'>" . $resultFileUrl . "</a></div>";
                }
                else
                {
                    // Display service reported error
                    echo "<p>Error: " . $json["message"] . "</p>"; 
                }
            }
            else
            {
                // Display request error
                echo "<p>Status code: " . $status_code . "</p>"; 
                echo "<p>" . $result . "</p>"; 
            }
        }
        else
        {
            // Display CURL error
            echo "Error: " . curl_error($curl);
        }
        
        // Cleanup
        curl_close($curl);
    }

    ?>

    </body>
    </html>
    ```
  </Tab>
</Tabs>
