# Extract Table From PDF

This activity extracts tables from a user-specified PDF in a specific format. The output of the activity is a list of Tables.

### Input

* **Filename:** [`Object Argument`](/getting-started/rpa-studio/arguments.md#types-of-arguments) <mark style="color:red;">`Required`</mark>\
  The full path of the input PDF file.<br>
* **Output Format:**\
  The output format of the parsed tables.
  * `DataTable (default)` - Extract the table from PDF in a DataTable format
  * `JSON` - Extract the table from PDF in a JSON format
  * `CSV` - Extract the table from PDF in a CSV format<br>
* **Configuration:**

  Specifies the selector that contains the user-selected area to parse the table.<br>
* **Extraction Method:**

  Specifies the method for detecting cells in the user-specified pdf doc. The available options are:-

  * `Lattice` - Uses gridlines to identify cells in the given pdf document. This method cannot be applied for scanned documents.
  * `Stream` - Uses whitespace to detect cells in the given pdf document.
  * `Custom` - Uses the user-specified regular expressions to extract the data from the pdf page. The regular expressions should be configured by launching the **Build DataTable Window** using the configure table button. Each page will give a row of data for the extracted table.<br>
* **Page Range:** [`String Argument`](/getting-started/rpa-studio/arguments.md#types-of-arguments) <mark style="color:red;">`Required`</mark>\
  Specifies the page number or a range of page numbers of the page/pages to be processed.

{% hint style="info" %}
The page number range can be specified as the following:

\
"0" - **All pages**

"1" - **Page 1**\
"1-5" - **Pages 1 to 5**\
"1,4,6" - **Pages 1,4,6**\
"1,3,5-9,12" - **Pages 1,3,5 to 9 and 12**\
"^1" - **Last page of the given PDF**\
"^2" - **Second last page of the given PDF**\
"1,^1" - **Pages 1,Last page of the given PDF**
{% endhint %}

### Output

* **Output:**

  Saves the output as a list of tables in the specified variable. The list will be a list of DataTable or CSV or JSON according to the Output format&#x20;


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.visualyze.ai/rpa-studio/file/pdf/extract-table-from-pdf.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
