Run Extractor

Extracts data from a document

The Run Extractor activity takes an extractor definition file and a preprocessed document as its inputs. It performs document extraction in which the required data is extracted from the document according to the extractor definition. It outputs an ExtractionResult variable which contains the extracted data.

If the user provides a ClassificationResult output for the document from Classify Document activity and if the extractor definition is multi-class then the activity only extracts the data defined under the class predicted in ClassificationResult.

User should preprocess the document using a Document AI Client with appropriate extraction enabled depending on the extractor used.

  • Text Regex Extractor

  • Skill Skill Extractor

  • Form Form Extractor

The extractor definition can be created using Create Extractor Window

Input

  • Processed Document : ProcessedDocument Variable Required The preprocessed document from the preprocess document activity.

  • Extractor Definition : String Argument Required Path to the classifier definition file (*.extractor.vdai).

  • Document AI Client : DocAIClient Variable Required The configured Document AI Client.

  • Classification Result : ClassificationResult Variable Required The ClassificationResult if the user has classified the document and the definition is multi-class.

Options

  • Split Pages : Required If true and the document has multiple pages then the extraction is applied on each page independently. Available values are :

    • True

    • False

Output

Last updated