Intelligent Document Processing

After a document is uploaded to ProcessMaker IDP, it undergoes the following steps for analysis and data extraction:

  1. Metadata Extraction: Metadata such as the author, creation date, and media type is extracted from the document. This metadata helps in organizing and managing the documents efficiently.

  2. OCR Processing: The OCR service processes every uploaded document to extract text from it. This step involves recognizing and digitizing text from images or scanned documents.

  3. Classification: The classification service analyzes the document and classifies it into one or more predefined document types based on the model being used.

  4. Model Analysis: The document is processed using a trained model to generate a Model Result. This result provides further insights and structured data extracted from the document.

    • Object Detection: The Model also identifies Objects in the document.

    • NER (Named Entity Recognition): Static models are to used to identify information (address, organization, etc.) in the document. Anonymization also occurs at this step.

    • AI Model Gateway: If configured, custom AI Models can also be used for processing documents.

  5. Store Metadata: The extracted metadata is then stored in the database for further processing and easy retrieval.

Watch the following product tour to get a quick overview of how IDP processes documents.

Last updated