Intelligent Document Processing
Last updated
Last updated
After a document is uploaded to ProcessMaker IDP, it undergoes the following steps for analysis and data extraction:
Metadata Extraction: Metadata such as the author, creation date, and media type is extracted from the document. This metadata helps in organizing and managing the documents efficiently.
OCR Processing: The OCR service processes every uploaded document to extract text from it. This step involves recognizing and digitizing text from images or scanned documents.
Classification: The classification service analyzes the document and classifies it into one or more predefined document types based on the model being used.
Model Analysis: The document is processed using a trained model to generate a Model Result. This result provides further insights and structured data extracted from the document.
Object Detection: The Model also identifies Objects in the document.
NER (Named Entity Recognition): Static models are to used to identify information (address, organization, etc.) in the document. Anonymization also occurs at this step.
AI Model Gateway: If configured, custom AI Models can also be used for processing documents.
Store Metadata: The extracted metadata is then stored in the database for further processing and easy retrieval.
Watch the following product tour to get a quick overview of how IDP processes documents.