Process Intelligence Documentation
processmaker.comDeveloper DocumentationKnowledge Center
User Documentation
User Documentation
  • Process Intelligence
  • 🟪Using Process Intelligence
    • Getting Started
    • Dashboard
    • Integration with ProcessMaker
  • 🟪Data Collection
    • Introduction to Data Collection
    • Data Collection Studio
    • Data Collection Rules
      • Advanced Examples for Data Collection
        • Handling Overlapping URLs in Screen Mapping
        • Click Activity and Field Edit Tracking for Web Applications
        • OCR-Based Identifier Extraction in Desktop Applications
    • Data Collection with X-Ray
      • Working with X-Ray Desktop Tool
      • Example of the X-Ray Workflow
    • Training Data Studio
    • Introduction to Testing
  • 🟪Configuration
    • Service Deployment
    • System and Network Requirements
    • Silent Distribution by IT
    • Chrome Extension
    • Windows Plugin Manual Installation
    • Uninstallation and Cleanup
    • User Roles and Permissions
  • 🟪Architecture
    • Process Intelligence Architecture
    • Architecture Diagrams
Powered by GitBook
On this page
  • Defining data collection settings
  • Identification of Agents
  • Opting-in business applications
  • Configurable data collection types
  • Configurable data collection capabilities
  1. Data Collection

Introduction to Data Collection

PreviousIntegration with ProcessMakerNextData Collection Studio

This chapter describes the data collection options and source-computer identification method. Process Intelligence Platform’s data collection combines accurate business process data capture and advanced anonymization techniques. Data collection is justified, minimized and defined.

Anonymity of the source-computer is achieved with: (1) team-level source-computer identification, (2) opting-in business process applications, and (3) versatile configuration of business process data collection for each opted-in business application. This approach does not collect Personally Identifiable Information (PII) at all.

Process Intelligence Agent utilizes various technologies to collect data on Windows platforms that are grouped under the term Work API. Examples of the used technologies are Windows COM (Component Object Model), DOM (Document Object Model), and Windows DLLs (Dynamic Link Libraries).

Defining data collection settings

Process Intelligence platform does not collect data without data collection settings done by the Customer using the Process Intelligence Dashboard’s Work API Configuration functionality:

  • Define opt-in of applications: which applications are being part of analysis.

  • Configure collected data for each opt-in applications: define what data is being collected from those applications

  • Apply configurations to Agents: Data collection settings are automatically updated to all computers. Different teams can have different settings.

Collected business process data is clearly defined and done by the customer. Process Intelligence Agents then observes the use of opt-in business applications and collects the business data according to the settings. All other applications are being ignored and out-of-scope of the analysis.

Identification of Agents

Each Process Intelligence Agent is linked to a specific customer organization and a team. Unique team tokens are used for that purpose. This way the Platform does not need personal information to identify and separate different source-computer users.

Agents create random session IDs that they use to separate repetitive workflows on source computers. These session IDs are changing so that the data sent to Process Intelligence Platform will not create data sets of individual source computer users.

For the sake of clarity: As there are no unique identifiers of a particular computer name or username collected by the Process Intelligence platform, there is no possibility of identifying individual computer users from the data stored in Process Intelligence's databases.

Opting-in business applications

The first step of allow-listing business process applications is to define the applications related to performing business transactions. The options are explained below.

Application Type
Example
Notes

Desktop Application

sapgui.exe

Native Windows applications

Web Application

organization.salesforce.com

The minimum part of the included domain. E.g., different.salesforce.com would be excluded.

Web Portal

invoices.organization.de

Application on a virtual or remote desktop

Wfica32.exe (Citrix)

Depending on the target application design, some part of data collection might be short. Process Intelligence Agent can also be installed on the virtual machine for more granular data collection.

Configurable data collection types

More detailed business process data collection for opt-in applications can be defined using data tagging, identifiers, and salvage fields. Those options are explained below.

Data Type
Explanation
Examples and Notes

Tag

Fixed keyword identified in titles, URLs, or UI.

E.g., “Invoice” or “Report”.

Identifier

Variable/process identifier in title, URL, or UI. Option for value hashing.

E.g., “Invoice number” = 12345, “Customer name” = Workfellow.

Salvaged Data

Collecting data in original format as a training data to help define tags and identifiers.

E.g., enabling URL salvage for domain app.sap.com, would mean that visits to subpages are being collected: app.sap.com/reports, app.sap.com/invoices…

For the sake of clarity: the collected data through opt-in and allow-listing is defined by the customer.

Configurable data collection capabilities

Data
Collection Status
Purpose

Team-token

Yes

To identify to which team and organization the computer belongs.

Session-ID

Yes for opt-in

To separate workflows.

Time stamps

Yes for opt-in

To identify time ranges and durations

Application name

Yes for opt-in

To separate different applications.

Mouse click elements

Click events for opt-in, tagging element names is an option

To identify field changes in a process. E.g., identify address field edits in invoicing application to find master data problems.

Typed keyboard length

Yes for opt-in, salvage option for specific application windows

Yes for opt-in, salvage option for specific application windows

Keyboard shortcuts

Typical ones for opt-in

To identify manual data flows and activities. E.g. CTRL+C, CTRL+V

Clipboard activity type: text, image or file

Yes for opt-in, salvage option for specific application windows

To identify manual data flows between applications.

Window titles

Yes for allow-listed

To separate different windows within applications.

Case identifiers

Yes for allow-listed

To identify process transactions.

File type

Yes for allow-listed

To identify used file formats

Business process related web URLs

Yes for allow-listed

Only specified business process-related web applications are included in data collection.

🟪
Process Intelligence data collection principles