# Training Data Studio

The Training Data Studio enables users to create applications and define rules efficiently. By leveraging system discovery data, defining precise window rules, and using filtering options, you can streamline data collection and application tracking effectively.

This studio offers the following key features:

* **Create Applications from System Discovery:** When **System Discovery** is enabled, applications based on the collected discovery data can be generated.
* **Create Window Rules:** If an application is in **Training Mode,** new window rules can be defined for that application.

{% hint style="info" %}
An application is considered to be in Training Mode if `Capture full URLs or Titles` mode is enabled.
{% endhint %}

***

## Access the Training Data Studio

Follow these steps to navigate to Training Data Studio:

1. Go to **Admin Panel → Configure Data Collection**.
2. Select an application:
   * Choose **System Discovery** to generate application rules.
   * Select a specific application to generate window rules.
3. In the **Data Collection Granularity** section, click **Go To Training Data Studio** to access the training studio.<br>

   <div align="left"><figure><img src="/files/at6cRsO5Ngytk9EabPW4" alt="" width="563"><figcaption></figcaption></figure></div>

***

## Training Data List

The Training Data List appears at the bottom of the Training Data Studio and displays a list of all unique **process names, URLs, and title combinations** collected for the selected application. The **Total Visits** column shows how frequently each combination appears, helping you assess data volume.

<figure><img src="/files/T79cjDJF8JNIM9kPAn8u" alt=""><figcaption></figcaption></figure>

This list helps you configure applications/windows based on detected patterns in process names, URLs, and titles.

***

## Add a New Application Rule

Follow these steps to create a new application rule:

1. Review the **Training Data List** to identify patterns.
2. Enter the **application or window name**.
3. Define the rule by entering values in the **Process Name, URL, and Title** fields.
4. Click **Preview matched training data for the rule**. A popup will display all matching rows from the training data.&#x20;
5. Verify that the information is correct. If no data appears, adjust the rule.
6. Click **Create Rule** to create the rule.
7. The matched rows will be removed from the listing, leaving only unmatched data for further processing.

***

## Filter Training Data

Training data can be grouped and customized using advanced filtering options to streamline processing.

### **Select Columns**

You can choose which columns (Process Name, URL, and Title) to display, and the data will be grouped based only on the selected columns.

For example, if the same URL is accessed using different browsers, enabling **Process Name** and **URL** columns will display the following data:

| Process Name | URL                               | Total Visits |
| ------------ | --------------------------------- | ------------ |
| msedge.exe   | [http://sap.com](http://sap.com/) | 3            |
| chrome.exe   | [http://sap.com](http://sap.com/) | 2            |

If only the **URL** column is selected, the data will be combined into a single row:

| URL                               | Total Visits |
| --------------------------------- | ------------ |
| [http://sap.com](http://sap.com/) | 5            |

### **Best Practices for Filtering Training Data**

Follow these best practices when selecting columns for filtering training data.

{% tabs %}
{% tab title="Ignore IDs in Titles/URLs" %}
If your URLs or titles contain **variable IDs**, you can remove them to improve grouping. For example, if you have a URL grouping as follows:

| URL                     | Total Visits |
| ----------------------- | ------------ |
| sap.com/invoice/123/pay | 5            |
| sap.com/invoice/456/pay | 3            |
| sap.com/invoice/789/pay | 2            |

Set the **Remove IDs from URLs after keyword** field to the value `invoice/` to ignore the variable ID and get the following results:

| URL                     | Total Visits |
| ----------------------- | ------------ |
| sap.com/invoice/XXX/pay | 10           |

{% hint style="info" %}
The same logic can also be used when applying a filter on the **Title** column.
{% endhint %}
{% endtab %}

{% tab title="Shorten Titles/URLs" %}
If URLs or titles contain unnecessary suffixes (such as query parameters), you can remove them for clear grouping of the data.

For examples, the following data contains suffixes which are not required for grouping.&#x20;

| URL                              | Total Visits |
| -------------------------------- | ------------ |
| sap.com/invoices?filter=a        | 5            |
| sap.com/invoice?filter=b         | 3            |
| sap.com/invoice?filter=c\&mode=d | 2            |

Set the **Cut the URLs starting from a keyword** field to **`?`** to remove the unnecessary suffixes.

| URL             | Total Visits |
| --------------- | ------------ |
| sap.com/invoice | 10           |

{% hint style="info" %}
The same logic can also be used when applying a filter on the **Title** column.
{% endhint %}
{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://processmaker.gitbook.io/process-intelligence/data-collection/training-data-studio.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
