Create Simulated Eventlog: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
Line 246: Line 246:


=== Transformation: resources_to_roles ===
=== Transformation: resources_to_roles ===
Extract, for every value of a specified column, the maximum number of concurrent cases in given event log that have that value.
Performs "organization mining" by grouping together column values (e.g., resources) that are used in similar fashion in given event log. E.g., resources that are often present in similar set of activities.


==== Supported parameters ====
==== Supported parameters ====


* '''resource_column''':
* '''resource_column''': The name of the column containing names of resources.
** The name of the column representing the resources whose maximum concurrent case usages are to be calculated.
* '''resource_limits''': Contains an object dictionary containing resource names with their maximum concurrent usages.
** If set, when building role_limits output, these values will be summed for each resource into the resulting role-based usage limit.
** If not set, each resource in a role will be counted as one, when calculating the role_limits.
* '''role_column''': The name of the column to be created and whose values will indicate the role in which the resource belongs to.
* '''role_name_template''': If set, specifies the name template used for role names. In this template, %d will be replaced by a numeric value starting from 1.
** The default value is "Role %d".
* '''similarity_threshold''': The minimum value of Pearson correlation coefficient calculated between two resources in order for them to be considered as having the same role.
** The default value is 0.7


==== Inputs ====
==== Inputs ====
Line 258: Line 265:
==== Outputs ====
==== Outputs ====


* max_resource_usages:
* Transformed event log.
** A dictionary object containing resource names as keys (unique resource_column values) and their maximum usage in the event log.
* Result dictionary object containing the following properties:
** '''resource_column''': The name of the column containing names of resources.
** '''resource_to_role_map''': An object containing resource names as property names and role names as value.
** '''role_column''': The name of the generated column whose values will indicate the role in which the resource belongs to.
** '''role_limits''': An object containing role names as property names and maximum usage for that role as value.


==== Example ====
==== Example ====
<syntaxhighlight lang="json">
#{
  "type": "resources_to_roles",
  "resource_column": "SAP_User",
  "role_column": "Role",
  "role_name_template": "Role %d",
  "input": #{
    "resource_limits": "max_resource_usages"
  }
}
</syntaxhighlight>

Revision as of 14:44, 15 November 2024

This article has instructions how to install, configure and use eventlog simulations. The simulation creates a new model that contains both the source model data and the new simulated data. Case attribute Simulated can be used to determine whether the case is in the source data (false) or whether the simulation generated it as a new simulated case (true).

Prerequisites for simulation

As prerequisite, prediction must be installed into the used Snowflake as described in Install prediction in Snowflake.

Create simulation script in QPR ProcessAnalyzer

1. Create the following example expression script (e.g., with name "Create simulation model - delete events"):

let sourceModel = ProjectByName("<project name>").ModelByName("<model name>");
let eventTypeColumnName = sourceModel.EventsDataTable.ColumnMappings["EventType"];
let flowFromEventType = "<from event type of a flow to modify>", flowToEventType = "<to event type of a flow to modify>";

let targetProject = Project;
_system.ML.ApplyTransformations(#{
  "Name": "My simulation model - delete",   // Name of the PA model to generate to the target project.
  "SourceModel": sourceModel,               // Snowflake-based PA model used for training the prediction model.
  "TargetProject": targetProject,           // Target project to create the model into.
  "Transformations": [#{                    // Transformation configurations.
    "type": "modify_flow_durations",
    "column": eventTypeColumnName,
    "flows": [#{
      "from": flowFromEventType,
      "to": flowToEventType,
      "probability": 1.0,
      "operation": #{
        "type": "set_value",
        "value": 0.0
      },
      "delete": true
    }]
  }]
});

2. Configure simulation for the previously created script as instructed in the next chapter. At minimum, replace the tags listed below with some suitable values:

  • <project name>: Name of the project in which the source model is located.
  • <model name>: Name of the model to be used as source model. This data in this source model will be used as source data to be modified by the simulation transformations.
  • <from event type of a flow to modify>: .From-event type name of flows from which the from-event is to be deleted.
  • <to event type of a flow to modify>: .To-event type name of flows from which the from-event is to be deleted.

Configure simulation

Simulation script has the following settings in the ApplyTransformations call:

  • Name: Name of the QPR ProcessAnalyzer model that is created to the target project. The model will contain the source model content and the predictions.
  • SourceModel: Source model for which the simulation is made. Model can be selected for example based on id with ModelById function or by name with ModelByName function.
  • TargetProject: Target project to create the new model into.
  • Transformations: Array of transformation configuration objects. Each object supports the following parameters:
    • type: Defines the type of the transformation to perform. See below for more details on the supported transformations. Supported values are:
      • enforce_resource_limits: Used to modify given event log using given maximum resource limits.
      • extract_max_resource_usages: Used to extract, for every value of a specified column, the maximum number of concurrent cases in given event log that have that value.
      • generate: Used to generate a new event log using a trained ML model.
      • modify_flow_durations: Used to modify durations of flows and possibly remove events having specific flows.
      • modify_values: Used to modify values of a dictionary given as input (e.g., dictionary generated by extract_max_resource_usages).
      • resources_to_roles: Performs "organization mining" by trying to group together column values (e.g., resources) that are used in similar fashion in given event log (e.g., resources that are often present in similar set of activities).
    • input: Can be used to specify that given transformation input parameters get their values from the previous transformation result.
      • Value can be either direct mapping by just the name of the transformation result property, or it can be a value mapping configuration object that supports the following parameters:
        • input: Name of the parameter to get from the previous transformation result as the root object of the actual value to extract.
        • value_path: An array of property names to traverse into the root object.

Transformation: event_resource_limits

Using given input data, this transformation generates a new event log which does not exceed the concurrency limitations of specified column values.

Event rows are traversed in time order, and if at some point a limit would be exceeded, instead of outputting the actual event, a new copy of the actual event, with copied event properties, is created to represent the queue for the actual event.

Only after an event leaves from the column value that contains a queue, the event that had been waiting for the longest in the queue will be generated (following the FIFO-principle).

Supported parameters

  • column: Name of the column having the values whose concurrent usage is to be limited by specified limits
  • limits: Specifies an object containing key-value -pairs where keys are column values and values contain an integer specifying the maximum number of concurrent cases in the given event log that can contain given value.
  • queue_event_activity_name: If set, specifies the name template used for queue-events. In this template, when a queue event is created, %s is replaced with the name of the activity this queue event is queuing to.
    • If not set, the activity name is not altered at all for the queue event.
  • queue_event_column:
    • If queue_event_activity_name is set:
      • If the event represents a queue-event, the value in this column specifies the name of the queue-activity.
      • Otherwise, the value is null.
    • If queue_event_activity_name is not set:
      • If the event represents a queue-event, the value in this column True.
      • Otherwise, the value is False.

Inputs

Event log to operate on.

Outputs

Event log with resource limits enforced.

Example

#{
  "type": "enforce_resource_limits",
  "queue_event_column": "Queue",
  "queue_event_activity_name": "%s - Queue",
  "limits": #{
    "Role 3": None
  },
  "input": #{
    "limits": "role_limits",
    "column": "role_column"
  }
}

Transformation: extract_max_resource_usages

Extract, for every value of a specified column, the maximum number of concurrent cases in given event log that have that value.

Supported parameters

  • resource_column:
    • The name of the column representing the resources whose maximum concurrent case usages are to be calculated.

Inputs

Event log to operate on.

Outputs

  • max_resource_usages:
    • A dictionary object containing resource names as keys (unique resource_column values) and their maximum usage in the event log.

Example

#{
  "type": "extract_max_resource_usages",
  "resource_column": "SAP_User"
}

Transformation: generate

Generate a new event log using the configured model prediction generation parameters (GenerationConfiguration).

Supports all the same parameters as those supported by model prediction generation configuration.

Inputs

Does not support inputs.

Outputs

Generated event log.

Example:

#{
  "type": "generate",
  "model_name": "ML model",
  "cases_to_generate": 100,
  "max_num_events": 20
}

Transformation: modify_flow_durations

Modify durations of flows and possibly remove events having specific flows.

Supported parameters

  • column: The name of the column based on which the flows are created. Usually this is the column containing activities, but could also be, e.g., organization units, users, …
  • flows: Flows to transform. Contains an array of flow transformation configuration objects. Each object defines transformations performed on one flow type defined by starting and ending column values. Supports the following properties:
    • delete: Same as delete_from.
    • delete_from: If defined, specifies whether the "from event" of the matched flow should be removed after applying the operation (if defined).
    • delete_to: If defined, specifies whether the "to event" of the matched flow should be removed after applying the operation (if defined).
    • from: Column value starting the flow.
      • If this and from_input are both undefined, any starting value is accepted.
    • from_input: If defined, specifies the name of the transformation-level parameter from which the actual column value starting the flow is read from.
      • Overrides the value defined in from-parameter.
    • operation: Specifies the actual flow duration modification operation to perform as value modification configuration object where the value is the duration in seconds. Supports the following properties:
      • probability: If defined, specifies the percentage probability of applying the operation to any matching instance of the flow.
        • Value should be a numeric value between 0 and 1.0.
        • This probability applies only to this operation.
        • The default value is 1.0.
      • type: Type of the operation. The following types are supported:
        • add: Sets the value to be the current value plus the number specified by the value.
        • multiply: Sets the value to be the current value multiplied by the number specified by the value.
        • set_value: Sets the value to be exactly the number specified by the value.
      • value: Value used by the operation.
    • probability: If defined, specifies the percentage probability of applying the operation to any matching instance of the flow.
      • Value should be a numeric value between 0 and 1.0.
      • This probability applies, in addition to the operation specified by the operation-parameter, also to any possible other transformations, such as event deletion.
      • Default value is 1.0
    • to: Column value ending the flow.
      • If this and to_input are both undefined, any ending value is accepted.
    • to_input: If defined, specifies the name of the transformation-level parameter to which the actual column value ending the flow is read from.
      • Overrides the value defined in to-parameter.

Inputs

Event log to operate on.

Outputs

Transformed event log.

Example

#{
  "type": "modify_flow_durations",
  "column": "Organization",
  "flows": [#{
    "from": "Delivery",
    "operation": #{
      "type": "set_value",
      "value": 0.0
    },
    "delete": True
  }]
}

Transformation: modify_values

Modify values of an object given as input (e.g., object generated by extract_max_resource_usages).

Due to the required inputs, this transformation can't be the first transformation to perform.

Supported parameters

  • values: Array of value configuration objects. Each object supports the following properties:
    • input: Name of the result to modify, where the result is the output of the previous transformation.
    • input_key_from: If defined, specifies the the name of the property of a input object whose value contains the name of the property whose value is to be modified.
    • input_key_value_path: If input_key_from is defined and is represented as an object, this configuration should specify an array of property names to traverse into the object.
      • The value at the end of this path will be used as the name of the property to modify in the input.
    • operation: Specifies the actual value modification operation to perform as value modification configuration object. Supports the following properties:
      • probability: If defined, specifies the percentage probability of applying the operation to any matching instance of the flow.
        • Value should be a numeric value between 0 and 1.0.
        • This probability applies only to this operation.
        • The default value is 1.0.
      • type: Type of the operation. The following types are supported:
        • add: Sets the value to be the current value plus the number specified by the value.
        • multiply: Sets the value to be the current value multiplied by the number specified by the value.
        • set_value: Sets the value to be exactly the number specified by the value.
      • value: Value used by the operation.

Inputs

Output of the previous transformation operation.

Outputs

The same output as the previous performed transformation, except with the specified value modifications applied.

Example

#{
  "type": "modify_values",
  "values": [#{
    "input": "role_limits",
    "input_key_from": "resource_to_role_map",
    "input_key_value_path": ["Tina"],
    "operation": #{
      "type": "multiply",
      "value": 0.5
    }
  }]
}

Transformation: resources_to_roles

Performs "organization mining" by grouping together column values (e.g., resources) that are used in similar fashion in given event log. E.g., resources that are often present in similar set of activities.

Supported parameters

  • resource_column: The name of the column containing names of resources.
  • resource_limits: Contains an object dictionary containing resource names with their maximum concurrent usages.
    • If set, when building role_limits output, these values will be summed for each resource into the resulting role-based usage limit.
    • If not set, each resource in a role will be counted as one, when calculating the role_limits.
  • role_column: The name of the column to be created and whose values will indicate the role in which the resource belongs to.
  • role_name_template: If set, specifies the name template used for role names. In this template, %d will be replaced by a numeric value starting from 1.
    • The default value is "Role %d".
  • similarity_threshold: The minimum value of Pearson correlation coefficient calculated between two resources in order for them to be considered as having the same role.
    • The default value is 0.7

Inputs

Event log to operate on.

Outputs

  • Transformed event log.
  • Result dictionary object containing the following properties:
    • resource_column: The name of the column containing names of resources.
    • resource_to_role_map: An object containing resource names as property names and role names as value.
    • role_column: The name of the generated column whose values will indicate the role in which the resource belongs to.
    • role_limits: An object containing role names as property names and maximum usage for that role as value.

Example

#{
  "type": "resources_to_roles",
  "resource_column": "SAP_User",
  "role_column": "Role",
  "role_name_template": "Role %d",
  "input": #{
    "resource_limits": "max_resource_usages"
  }
}