Case Level Permissions: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
Line 3: Line 3:
== Calculated Attributes ==
== Calculated Attributes ==
Case and event attributes data can be originated either
Case and event attributes data can be originated either
* by '''importing''' them to QPR ProcessAnalyzer, or
* by '''importing''' them into QPR ProcessAnalyzer, or
* by '''calculating''' attribute values during model loading using an [[QPR ProcessAnalyzer Expressions|expression]].
* by '''calculating''' attribute values during model loading using an [[QPR ProcessAnalyzer Expressions|expression]].


Notes about calculated attributes:
Notes about calculated attributes:
* Calculated attributes are treated as the imported attributes, e.g. they can be used in the profiling or influence analyses.
* Calculated attributes are treated as the imported attributes, e.g. they can be used in the profiling or influence analyses.
* An attribute is a calculated attribute, when an expression is defined for it in the model settings.
* Calculated attributes are calculated when the model is loaded. Calculated attribute values don't change after the model loading during the time the model is in the memory. If attributes need to be calculated again, the model needs to be reloaded.
* Calculated attributes are calculated when the model is loaded. Calculated attribute values don't change after the model loading during the time the model is in the memory. If attributes need to be calculated again, the model needs to be reloaded.
* Filtering doesn't affect the values of calculated attributes, but filtering affects the aggregated results from calculated attributes, e.g. in the profiling analysis.
* Filtering doesn't affect the values of calculated attributes, but filtering affects the aggregated results from calculated attributes, e.g. in the profiling analysis.
* Calculated attribute values are stored into memory as the imported attributes, meaning the model requires more memory, if there are more new calculated attributes.
* Calculated attribute values are stored into the memory as the imported attributes, meaning the model requires more memory, if there are more new calculated attributes.
* For calculated case attributes, the expression is evaluated in each case's context, and for calculated event attributes, for each event's context. It's thus possible to use the entice model in the expressions, such as variations, event types, flow and flow occurrences.
* For calculated case attributes, the expression is evaluated in each case's context, and for calculated event attributes, for each event's context. It's thus possible to use the entice model in the expressions, such as variations, event types, flow and flow occurrences.
* Calculated event attributes are processed first, so it's possible to use calculated event attributes in calculated case attributes expressions.
* Calculated event attributes are processed first, so it's possible to use calculated event attributes in calculated case attributes expressions.
Line 33: Line 32:
Array of calculated case attributes with the following properties:
Array of calculated case attributes with the following properties:
* '''name''': Calculated attribute name.
* '''name''': Calculated attribute name.
*'''expression''': Expression to calculate the calculated attribute.
*'''expression''': Expression to calculate the calculated attribute. When an expression has been defined, the attribute is a calculated attribute.
|-
|-
||
||

Revision as of 21:40, 11 March 2019

QPR ProcessAnalyzer models have a Configuration field containing model related settings in a JSON format. When the model JSON configuration is changed, the model is dropped from the memory. The JSON settings are listed in this page.

Calculated Attributes

Case and event attributes data can be originated either

  • by importing them into QPR ProcessAnalyzer, or
  • by calculating attribute values during model loading using an expression.

Notes about calculated attributes:

  • Calculated attributes are treated as the imported attributes, e.g. they can be used in the profiling or influence analyses.
  • Calculated attributes are calculated when the model is loaded. Calculated attribute values don't change after the model loading during the time the model is in the memory. If attributes need to be calculated again, the model needs to be reloaded.
  • Filtering doesn't affect the values of calculated attributes, but filtering affects the aggregated results from calculated attributes, e.g. in the profiling analysis.
  • Calculated attribute values are stored into the memory as the imported attributes, meaning the model requires more memory, if there are more new calculated attributes.
  • For calculated case attributes, the expression is evaluated in each case's context, and for calculated event attributes, for each event's context. It's thus possible to use the entice model in the expressions, such as variations, event types, flow and flow occurrences.
  • Calculated event attributes are processed first, so it's possible to use calculated event attributes in calculated case attributes expressions.
  • Calculated case and event attributes are processed in the order they are defined, so it's possible to use an earlier defined calculated attribute in later ones.
  • Calculated attribute can replace an imported attribute. Note that it's not possible to override event timestamp, event type name or case id.
  • If there is an error in the calculated attribute expression, the model loading fails, so the error needs to be corrected before the model can be used.

Calculated attributes have following use cases:

  • Conversions: If an imported attribute data is not in a suitable format, the data can be overridden by defining a calculated attribute with the same name. The calculated attribute expression performs conversions for the data.
  • Aggregations: It's possible to aggregate e.g. event level information to case level using a calculated case attribute.
  • Performance improvement: If there is an expression to aggregate event level information to a case level, the aggregation can be done using a calculated case attribute to provide better performance. Then the expression can process case level only without going to the event level, which performs considerably faster, because usually there are much less cases than events.

Calculated case and event attributes are configured to the model JSON settings are follows:

Property Description

CaseAttributes

Array of calculated case attributes with the following properties:

  • name: Calculated attribute name.
  • expression: Expression to calculate the calculated attribute. When an expression has been defined, the attribute is a calculated attribute.

EventAttributes

Calculated event attributes are defined using the same structure as calculated case attributes.

Example 1: One calculated case attribute.

"CaseAttributes": [
  {
    "name": "Cost",
    "expression": "Cost * 1.5"
  },
}

Example 2: Two calculated event attributes:

"EventAttributes": [
  {
    "name": "Event Month",
    "expression": "Timestamp.Truncate(\"month\")"
  },
  {
    "name": "Duration to Next Event",
    "expression": "If(IsNull(_.NextInCase), null, (_.NextInCase.Timestamp-_.Timestamp))"
  }
]

Example 3: Several calculated case and event attributes:

{
"CaseAttributes": [
  {
    "name": "myBooleanAttribute1",
    "expression": "CURRENCYCODE == \"RUB\""
  },
  {
    "name": "Combined status",
    "expression": "PurchaseStatus+ \" \" + PurchaseType"
  },
  {
    "name": "Duration between \"Invoice Sent\" and \"Payment Received\"",
    "expression": "if(Count(_.EventsByType(\"Invoice Sent\")) == 0 || Count(_.EventsByType(\"Purchase Order Line Created\")) == 0, null, (GetAtReverse(0,_.EventsByType([\"Invoice Sent\", \"Payment Received\"])).Timestamp-GetAt(0,_.EventsByType([\"Invoice Sent\", \"Payment Received\"])).Timestamp))"
  },
  {
    "name": "Repeated Events Count",
    "expression": "Count(_.Events)-Count(Distinct(_.Events.Type))"
  }
],
"EventAttributes": [
  {
    "name": "Event type occurrence number",
    "expression": "Let(\"name\", _.Typename);Count(_.Recurse(_.PreviousInCase).Where(_.Typename==name))"
  }
]
}

Memory Usage Settings

The memory usage settings are used to manage, how long objects are kept in the memory, which affect both the memory usage and performance. The longer the object storing durations, the more memory is consumed, but on the other hand users perceive better performance as analyses are more likely to be found in the memory already calculated. Memory usage settings can be defined in the web.config file and for each model separately in the model settings.

Property Description

CacheUsage/ DropUnusedModelsAfter

Duration after which the unused model, is dropped from the memory. Defined in format HH:mm:ss or d.HH:mm:ss, for example 01:00:00 (one hour), 00:30:00 (30 minutes) or 1.00:00:00 (24 hours). When an analysis is requested for a model, the model's last used time is updated (also filter's last used time is updated).

If this setting is not defined, the server level default setting is used.

CacheUsage/ DropUnusedFiltersAfter

Duration after which the model's unused filters, are dropped from the memory. Defined in format HH:mm:ss or d.HH:mm:ss, for example 01:00:00 (one hour), 00:30:00 (30 minutes) or 1.00:00:00 (24 hours). When an analysis is requested for a filter , the filter's last used time is updated (also model's last used time is updated).

If this setting is not defined, the server level default setting is used.

Configuration Examples

The following configuration keeps the model in memory for 1 hour and filters for 30 minutes.

{
  "CacheUsage": {
    "DropUnusedModelsAfter": "1:00:00",
    "DropUnusedFiltersAfter": "00:30:00"
  }
}

The following configuration keeps the model in memory for 15 minutes and filters for 5 minutes.

{
  "CacheUsage": {
    "DropUnusedModelsAfter": "0:15:00",
    "DropUnusedFiltersAfter": "00:05:00"
  }
}

The following configuration keeps the model in memory for 100 days (practically it's never dropped) and filters for 2 hours.

{
  "CacheUsage": {
    "DropUnusedModelsAfter": "100.00:00:00",
    "DropUnusedFiltersAfter": "02:00:00"
  }
}

Memory Objects Summary

Data handled by QPR ProcessAnalyzer memory is stored in following types of objects:

Stored Objects Contents Unused objects dropped after Time to reload/calculate Memory consumption
Models Models objects contain all data in a QPR ProcessAnalyzer model, such as
  • cases
  • events
  • variations
  • event types
  • flows
  • flow occurrences
Primarily defined by a model setting CacheUsage/DropUnusedModelsAfter. If the model setting is not defined, the server level setting in the web.config file is used. If the server setting is not defined either, a default value of 1 hour is used. When the is a memory shortage in the server, models are not dropped from the memory. Slow

Models are loaded from the database requiring to transfer considerable amount of data, which takes much more time than e.g. calculating filters and analyses.

High
Filters Filters contain all filtered data in QPR ProcessAnalyzer model, which includes similar data and structures as the models. Primarily defined by a model setting. If the model setting is not defined, the server setting is used. If the server setting is not defined, a default value of 30 minutes is used. When the is a memory shortage in the server, filters may be dropped from the memory to free memory. Fast

Filters are calculated from the model data that already exists in the memory. Practically, filters are subsets of models.

Medium
Analysis results Results of the analysis (in tabular form). Unused analysis results are kept maximum of 15 minutes in the memory. When the is a memory shortage in the server, analysis results may be dropped from the memory to free memory. Fast

Analysis results are calculated from filters, which already exist in memory.

Low

If the QPR ProcessAnalyzer server doesn't have enough memory to store more objects, already stored objects are dropped from the memory starting from the one having the longest time from the last usage. Thus, when there is a short of memory, objects may be dropped earlier than their settings define. Only filters and analysis results are dropped during the memory shortage, i.e. model objects are not dropped prematurely. This is because recalculating filters and analysis results is usually faster than loading models from the database. That's why, when trying to load more models than there is available memory in the server, an out of memory error situation may occur.

Model Automatic Loading on Server Startup

QPR ProcessAnalyzer models can be loaded automatically, when QPR ProcessAnalyzer Server starts. When the model loading takes long time, it's useful to load it automatically beforehand. In the model JSON settings, when the LoadOnStartup property is set to true, the model is loaded automatically during QPR ProcessAnalyzer Server startup. The JSON configuration is as follows:

{
  "LoadOnStartup": true,
  "CacheUsage": {
    "DropUnusedModelsAfter": "100.00:00:00",
    "DropUnusedFiltersAfter": "00:30:00"
  }
}

Note: For models that are set to load automatically on server startup, you also need to set the CacheUsage/DropUnusedModelsAfter setting is large value (as shown by the previous example), so that the model is not dropped from memory even if it's not used.

Note that LoadOnStartup setting to work, IIS configurations related to QPR ProcessAnalyzer Server installation needs to be in place.

Loading Process Mining Data from ODBC Datasource

The DataSource section is used to define where the QPR ProcessAnalyzer model data is loaded. See below examples, how to construct the full JSON.

Property Description
DataSource/Cases/DataSourceType Datasource type to use, when fetching the Cases data. Currently the only supported value is odbc.
DataSource/Cases/OdbcConnectionString ODBC connection string to use to fetch the Cases data. Connection strings can be found in https://www.connectionstrings.com.
DataSource/Cases/OdbcQuery ODBC query to use to fetch the Cases data. Query syntax depends on the source system where the data is fetched.
DataSource/Cases/Columns Column name mappings for QPR ProcessAnalyzer data model. Only supported mapping is CaseId defining the case id (case name). Example:
{ 
  "CaseId": "SalesOrderHeaderId"
}

All unmapped columns are taken to the model as case attributes. If the fetched data doesn't contain a mapped column, the model loading fails and an error message is given.

DataSource/Events/DataSourceType Datasource type to use, when fetching the Events data. Currently the only supported value is odbc.
DataSource/Events/OdbcConnectionString ODBC connection string to use to fetch the Events data. Connection strings can be found in https://www.connectionstrings.com.
DataSource/Events/OdbcQuery ODBC query to use to fetch the Events data. Query syntax depends on the source system where the data is fetched.
DataSource/Events/Columns

Column name mappings for QPR ProcessAnalyzer data model. Supported mappings for Events are CaseId, EventType and Timestamp. Example:

{ 
  "CaseId": "SalesOrderHeaderId",
  "EventType": "EventType",
  "Timestamp": "CreatedDate"
}

All unmapped columns are taken to the model as event attributes. If the fetched data doesn't contain a mapped column, the model loading fails and an error message is given.

Configurations examples

ODBC Driver Installation

Microsoft Access Database Engine 2016 Redistributable is software package that contains ODBC drivers for

  • Microsoft SQL Server
  • Microsoft Access (*.mdb and *.accdb) files
  • Microsoft Excel (*.xls, *.xlsx, and *.xlsb) files
  • CSV text files

The package needs to be installed in the same computer where the QPR ProcessAnalyzer Server is running. Installation instructions:

  1. Go to https://www.microsoft.com/en-us/download/details.aspx?id=54920 and click Download.
  2. Select whether to use the 32bit or 64bit (x64) version (usually it's the 64bit version).
  3. Double-click the executable file on your hard disk to start the setup program.
  4. Follow the instructions on the screen to complete the installation.

If 32-bit office is installed in machine then 64-bit ODBC driver need install using /passive parameter. More information

Read data from CSV file

In this example, data is loaded from CSV files located in the file system using Microsoft Access Text Driver (*.txt, *.csv) driver. Loaded files are C:\ProcessMiningData\ModelCaseAttributes.csv and C:\ProcessMiningData\ModelEventData.csv.

{
  "DataSource": {
    "Cases": {    
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\\ProcessMiningData\\;Extensions=asc,csv,tab,txt",
      "OdbcQuery": "SELECT * FROM [ModelCaseAttributes.csv]",
      "Columns": { 
        "CaseId": "Name"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\\ProcessMiningData\\;Extensions=asc,csv,tab,txt",
      "OdbcQuery": "SELECT * FROM [ModelEventData.csv]",
      "Columns": { 
        "CaseId": "Case",
        "EventType": "Event Type",
        "Timestamp": "Start Time"
      }
    }
  }
}

When reading from CSV files, you may need to set the CSV file format for the ODBC driver using the Schema.ini file (more information: https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver?view=sql-server-2017).

Read data from Excel file

In this example, data is loaded from a Excel file that is accessible in the file system. In this example, the loaded file is C:\ProcessMiningData\ModelData.xlsx and cases are in sheet MyCases and events in sheet MyEvents.

{
  "DataSource": {
    "Cases": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=C:\\ProcessMiningData\\ModelData.xlsx",
      "OdbcQuery": "SELECT * FROM [MyCases$]",
      "Columns": {
        "CaseId": "Case ID"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=C:\\ProcessMiningData\\ModelData.xlsx",
      "OdbcQuery": "SELECT * FROM [MyEvents$]",
      "Columns": {
        "CaseId": "Case ID",
        "EventType": "Activity",
        "Timestamp": "Start Time"
      }
    }
  }
}

Read data from SQL Server table

Recommended driver: https://www.microsoft.com/en-us/download/details.aspx?id=56567

In this example, data is loaded from an SQL Server table. SQL Server hostname is myServerName, instance name is myInstanceName and database name is MyDatabase. The Trusted_Connection is for Windows authentication.

{
  "DataSource": {
    "Cases": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={ODBC Driver 17 for SQL Server};Server=myServerName\myInstanceName;DataBase=MyDatabase;Trusted_Connection=Yes;",
      "OdbcQuery": "SELECT CASENAME, COST, COUNTRY, PRODUCT FROM CASES_TABLE",
      "Columns": { 
        "CaseId": "CASENAME"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={ODBC Driver 17 for SQL Server};Server=myServerName\myInstanceName;DataBase=MyDatabase;Trusted_Connection=Yes;",
      "OdbcQuery": "SELECT CASENAME, CREATED_DATE, CREATED_BY FROM EVENTS_TABLE",
      "Columns": { 
        "CaseId": "CASENAME",
        "Timestamp": "CREATED_DATE",
        "EventType": "CREATED_BY"
      }
    }
  }
}

If SQL Server authentication is used, the connection string is

Driver={ODBC Driver 17 for SQL Server};Server=myServerName\myInstanceName;DataBase=MyDatabase;Trusted_Connection=No;Uid=myUsername;PWD=myPassword;

More about SQL Server connection strings: https://www.connectionstrings.com/sql-server/

Troubleshooting

The following error message may be encountered: System.Data.Odbc.OdbcException (0x80131937): ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified. Probable reason is that the ODBC driver is missing or driver name is not correct. To solve the issue:

  • Check ODBC drivers is installed by running C:\Windows\System32\odbcad32.exe.
  • Check the connection string.

The following error message may be encountered: ERROR [HY024] [Microsoft][ODBC Text Driver] '(unknown)' is not a valid path. Make sure that the path name is spelled correctly and that you are connected to the server on which the file resides.. Example error when specified directory is missing:

Error when CSV file is missing: System.Data.Odbc.OdbcException (0x80131937): ERROR [42S02] [Microsoft][ODBC Text Driver] The Microsoft Access database engine could not find the object 'CaseAttributes.csv'. Make sure the object exists and that you spell its name and the path name correctly. If 'CaseAttributes.csv' is not a local object, check your network connection or contact the server administrator.

Error codes for troubleshooting: ODBC Error Codes.

If the error message mention about Case or Event attributes named as F<column number> (e.g. F10), the data might be inconsistent, as there might be different amount of columns in different rows.

Case Permissions

The Permissions section specifies data security restrictions for objects within the QPR ProcessAnalyzer model (i.e. limit visibility). If the Permissions section hasn't been defined, all the model data is visible to all users having GenericRead permission for the project in which the model resides (more information about roles and permissions). Permissions defined in this section, are only available when using the In-Memory core.

Property Description
Permissions/Initialization Expression language expression used to make an initial calculation for all the other expressions within this same permissions context. This expression can be used to improve performance when part of the Case or EventLogKey expressions are common and thus they don't need to be calculated again for every Case separately. See the examples below of using the Initialization expression.
Permissions/Case Expression language expression determining which users can see each Cases. The expression is evaluated within the context of each Case. If the evaluation results true, the Case is visible for the user. Otherwise the Case, its Events and case and event attributes are not visible. This setting implements case level security restrictions.
Permissions/EventLogKey Expression language expression used to uniquely identify all the unique event logs created by case permission filters. If a cached EventLog with the same key is already in the system, that EventLog is used instead of creating a new. The new EventLog is created by applying the Case expression to filter the Cases users have rights to.

Example usecase for case permissions

Case level permissions (security control) can be implemented with the principle illustrated in the image below. Users already belong to certain groups in the user management, and cases have certain case attribute values which is part of the loaded process mining data. Additionally, the linkage between case attribute values (of a certain case attribute) and groups needs to be defined when this security feature is configured. The image below illustrates the chain between users and cases, how certain users are able to see certain cases when viewing analyses from a QPR ProcessAnalyzer model.

CasePermissions.png

Example: There are groups G1, G2 and G3. Case permissions have been set as follows:

  • group G1 can only see cases where (case attribute) Region is Dallas
  • group G2 can only see cases where Region is Austin
  • group G3 can only see cases where Region is either Austin or New York

QPR ProcessAnalyzer model contains the following cases:

Case name Region (case attribute) Groups can see
A Dallas G1
B Dallas G1
C Austin G2, G3
D New York G3
E New York G3
F New York G3

Thus, when viewing analyses, a user see that the model contains the following cases:

  • If the user belongs to group G1 only, the user can see cases A and B (2 cases)
  • If the user belongs to group G2 only, the user can see case C (1 case)
  • If the user belongs to group G3 only, the user can see cases C, D, E and F (4 cases)
  • If the user belongs to groups G1 and G2 only, the user can see cases A, B and C (3 cases)

There is no way for a user to be aware of the existence of cases that the user doesn't have rights to.

Configuration examples for case permissions

In this example, visibility of cases is limited in a way that only those users can see the cases belonging to a user group which name is same as the Region (case attribute).

{
  "Permissions": {
    "Initialization": "Let(\"groupNames\", OrderByValue(CurrentUser.GroupNames))", 
    "Case": "Region.In(groupNames)",
    "EventLogKey": "StringJoin(\"_\", groupNames)"
  }
}

In this example, cases are only visible for users whose user name is same as the Account Manager (case attribute).

{
  "Permissions": {
    "Initialization": "Let(\"userName\", CurrentUser.Name)", 
    "Case": "(Attribute(\"Account Manager\") == userName)",
    "EventLogKey": "CurrentUser.Id"
  }
}

In this example, cases having "Region" case attribute of "Dallas" will only be visible for users belonging to user group "GroupA" (and "New York" for group "GroupB").

{
  "Permissions": {
    "Initialization": "Let(\"groupNames\", CurrentUser.GroupNames)", 
    "Case": "(Region == \"Dallas\" && \"GroupA\".In(groupNames)) || (Region == \"New York\" && \"GroupB\".In(groupNames))",
    "EventLogKey": "If(\"GroupA\".In(groupNames), \"_A\", \"_\") + If(\"GroupB\".In(groupNames), \"_B\", \"_\")"
  }
}