QPR ProcessAnalyzer Model Datasources: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Data into QPR Processanalyzer models can be loaded from several datasources using several methods. The following datasource types can be used:
Data into QPR Processanalyzer models can be loaded from several datasources. The following datasource types can be used:
* [[#Loading Data from Datatables|Datatables]]: Data can be loaded from QPR ProcessAnalyzer [[Data_Tables_in_QPR_ProcessAnalyzer|Datatables]].
* [[#Loading Data from Datatables|Datatables]]: Data can be loaded from QPR ProcessAnalyzer [[Data_Tables_in_QPR_ProcessAnalyzer|Datatables]].
* [[#Loading Data using Loading Script|Loading Script]]: Data can be loaded using a loading script written in QPR ProcessAnalyzer [[QPR_ProcessAnalyzer_Expressions|expression language]]. The loading script can extract the data and make needed transformations to modify data suitable to be loaded into the model. The loading script is started when the model needs to be loaded into memory (pull principle). With the expression language, it's also possible to fetch data from other QPR ProcessAnalyzer models.
* [[#Loading Data using Loading Script|Loading Script]]: Data can be loaded using a loading script written in QPR ProcessAnalyzer [[QPR_ProcessAnalyzer_Expressions|expression language]]. The loading script can extract the data and perform needed transformations to the suitable eventlog format. The loading script is started when the model needs to be loaded into memory (pull principle). With the expression language, it's also possible to fetch data from other QPR ProcessAnalyzer models.
* [[#Loading Data from ODBC Datasource|ODBC datasources]]: Data can be loaded from any ODBC compliant source system, such as database systems, SAP HANA, ServiceNow, Excel files and CSV files.
* [[#Loading Data from ODBC Datasource|ODBC datasources]]: Data can be loaded from any ODBC compliant source system, such as database systems, SAP HANA, ServiceNow, Excel files and CSV files.


Two queries for the datasource needs to be defined: one for cases and one for events. It's also possible that cases and event are loaded from different types of datasources, e.g. cases from an ODBC datasource and events from a datatable.
Two queries for the datasource needs to be defined: one for cases and one for events. It's also possible that cases and event are loaded from different types of datasources, e.g. cases from an ODBC datasource and events from a datatable.


Mappings for the columns in the datasets can be defined freely, so the order of columns is not relevant for QPR ProcessAnalyzer.
Mappings for the columns in the datasets can be defined freely, so the order of columns is not relevant for QPR ProcessAnalyzer. There are the ''Cases'' and ''Events'' sections for defining a datasource for the respective data.


== Loading Data from Datatables ==
* ''Datasource'': The datasource can be ''Datatable'', ''ODBC'' or ''Loading Script''.
In the ''DataSource'' section, when the ''DataSourceType'' is defined as ''datatable'', data is loaded from a QPR ProcessAnalyzer [[Data_Tables_in_QPR_ProcessAnalyzer|datatable]]. Datatables can be created by importing data to them from a CSV file, or as a result of an [[Data_Extraction,_Transformation,_and_Loading|ETL script]].
* ''Datatable'': When datasource is a datatable, this settings selects the datatable.
* ''Column: Case ID'': Defines column containing case id. This mapping needs to be done for both the cases and events data. This column datatype can be either string or number.
* ''Column: Event Type'': Defines column containing event type name. This column datatype must be string.
* ''Column: Event Time'': Defines column containing event timestamp. The column datatype must be date.
* ''Connection String'':
* ''Connection String Key'':
* ''ODBC Query'':
* ''Loading Script'':


When a datatable model is moved to another project, also the linked datatables are moved with it. Note that there may be several models using a same datatable, and in that case the linkages to other models may be broken when moving a model to other project. Note also that if another datatable with the same name already exist in the target project, the datatable cannot be moved, because there cannot be two datatables having the same name in the same project.
== Model Datasource Dettings ==
Model datasource settings are in the ''Datasources'' tab in the ''Model Properties''.


{| class="wikitable"
== Loading Data from Datatables ==
!'''Property'''
When the ''DataSource'' is ''Datatable'', data is loaded from a QPR ProcessAnalyzer [[Data_Tables_in_QPR_ProcessAnalyzer|datatable]]. Datatables can be created by importing data to them from CSV files, or as a result of an [[Data_Extraction,_Transformation,_and_Loading|ETL script]].
! '''Description'''
|-
||DataSource/Cases/DataSourceType
||Datasource type to use, when fetching the Cases data. Supported values are '''datatable''' (default), '''expression''' and '''odbc'''.
|-
||DataSource/Cases/DataTableName
||Name of the datatable where to fetch the cases data.
|-
||DataSource/Cases/Columns
||Column name mappings for QPR ProcessAnalyzer data model. Only supported mapping is '''CaseId''' defining the case id (case name). Example:
<pre>
{
  "CaseId": "SalesOrderHeaderId"
}
</pre>
All unmapped columns are taken to the model as case attributes. If the fetched data doesn't contain a mapped column, the model loading fails and an error message is given.
|-
||DataSource/Events/DataSourceType
||Datasource type to use, when fetching the Events data. Supported values are '''datatable''' (default), '''expression''' and '''odbc'''.
|-
||DataSource/Events/DataTableName
||Name of the datatable where to fetch the events data.
|-
||DataSource/Events/Columns
||
Column name mappings for QPR ProcessAnalyzer data model. Supported mappings for Events are '''CaseId''', '''EventType''' and '''Timestamp'''. Example:
<pre>
{
  "CaseId": "SalesOrderHeaderId",
  "EventType": "EventType",
  "Timestamp": "CreatedDate"
}
</pre>
All unmapped columns are taken to the model as event attributes. If the fetched data doesn't contain a mapped column, the model loading fails and an error message is given.
|}


===Example===
When a model using datatables as a datasource, is moved to another project, also the linked datatables are moved with the model. Note that there may be several models using a same datatable, and in that case the linkages to other models may be broken when moving a model to other project. Note also that if another datatable with the same name already exist in the target project, the datatable cannot be moved, because there cannot be two datatables having the same name in the same project.
The following configuration loads cases from datatable ''Cases'' where there is a column ''Case ID'' containing case id's, and events from datatable ''events'' where there are columns ''Case ID'', ''Event name'' and ''Event time''.
<pre>
{
  "DataSource": {
    "Cases": {   
      "DataSourceType": "datatable",
      "DataTableName": "Cases",
      "Columns": {
        "CaseId": "Case ID"
      }
    },
    "Events": {
      "DataSourceType": "datatable",
      "DataTableName": "Events",
      "Columns": {
        "CaseId": "Case ID",
        "EventType": "Event name",
        "Timestamp": "Event time"
      }
    }
  }
}
</pre>


== Loading Data using Loading Script ==
== Loading Data using Loading Script ==
Data (cases and events) loaded to a QPR ProcessAnalyzer model can be defined using a loading script, which is written in the QPR ProcessAnalyzer expression language. The loading script produces a DataFrame containing the cases or events data. In addition, mappings from the DataFrame column names defining which column is case id, event type name and timestamp needs to be defined.
Data loaded to a QPR ProcessAnalyzer model can be defined using a loading script, which is written in the QPR ProcessAnalyzer expression language. The loading script produces a DataFrame containing the cases or events data. In addition, mappings from the DataFrame column names defining which column is case id, event type name and timestamp needs to be defined.


The loading script is run in a security context where there are [[Roles_and_Permissions_in_QPR_ProcessAnalyzer#Global_and_Project_Roles|GenericRead and GenericWrite]] permissions for the project, where the loaded model is located. This means that the loading script has access to all the data in the same project, and also a write access e.g. to the datatables. User permissions are not used by the loading script, so that the user who initiated the loading, wouldn't affect the result. In addition, the model may be loaded [[Automatic_Loading_on_Server_Startup|automatically on server startup]], and in that case there is no user who initiated the loading.
The loading script is run in a security context where there are [[Roles_and_Permissions_in_QPR_ProcessAnalyzer#Global_and_Project_Roles|GenericRead and GenericWrite]] permissions for the project, where the loaded model is located. This means that the loading script has access to all the data in the same project, and also a write access e.g. to the datatables. User permissions are not used by the loading script, so that the user who initiated the loading, wouldn't affect the result. In addition, the model may be loaded [[Automatic_Loading_on_Server_Startup|automatically on server startup]], and in that case there is no user who initiated the loading.
Line 83: Line 33:
When loading data from an ODBC datasource, note also the [[PA_Configuration_database_table_in_QPR_ProcessAnalyzer|AllowExternalDatasources setting]], which may prevent the loading.
When loading data from an ODBC datasource, note also the [[PA_Configuration_database_table_in_QPR_ProcessAnalyzer|AllowExternalDatasources setting]], which may prevent the loading.


{| class="wikitable"
=== Examples ===
!'''Property'''
! '''Description'''
|-
||DataSource/Cases/DataSourceType
||Use value '''expression''' when loading data using a loading script. More information about DataSourceType, see [[QPR_ProcessAnalyzer_Model_Datasources#Loading_Data_from_Datatables|Loading Data from Datatables]].
|-
||DataSource/Cases/Expression
||Loading script written in the expression language to get cases. The script should return a DataFrame object which there is one row per case and case attributes are as columns. The script is run in the [[QPR_ProcessAnalyzer_Objects_in_Expression_Language#Model|Model]] context.
|-
||DataSource/Cases/Columns
||For more information, see [[QPR_ProcessAnalyzer_Model_Datasources#Loading_Data_from_Datatables|Loading Data from Datatables]].
|-
||DataSource/Events/DataSourceType
||Use value '''expression''' when loading data using a loading script. More information about DataSourceType, see [[QPR_ProcessAnalyzer_Model_Datasources#Loading_Data_from_Datatables|Loading Data from Datatables]].
|-
||DataSource/Events/Expression
||Loading script written in the expression language to get events. The script should return a DataFrame object which there is one row per event, and event attributes are as columns. The script is run in the [[QPR_ProcessAnalyzer_Objects_in_Expression_Language#Model|Model]] context.
|-
||DataSource/Events/Columns
||For more information, see [[QPR_ProcessAnalyzer_Model_Datasources#Loading_Data_from_Datatables|Loading Data from Datatables]].
|-
|}


=== Examples ===
Loading script for events:
<pre>
ToDataFrame(
  EventLogById(1).Events.[Case.Name, TypeName, Timestamp],
  ["CaseId", "EventType", "Timestamp"]
)
</pre>


Create model from another model:
Loading script for events:
<pre>
<pre>
{
Let("maxEventsPerModel", 1000);Let("ms", ["SAP_OrderToCash", "BPIC13_incidents"].(Let("modelName", _), Models.Where(Name == modelName))[0]);Let("eventAttributes", ToDictionary());ms.EventAttributes.(Let("ea", _), eventAttributes.Set(ea.Name, ea));Let("orderedEventAttributes", OrderByValue(eventAttributes.Keys));ConcatTop([[Flatten(["Case", "EventType", "TimeStamp", orderedEventAttributes])],ConcatTop(ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Flatten([Case.Name, Type.Name, TimeStamp, (Let("evt", _), orderedEventAttributes.(Let("att", _), evt.Attribute(att)))]))])
  "DataSource": {
    "Events": {
      "DataSourceType": "expression",
      "Expression": "ToDataFrame(EventLogById(1).Events.[Case.Name, TypeName, Timestamp], [\"CaseId\", \"EventType\", \"Timestamp\"])",
      "Columns": {
        "CaseId": "Case",
        "Timestamp": "Timestamp",
        "EventType": "EventType"
      }
    }
  }
}
</pre>
</pre>


Loading script for cases:
<pre>
<pre>
{
Let("maxEventsPerModel", 1000);Let("ms", ["SAP_OrderToCash", "BPIC13_incidents"].(Let("modelName", _), Models.Where(Name == modelName))[0]);Let("caseAttributes", ToDictionary());ms.CaseAttributes.(Let("ca", _), caseAttributes.Set(ca.Name, ca));Let("orderedCaseAttributes", OrderByValue(caseAttributes.Keys));Let("cases", ToDictionary());ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Case.(Let("c", _), cases.Set(c, true));ConcatTop([[Flatten(["Case", "Model", orderedCaseAttributes])],cases.Keys.Flatten([Name, EventLog.Name, (Let("c", _), orderedCaseAttributes.(Let("att", _), c.Attribute(att)))])])
  "DataSource": {
    "Events": {
      "DataSourceType": "expression",
      "Expression": "Let(\"maxEventsPerModel\", 1000);Let(\"ms\", [\"SAP_OrderToCash\", \"BPIC13_incidents\"].(Let(\"modelName\", _), Models.Where(Name == modelName))[0]);Let(\"eventAttributes\", ToDictionary());ms.EventAttributes.(Let(\"ea\", _), eventAttributes.Set(ea.Name, ea));Let(\"orderedEventAttributes\", OrderByValue(eventAttributes.Keys));ConcatTop([[Flatten([\"Case\", \"EventType\", \"TimeStamp\", orderedEventAttributes])],ConcatTop(ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Flatten([Case.Name, Type.Name, TimeStamp, (Let(\"evt\", _), orderedEventAttributes.(Let(\"att\", _), evt.Attribute(att)))]))])",
      "Columns": {
        "CaseId": "Case",
        "Timestamp": "TimeStamp",
        "EventType": "EventType"
      }
    },
    "Cases": {
      "DataSourceType": "expression",
      "Expression": "Let(\"maxEventsPerModel\", 1000);Let(\"ms\", [\"SAP_OrderToCash\", \"BPIC13_incidents\"].(Let(\"modelName\", _), Models.Where(Name == modelName))[0]);Let(\"caseAttributes\", ToDictionary());ms.CaseAttributes.(Let(\"ca\", _), caseAttributes.Set(ca.Name, ca));Let(\"orderedCaseAttributes\", OrderByValue(caseAttributes.Keys));Let(\"cases\", ToDictionary());ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Case.(Let(\"c\", _), cases.Set(c, true));ConcatTop([[Flatten([\"Case\", \"Model\", orderedCaseAttributes])],cases.Keys.Flatten([Name, EventLog.Name, (Let(\"c\", _), orderedCaseAttributes.(Let(\"att\", _), c.Attribute(att)))])])",
      "Columns": {
        "CaseId": "Case"
      }
    }
  }
}
</pre>
</pre>


Create a model with eventlog data embedded into the expression.
Create a model with eventlog data embedded into the expression.
<pre>
<pre>
{
ToDataFrame([["a", "et1", DateTime(2019,1)],["a", "et2", DateTime(2019,2)],["a", "et3", DateTime(2019,3)],["b", "et1", DateTime(2019,2)],["c", "et3", DateTime(2019,2)],["c", "et2", DateTime(2019,3)],["c", "et2", DateTime(2019,4)],["c", "et3", DateTime(2019,5)],["c", "et4", DateTime(2019,6)]], ["Case","Timestamp","EventType"])
  "DataSource": {
    "Events": {
      "DataSourceType": "expression",
      "Expression": "ToDataFrame([["a", "et1", DateTime(2019,1)],["a", "et2", DateTime(2019,2)],["a", "et3", DateTime(2019,3)],["b", "et1", DateTime(2019,2)],["c", "et3", DateTime(2019,2)],["c", "et2", DateTime(2019,3)],["c", "et2", DateTime(2019,4)],["c", "et3", DateTime(2019,5)],["c", "et4", DateTime(2019,6)]], ["Case","Timestamp","EventType"])",
      "Columns": {
        "CaseId": "Case",
        "Timestamp": "TimeStamp",
        "EventType": "EventType"
      }
    }
  }
}
</pre>
</pre>


Line 189: Line 81:


== Loading Data from ODBC Datasource ==
== Loading Data from ODBC Datasource ==
In the ''DataSource'' section, when the ''DataSourceType'' is defined as ''odbc'' (for cases or events), data is loaded from an ODBC datasource. There needs to be a suitable ODBC driver installed in QPR ProcesAnalyzer server, and ODBC connection string and query needs to be defined (more below).  
When ''Datasource'' is ''ODBC'', data is loaded from an ODBC datasource. There needs to be a suitable ODBC driver installed in QPR ProcesAnalyzer server, and ODBC connection string and query needs to be defined (more below).  


[[PA_Configuration_database_table_in_QPR_ProcessAnalyzer|AllowExternalDatasources]] setting needs to be True to be able to use the ODBC datasource.
[[PA_Configuration_database_table_in_QPR_ProcessAnalyzer|AllowExternalDatasources]] setting needs to be True to be able to use the ODBC datasource.

Revision as of 15:17, 25 October 2020

Data into QPR Processanalyzer models can be loaded from several datasources. The following datasource types can be used:

  • Datatables: Data can be loaded from QPR ProcessAnalyzer Datatables.
  • Loading Script: Data can be loaded using a loading script written in QPR ProcessAnalyzer expression language. The loading script can extract the data and perform needed transformations to the suitable eventlog format. The loading script is started when the model needs to be loaded into memory (pull principle). With the expression language, it's also possible to fetch data from other QPR ProcessAnalyzer models.
  • ODBC datasources: Data can be loaded from any ODBC compliant source system, such as database systems, SAP HANA, ServiceNow, Excel files and CSV files.

Two queries for the datasource needs to be defined: one for cases and one for events. It's also possible that cases and event are loaded from different types of datasources, e.g. cases from an ODBC datasource and events from a datatable.

Mappings for the columns in the datasets can be defined freely, so the order of columns is not relevant for QPR ProcessAnalyzer. There are the Cases and Events sections for defining a datasource for the respective data.

  • Datasource: The datasource can be Datatable, ODBC or Loading Script.
  • Datatable: When datasource is a datatable, this settings selects the datatable.
  • Column: Case ID: Defines column containing case id. This mapping needs to be done for both the cases and events data. This column datatype can be either string or number.
  • Column: Event Type: Defines column containing event type name. This column datatype must be string.
  • Column: Event Time: Defines column containing event timestamp. The column datatype must be date.
  • Connection String:
  • Connection String Key:
  • ODBC Query:
  • Loading Script:

Model Datasource Dettings

Model datasource settings are in the Datasources tab in the Model Properties.

Loading Data from Datatables

When the DataSource is Datatable, data is loaded from a QPR ProcessAnalyzer datatable. Datatables can be created by importing data to them from CSV files, or as a result of an ETL script.

When a model using datatables as a datasource, is moved to another project, also the linked datatables are moved with the model. Note that there may be several models using a same datatable, and in that case the linkages to other models may be broken when moving a model to other project. Note also that if another datatable with the same name already exist in the target project, the datatable cannot be moved, because there cannot be two datatables having the same name in the same project.

Loading Data using Loading Script

Data loaded to a QPR ProcessAnalyzer model can be defined using a loading script, which is written in the QPR ProcessAnalyzer expression language. The loading script produces a DataFrame containing the cases or events data. In addition, mappings from the DataFrame column names defining which column is case id, event type name and timestamp needs to be defined.

The loading script is run in a security context where there are GenericRead and GenericWrite permissions for the project, where the loaded model is located. This means that the loading script has access to all the data in the same project, and also a write access e.g. to the datatables. User permissions are not used by the loading script, so that the user who initiated the loading, wouldn't affect the result. In addition, the model may be loaded automatically on server startup, and in that case there is no user who initiated the loading.

When loading data from an ODBC datasource, note also the AllowExternalDatasources setting, which may prevent the loading.

Examples

Loading script for events:

ToDataFrame(
  EventLogById(1).Events.[Case.Name, TypeName, Timestamp],
  ["CaseId", "EventType", "Timestamp"]
)

Loading script for events:

Let("maxEventsPerModel", 1000);Let("ms", ["SAP_OrderToCash", "BPIC13_incidents"].(Let("modelName", _), Models.Where(Name == modelName))[0]);Let("eventAttributes", ToDictionary());ms.EventAttributes.(Let("ea", _), eventAttributes.Set(ea.Name, ea));Let("orderedEventAttributes", OrderByValue(eventAttributes.Keys));ConcatTop([[Flatten(["Case", "EventType", "TimeStamp", orderedEventAttributes])],ConcatTop(ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Flatten([Case.Name, Type.Name, TimeStamp, (Let("evt", _), orderedEventAttributes.(Let("att", _), evt.Attribute(att)))]))])

Loading script for cases:

Let("maxEventsPerModel", 1000);Let("ms", ["SAP_OrderToCash", "BPIC13_incidents"].(Let("modelName", _), Models.Where(Name == modelName))[0]);Let("caseAttributes", ToDictionary());ms.CaseAttributes.(Let("ca", _), caseAttributes.Set(ca.Name, ca));Let("orderedCaseAttributes", OrderByValue(caseAttributes.Keys));Let("cases", ToDictionary());ms.EventLog.Events[NumberRange(0, Min(CountTop(_), maxEventsPerModel) - 1)].Case.(Let("c", _), cases.Set(c, true));ConcatTop([[Flatten(["Case", "Model", orderedCaseAttributes])],cases.Keys.Flatten([Name, EventLog.Name, (Let("c", _), orderedCaseAttributes.(Let("att", _), c.Attribute(att)))])])

Create a model with eventlog data embedded into the expression.

ToDataFrame([["a", "et1", DateTime(2019,1)],["a", "et2", DateTime(2019,2)],["a", "et3", DateTime(2019,3)],["b", "et1", DateTime(2019,2)],["c", "et3", DateTime(2019,2)],["c", "et2", DateTime(2019,3)],["c", "et2", DateTime(2019,4)],["c", "et3", DateTime(2019,5)],["c", "et4", DateTime(2019,6)]], ["Case","Timestamp","EventType"])


This loading script loads events using several odbc queries (each event type separately).

let("eventTypes", ["Invoice Created", "Requested Delivery", "Inventory Date", "Invoice Payment", "Actual Delivery"]);
let("data",
	eventTypes
	.(
		ImportODBC(
			"Driver={ODBC Driver 17 for SQL Server};Server=SERVER\\MSSQLSERVER2017;DataBase=Process Mining Data;Uid=user1;Pwd=pw1;",
			"SELECT [Case] AS \"CaseId\", Event_Type AS \"EventType\", Start_Time AS TimeStamp, Cost, Variable_cost, booleanAttribute1 FROM [OtC Model events] Where [Event_Type]=\'" + _ +"\'"
		)
	)
);
let("combinedData", data[0]);
For("i", 1, i < CountTop(data), i + 1,
	let("combinedData",
		combinedData.Append(data[i])
	)
);
combinedData;

Loading Data from ODBC Datasource

When Datasource is ODBC, data is loaded from an ODBC datasource. There needs to be a suitable ODBC driver installed in QPR ProcesAnalyzer server, and ODBC connection string and query needs to be defined (more below).

AllowExternalDatasources setting needs to be True to be able to use the ODBC datasource.

Property Description
DataSource/Cases/DataSourceType Use value odbc when loading data from an ODBC datasource. More information about DataSourceType, see Loading Data from Datatables.
DataSource/Cases/OdbcConnectionString ODBC connection string to use to fetch the Cases data. Connection strings can be found in https://www.connectionstrings.com.
DataSource/Cases/OdbcConnectionStringKey Secure string key of the ODBC connection string to use to fetch the Cases data. Alternative to the OdbcConnectionString property.
DataSource/Cases/OdbcQuery ODBC query to use to fetch the Cases data. Query syntax depends on the source system where the data is fetched.
DataSource/Cases/Columns For more information, see Loading Data from Datatables.
DataSource/Events/DataSourceType Use value odbc when loading data from an ODBC datasource. More information about DataSourceType, see Loading Data from Datatables.
DataSource/Events/OdbcConnectionString ODBC connection string to use to fetch the Events data. Connection strings can be found in https://www.connectionstrings.com.
DataSource/Events/OdbcConnectionStringKey Secure string key of the ODBC connection string to use to fetch the Events data. Alternative to the OdbcConnectionString property.
DataSource/Events/OdbcQuery ODBC query to use to fetch the Events data. Query syntax depends on the source system where the data is fetched.
DataSource/Events/Columns For more information, see Loading Data from Datatables.

Example: ODBC Driver Installation

Microsoft Access Database Engine 2016 Redistributable is software package that contains ODBC drivers for

  • Microsoft SQL Server
  • Microsoft Access (*.mdb and *.accdb) files
  • Microsoft Excel (*.xls, *.xlsx, and *.xlsb) files
  • CSV text files

The package needs to be installed in the same computer where the QPR ProcessAnalyzer Server is running. Installation instructions:

  1. Go to https://www.microsoft.com/en-us/download/details.aspx?id=54920 and click Download.
  2. Select whether to use the 32bit or 64bit (x64) version (usually it's the 64bit version).
  3. Double-click the executable file on your hard disk to start the setup program.
  4. Follow the instructions on the screen to complete the installation.

If 32-bit office is installed in machine then 64-bit ODBC driver need install using /passive parameter. More information

Example: Read data from CSV file

In this example, data is loaded from CSV files located in the file system using Microsoft Access Text Driver (*.txt, *.csv) driver. Loaded files are C:\ProcessMiningData\ModelCaseAttributes.csv and C:\ProcessMiningData\ModelEventData.csv.

{
  "DataSource": {
    "Cases": {    
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\\ProcessMiningData\\;Extensions=asc,csv,tab,txt",
      "OdbcQuery": "SELECT * FROM [ModelCaseAttributes.csv]",
      "Columns": { 
        "CaseId": "Name"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Access Text Driver (*.txt, *.csv)};DefaultDir=C:\\ProcessMiningData\\;Extensions=asc,csv,tab,txt",
      "OdbcQuery": "SELECT * FROM [ModelEventData.csv]",
      "Columns": { 
        "CaseId": "Case",
        "EventType": "Event Type",
        "Timestamp": "Start Time"
      }
    }
  }
}

When reading from CSV files, you may need to set the CSV file format for the ODBC driver using the Schema.ini file (more information: https://docs.microsoft.com/en-us/sql/odbc/microsoft/schema-ini-file-text-file-driver?view=sql-server-2017).

  • Example: ODBC text driver may guess column data type and return null values in case when guessed column type not match. Guessing based top rows of data.
  • Note: ODBC text driver support only one data type per column mixed data types is not supported.

Example: Read data from Excel file

In this example, data is loaded from a Excel file that is accessible in the file system. In this example, the loaded file is C:\ProcessMiningData\ModelData.xlsx and cases are in sheet MyCases and events in sheet MyEvents.

{
  "DataSource": {
    "Cases": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=C:\\ProcessMiningData\\ModelData.xlsx",
      "OdbcQuery": "SELECT * FROM [MyCases$]",
      "Columns": {
        "CaseId": "Case ID"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};DBQ=C:\\ProcessMiningData\\ModelData.xlsx",
      "OdbcQuery": "SELECT * FROM [MyEvents$]",
      "Columns": {
        "CaseId": "Case ID",
        "EventType": "Activity",
        "Timestamp": "Start Time"
      }
    }
  }
}

Example: Read data from SQL Server table

Recommended driver: https://www.microsoft.com/en-us/download/details.aspx?id=56567

In this example, data is loaded from an SQL Server table. SQL Server hostname is myServerName, instance name is myInstanceName and database name is MyDatabase. The Trusted_Connection is for Windows authentication.

{
  "DataSource": {
    "Cases": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={ODBC Driver 17 for SQL Server};Server=myServerName\\myInstanceName;DataBase=MyDatabase;Trusted_Connection=Yes;",
      "OdbcQuery": "SELECT CASENAME, COST, COUNTRY, PRODUCT FROM CASES_TABLE",
      "Columns": { 
        "CaseId": "CASENAME"
      }
    },
    "Events": {
      "DataSourceType": "odbc",
      "OdbcConnectionString": "Driver={ODBC Driver 17 for SQL Server};Server=myServerName\\myInstanceName;DataBase=MyDatabase;Trusted_Connection=Yes;",
      "OdbcQuery": "SELECT CASENAME, CREATED_DATE, CREATED_BY FROM EVENTS_TABLE",
      "Columns": { 
        "CaseId": "CASENAME",
        "Timestamp": "CREATED_DATE",
        "EventType": "CREATED_BY"
      }
    }
  }
}

If SQL Server authentication is used, the connection string is

Driver={ODBC Driver 17 for SQL Server};Server=myServerName\myInstanceName;DataBase=MyDatabase;Trusted_Connection=No;Uid=myUsername;PWD=myPassword;

More about SQL Server connection strings: https://www.connectionstrings.com/sql-server/

Troubleshooting

The following error message may be encountered: System.Data.Odbc.OdbcException (0x80131937): ERROR [IM002] [Microsoft][ODBC Driver Manager] Data source name not found and no default driver specified. Probable reason is that the ODBC driver is missing or driver name is not correct. To solve the issue:

  • Check ODBC drivers is installed by running C:\Windows\System32\odbcad32.exe.
  • Check the connection string.

The following error message may be encountered: ERROR [HY024] [Microsoft][ODBC Text Driver] '(unknown)' is not a valid path. Make sure that the path name is spelled correctly and that you are connected to the server on which the file resides.. Example error when specified directory is missing:

Error when CSV file is missing: System.Data.Odbc.OdbcException (0x80131937): ERROR [42S02] [Microsoft][ODBC Text Driver] The Microsoft Access database engine could not find the object 'CaseAttributes.csv'. Make sure the object exists and that you spell its name and the path name correctly. If 'CaseAttributes.csv' is not a local object, check your network connection or contact the server administrator.

Error codes for troubleshooting: ODBC Error Codes.

If the error message mention about Case or Event attributes named as F<column number> (e.g. F10), the data might be inconsistent, as there might be different amount of columns in different rows.