QPR ProcessAnalyzer System Architecture: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
m (Ollvihe moved page QPR ProcessAnalyzer Architecture to QPR Software Products Architecture without leaving a redirect)
 
(107 intermediate revisions by the same user not shown)
Line 1: Line 1:
QPR ProcessAnalyzer is a cloud based product but also on-premise set-up is supported.
QPR ProcessAnalyzer is natively a cloud-based software, but also an on-premise installation option is available. QPR ProcessAnalyzer can be used with desktops, laptops and tablets with web browser.


As data security is always key, the architecture is built so that the data is protected when collected from the source. The data can be fetched from any source using integration interfaces whether they are located on premise or in the cloud. The analysis algorithms are located in SQL Server from where the analysis results are fed to the Excel or Web user interfaces.
==System Architecture and Components==
The following diagram shows the high level system architecture for QPR ProcessAnalyzer with its main components.


Users access the product through their PCs, laptops or tables on the Internet for the Web UI and the Excel UI.
[[File:QPR_ProcessAnalyzer_System_Architecture.png|1150px]]


== Integration to Data Sources ==
QPR ProcessAnalyzer consists of the following components:
QPR ProcessAnalyzer is designed and built for easy integration to a wide range of data sources. The power of the product comes from having different process information accessible from one point and where it can be analyzed from any angle.
* '''QPR ProcessAnalyzer Web UI''': Web UI is a web browser based user interface for QPR ProcessAnalyzer using HTML5 and Angular as technologies. The Web UI works with all modern browsers without separate add-ons. For more information, see the [[QPR_ProcessAnalyzer_System_Requirements#System_Requirements_for_Web_Browsers|list of supported browser]].
* '''QPR ProcessAnalyzer Server''': QPR ProcessAnalyzer Server is a component running in the cloud (or server side). When using Snowflake, QPR ProcessAnalyzer Server creates SQL queries for the needed calculations and sends them to Snowflake for execution. When using in-memory processing, QPR ProcessAnalyzer Server has the eventlog data in-memory where it can be processed using QPR ProcessAnalyzer's expression language optimized for process mining calculations. QPR ProcessAnalyzer Server also manages users, sessions and other system configurations. QPR ProcessAnalyzer Server is based on Microsoft .Net 8 and it runs in Windows Server (hosted by IIS) or Linux (hosted by Kestrel).
* '''QPR ScriptLauncher''': ScriptLauncher is a tool to trigger QPR ProcessAnalyzer scripts used for ETL (extract, transform and load). The ScriptLauncher is installed on-premise, to extract data from on-premise systems and store to the QPR ProcessAnalyzer cloud. ScriptLauncher will run scripts managed in QPR ProcessAnalyzer, executing the data extraction commands to the on-premise systems.
* '''Snowflake data cloud''': Snowflake data cloud is an external cloud service offered by [https://www.snowflake.com Snowflake Inc.]. When integrated with QPR ProcessAnalyzer, Snowflake can store datatables, perform process mining analytics calculations, and run ETL transformations.
* '''SQL Server metadata database''': This is the system database used by QPR ProcessAnalyzer Server, storing e.g., projects, dashboards, scripts, users and system configurations. SQL Server is supported as the database system and connection to the database uses ADO.Net.
* '''SQL Server datatables database''': This database stores datatable contents managed by QPR ProcessAnalyzer. The datatables contain events and cases data for the process mining models and data processed by the ETL scripts. SQL Server is supported as the database system and connection to the database uses ADO.Net. When using Snowflake, this database is not needed, as the datatables are stored in the Snowflake.
* '''SQL Server scripting database''': Scripting database is used to run SQL commands in the ETL scripts. The scripting database can be configured to store data only temporarily during the script runs (for the permanent storage [[QPR_ProcessAnalyzer_Project_Workspace#Datatables|datatables]] stored to the QPR ProcessAnalyzer Server Database can be used). For custom usecases, the scripting database can also have write access to store data permanently.
* '''(SQL Server tempdb)''': SQL Server has a database called tempdb that store temporary data during processing queries. As a data intensive application, the tempdb performance has an essential role in QPR ProcessAnalyzer, e.g., the SQL scripts use the tempdb. See the [[QPR_ProcessAnalyzer_System_Requirements#Recommended_Hardware_for_QPR_ProcessAnalyzer_Database_Server|system requirements]] section for the tempdb sizing. More information about tempdb: https://docs.microsoft.com/en-us/sql/relational-databases/databases/tempdb-database?view=sql-server-ver15.


For basic analysis, Event ID, Time Stamp and Attiribute data should be available from the data source. However, depending on the customer need the analysis can be expanded by other data where all provided information can be included in the analysis be it sales person, location, customer, sale amount, time stamp for start and end of sale.
== Snowflake-Powered Calculation ==
The data sources can include:
In addition to the in-memory processing, process mining queries for dashboards can be calculated in Snowflake (https://snowflake.com), enabling remarkable scaling in performance with very large datasets and number of simultaneous users. To use the Snowflake processing, an account to Snowflake is required and QPR ProcessAnalyzer needs to be connected to the account using an ODBC connection ([[Snowflake Connection Configuration|more information]]). Any cloud platforms provided by Snowflake can be used, but for optimal performance, prefer a site that is near to the QPR ProcessAnalyzer environment.
*ERP systems e.g. SAP for Order to Cash and other processes  
 
*CRM systems e.g. Salesforce for sales process
When the Snowflake connection is available in a QPR ProcessAnalyzer environment, users can create datatables that store their data in tables located in Snowflake. When creating a process mining model using the Snowflake stored events and cases datatables, process mining calculations are performed by the Snowflake warehouses. The snowflake processing has the same capabilities (and even more) than the in-memory processing, but the expression language syntax is different and thus there is a different set of ready-made KPI's available when creating dashboards.
*Customer support systems e.g. Jira
 
*Case Management Systems
The following diagram shows how QPR ProcessAnalyzer interacts with Snowflake in the process mining query processing and also options for the ETL data flows:
*Supply Chain Management systems
 
*Configuration Management Databases
[[File:QPR ProcessAnalyzer integration to Snowflake.PNG|1100px]]
 
== Run as Snowflake Native Application ==
QPR ProcessAnalyzer is a available as the Snowflake Native Application. In Snowflake, the architecture is more simple because all components are running in the Snowflake Data Cloud. Following diagram shows the overall architecture:
 
[[File:QPR_ProcessAnalyzer_in_Snowpark_Container_Services_architecture.png|900px]]
 
In Snowflake, QPR ProcessAnalyzer Server is running in the container hosted by the Snowpark Container Services. The container also has the SQL Server Express where the configuration data (metadata database) is stored. The static files for the web frontend are also located in the container.
 
QPR ProcessAnalyzer user management is integrated into Snowflake, so authenticated Snowflake users are automatically logged in to QPR ProcessAnalyzer by matching a username. When viewing dashboards and analyses, the process mining analytics calculations are performed in the Snowflake virtual warehouse. QPR ProcessAnalyzer Server running in the container doesn't handle the eventlog data at all. Calculation results shown to the user, are temporary cached to QPR ProcessAnalyzer Server and the caches are cleared when the container service shuts down.
 
Following components and features are not available when running QPR ProcessAnalyzer as Snowflake Native Application:
* SQL Server datatable database
* SQL scripts and SQL Server scripting database
* QPR ScriptLauncher
* Data extraction from external datasources
 
== Connecting to Data Sources ==
QPR ProcessAnalyzer is designed and built for easy integration to a wide range of data sources to fetch the eventlogs data. The power of process mining comes from collecting process information from differences sources to a comprehensive process mining model, where it can be analyzed from any angle. The available data sources can include for example:
* ERP systems, e.g. SAP HANA and SAP R/3 (for Order-to-Cash, Purchase-to-Pay processes etc.)
* CRM systems, e.g. Salesforce (for sales process)
* Customer support systems, e.g. Jira
* Case management systems
* Supply chain management systems
* Configuration management databases
 
To get started with an analysis, an eventlog is needed containing individual events as rows and at least following three properties as column: Case id, timestamp and event type. The eventlog can optionally contain event attributes such as sales person, location, customer, sale amount, time stamp for start and end of the event.
 
As the security of the data is essential, the architecture is built so that the data is protected when collected from the source. The data can be fetched from any source using integration interfaces whether they are located on premise or in the cloud. For example, connectors to ODBC, OLEDB, ADO.net, and SOAP/REST Web Services are available ([[SQL_Scripting_Commands|more information]]).
 
== Authentication, Authorization and Security Model ==
QPR ProcessAnalyzer supports the following methods for authenticating users:
* '''QPR ProcessAnalyzer built-in authentication''': QPR ProcessAnalyzer has its own user management which can provide password authentication for users. For increased security, there is a mechanism in place to prevent [[User_Session_Management#Preventing password guessing attacks|brute-force password guessing attacks]] against QPR ProcessAnalyzer built-in authentication.
* '''SAML authentication (federated authentication)''': To use organization's own identity management system (IdP), QPR ProcessAnalyzer can be integrated with it using the SAML 2.0 protocol. SAML 2.0 is also the optimal solution for QPR Cloud (more information about [[SAML_2.0_Federated_Authentication|SAML authentication]]).
* '''LDAP/AD authentication''': QPR ProcessAnalyzer can use the traditional AD/LDAP authentication, although the SAML authentication as more flexible is the recommended option (more information about [[LDAP/AD_Authentication|LDAP/AD authentication]].)
* '''Snowflake authentication''': QPR ProcessAnalyzer user management is integrated into Snowflake user management. Authenticated Snowflake users are directly logged in to QPR ProcessAnalyzer when a matching username is found in the QPR ProcessAnalyzer user management. Note that QPR ProcessAnalyzer groups and Snowflake roles are not connected, so QPR ProcessAnalyzer user management is still used to manage user permissions. This method is only available when running QPR ProcessAnalyzer as [[QPR_ProcessAnalyzer_System_Architecture#Snowflake_Native_Application|Snowflake Native Application]].
 
QPR ProcessAnalyzer has its own robust and flexible authorization mechanisms that controls user's access to data. Permissions are managed for individual projects, which consist of models, datatables, dashboards and scripts. When there are lot of users, it's recommended to use ''groups'' to make user management easier: when permissions have been configured to groups, users can be easily assigned to groups. More information about [[Roles_and_Permissions|permissions]] and [[User_Session_Management|user session management]].
 
QPR ProcessAnalyzer also supports case level permissions, allowing to determine permissions for each case separately based on flexible rules and the attribute data in the eventlog. You can for example assign permissions for cases from each company code for persons working in the respective area. More information about [[Case_Level_Permissions|case level permissions]].
 
For connection between QPR ProcessAnalyzer Server and SQL Server database, the system uses one connection string and a shared connection pool with a single SQL Server user (login) for all database operations. For easy setup, the database user can use the ''db_owner'' role, but the [[QPR_ProcessAnalyzer_Security_Hardening#Database_User_Least_Privileges|hardened configuration]] complying with the minimum privilege principle is recommended.
 
Credentials for the data sources can be stored as [[Storing_Secrets_for_Scripts|secrets]], which allow to store passwords and other confidential data in a secure way, that the stored credentials cannot be seen by any user.
 
== QPR ProcessAnalyzer REST API ==
All connections to QPR ProcessAnalyzer Server are established through the [[QPR ProcessAnalyzer API]]. For example, the QPR ProcessAnalyzer Web UI and the [[QPR_ProcessAnalyzer_ScriptLauncher|QPR ProcessAnalyzer ScriptLauncher]] are using it, and also any external integrations can use it to communicate with the QPR ProcessAnalyzer system.
 
== Version compatibility ==
As a general rule, QPR ProcessAnalyzer releases are compatible with each other with the backwards compatibility principle where older files or databases are compatible with newer QPR ProcessAnalyzer releases. If a version number has increased, the newer files or databases are not compatible with older QPR ProcessAnalyzer releases. There are the following version information:
*'''Dashboard file version''': Dashboards can be imported as .qprpa files with each file having a version number. Dashboard file can be imported, if the file has been exported with QPR ProcessAnalyzer using same or earlier file version than in the importing system. However, dashboard exported using QPR ProcessAnalyzer that has later dashboard file version than in the importing environment, the import cannot be done. The dashboard version can be checked by opening the dashboard file in a text editor (contents is a cleartext json) and searching the ''version'' property.
*'''PACM file version''': PACM file can be imported, if the file has been exported with QPR ProcessAnalyzer using same or earlier PACM file version than in the importing system. However, model exported using QPR ProcessAnalyzer that has later PACM file version than in the importing environment, the import cannot be done.
*'''Database major version''' increases when the database structure changes in a way that the backwards compatibility to earlier releases is lost. Thus, database major version used by QPR ProcessAnalyzer Server must match with the major version of the QPR ProcessAnalyzer database (in the SQL Server). If the major version of the database is lower, QPR ProcessAnalyzer Server upgrades the database to the same major and minor database version as the server. If the database has a newer major version than the QPR ProcessAnalyzer Server, that database cannot be used with that server release.
*'''Database minor version''' is changed when the database structure changes in a way that the earlier QPR ProcessAnalyzer Server releases maintain compatibility with the new database schema (within the same database major version).
 
[[Category: QPR ProcessAnalyzer]]

Latest revision as of 00:20, 10 October 2024

QPR ProcessAnalyzer is natively a cloud-based software, but also an on-premise installation option is available. QPR ProcessAnalyzer can be used with desktops, laptops and tablets with web browser.

System Architecture and Components

The following diagram shows the high level system architecture for QPR ProcessAnalyzer with its main components.

QPR ProcessAnalyzer System Architecture.png

QPR ProcessAnalyzer consists of the following components:

  • QPR ProcessAnalyzer Web UI: Web UI is a web browser based user interface for QPR ProcessAnalyzer using HTML5 and Angular as technologies. The Web UI works with all modern browsers without separate add-ons. For more information, see the list of supported browser.
  • QPR ProcessAnalyzer Server: QPR ProcessAnalyzer Server is a component running in the cloud (or server side). When using Snowflake, QPR ProcessAnalyzer Server creates SQL queries for the needed calculations and sends them to Snowflake for execution. When using in-memory processing, QPR ProcessAnalyzer Server has the eventlog data in-memory where it can be processed using QPR ProcessAnalyzer's expression language optimized for process mining calculations. QPR ProcessAnalyzer Server also manages users, sessions and other system configurations. QPR ProcessAnalyzer Server is based on Microsoft .Net 8 and it runs in Windows Server (hosted by IIS) or Linux (hosted by Kestrel).
  • QPR ScriptLauncher: ScriptLauncher is a tool to trigger QPR ProcessAnalyzer scripts used for ETL (extract, transform and load). The ScriptLauncher is installed on-premise, to extract data from on-premise systems and store to the QPR ProcessAnalyzer cloud. ScriptLauncher will run scripts managed in QPR ProcessAnalyzer, executing the data extraction commands to the on-premise systems.
  • Snowflake data cloud: Snowflake data cloud is an external cloud service offered by Snowflake Inc.. When integrated with QPR ProcessAnalyzer, Snowflake can store datatables, perform process mining analytics calculations, and run ETL transformations.
  • SQL Server metadata database: This is the system database used by QPR ProcessAnalyzer Server, storing e.g., projects, dashboards, scripts, users and system configurations. SQL Server is supported as the database system and connection to the database uses ADO.Net.
  • SQL Server datatables database: This database stores datatable contents managed by QPR ProcessAnalyzer. The datatables contain events and cases data for the process mining models and data processed by the ETL scripts. SQL Server is supported as the database system and connection to the database uses ADO.Net. When using Snowflake, this database is not needed, as the datatables are stored in the Snowflake.
  • SQL Server scripting database: Scripting database is used to run SQL commands in the ETL scripts. The scripting database can be configured to store data only temporarily during the script runs (for the permanent storage datatables stored to the QPR ProcessAnalyzer Server Database can be used). For custom usecases, the scripting database can also have write access to store data permanently.
  • (SQL Server tempdb): SQL Server has a database called tempdb that store temporary data during processing queries. As a data intensive application, the tempdb performance has an essential role in QPR ProcessAnalyzer, e.g., the SQL scripts use the tempdb. See the system requirements section for the tempdb sizing. More information about tempdb: https://docs.microsoft.com/en-us/sql/relational-databases/databases/tempdb-database?view=sql-server-ver15.

Snowflake-Powered Calculation

In addition to the in-memory processing, process mining queries for dashboards can be calculated in Snowflake (https://snowflake.com), enabling remarkable scaling in performance with very large datasets and number of simultaneous users. To use the Snowflake processing, an account to Snowflake is required and QPR ProcessAnalyzer needs to be connected to the account using an ODBC connection (more information). Any cloud platforms provided by Snowflake can be used, but for optimal performance, prefer a site that is near to the QPR ProcessAnalyzer environment.

When the Snowflake connection is available in a QPR ProcessAnalyzer environment, users can create datatables that store their data in tables located in Snowflake. When creating a process mining model using the Snowflake stored events and cases datatables, process mining calculations are performed by the Snowflake warehouses. The snowflake processing has the same capabilities (and even more) than the in-memory processing, but the expression language syntax is different and thus there is a different set of ready-made KPI's available when creating dashboards.

The following diagram shows how QPR ProcessAnalyzer interacts with Snowflake in the process mining query processing and also options for the ETL data flows:

QPR ProcessAnalyzer integration to Snowflake.PNG

Run as Snowflake Native Application

QPR ProcessAnalyzer is a available as the Snowflake Native Application. In Snowflake, the architecture is more simple because all components are running in the Snowflake Data Cloud. Following diagram shows the overall architecture:

QPR ProcessAnalyzer in Snowpark Container Services architecture.png

In Snowflake, QPR ProcessAnalyzer Server is running in the container hosted by the Snowpark Container Services. The container also has the SQL Server Express where the configuration data (metadata database) is stored. The static files for the web frontend are also located in the container.

QPR ProcessAnalyzer user management is integrated into Snowflake, so authenticated Snowflake users are automatically logged in to QPR ProcessAnalyzer by matching a username. When viewing dashboards and analyses, the process mining analytics calculations are performed in the Snowflake virtual warehouse. QPR ProcessAnalyzer Server running in the container doesn't handle the eventlog data at all. Calculation results shown to the user, are temporary cached to QPR ProcessAnalyzer Server and the caches are cleared when the container service shuts down.

Following components and features are not available when running QPR ProcessAnalyzer as Snowflake Native Application:

  • SQL Server datatable database
  • SQL scripts and SQL Server scripting database
  • QPR ScriptLauncher
  • Data extraction from external datasources

Connecting to Data Sources

QPR ProcessAnalyzer is designed and built for easy integration to a wide range of data sources to fetch the eventlogs data. The power of process mining comes from collecting process information from differences sources to a comprehensive process mining model, where it can be analyzed from any angle. The available data sources can include for example:

  • ERP systems, e.g. SAP HANA and SAP R/3 (for Order-to-Cash, Purchase-to-Pay processes etc.)
  • CRM systems, e.g. Salesforce (for sales process)
  • Customer support systems, e.g. Jira
  • Case management systems
  • Supply chain management systems
  • Configuration management databases

To get started with an analysis, an eventlog is needed containing individual events as rows and at least following three properties as column: Case id, timestamp and event type. The eventlog can optionally contain event attributes such as sales person, location, customer, sale amount, time stamp for start and end of the event.

As the security of the data is essential, the architecture is built so that the data is protected when collected from the source. The data can be fetched from any source using integration interfaces whether they are located on premise or in the cloud. For example, connectors to ODBC, OLEDB, ADO.net, and SOAP/REST Web Services are available (more information).

Authentication, Authorization and Security Model

QPR ProcessAnalyzer supports the following methods for authenticating users:

  • QPR ProcessAnalyzer built-in authentication: QPR ProcessAnalyzer has its own user management which can provide password authentication for users. For increased security, there is a mechanism in place to prevent brute-force password guessing attacks against QPR ProcessAnalyzer built-in authentication.
  • SAML authentication (federated authentication): To use organization's own identity management system (IdP), QPR ProcessAnalyzer can be integrated with it using the SAML 2.0 protocol. SAML 2.0 is also the optimal solution for QPR Cloud (more information about SAML authentication).
  • LDAP/AD authentication: QPR ProcessAnalyzer can use the traditional AD/LDAP authentication, although the SAML authentication as more flexible is the recommended option (more information about LDAP/AD authentication.)
  • Snowflake authentication: QPR ProcessAnalyzer user management is integrated into Snowflake user management. Authenticated Snowflake users are directly logged in to QPR ProcessAnalyzer when a matching username is found in the QPR ProcessAnalyzer user management. Note that QPR ProcessAnalyzer groups and Snowflake roles are not connected, so QPR ProcessAnalyzer user management is still used to manage user permissions. This method is only available when running QPR ProcessAnalyzer as Snowflake Native Application.

QPR ProcessAnalyzer has its own robust and flexible authorization mechanisms that controls user's access to data. Permissions are managed for individual projects, which consist of models, datatables, dashboards and scripts. When there are lot of users, it's recommended to use groups to make user management easier: when permissions have been configured to groups, users can be easily assigned to groups. More information about permissions and user session management.

QPR ProcessAnalyzer also supports case level permissions, allowing to determine permissions for each case separately based on flexible rules and the attribute data in the eventlog. You can for example assign permissions for cases from each company code for persons working in the respective area. More information about case level permissions.

For connection between QPR ProcessAnalyzer Server and SQL Server database, the system uses one connection string and a shared connection pool with a single SQL Server user (login) for all database operations. For easy setup, the database user can use the db_owner role, but the hardened configuration complying with the minimum privilege principle is recommended.

Credentials for the data sources can be stored as secrets, which allow to store passwords and other confidential data in a secure way, that the stored credentials cannot be seen by any user.

QPR ProcessAnalyzer REST API

All connections to QPR ProcessAnalyzer Server are established through the QPR ProcessAnalyzer API. For example, the QPR ProcessAnalyzer Web UI and the QPR ProcessAnalyzer ScriptLauncher are using it, and also any external integrations can use it to communicate with the QPR ProcessAnalyzer system.

Version compatibility

As a general rule, QPR ProcessAnalyzer releases are compatible with each other with the backwards compatibility principle where older files or databases are compatible with newer QPR ProcessAnalyzer releases. If a version number has increased, the newer files or databases are not compatible with older QPR ProcessAnalyzer releases. There are the following version information:

  • Dashboard file version: Dashboards can be imported as .qprpa files with each file having a version number. Dashboard file can be imported, if the file has been exported with QPR ProcessAnalyzer using same or earlier file version than in the importing system. However, dashboard exported using QPR ProcessAnalyzer that has later dashboard file version than in the importing environment, the import cannot be done. The dashboard version can be checked by opening the dashboard file in a text editor (contents is a cleartext json) and searching the version property.
  • PACM file version: PACM file can be imported, if the file has been exported with QPR ProcessAnalyzer using same or earlier PACM file version than in the importing system. However, model exported using QPR ProcessAnalyzer that has later PACM file version than in the importing environment, the import cannot be done.
  • Database major version increases when the database structure changes in a way that the backwards compatibility to earlier releases is lost. Thus, database major version used by QPR ProcessAnalyzer Server must match with the major version of the QPR ProcessAnalyzer database (in the SQL Server). If the major version of the database is lower, QPR ProcessAnalyzer Server upgrades the database to the same major and minor database version as the server. If the database has a newer major version than the QPR ProcessAnalyzer Server, that database cannot be used with that server release.
  • Database minor version is changed when the database structure changes in a way that the earlier QPR ProcessAnalyzer Server releases maintain compatibility with the new database schema (within the same database major version).