Snowflake Connection Configuration: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
Line 45: Line 45:
<pre>
<pre>
DatatableById(123).NameInDatasource
DatatableById(123).NameInDatasource
</pre>
The view can be created in Snowflake as follows (replace <ViewNameInDatasource> with the result of the previous expression query):
<pre>
CREATE OR REPLACE VIEW <ViewNameInDatasource> AS
SELECT * FROM VALUES (1, 'red'), (2, 'orange'), (3, 'yellow'), (4, 'green') AS Colors(Id, Value)
</pre>
</pre>



Revision as of 12:58, 21 October 2022

This page describes the native method to use Snowflake where the process mining queries are run in the Snowflake and the eventlog data stays in the Snowflake. There are two steps to configure: setup a Snowflake account and configure QPR ProcessAnalyzer to use the account through an ODBC connection.

Alternatively, the traditional method can be used to import data from Snowflake to QPR ProcessAnalyzer and use the QPR ProcessAnalyzer's in-memory calculation engine. More information about importing data from an ODBC datasource:

Snowflake account configuration

These instructions provide one possible way to configure the Snowflake account for QPR ProcessAnalyzer. Configuration can also be done differently based on requirements, e.g., if there is other use for the same Snowflake account. These instructions optionally set the least possible permissions for the Snowflake user account (minimum privilege principle).

  1. Create a new Snowflake account in the Snowflake site: https://www.snowflake.com/ (or use an existing account). The account is created to the selected cloud platform and site, so consider a location that is close to the QPR ProcessAnalyzer hosting site.
  2. Go to https://app.snowflake.com and login in to your Snowflake account.
  3. In Account > Roles, click the Role button to create a new role. Define Name QPRPA, and click Create Role.
  4. In Account > Users, click the User button to create a new user. Define User Name QPRPA and define a strong password. Check also that the Force user to change password on first time login is disabled. Click the Create User button.
  5. Select the created user, and click the Grant Role button. In the Role to grant list, select QPRPA, and click Grant.
  6. In Compute > Warehouses, click the Warehouse button to create a new warehouse. Define Name QPRPA, and select a suitable Size for the warehouse. In the Advanced Warehouse Options, check that Auto Resume and Auto Suspend are enabled. Set the Suspend After time based on your performance requirements (for example 5 minutes is a good initial value). Click Create Warehouse.
  7. (Optional security hardening) Select the created warehouse and in the Privileges section and click the Privilege button. For Role, select the QPRPA role. For Privileges, select MONITOR, OPERATE and USAGE. Click the Grant Privileges button.
  8. In Data > Databases, click the Database button to create a new database. Define Name QPRPA, and click the Create button.
  9. (Optional security hardening) Select the created database, and in the Privileges section, click the Privilege button. For Role, select the QPRPA role. For Privileges, select USAGE. Click Grant Privileges.
  10. Select the created database, and click the Schema button to create a new schema. Define Name QPRPA, and click the Create button.
  11. (Optional security hardening) Select the created schema, and in the Privileges section, click the Privilege button. For Role, select the QPRPA role. For Privileges, select CREATE VIEW, CREATE TABLE, CREATE FILE FORMAT and USAGE. Click the Grant Privileges button.
  12. (Optional) Running a Snowflake warehouse consumes credits in the Snowflake account, and thus it's a good practice to set resource monitoring to control the credit usage: In Compute > Resource Monitors, click the Resource Monitor button to create a new resources monitor. Choose suitable monitor settings for your needs, and click the Create Resource Monitor button.

Set Snowflake ODBC connection

To add the Snowflake ODBC connection to QPR ProcessAnalyzer, following steps are required:

  1. Install Snowflake ODBC driver in the machine running QPR ProcessAnalyzer Server. More information about the ODBC driver installation: https://docs.snowflake.com/en/user-guide/odbc.html.
  2. Configure Snowflake ODBC connection string to the QPR ProcessAnalyzer configuration table. When configuring Snowflake as instructed above, the following connection string can be used:
Driver={SnowflakeDSIIDriver};Application=QPR_ProcessAnalyzer;Server=<account_identifier>.snowflakecomputing.com;Database=QPRPA;Schema=QPRPA;Warehouse=QPRPA;Role=QPRPA;uid=QPRPA;pwd=<password>

Replace <password> with the password of the QPRPA user. In addition, replace <account_identifier> with the account identifier of your Snowflake account. More information about finding your Snowflake account identifier: https://docs.snowflake.com/en/user-guide/admin-account-identifier.html.

Please also include the Application tag to the connection string, identifying QPR ProcessAnalyzer related usage for collecting usage statistics.

More information about Snowflake ODBC connection strings: https://docs.snowflake.com/en/user-guide/odbc-parameters.html

Snowflake database tables

When a datatable is created in QPR ProcessAnalyzer, corresponding table is created in the Snowflake storing the actual data. The table is not yet created when a new datatable is created, but instead when there is at least one column in the datatable. The Snowflake tables created by QPR ProcessAnalyzer are named as follows: qprpa_dt_<databaseId>_<datatableId>, where <databaseId> identifies the QPR ProcessAnalyzer environment and <datatableId> identify the datatable within the environment. The database id is set in the QPR ProcessAnalyzer configuration table. Due to the naming, it's possible to configure multiple QPR ProcessAnalyzer environments to use the same Snowflake database and schema. Alternatively, there can be different schemas or databases for each environment.

Datatables can managed both in the QPR ProcessAnalyzer web UI and in the Snowflake. When datatables are modified in the Snowflake, a synchronization (see see synchronize function) needs to be called in QPR ProcessAnalyzer to inform that the underlying data in Snowflake has changed. Still, creating and deleting datatables, are operations that are only possible in QPR ProcessAnalyzer web UI.

Using database views

It's possible to use any data available in Snowflake as source for cases and events. This is done by using database views that are linked to the QPR ProcessAnalyzer datatables. When the datatable is used, the query in the database view is executed as part of the usual process mining queries. The views need to be created to the same Snowflake database and schema as the tables. The table (that has been created automatically) needs to be removed before the view can be created. Note that if the datatable doesn't have columns, the database table doesn't exist, and thus it's enough to create the database view. The expected name of the view can be queried using the Expression Designer with following query (define your datatable id):

DatatableById(123).NameInDatasource

The view can be created in Snowflake as follows (replace <ViewNameInDatasource> with the result of the previous expression query):

CREATE OR REPLACE VIEW <ViewNameInDatasource> AS
SELECT * FROM VALUES (1, 'red'), (2, 'orange'), (3, 'yellow'), (4, 'green') AS Colors(Id, Value)

When the view has been created, the Synchronize needs to be called update the status (define your datatable id):

DatatableById(123).Synchronize()

Type of the datasource of a datatable can be queried as follows (define your datatable id):

DatatableById(123).DataSourceObjectType

When a datatable having a datatable view is deleted, the view in the datasource is not deleted (this is unlike to the database table which is deleted). For datatables having a datatable view, following data modification operation are not allowed: Import, Persist, AddColumn, Merge, RemoveColumns, RenameColumns, and Truncate. If the data for a datatable comes from a view, the LastImportDate and LastImportBy settings are not available.