Snowflake Connection Configuration

This page describes the native method to use Snowflake where the process mining queries are run in the Snowflake and the eventlog data stays in the Snowflake. There are two steps to configure: setup a Snowflake account and configure QPR ProcessAnalyzer to use the account through an ODBC connection.

QPR ProcessAnalyzer needs one Snowflake schema where it can create functions and store cache tables, and write permissions are needed to that schema. QPR ProcessAnalyzer can use any tables in Snowflake as eventlogs source data, requiring at least read permission to the tables.

Snowflake account configuration

These instructions provide one scenario to configure Snowflake for QPR ProcessAnalyzer. Configuration can also be done differently, e.g., if there is other use for the same Snowflake account.

For permissions, these instructions are based on the database and schema ownership, giving all permissions to the objects. The next chapter provides instruction for the minimum permissions.

Create a new Snowflake account in the Snowflake site: https://www.snowflake.com/ (or use an existing account). The account is created to the selected cloud platform and site, so consider a location that is close to the QPR ProcessAnalyzer hosting site.
Go to https://app.snowflake.com and login in to your Snowflake account. Change the role to ACCOUNTADMIN.
In Account > Roles, click the Role button to create a new role. Define Name QPRPA, and click Create Role.
In Account > Users, click the User button to create a new user. Define User Name QPRPA and define a strong password. Check also that the Force user to change password on first time login is disabled. Click the Create User button.
Select the created user, and click the Grant Role button. In the Role to grant list, select QPRPA, and click Grant.
In Compute > Warehouses, click the Warehouse button to create a new warehouse. Define Name QPRPA, and select a suitable Size for the warehouse. In the Advanced Warehouse Options, check that Auto Resume and Auto Suspend are enabled. Set the Suspend After time based on your performance requirements (for example 3 minutes is a suitable value). Click Create Warehouse.
Select the created warehouse and in the Privileges section and click the Privilege button. For Role, select the QPRPA role. For Privileges, select MONITOR, OPERATE and USAGE. Click the Grant Privileges button.
In Data > Databases, click the Database button to create a new database. Define Name QPRPA, and click the Create button.
Set the role as the owner of the database by selecting Transfer Ownership for the database in the menu and selecting the QPRPA role.
Select the created database, and click the Schema button to create a new schema. Define Name QPRPA, and click the Create button.
Set the role as the owner of the schema by selecting Transfer Ownership for the schema in the menu and selecting the QPRPA role.
(Optional) If the connection to Snowflake is lost for any reason, QPR ProcessAnalyzer may not be able to cancel pending queries. To get pending queries automatically cancelled to save costs, it's advisable to change the ABORT_DETACHED_QUERY session setting to true: https://docs.snowflake.com/en/sql-reference/parameters.html#abort-detached-query.
(Optional) Running a Snowflake warehouse consumes credits in the Snowflake account, and thus it's a good practice to set resource monitoring to control the credit usage: In Compute > Resource Monitors, click the Resource Monitor button to create a new resources monitor. Choose suitable monitor settings for your needs, and click the Create Resource Monitor button.

Set Snowflake ODBC connection

To add the Snowflake ODBC connection to QPR ProcessAnalyzer, following steps are required:

Install Snowflake ODBC driver in the machine running QPR ProcessAnalyzer Server. More information about the ODBC driver installation: https://docs.snowflake.com/en/user-guide/odbc.html.
Configure Snowflake ODBC connection string to the QPR ProcessAnalyzer configuration table. When configuring Snowflake as instructed above, the following connection string can be used:

Driver={SnowflakeDSIIDriver};Application=QPR_ProcessAnalyzer;Server=<account_identifier>.snowflakecomputing.com;Database=QPRPA;Schema=QPRPA;Warehouse=QPRPA;Role=QPRPA;query_timeout=3600;uid=QPRPA;pwd=<password>

Replace <password> with the password of the QPRPA user. In addition, replace <account_identifier> with the account identifier of your Snowflake account. More information about finding your Snowflake account identifier: https://docs.snowflake.com/en/user-guide/admin-account-identifier.html.

Please also include the Application tag to the connection string, identifying QPR ProcessAnalyzer related usage for collecting usage statistics.

More information about Snowflake ODBC connection strings: https://docs.snowflake.com/en/user-guide/odbc-parameters.html

Note: By default the Snowflake ODBC driver generates quite much logging to files which might unnecessarily fill up the disk space. It's recommended to set the LogLevel parameter to value 0. The path where the logs are created can be seen in Windows registry (key LogPath). More information about configuring Snowflake ODBC parameters: https://docs.snowflake.com/en/developer-guide/odbc/odbc-parameters.

Test connection string

The connection string can be tested in the QPR ProcessAnalyzer UI to make sure it works before configuring it. The following expression tests a connection string, and if the connection works, it returns the Snowflake user name:

ImportOdbc(
  `...`,
  "SELECT CURRENT_USER()"
).toCsv();

If an error dialog is shown, connection to Snowflake couldn't be made and thus the connection string doesn't work.

Query timeout

Query timeout is important for system responsiveness and cost management because for example an exceptionally heavy query might block the warehouse. If a running query reaches the defined timeout, it will be stopped automatically releasing the block.

There are timeouts both in QPR ProcessAnalyzer and Snowflake. QPR ProcessAnalyzer query timeout is by default 5 minutes which will send a cancellation to Snowflake. Sometimes the cancellation may not be processd successfully and the query won't be stopped in Snowflake. Thus, the Snowflake side timeout is also important.

Snowflake has two timeouts: query timeout (by default 2 days) and queue timeout (by default disabled). Query timeout is counted during the time a query is running in the warehouse, and the queue timeout when a query is waiting for an available warehouse. It's recommended to set both these timeouts to same value as the QPR ProcessAnalyzer timeout. Snowflake timeouts can be set for the warehouse or the service user (in the ODBC connection string). If the warehouse has other usage than the QPR ProcessAnalyzer queries, it might be a better option to set the timeout to the user.

Example: set Snowflake timeouts for the warehouse to 5 minutes:

ALTER WAREHOUSE <warehousename> SET STATEMENT_TIMEOUT_IN_SECONDS = 300;
ALTER WAREHOUSE <warehousename> SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 300;

More information:

Hardened security with key-pair authentication

Instead of using password for the Snowflake ODBC connection, it's more secure to use the key-pair authentication available in Snowflake. The idea is that a key-pair is generated (public key and private key), the public key is stored in Snowflake cloud, and the private key stored in QPR ProcessAnalyzer. QPR ProcessAnalyzer will securely authenticate to Snowflake by proving that it has the private key corresponding to the public key. To setup the key-pair authentication, follow these steps: https://docs.snowflake.com/en/user-guide/key-pair-auth.

When the key-pair authentication is used, the following parameters is added to the connection string (and pwd parameter won't be used):

Authenticator: Use value "SNOWFLAKE_JWT".
PRIV_KEY_FILE: The private key file location in the local disk.
PRIV_KEY_FILE_PWD: Password for the private key file (this is not the Snowflake user password, but it's for opening the encrypted private key file).

Minimal Snowflake permissions

The above instructions configure the Snowflake user as the database and schema owner, giving all permissions to the database and schema. If required, it's possible to configure the Snowflake user with less permissions. The minimal permissions for the database and schema configured to the Snowflake connection string are as follows (concerns both the global and project-level).

Database needs following permission:

USAGE

Schema needs following permissions:

USAGE
CREATE TABLE (for creating cache tables and being able to ingest data through QPR ProcessAnalyzer UI)
CREATE VIEW (temporarily views are used to determine column data types)
CREATE FUNCTION (Python functions are needed for business calendar, conformance calculation, and clustering)
CREATE STAGE (internal table stages used when ingesting data)
CREATE FILE FORMAT (for reading data from stage when ingesting data)
CREATE SNOWFLAKE.ML.TOP_INSIGHTS (for creating the TOP_INSIGHTS object to be able to run the Snowflake Top Insights used by the root causes analysis)

If data is not imported using QPR ProcessAnalyzer but using Snowflake's own tools, the CREATE STAGE and CREATE FILE FORMAT permissions are not required.

When QPR ProcessAnalyzer creates Snowflake objects to the schema (e.g., tables, functions and file format), the Snowflake user will get ownership to them (and thus full permissions).

QPR ProcessAnalyzer can use any table or view in Snowflake, and the following minimal permissions are required:

Database: USAGE
Schema : USAGE
Table/view: SELECT

Snowflake Top Insights privilege

Using the root causes analysis powered by the Snowflake Top Insights requires a TOP_INSIGHTS schema-level object in the schema where the Snowflake connection string is pointing to. QPR ProcessAnalyzer will automatically create the TOP_INSIGHTS object which requires the SNOWFLAKE.ML.TOP_INSIGHTS privilege on the schema where the instance is created.

The privilege can be granted with the following SQL command:

GRANT CREATE SNOWFLAKE.ML.TOP_INSIGHTS ON SCHEMA <database>.<schema> TO ROLE <role>;

If not desired that grant this privilege to the Snowflake service user used by QPR ProcessAnalyzer, the TOP_INSIGHTS object can be created beforehand as follows:

CREATE SNOWFLAKE.ML.TOP_INSIGHTS QPRPA_TOP_INSIGHTS();

If the Snowflake service user used by QPR ProcessAnalyzer is not the owner of the TOP_INSIGHTS object, the user must have the USAGE privilege on it for QPR ProcessAnalyzer to use it.

Datatable key for events datatable

Analytics queries executed by QPR ProcessAnalyzer in Snowflake require to know key column(s) for each table. The key columns mean that the combination of values in the key columns are unique for each row in the table. For cases datatable, the case id column is assumed to be the key. For the events datatable, there is no key information available by default, so for the events datatable the key column(s) need to be defined in the Datatable properties dialog. If the key columns are not specified, all columns are assumed to be the key (which is the slowest option from the performance viewpoint). Note that defining the key columns incorrectly, may lead to incorrect calculation results.

If there are no suitable columns for keys, a new column containing a sequence number can be added. A sequence can be created as follows:

CREATE OR REPLACE SEQUENCE seq1;

Snowflake table can be set to use the sequence as follows:

CREATE OR REPLACE TABLE foo (a NUMBER DEFAULT seq1.NEXTVAL, b NUMBER);

When new data is imported to the table, new rows automatically get a unique sequence number.

Connecting to Snowflake tables

When a datatable is created in QPR ProcessAnalyzer, corresponding table is created in Snowflake storing the actual data. The table is not yet created when a new datatable is added, but when there is at least one column in the datatable. When using default names, Snowflake tables created by QPR ProcessAnalyzer are named as follows: qprpa_dt_<databaseId>_<datatableId>, where <databaseId> identifies the QPR ProcessAnalyzer environment and <datatableId> identifies the datatable within the environment. The database id is set in the QPR ProcessAnalyzer configuration table. Datatables can also be configured to use user-defined table, schema or database. The custom table behaves similarly as the default table, for example columns can be modified and data imported, but when the datatable is deleted, the custom-named table is not dropped in Snowflake.

Datatables can managed both in the QPR ProcessAnalyzer web UI and in the Snowflake. When datatables are modified in the Snowflake, a synchronization (see see synchronize function) is needed in QPR ProcessAnalyzer to check whether the underlying data in Snowflake has changed. Note that while the tables can be modified in Snowflake, new datatables don't appear to QPR ProcessAnalyzer until they have been created in the QPR ProcessAnalyzer UI. When the underlying table in the Snowflake is deleted all rows and columns appear to be removed from the datatable, but the datatable itself is not deleted. Schemas and databases are not created or modified by QPR ProcessAnalyzer (just tables).

Connecting to Snowflake views

In addition to tables, the datatables can be connected to Snowflake views (https://docs.snowflake.com/en/user-guide/views-introduction.html). When a process mining query is made to the cases and events datatables, the queries in the linked Snowflake views are executed as part of the process mining queries. Note that there cannot be a table and a view with the same name in the same schema, so the table may need to be removed before a view can be created.

New view can be created in Snowflake as follows (where ViewName is the name of the view):

CREATE OR REPLACE VIEW "ViewName" AS
SELECT * FROM VALUES (1, 'red'), (2, 'orange'), (3, 'yellow'), (4, 'green') AS Colors(Id, Value)

The SELECT part in the above statement can be any query allowed by the Snowflake. The query can also contain data transformations which means that those transformation will be performed when the dashboards are used in QPR ProcessAnalyzer. This is following the ELT (extract, load, transform) principle, where the non-transformed data is loaded into the system and needed transformations are performed when the data is used by QPR ProcessAnalyzer. Note that there is a negative impact on the performance when performing transformations on demand.

When a datatable with a Snowflake view is deleted, the view in the datasource is not deleted. For datatables with Snowflake view, following data modification operation are not allowed: Import, Persist, AddColumn, Merge, RemoveColumns, RenameColumns, and Truncate.

Note that the query in the view is executed using permissions of the user account defined in the ODBC connection string, and thus the permissions are managed in Snowflake. Still, the normal QPR ProcessAnalyzer datatable permissions are applied for datatable with Snowflake views, so if user cannot see the datatable, user cannot access the data in the view either.

Finding query from Snowflake query history

If there are performance issues in running Snowflake queries, here are instructions how to find the corresponding query from the Snowflake query history (https://docs.snowflake.com/en/sql-reference/account-usage/query_history). First look at the QPR ProcessAnalyzer task log to find the problematic query there. Add Request id as a column to the table to see the request id of the problematic query. Go to the Snowflake query history and Filter queries based on the SQL Text by searching with the request id. The request id appears in the end of the SQL query as a comment, so it's easy to find the corresponding Snowflake query (or queries).

Following can be checked from the problematic Snowflake query:

Has any time spent in the queue. If there is queuing, it indicates that there are too many queries for the warehouse to handle at the same time. Solution is to use the multi-cluster warehouses which can scale based on the load.
Are there any other simultaneously running queries in the same warehouse (which may have slowed down the execution).
Try rerunning the query to check whether the issue can be replicated.

Automatic data update detection

When data in Snowflake tables changes (data is added, modified or removed), QPR ProcessAnalyzer automatically detects that data has updated and performs calculations using the latest data. Changing of data is checked every 5 minutes, so it takes maximum of that time for new data to appear in the dashboards. QPR ProcessAnalyzer has different levels of caches (to improve performance) explaining why the new data might not immediately appear in the dashboards. When the change of data is detected, the caches are cleared and that calculations are performed using the latest data in Snowflake.

Automatic query cancellation

If the result of a pending query running in Snowflake is not needed anymore, the query is automatically cancelled by QPR ProcessAnalyzer, to save costs and reserve computing capacity for new queries. Queries are cancelled, e.g., when user makes filtering in a dashboard or changes chart settings, because the new filter or settings will make an updated query and the old query (if still pending) is cancelled.

Alternative traditional ODBC Import method

Alternatively, the traditional method can be used to import data from Snowflake to QPR ProcessAnalyzer and use the QPR ProcessAnalyzer's in-memory calculation engine. More information about importing data from an ODBC datasource:

Snowflake Cortex

Snowflake Cortex LLMs are available for environments that use Snowflake models.
See here for Snowflake Cortex LLM availability: https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#availability
Additionally, access to LLMs that are not available on your region can be enabled by the cross-region inference: https://docs.snowflake.com/en/user-guide/snowflake-cortex/cross-region-inference

Snowflake Connection Configuration

Contents

Snowflake account configuration

Set Snowflake ODBC connection

Test connection string

Query timeout

Hardened security with key-pair authentication

Minimal Snowflake permissions

Snowflake Top Insights privilege

Datatable key for events datatable

Connecting to Snowflake tables

Connecting to Snowflake views

Finding query from Snowflake query history

Automatic data update detection

Automatic query cancellation

Alternative traditional ODBC Import method

Snowflake Cortex

Navigation menu

Snowflake Connection Configuration

Snowflake account configuration

Set Snowflake ODBC connection

Test connection string

Query timeout

Hardened security with key-pair authentication

Minimal Snowflake permissions

Snowflake Top Insights privilege

Datatable key for events datatable

Connecting to Snowflake tables

Connecting to Snowflake views

Finding query from Snowflake query history

Automatic data update detection

Automatic query cancellation

Alternative traditional ODBC Import method

Snowflake Cortex

Navigation menu

Search