Snowflake Connection Configuration: Difference between revisions
| (40 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
| This page describes the ''native'' method to use Snowflake where the process mining queries are run in the Snowflake and the eventlog data stays in the Snowflake. There are two steps to configure: setup a [[#Snowflake account configuration|Snowflake account]] and configure QPR ProcessAnalyzer to use the account through an [[#Set Snowflake ODBC connection|ODBC connection]]. | This page describes the ''native'' method to use Snowflake where the process mining queries are run in the Snowflake and the eventlog data stays in the Snowflake. There are two steps to configure: setup a [[#Snowflake account configuration|Snowflake account]] and configure QPR ProcessAnalyzer to use the account through an [[#Set Snowflake ODBC connection|ODBC connection]]. | ||
| QPR ProcessAnalyzer needs one Snowflake schema where it can create functions and store cache tables, and write permissions are needed to that schema. QPR ProcessAnalyzer can use any tables in Snowflake as eventlogs source data, requiring at least read permission to the tables. | |||
| ==Snowflake account configuration == | ==Snowflake account configuration == | ||
| These instructions provide one  | These instructions provide one scenario to configure Snowflake for QPR ProcessAnalyzer. Configuration can also be done differently, e.g., if there is other use for the same Snowflake account. | ||
| For permissions, these instructions are based on the database and schema ownership, giving all permissions to the objects. The [[#Minimal Snowflake permissions|next chapter]] provides instruction for the minimum permissions. | |||
| #Create a new Snowflake account in the Snowflake site: https://www.snowflake.com/ (or use an existing account). The account is created to the selected cloud platform and site, so consider a location that is close to the QPR ProcessAnalyzer hosting site. | #Create a new Snowflake account in the Snowflake site: https://www.snowflake.com/ (or use an existing account). The account is created to the selected cloud platform and site, so consider a location that is close to the QPR ProcessAnalyzer hosting site. | ||
| # Go to https://app.snowflake.com and login in to your Snowflake account. Change the role to ACCOUNTADMIN. | #Go to https://app.snowflake.com and login in to your Snowflake account. Change the role to ACCOUNTADMIN. | ||
| #In '''Account''' > '''Roles''', click the '''Role''' button to create a new role. Define '''Name''' ''QPRPA'', and click '''Create Role'''. | #In '''Account''' > '''Roles''', click the '''Role''' button to create a new role. Define '''Name''' ''QPRPA'', and click '''Create Role'''. | ||
| # In '''Account''' > '''Users''', click the '''User''' button to create a new user. Define '''User Name''' ''QPRPA'' and define a strong password. Check also that the '''Force user to change password on first time login''' is disabled. Click the '''Create User''' button. | #In '''Account''' > '''Users''', click the '''User''' button to create a new user. Define '''User Name''' ''QPRPA'' and define a strong password. Check also that the '''Force user to change password on first time login''' is disabled. Click the '''Create User''' button. | ||
| #Select the created user, and click the '''Grant Role''' button. In the '''Role to grant''' list, select ''QPRPA'', and click '''Grant'''. | #Select the created user, and click the '''Grant Role''' button. In the '''Role to grant''' list, select ''QPRPA'', and click '''Grant'''. | ||
| #In '''Compute''' > '''Warehouses''', click the '''Warehouse''' button to create a new warehouse. Define '''Name''' ''QPRPA'', and select a suitable '''Size''' for the warehouse. In the '''Advanced Warehouse Options''', check that '''Auto Resume''' and '''Auto Suspend''' are enabled. Set the '''Suspend After''' time based on your performance requirements (for example 3 minutes is a suitable value). Click '''Create Warehouse'''. | #In '''Compute''' > '''Warehouses''', click the '''Warehouse''' button to create a new warehouse. Define '''Name''' ''QPRPA'', and select a suitable '''Size''' for the warehouse. In the '''Advanced Warehouse Options''', check that '''Auto Resume''' and '''Auto Suspend''' are enabled. Set the '''Suspend After''' time based on your performance requirements (for example 3 minutes is a suitable value). Click '''Create Warehouse'''. | ||
| #Select the created warehouse and in the '''Privileges''' section and click the '''Privilege''' button. For '''Role''', select the ''QPRPA'' role. For '''Privileges''', select '''MONITOR''', '''OPERATE''' and '''USAGE'''. Click the '''Grant Privileges''' button. | #Select the created warehouse and in the '''Privileges''' section and click the '''Privilege''' button. For '''Role''', select the ''QPRPA'' role. For '''Privileges''', select '''MONITOR''', '''OPERATE''' and '''USAGE'''. Click the '''Grant Privileges''' button. | ||
| #In '''Data''' > '''Databases''', click the '''Database''' button to create a new database. Define '''Name''' ''QPRPA'', and click the '''Create''' button. | #In '''Data''' > '''Databases''', click the '''Database''' button to create a new database. Define '''Name''' ''QPRPA'', and click the '''Create''' button. | ||
| # Set the role as the owner of the database by selecting '''Transfer Ownership''' for the database in the menu and selecting the ''QPRPA'' role. | #Set the role as the owner of the database by selecting '''Transfer Ownership''' for the database in the menu and selecting the ''QPRPA'' role. | ||
| #Select the created database, and click the '''Schema''' button to create a new schema. Define '''Name''' ''QPRPA'', and click the '''Create''' button. | #Select the created database, and click the '''Schema''' button to create a new schema. Define '''Name''' ''QPRPA'', and click the '''Create''' button. | ||
| #Set the role as the owner of the schema by selecting '''Transfer Ownership''' for the schema in the menu and selecting the ''QPRPA'' role. | #Set the role as the owner of the schema by selecting '''Transfer Ownership''' for the schema in the menu and selecting the ''QPRPA'' role. | ||
| #(Optional) If the connection to Snowflake is lost for any reason, QPR ProcessAnalyzer may not be able to cancel pending queries. To get pending queries automatically cancelled to save costs, it's advisable to change the ABORT_DETACHED_QUERY session setting to ''true'': https://docs.snowflake.com/en/sql-reference/parameters.html#abort-detached-query. | #(Optional) If the connection to Snowflake is lost for any reason, QPR ProcessAnalyzer may not be able to cancel pending queries. To get pending queries automatically cancelled to save costs, it's advisable to change the ABORT_DETACHED_QUERY session setting to ''true'': https://docs.snowflake.com/en/sql-reference/parameters.html#abort-detached-query. | ||
| #(Optional) Running a Snowflake warehouse consumes credits in the Snowflake account, and thus it's a good practice to set resource monitoring to control the credit usage: In '''Compute''' > '''Resource Monitors''', click the '''Resource Monitor''' button to create a new resources monitor. Choose suitable monitor settings for your needs, and click the '''Create Resource Monitor''' button. | #(Optional) Running a Snowflake warehouse consumes credits in the Snowflake account, and thus it's a good practice to set resource monitoring to control the credit usage: In '''Compute''' > '''Resource Monitors''', click the '''Resource Monitor''' button to create a new resources monitor. Choose suitable monitor settings for your needs, and click the '''Create Resource Monitor''' button. | ||
| == Set Snowflake ODBC connection== | == Set Snowflake ODBC connection== | ||
| Line 39: | Line 27: | ||
| #Configure Snowflake ODBC connection string to the [[PA_Configuration_database_table|QPR ProcessAnalyzer configuration table]]. When configuring Snowflake as instructed above, the following connection string can be used: | #Configure Snowflake ODBC connection string to the [[PA_Configuration_database_table|QPR ProcessAnalyzer configuration table]]. When configuring Snowflake as instructed above, the following connection string can be used: | ||
| <pre> | <pre> | ||
| Driver={SnowflakeDSIIDriver};Application=QPR_ProcessAnalyzer;Server=<account_identifier>.snowflakecomputing.com;Database=QPRPA;Schema=QPRPA;Warehouse=QPRPA;Role=QPRPA;uid=QPRPA;pwd=<password> | Driver={SnowflakeDSIIDriver};Application=QPR_ProcessAnalyzer;Server=<account_identifier>.snowflakecomputing.com;Database=QPRPA;Schema=QPRPA;Warehouse=QPRPA;Role=QPRPA;query_timeout=3600;uid=QPRPA;pwd=<password> | ||
| </pre> | </pre> | ||
| Line 49: | Line 37: | ||
| Note: By default the Snowflake ODBC driver generates quite much logging to files which might unnecessarily fill up the disk space. It's recommended to set the '''LogLevel''' parameter to value '''0'''. The path where the logs are created can be seen in Windows registry (key '''LogPath'''). More information about configuring Snowflake ODBC parameters: https://docs.snowflake.com/en/developer-guide/odbc/odbc-parameters. | Note: By default the Snowflake ODBC driver generates quite much logging to files which might unnecessarily fill up the disk space. It's recommended to set the '''LogLevel''' parameter to value '''0'''. The path where the logs are created can be seen in Windows registry (key '''LogPath'''). More information about configuring Snowflake ODBC parameters: https://docs.snowflake.com/en/developer-guide/odbc/odbc-parameters. | ||
| === Test connection string === | |||
| The connection string can be tested in the QPR ProcessAnalyzer UI to make sure it works before configuring it. The following expression tests a connection string, and if the connection works, it returns the Snowflake user name: | |||
| <pre> | |||
| ImportOdbc( | |||
|   `...`, | |||
|   "SELECT CURRENT_USER()" | |||
| ).toCsv(); | |||
| </pre> | |||
| If an error dialog is shown, connection to Snowflake couldn't be made and thus the connection string doesn't work. | |||
| == Query timeout== | |||
| Query timeout is important for system responsiveness and cost management because for example an exceptionally heavy query might block the warehouse. If a running query reaches the defined timeout, it will be stopped automatically releasing the block. | |||
| There are timeouts both in QPR ProcessAnalyzer and Snowflake. QPR ProcessAnalyzer [[PA_Configuration_database_table#QueryTimeout|query timeout]] is by default 5 minutes which will send a cancellation to Snowflake. Sometimes the cancellation may not be processd successfully and the query won't be stopped in Snowflake. Thus, the Snowflake side timeout is also important. | |||
| Snowflake has two timeouts: ''query timeout'' (by default 2 days) and ''queue timeout'' (by default disabled). Query timeout is counted during the time a query is running in the warehouse, and the queue timeout when a query is waiting for an available warehouse. It's recommended to set both these timeouts to same value as the QPR ProcessAnalyzer timeout. Snowflake timeouts can be set for the warehouse or the service user (in the ODBC connection string). If the warehouse has other usage than the QPR ProcessAnalyzer queries, it might be a better option to set the timeout to the user. | |||
| Example: set Snowflake timeouts for the warehouse to 5 minutes: | |||
| <pre> | |||
| ALTER WAREHOUSE <warehousename> SET STATEMENT_TIMEOUT_IN_SECONDS = 300; | |||
| ALTER WAREHOUSE <warehousename> SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 300; | |||
| </pre> | |||
| More information: | |||
| * https://docs.snowflake.com/en/sql-reference/parameters#statement-timeout-in-seconds | |||
| * https://docs.snowflake.com/en/sql-reference/parameters#label-statement-queued-timeout-in-seconds | |||
| == Hardened security with key-pair authentication == | == Hardened security with key-pair authentication == | ||
| Instead of using password for the ODBC connection, it's more secure to use the key-pair authentication available in Snowflake. The idea is that a key-pair is generated (public key and private key), the public key is stored  | Instead of using password for the Snowflake ODBC connection, it's more secure to use the key-pair authentication available in Snowflake. The idea is that a key-pair is generated (''public key'' and ''private key''), the public key is stored in Snowflake cloud, and the private key stored in QPR ProcessAnalyzer. QPR ProcessAnalyzer will securely authenticate to Snowflake by proving that it has the private key corresponding to the public key. To setup the key-pair authentication, follow these steps: https://docs.snowflake.com/en/user-guide/key-pair-auth. | ||
| When the key-pair authentication is used, the following parameters  | When the key-pair authentication is used, the following parameters is added to the connection string (and ''pwd'' parameter won't be used): | ||
| * '''Authenticator:''' Use value "SNOWFLAKE_JWT" | * '''Authenticator:''' Use value "SNOWFLAKE_JWT". | ||
| * '''PRIV_KEY_FILE''': The private key file location in the local disk. | * '''PRIV_KEY_FILE''': The private key file location in the local disk. | ||
| * '''PRIV_KEY_FILE_PWD''': Password for the private key file. | * '''PRIV_KEY_FILE_PWD''': Password for the private key file (this is not the Snowflake user password, but it's for opening the encrypted private key file). | ||
| == Minimal Snowflake permissions == | |||
| The above instructions configure the Snowflake user as the database and schema owner, giving all permissions to the database and schema. If required, it's possible to configure the Snowflake user with less permissions. The minimal permissions for the database and schema configured to the Snowflake connection string are as follows (concerns both the [[PA_Configuration_database_table#SnowflakeConnectionString|global]] and [[QPR_ProcessAnalyzer_Project_Workspace#Project-level_Snowflake_Connection|project-level]]). | |||
| Database needs following permission: | |||
| * USAGE | |||
| Schema needs following permissions: | |||
| * USAGE | |||
| * CREATE TABLE (for creating cache tables and being able to ingest data through QPR ProcessAnalyzer UI) | |||
| * CREATE VIEW (temporarily views are used to determine column data types) | |||
| * CREATE FUNCTION (Python functions are needed for business calendar, conformance calculation, and clustering) | |||
| * CREATE STAGE (internal table stages used when ingesting data) | |||
| * CREATE FILE FORMAT (for reading data from stage when ingesting data) | |||
| If data is not imported using QPR ProcessAnalyzer but using Snowflake's own tools, the CREATE STAGE and CREATE FILE FORMAT permissions are not required. | |||
| When QPR ProcessAnalyzer creates Snowflake objects to the schema (e.g., tables, functions and file format), the Snowflake user will get ownership to them (and thus full permissions). | |||
| QPR ProcessAnalyzer can use any table or view in Snowflake, and the following minimal permissions are required: | |||
| * Database: USAGE | |||
| * Schema : USAGE | |||
| * Table/view: SELECT | |||
| ==Datatable key for events datatable== | |||
| Analytics queries executed by QPR ProcessAnalyzer in Snowflake require to know key column(s) for each table. The key columns mean that the combination of values in the key columns are unique for each row in the table. For cases datatable, the case id column is assumed to be the key. For the events datatable, there is no key information available by default, so for the events datatable the key column(s) need to be defined in the [[Datatable_Properties_Dialog|Datatable properties dialog]]. If the key columns are not specified, all columns are assumed to be the key (which is the slowest option from the performance viewpoint). Note that defining the key columns incorrectly, may lead to incorrect calculation results. | |||
| If there are no suitable columns for keys, a new column containing a sequence number can be added. A sequence can be created as follows: | |||
| <pre> | |||
| CREATE OR REPLACE SEQUENCE seq1; | |||
| </pre> | |||
| Snowflake table can be set to use the sequence as follows: | |||
| <pre> | |||
| CREATE OR REPLACE TABLE foo (a NUMBER DEFAULT seq1.NEXTVAL, b NUMBER); | |||
| </pre> | |||
| When new data is imported to the table, new rows automatically get a unique sequence number. | |||
| ==Connecting to Snowflake tables== | ==Connecting to Snowflake tables== | ||
| Line 78: | Line 131: | ||
| Note that the query in the view is executed using permissions of the user account defined in the [[#Set Snowflake ODBC connection|ODBC connection string]], and thus the permissions are managed in Snowflake. Still, the normal QPR ProcessAnalyzer datatable permissions are applied for datatable with Snowflake views, so if user cannot see the datatable, user cannot access the data in the view either. | Note that the query in the view is executed using permissions of the user account defined in the [[#Set Snowflake ODBC connection|ODBC connection string]], and thus the permissions are managed in Snowflake. Still, the normal QPR ProcessAnalyzer datatable permissions are applied for datatable with Snowflake views, so if user cannot see the datatable, user cannot access the data in the view either. | ||
| == | == Finding query from Snowflake query history == | ||
| If there are performance issues in running Snowflake queries, here are instructions how to find the corresponding query from the Snowflake query history (https://docs.snowflake.com/en/sql-reference/account-usage/query_history). First look at the QPR ProcessAnalyzer [[QPR_ProcessAnalyzer_Logs#Task_Log|task log]] to find the problematic query there. Add '''Request id''' as a column to the table to see the request id of the problematic query. Go to the Snowflake query history and '''Filter''' queries based on the '''SQL Text''' by searching with the request id. The request id appears in the end of the SQL query as a comment, so it's easy to find the corresponding Snowflake query (or queries). | |||
| Following can be checked from the problematic Snowflake query: | |||
| * Has any time spent in the queue. If there is queuing, it indicates that there are too many queries for the warehouse to handle at the same time. Solution is to use the multi-cluster warehouses which can scale based on the load. | |||
| * Are there any other simultaneously running queries in the same warehouse (which may have slowed down the execution). | |||
| * Try rerunning the query to check whether the issue can be replicated. | |||
| When new data  | ==Automatic data update detection== | ||
| When data in Snowflake tables changes (data is added, modified or removed), QPR ProcessAnalyzer automatically detects that data has updated and performs calculations using the latest data. Changing of data is checked every 5 minutes, so it takes maximum of that time for new data to appear in the dashboards. QPR ProcessAnalyzer has different levels of caches (to improve performance) explaining why the new data might not immediately appear in the dashboards. When the change of data is detected, the caches are cleared and that calculations are performed using the latest data in Snowflake. | |||
| ==Automatic query cancellation== | ==Automatic query cancellation== | ||
| Line 101: | Line 150: | ||
| *[[DataFrame_in_Expression_Language#Extract_Data_to_DataFrame|Import data from ODBC using expression scripting]] | *[[DataFrame_in_Expression_Language#Extract_Data_to_DataFrame|Import data from ODBC using expression scripting]] | ||
| *[[SQL_Scripting_Commands#--.23ImportOdbcQuery|Import data from ODBC using SQL scripting]] | *[[SQL_Scripting_Commands#--.23ImportOdbcQuery|Import data from ODBC using SQL scripting]] | ||
| == Snowflake Cortex == | |||
| Snowflake Cortex LLMs are available for environments that use [[QPR_ProcessAnalyzer_System_Architecture#Snowflake-Powered_Calculation|Snowflake models]].<br> | |||
| See here for Snowflake Cortex LLM availability: https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#availability | |||
| <br> | |||
| Additionally, access to LLMs that are not available on your region can be enabled by the cross-region inference: https://docs.snowflake.com/en/user-guide/snowflake-cortex/cross-region-inference | |||
Latest revision as of 17:32, 3 September 2025
This page describes the native method to use Snowflake where the process mining queries are run in the Snowflake and the eventlog data stays in the Snowflake. There are two steps to configure: setup a Snowflake account and configure QPR ProcessAnalyzer to use the account through an ODBC connection.
QPR ProcessAnalyzer needs one Snowflake schema where it can create functions and store cache tables, and write permissions are needed to that schema. QPR ProcessAnalyzer can use any tables in Snowflake as eventlogs source data, requiring at least read permission to the tables.
Snowflake account configuration
These instructions provide one scenario to configure Snowflake for QPR ProcessAnalyzer. Configuration can also be done differently, e.g., if there is other use for the same Snowflake account.
For permissions, these instructions are based on the database and schema ownership, giving all permissions to the objects. The next chapter provides instruction for the minimum permissions.
- Create a new Snowflake account in the Snowflake site: https://www.snowflake.com/ (or use an existing account). The account is created to the selected cloud platform and site, so consider a location that is close to the QPR ProcessAnalyzer hosting site.
- Go to https://app.snowflake.com and login in to your Snowflake account. Change the role to ACCOUNTADMIN.
- In Account > Roles, click the Role button to create a new role. Define Name QPRPA, and click Create Role.
- In Account > Users, click the User button to create a new user. Define User Name QPRPA and define a strong password. Check also that the Force user to change password on first time login is disabled. Click the Create User button.
- Select the created user, and click the Grant Role button. In the Role to grant list, select QPRPA, and click Grant.
- In Compute > Warehouses, click the Warehouse button to create a new warehouse. Define Name QPRPA, and select a suitable Size for the warehouse. In the Advanced Warehouse Options, check that Auto Resume and Auto Suspend are enabled. Set the Suspend After time based on your performance requirements (for example 3 minutes is a suitable value). Click Create Warehouse.
- Select the created warehouse and in the Privileges section and click the Privilege button. For Role, select the QPRPA role. For Privileges, select MONITOR, OPERATE and USAGE. Click the Grant Privileges button.
- In Data > Databases, click the Database button to create a new database. Define Name QPRPA, and click the Create button.
- Set the role as the owner of the database by selecting Transfer Ownership for the database in the menu and selecting the QPRPA role.
- Select the created database, and click the Schema button to create a new schema. Define Name QPRPA, and click the Create button.
- Set the role as the owner of the schema by selecting Transfer Ownership for the schema in the menu and selecting the QPRPA role.
- (Optional) If the connection to Snowflake is lost for any reason, QPR ProcessAnalyzer may not be able to cancel pending queries. To get pending queries automatically cancelled to save costs, it's advisable to change the ABORT_DETACHED_QUERY session setting to true: https://docs.snowflake.com/en/sql-reference/parameters.html#abort-detached-query.
- (Optional) Running a Snowflake warehouse consumes credits in the Snowflake account, and thus it's a good practice to set resource monitoring to control the credit usage: In Compute > Resource Monitors, click the Resource Monitor button to create a new resources monitor. Choose suitable monitor settings for your needs, and click the Create Resource Monitor button.
Set Snowflake ODBC connection
To add the Snowflake ODBC connection to QPR ProcessAnalyzer, following steps are required:
- Install Snowflake ODBC driver in the machine running QPR ProcessAnalyzer Server. More information about the ODBC driver installation: https://docs.snowflake.com/en/user-guide/odbc.html.
- Configure Snowflake ODBC connection string to the QPR ProcessAnalyzer configuration table. When configuring Snowflake as instructed above, the following connection string can be used:
Driver={SnowflakeDSIIDriver};Application=QPR_ProcessAnalyzer;Server=<account_identifier>.snowflakecomputing.com;Database=QPRPA;Schema=QPRPA;Warehouse=QPRPA;Role=QPRPA;query_timeout=3600;uid=QPRPA;pwd=<password>
Replace <password> with the password of the QPRPA user. In addition, replace <account_identifier> with the account identifier of your Snowflake account. More information about finding your Snowflake account identifier: https://docs.snowflake.com/en/user-guide/admin-account-identifier.html.
Please also include the Application tag to the connection string, identifying QPR ProcessAnalyzer related usage for collecting usage statistics.
More information about Snowflake ODBC connection strings: https://docs.snowflake.com/en/user-guide/odbc-parameters.html
Note: By default the Snowflake ODBC driver generates quite much logging to files which might unnecessarily fill up the disk space. It's recommended to set the LogLevel parameter to value 0. The path where the logs are created can be seen in Windows registry (key LogPath). More information about configuring Snowflake ODBC parameters: https://docs.snowflake.com/en/developer-guide/odbc/odbc-parameters.
Test connection string
The connection string can be tested in the QPR ProcessAnalyzer UI to make sure it works before configuring it. The following expression tests a connection string, and if the connection works, it returns the Snowflake user name:
ImportOdbc( `...`, "SELECT CURRENT_USER()" ).toCsv();
If an error dialog is shown, connection to Snowflake couldn't be made and thus the connection string doesn't work.
Query timeout
Query timeout is important for system responsiveness and cost management because for example an exceptionally heavy query might block the warehouse. If a running query reaches the defined timeout, it will be stopped automatically releasing the block.
There are timeouts both in QPR ProcessAnalyzer and Snowflake. QPR ProcessAnalyzer query timeout is by default 5 minutes which will send a cancellation to Snowflake. Sometimes the cancellation may not be processd successfully and the query won't be stopped in Snowflake. Thus, the Snowflake side timeout is also important.
Snowflake has two timeouts: query timeout (by default 2 days) and queue timeout (by default disabled). Query timeout is counted during the time a query is running in the warehouse, and the queue timeout when a query is waiting for an available warehouse. It's recommended to set both these timeouts to same value as the QPR ProcessAnalyzer timeout. Snowflake timeouts can be set for the warehouse or the service user (in the ODBC connection string). If the warehouse has other usage than the QPR ProcessAnalyzer queries, it might be a better option to set the timeout to the user.
Example: set Snowflake timeouts for the warehouse to 5 minutes:
ALTER WAREHOUSE <warehousename> SET STATEMENT_TIMEOUT_IN_SECONDS = 300; ALTER WAREHOUSE <warehousename> SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = 300;
More information:
- https://docs.snowflake.com/en/sql-reference/parameters#statement-timeout-in-seconds
- https://docs.snowflake.com/en/sql-reference/parameters#label-statement-queued-timeout-in-seconds
Hardened security with key-pair authentication
Instead of using password for the Snowflake ODBC connection, it's more secure to use the key-pair authentication available in Snowflake. The idea is that a key-pair is generated (public key and private key), the public key is stored in Snowflake cloud, and the private key stored in QPR ProcessAnalyzer. QPR ProcessAnalyzer will securely authenticate to Snowflake by proving that it has the private key corresponding to the public key. To setup the key-pair authentication, follow these steps: https://docs.snowflake.com/en/user-guide/key-pair-auth.
When the key-pair authentication is used, the following parameters is added to the connection string (and pwd parameter won't be used):
- Authenticator: Use value "SNOWFLAKE_JWT".
- PRIV_KEY_FILE: The private key file location in the local disk.
- PRIV_KEY_FILE_PWD: Password for the private key file (this is not the Snowflake user password, but it's for opening the encrypted private key file).
Minimal Snowflake permissions
The above instructions configure the Snowflake user as the database and schema owner, giving all permissions to the database and schema. If required, it's possible to configure the Snowflake user with less permissions. The minimal permissions for the database and schema configured to the Snowflake connection string are as follows (concerns both the global and project-level).
Database needs following permission:
- USAGE
Schema needs following permissions:
- USAGE
- CREATE TABLE (for creating cache tables and being able to ingest data through QPR ProcessAnalyzer UI)
- CREATE VIEW (temporarily views are used to determine column data types)
- CREATE FUNCTION (Python functions are needed for business calendar, conformance calculation, and clustering)
- CREATE STAGE (internal table stages used when ingesting data)
- CREATE FILE FORMAT (for reading data from stage when ingesting data)
If data is not imported using QPR ProcessAnalyzer but using Snowflake's own tools, the CREATE STAGE and CREATE FILE FORMAT permissions are not required.
When QPR ProcessAnalyzer creates Snowflake objects to the schema (e.g., tables, functions and file format), the Snowflake user will get ownership to them (and thus full permissions).
QPR ProcessAnalyzer can use any table or view in Snowflake, and the following minimal permissions are required:
- Database: USAGE
- Schema : USAGE
- Table/view: SELECT
Datatable key for events datatable
Analytics queries executed by QPR ProcessAnalyzer in Snowflake require to know key column(s) for each table. The key columns mean that the combination of values in the key columns are unique for each row in the table. For cases datatable, the case id column is assumed to be the key. For the events datatable, there is no key information available by default, so for the events datatable the key column(s) need to be defined in the Datatable properties dialog. If the key columns are not specified, all columns are assumed to be the key (which is the slowest option from the performance viewpoint). Note that defining the key columns incorrectly, may lead to incorrect calculation results.
If there are no suitable columns for keys, a new column containing a sequence number can be added. A sequence can be created as follows:
CREATE OR REPLACE SEQUENCE seq1;
Snowflake table can be set to use the sequence as follows:
CREATE OR REPLACE TABLE foo (a NUMBER DEFAULT seq1.NEXTVAL, b NUMBER);
When new data is imported to the table, new rows automatically get a unique sequence number.
Connecting to Snowflake tables
When a datatable is created in QPR ProcessAnalyzer, corresponding table is created in Snowflake storing the actual data. The table is not yet created when a new datatable is added, but when there is at least one column in the datatable. When using default names, Snowflake tables created by QPR ProcessAnalyzer are named as follows: qprpa_dt_<databaseId>_<datatableId>, where <databaseId> identifies the QPR ProcessAnalyzer environment and <datatableId> identifies the datatable within the environment. The database id is set in the QPR ProcessAnalyzer configuration table. Datatables can also be configured to use user-defined table, schema or database. The custom table behaves similarly as the default table, for example columns can be modified and data imported, but when the datatable is deleted, the custom-named table is not dropped in Snowflake.
Datatables can managed both in the QPR ProcessAnalyzer web UI and in the Snowflake. When datatables are modified in the Snowflake, a synchronization (see see synchronize function) is needed in QPR ProcessAnalyzer to check whether the underlying data in Snowflake has changed. Note that while the tables can be modified in Snowflake, new datatables don't appear to QPR ProcessAnalyzer until they have been created in the QPR ProcessAnalyzer UI. When the underlying table in the Snowflake is deleted all rows and columns appear to be removed from the datatable, but the datatable itself is not deleted. Schemas and databases are not created or modified by QPR ProcessAnalyzer (just tables).
Connecting to Snowflake views
In addition to tables, the datatables can be connected to Snowflake views (https://docs.snowflake.com/en/user-guide/views-introduction.html). When a process mining query is made to the cases and events datatables, the queries in the linked Snowflake views are executed as part of the process mining queries. Note that there cannot be a table and a view with the same name in the same schema, so the table may need to be removed before a view can be created.
New view can be created in Snowflake as follows (where ViewName is the name of the view):
CREATE OR REPLACE VIEW "ViewName" AS SELECT * FROM VALUES (1, 'red'), (2, 'orange'), (3, 'yellow'), (4, 'green') AS Colors(Id, Value)
The SELECT part in the above statement can be any query allowed by the Snowflake. The query can also contain data transformations which means that those transformation will be performed when the dashboards are used in QPR ProcessAnalyzer. This is following the ELT (extract, load, transform) principle, where the non-transformed data is loaded into the system and needed transformations are performed when the data is used by QPR ProcessAnalyzer. Note that there is a negative impact on the performance when performing transformations on demand.
When a datatable with a Snowflake view is deleted, the view in the datasource is not deleted. For datatables with Snowflake view, following data modification operation are not allowed: Import, Persist, AddColumn, Merge, RemoveColumns, RenameColumns, and Truncate.
Note that the query in the view is executed using permissions of the user account defined in the ODBC connection string, and thus the permissions are managed in Snowflake. Still, the normal QPR ProcessAnalyzer datatable permissions are applied for datatable with Snowflake views, so if user cannot see the datatable, user cannot access the data in the view either.
Finding query from Snowflake query history
If there are performance issues in running Snowflake queries, here are instructions how to find the corresponding query from the Snowflake query history (https://docs.snowflake.com/en/sql-reference/account-usage/query_history). First look at the QPR ProcessAnalyzer task log to find the problematic query there. Add Request id as a column to the table to see the request id of the problematic query. Go to the Snowflake query history and Filter queries based on the SQL Text by searching with the request id. The request id appears in the end of the SQL query as a comment, so it's easy to find the corresponding Snowflake query (or queries).
Following can be checked from the problematic Snowflake query:
- Has any time spent in the queue. If there is queuing, it indicates that there are too many queries for the warehouse to handle at the same time. Solution is to use the multi-cluster warehouses which can scale based on the load.
- Are there any other simultaneously running queries in the same warehouse (which may have slowed down the execution).
- Try rerunning the query to check whether the issue can be replicated.
Automatic data update detection
When data in Snowflake tables changes (data is added, modified or removed), QPR ProcessAnalyzer automatically detects that data has updated and performs calculations using the latest data. Changing of data is checked every 5 minutes, so it takes maximum of that time for new data to appear in the dashboards. QPR ProcessAnalyzer has different levels of caches (to improve performance) explaining why the new data might not immediately appear in the dashboards. When the change of data is detected, the caches are cleared and that calculations are performed using the latest data in Snowflake.
Automatic query cancellation
If the result of a pending query running in Snowflake is not needed anymore, the query is automatically cancelled by QPR ProcessAnalyzer, to save costs and reserve computing capacity for new queries. Queries are cancelled, e.g., when user makes filtering in a dashboard or changes chart settings, because the new filter or settings will make an updated query and the old query (if still pending) is cancelled.
Alternative traditional ODBC Import method
Alternatively, the traditional method can be used to import data from Snowflake to QPR ProcessAnalyzer and use the QPR ProcessAnalyzer's in-memory calculation engine. More information about importing data from an ODBC datasource:
- Load models directly from ODBC
- Import data from ODBC using expression scripting
- Import data from ODBC using SQL scripting
Snowflake Cortex
Snowflake Cortex LLMs are available for environments that use Snowflake models.
See here for Snowflake Cortex LLM availability: https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions#availability
Additionally, access to LLMs that are not available on your region can be enabled by the cross-region inference: https://docs.snowflake.com/en/user-guide/snowflake-cortex/cross-region-inference