Performance Tuning Guide: Difference between revisions
Line 3: | Line 3: | ||
== Snowflake models == | == Snowflake models == | ||
In Snowflake, calculations are performed in virtual warehouses (https://docs.snowflake.com/en/user-guide/warehouses). There are two methods to affect the performance: warehouse size and multiclustering. | In Snowflake, calculations are performed in virtual warehouses (https://docs.snowflake.com/en/user-guide/warehouses). There are two methods to affect the performance: warehouse size and multiclustering. | ||
Larger the warehouse size, the faster individual queries run. Usually, when going to larger warehouse sizes the incremental performance improvement decreases while costs increase more. Thus, try to find a balance where there is still notable performance improvements but not too high cost increase. | |||
If there are multiple simultaneous queries (e.g., there are multiple users or dashboards with lot of charts), multiclustering is the right solution (https://docs.snowflake.com/en/user-guide/warehouses-multicluster). In multiclustering, there are several paraller warehouses, allowing to run more queries at the same time. Each warehouse can process limited number of queries simultaneously and excess queries go to a queue waiting for an available warehouse. If queue starts to build up, increase the number of clusters in the multiclustering. Instructions how to monitor the load: https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring. | |||
If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service). | If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service). |
Revision as of 23:34, 18 October 2023
This guide contains information how to get the best performance out of QPR ProcessAnalyzer system while taking into account incurred infrastructure costs. Performance optimization is entirely different in Snowflake and in-memory models, so there are separate chapters for them.
Snowflake models
In Snowflake, calculations are performed in virtual warehouses (https://docs.snowflake.com/en/user-guide/warehouses). There are two methods to affect the performance: warehouse size and multiclustering.
Larger the warehouse size, the faster individual queries run. Usually, when going to larger warehouse sizes the incremental performance improvement decreases while costs increase more. Thus, try to find a balance where there is still notable performance improvements but not too high cost increase.
If there are multiple simultaneous queries (e.g., there are multiple users or dashboards with lot of charts), multiclustering is the right solution (https://docs.snowflake.com/en/user-guide/warehouses-multicluster). In multiclustering, there are several paraller warehouses, allowing to run more queries at the same time. Each warehouse can process limited number of queries simultaneously and excess queries go to a queue waiting for an available warehouse. If queue starts to build up, increase the number of clusters in the multiclustering. Instructions how to monitor the load: https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring.
If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service).
If running machine learning with large datasets, warehouses might run out of memory. If that occurs, consider Snowpark-optimized warehouse which has more memory (https://docs.snowflake.com/en/user-guide/warehouses-snowpark-optimized).
Snowflake scaling
- Larger warehouse
- Multi-cluster warehouses
In-memory models
Server resources
- Memory
- Processors
- Database