Performance Tuning Guide: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
No edit summary
 
Line 7: Line 7:


If there are multiple simultaneous queries (e.g., there are multiple users or dashboards with lot of charts), multiclustering is the right solution (https://docs.snowflake.com/en/user-guide/warehouses-multicluster). In multiclustering, there are several paraller warehouses, allowing to run more queries at the same time. Each warehouse can process limited number of queries simultaneously and excess queries go to a queue waiting for an available warehouse. If queue starts to build up, increase the number of clusters in the multiclustering. Instructions how to monitor the load: https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring.
If there are multiple simultaneous queries (e.g., there are multiple users or dashboards with lot of charts), multiclustering is the right solution (https://docs.snowflake.com/en/user-guide/warehouses-multicluster). In multiclustering, there are several paraller warehouses, allowing to run more queries at the same time. Each warehouse can process limited number of queries simultaneously and excess queries go to a queue waiting for an available warehouse. If queue starts to build up, increase the number of clusters in the multiclustering. Instructions how to monitor the load: https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring.
Warehouse suspension time affect both the performance and costs. When a warehouse is suspended after idling some time, the costs stop incurring, but also the caches in the warehouse are lost. When the warehouse  is started again, the caches need to be built which slows down the queries in the beginning. Thus shorter warehouse suspension time saves costs, but there is a performance decrease when a suspended warehouse needs to start again.


If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service).
If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service).
Line 14: Line 16:
== In-memory models ==
== In-memory models ==


== Server resources ==
Server resources
* Memory
* Memory
* Processors
* Processors
* Database
* Database

Latest revision as of 23:38, 18 October 2023

This guide contains information how to get the best performance out of QPR ProcessAnalyzer system while taking into account incurred infrastructure costs. Performance optimization is entirely different in Snowflake and in-memory models, so there are separate chapters for them.

Snowflake models

In Snowflake, calculations are performed in virtual warehouses (https://docs.snowflake.com/en/user-guide/warehouses). There are two methods to affect the performance: warehouse size and multiclustering.

Larger the warehouse size, the faster individual queries run. Usually, when going to larger warehouse sizes the incremental performance improvement decreases while costs increase more. Thus, try to find a balance where there is still notable performance improvements but not too high cost increase.

If there are multiple simultaneous queries (e.g., there are multiple users or dashboards with lot of charts), multiclustering is the right solution (https://docs.snowflake.com/en/user-guide/warehouses-multicluster). In multiclustering, there are several paraller warehouses, allowing to run more queries at the same time. Each warehouse can process limited number of queries simultaneously and excess queries go to a queue waiting for an available warehouse. If queue starts to build up, increase the number of clusters in the multiclustering. Instructions how to monitor the load: https://docs.snowflake.com/en/user-guide/warehouses-load-monitoring.

Warehouse suspension time affect both the performance and costs. When a warehouse is suspended after idling some time, the costs stop incurring, but also the caches in the warehouse are lost. When the warehouse is started again, the caches need to be built which slows down the queries in the beginning. Thus shorter warehouse suspension time saves costs, but there is a performance decrease when a suspended warehouse needs to start again.

If there are individual queries that take considerably longer than other queries, the query acceleration service might be helpful (https://docs.snowflake.com/en/user-guide/query-acceleration-service).

If running machine learning with large datasets, warehouses might run out of memory. If that occurs, consider Snowpark-optimized warehouse which has more memory (https://docs.snowflake.com/en/user-guide/warehouses-snowpark-optimized).

In-memory models

Server resources

  • Memory
  • Processors
  • Database