Web API: Expression/query: Difference between revisions
Line 120: | Line 120: | ||
||SourceData | ||SourceData | ||
|| | || | ||
Defines expressions and column mappings for the events and cases data for which the query is performed. If SourceData is not defined and ''ContextType'' is ''Model'' or ''EventLog'', case and events is read from the model (defined by the ModelId and Filter parameters). SourceData has the following structure: | Defines expressions and column mappings for the events and cases data for which the query is performed. If SourceData is not defined and ''ContextType'' is ''Model'' or ''EventLog'', case and events is read from the model (defined by the ModelId and Filter parameters). It's also possible to override individual model level settings by defining them in the SourceData. For examples, when the ''Expressions'' are not defined, the model defined datatables are used, but it's possible to change the column mappings by defining them in the SourceData. | ||
SourceData has the following structure: | |||
* '''Events''': | * '''Events''': | ||
** '''Expression''': Expression to return events data as SqlDataFrame (one row for each event). There needs to be columns for the case id, event type and timestamp. | ** '''Expression''': Expression to return events data as SqlDataFrame (one row for each event). There needs to be columns for the case id, event type and timestamp. | ||
Line 132: | Line 134: | ||
*** '''CaseId''': Mapping to the case ID column. | *** '''CaseId''': Mapping to the case ID column. | ||
Example where all settings are defined: | |||
<pre> | <pre> | ||
{ | { | ||
Line 147: | Line 150: | ||
"Columns": { | "Columns": { | ||
"CaseId": "CaseIdColumn" | "CaseId": "CaseIdColumn" | ||
} | |||
} | |||
} | |||
} | |||
</pre> | |||
Example where only one setting from the model is overridden for the query: | |||
<pre> | |||
{ | |||
"SourceData": { | |||
"Events": { | |||
"Columns": { | |||
"EventType": "EventTypeColumn" | |||
} | } | ||
} | } |
Revision as of 13:42, 7 June 2022
The Expression query runs a query written using the expression language in the server and returns the query results. The expression language allows to query both the evenlog data and metadata in the system (e.g. datatables, projects, users).
See also, Expression Query examples.
Request
The request endpoint url is qprpa/api/expression/query and there are optional ResponseType and StringifyValues url parameters (see more in the response chapter. The payload of the request is a JSON object which properties are described in the query chapter. Example query:
Url: POST /api/expression/query?ResponseType=object&StringifyValues=false Content-Type: application/json;charset=UTF-8 Body: { "ModelId": 1234, "Filter": { "Items": [ { "Type": "IncludeCases", "Items": [ { "Type": "EventType", "Values": [ "Change Price" ] } ] } ] }, "Root": "Cases", "Dimensions": [ { "Name": "Start Month", "Expression": "StartTime.Month" }, { "Name": "Region", "Expression": "Attribute(\"Region\")" } ], "Values": [ { "Name": "Average Case Duration in Days", "Expression": "Average(_.Duration).TotalDays" } ], "Ordering": [ { "Name": "Start Month", "Direction": "Ascending" }, { "Name": "Average Case Duration in Days", "Direction": "Descending" } ] }
HTTP request header Authorization with value Bearer <access token> needs to be in place to identify the session.
Response
The response contains data in a tabular format (one-to-many columns with names and zero-to-many rows). The following url parameters are available to define the results format:
- ResponseType: Determines the format of the response, either array (default) or object. See below how the structure of the JSON differs.
- StringifyValues: determines that the data items are stringified (either true or false, by default false). When no stringification is used, values use JSON standard formatting for strings, numbers, booleans and nulls. Dates use format yyyy-MM-ddTHH:mm:ss.fff, and other data types use default stringification (more information about stringified values).
Example array type of response (without stringification):
[ {"A": "DataA", "B": "DataB", "C": "DataC"}, {"A": "DataA", "B": "DataB", "C": "DataC"}, {"A": "DataA", "B": "DataB", "C": "DataC"} ]
Example object type of response (without stringification):
{ "Columns": [ {"Name": "A"}, {"Name": "B"}, {"Name": "C"} ], "Rows": [ ["DataA", "DataB", "DataC"], ["DataA", "DataB", "DataC"], ["DataA", "DataB", "DataC"] ] }
Query
When making a query to a model, the ModelId property is mandatory and additionally Filter property can be used to filter the data. The Root parameter is mandatory, and optionally Dimensions and Measures are used. When making a query that is not targeted a certain model, the ContextType needs to be generic and then ModelId and Filter properties are not used.
Properties
Property | Description | ||||||
---|---|---|---|---|---|---|---|
ProcessingMethod |
Defines the processing method to perform calculations:
| ||||||
PreferSqlDataFrame | When set to true (default), dataframe mode calculations (i.e. ProcessingMethod=dataframe) are done in the original datasource using SQL language queries. When false, dataframe mode calculations are performed in-memory (which also requires to load the data to memory). Usually calculations are done in the original datasource. | ||||||
ContextType |
Determines in which context the root expression is run. Following contexts can be used:
If not specified, eventlog is used if FilterId or ModelId parameter is specified. | ||||||
ModelId | Model for which the calculation is run. This parameter is mandatory, if the calculation is run in an eventlog context. | ||||||
SourceData |
Defines expressions and column mappings for the events and cases data for which the query is performed. If SourceData is not defined and ContextType is Model or EventLog, case and events is read from the model (defined by the ModelId and Filter parameters). It's also possible to override individual model level settings by defining them in the SourceData. For examples, when the Expressions are not defined, the model defined datatables are used, but it's possible to change the column mappings by defining them in the SourceData. SourceData has the following structure:
Example where all settings are defined: { "SourceData": { "Events": { "Expression": "...", "Columns": { "CaseId": "CaseIdColumn", "EventType": "EventTypeColumn", "Timestamp": "TimestampColumn" } }, "Cases": { "Expression": "...", "Columns": { "CaseId": "CaseIdColumn" } } } } Example where only one setting from the model is overridden for the query: { "SourceData": { "Events": { "Columns": { "EventType": "EventTypeColumn" } } } } | ||||||
Filter | Filter definition as JSON. If Filter is defined, please do not use the FilterId parameter. | ||||||
FilterId | Stored filter for which the query is run. Alternative to the Filter parameter. If neither FilterId nor Filter is specified, calculation is run from the entire model. | ||||||
Comparison | Comparison definition as similar JSON structure than Filter. | ||||||
Root | The root expression is evaluated first, and objects it returns are used in the next calculation step (which is dimensioning if it's enabled). The root expression is run in the context defined by the ContextType parameter. If ProcessingMethod is dataframe, also expression aliases can be used in the root expression. | ||||||
Ordering | Array of Ordering objects. Ordering defined how the result table is sorted.
| ||||||
AggregateOthers |
When true, all rows that are left out due to the MaximumRowCount limit, are aggregated and shown as the last row. Expressions to calculate the aggregations, can be defined for each dimension and measure using the AggregationExpression property. When the AggregationExpression is not defined, the default aggregated value is null. When the AggregateOthers is used, the maximum number of rows is still the MaximumRowCount, so with the aggregation there is one data row less shown. Default for AggregateOthers is false. | ||||||
MaximumRowCount |
Defines a limit how many rows at maximum are returned by the query. Defined as an integer. The default is null meaning that all rows will be returned. If sorting is used, the query results are first sorted before the limit is applied. | ||||||
FirstRow |
Defines the starting row number from the query results to return. Defined as an integer, and row numbering start from 0. The default is null meaning that no rows are skipped. If sorting is used, the query results are first sorted before this setting is applied. If MaximumRowCount is also defined, the FirstRow is applied before the MaximumRowCount. | ||||||
Criteria | Expression returning boolean value to filter the resulting dataset row-by-row after dimensions and values are calculated. The criteria expression is calculated for each row of the dataset, and those rows are removed where the criteria expression results a false value. The respective row in the dataset is as a context for the criteria expression, similar to the DataFrame's Where function. | ||||||
RowInitExpression | Specifies an expression that is evaluated for every generated row before evaluating the value expressions. RowInitExpression is calculated after the dimension expressions have been calculated and dimensions generated. RowInitExpression can be used to make common calculations and define variables that are needed in several value expressions. | ||||||
ColumnOrdering |
An array of column names that defines the order in which the columns are returned. If an array contains a name that does not exist in the results, its value will be returned as null values in the returned result. If an array does not contain a name that is in the original result, then that column will not be returned at all. If ColumnOrdering is not defined, all the result rows will be returned in the defined order so that all the dimensions are returned before the values. Example: "ColumnOrdering": ["Case Count", "Account Manager", "Average Duration"] | ||||||
EnableResultCaching | When set to true, expression query results are cached both in the client and server side. When false, query results are not cached. the default value is true when ContextType is eventlog or model. When ContextType is generic, no caching is used regardless of this setting. | ||||||
QueryIdentifier | Identifier of the query. Can be used to refer to the query, when cancelling pending queries. | ||||||
CancelEarlierQueriesWithIdentifier | Boolean value defining whether possible previous query with the same QueryIdentifier is cancelled when a new query with the same identifier is received by the server. The default value is false. |
Expression allows configurable objects, such as cases, events, event types or flows, to be divided into configurable dimensions and calculable values (KPIs). In a basic form, the result of this query is a table with the following columns: (1) One column for each specified dimension, and (2) at least one column for each specified value. The result of this query is a table with one row for each unique dimension value combination.
When building expression query, the individual expressions are run in following contexts:
- Root expression is run either in the EventLog or generic context, depending on the ContextType setting.
- Dimension expressions are run in the context of the individual object returned by the root expression.
- Value (measure) expressions is run in the context of the array of items that are part of the dimensioned slice. If dimensioning is disabled (i.e. dimensions parameter is null), the value expression is run in the context of the individual object returned by the root expression.
- Row initialization expression is run in the same context as the measures. RowInitExpression is run before the measures, so variables initialized in the RowInitExpression are available in the measures.
- Aggregation expression is run in the context of an array of the rest of the items, i.e. items to be aggregated as the last row.
Dimensions
Using dimensions, the root rows are sliced into different groups based on how many unique values the dimension expressions produce. There can be zero to many dimensions. The result data contains a column for each dimension.
If dimensions are an ampty array (i.e. Dimensions: []), all root objects are aggregated into a single group. If dimensions array is not defined (i.e. Dimensions: null), each object in the root expression will be as a separate row in the result. Then context for the value expression is a single object, instead of array of objects.
Each dimension may have the following properties:
Property | Description |
---|---|
Name | Name of the dimension. Dimension name works as the title for the dimension column. In addition, the dimension name is used when referring to this dimension in the ordering. |
Expression | Expression to calculate the dimension value for each root object, i.e. each of the root objects are used as the context for the dimension expression. |
ValueExpression | Specifies an expression that is used as the final returned value of the dimension. The ValueExpression is calculated in the context of an unordered array containing of all the objects in the dimension represented by the row (i.e. the context is different than in the Expression). The ValueExpression can be used to improve performance as follows: the Expression is used to provide a simplified expression that's enough for the dimensioning, and define a more complex expression in the ValueExpression to provide the final returned value. The benefit is that the ValueExpression is not calculated for rows that are left out by the MaximumRowCount setting. |
NumberPrecision | Specified the number of decimals for rounding numerical dimension values. If not defined, no rounding is done. Negative number can be used to round to nearest tens (-1), hundreds (-2), etc. The rounded values are dividing values into the distinct dimension slots. If any value in this dimension is other than numerical, error is given. Works for both the inmemory and dataframe processing methods. |
DatetimeTruncation | Dimension values that are of the datetime type, are truncated (i.e. rounded downwards) using the defined time period (granularity). Available values for the inmemory processing are year, halfyear, quarter, month, week, day, hour, minute, second and millisecond. Truncation is done before dividing values into distinct dimension slots. If any value of this dimension is other than datetime, error is given. Works for both the inmemory and dataframe processing methods.
For the week truncation, in the in-memory processing the locale settings of the QPR ProcessAnalyzer server determine the first day of the week, and in the dataframe processing, SQL Server or Snowflake settings determine the first day of the week. |
TimespanPrecision | Dimension values that are of the timespan type, are rounded using the defined granularity. Available values are year, halfyear, quarter, month, fortnight, week, day, hour, minute, second and millisecond. Rounding is done before dividing values into distinct dimension slots. If any value of this dimension is other than timespan, error is given.
This settings does not work for the dataframe processing method. |
IsHidden | Is the dimension column hidden (true or false). Hiding dimensions don't affect the calculation, but the hidden columns are not returned to the client. |
Values
When there are dimensions defined, the Values are aggregations from the dimensioned root objects, usually these are the measures or KPI's. Value expressions are calculated for each dimensioned data group, i.e. for each unique dimension value. The context of the expression to calculate the value is an array of objects. If dimensions are not defined, context for the value expression is a single root object. Each value may have the following properties:
Property | Description |
---|---|
Name | Name for the value. Used only when ValueType is Single. |
Expression | Behavior depends on the ValueType property:
|
AggregationExpression | Expression to calculate the aggregated value for the rest of the rows for this column when using the AggregateOthers setting. All the aggregated values are provided as the context (_) for the aggregation expression. If no aggregation expression is defined, null value is used. Example aggregation expressions:
Row count: "AggregationExpression": "Count(_)" Sum of values (for numerical columns): "AggregationExpression": "Sum(_)" Showing count, minimum and maximum values: "AggregationExpression": "\"(\" + Count(_) + \" others) \" + Min(_) + \" - \" + Max(_)" |
ValueType |
Type of the value, which is one of the following:
|
ValueDimensionExpression
(ValueType: Dynamic, Pivot) |
Expression used to generate columns in Dynamic and Pivot value types. The expression should return an array where each item represents a column. This expression is executed once when calculating the Expression query. For Dynamic type value ValueDimensionExpression field is mandatory, and for Pivot type it's not mandatory.
The result of this expression will be given as the value of variable ValueDimension when evaluating the Expression. For Pivot type value, this can be used to override the default set of columns created automatically based on the actual cell evaluation results. An array of all the actual dimension values created by the actual cell evaluation result is given as context object for the evaluation. |
NameExpression
(ValueType: Dynamic, Pivot) |
Expression used to generate column names in Dynamic and Pivot value types. This expression is run once for each column, and the expression has as a context the column generated in the ValueDimensionExpression, and as a result the expression should give the column name. |
DimensionOrderExpression
(ValueType: Pivot) |
Expression used to order the dimensions when pivot value type value is used. If not defined, the following default expression will be used: OrderByValue(_). |
PivotAggregationExpression
(ValueType: Pivot) |
Expression used to aggregate all the values within one cell of pivot type values. If the expression returns an array of arrays, the first item in the inner array is used as the ValueDimension and the second item is used as the root object when evaluating PivotAggregationExpression. By default, if expression returns only an array of atomic objects, the root object of the evaluation of PivotAggregationExpression is the value of the current ValueDimension. |
DefaultValueExpression
(ValueType: Pivot) |
Expression whose result is used to fill the "gaps" of the matrix created when pivot value type is used. |
EvaluateAfterAggregation | If set to true, makes the value calculation occur only after dimension aggregations have been performed, thus making it possible to calculate values based on the already aggregated value or dimension columns. Available only in the DataFrame processing method. |
IsHidden | Is the value column hidden (true or false). Hiding values don't affect the calculation, but the hidden columns are not returned to the client. |