Web API: Expression/query

From QPR ProcessAnalyzer Wiki
Revision as of 11:25, 5 October 2021 by Ollvihe (talk | contribs) (→‎Query)
Jump to navigation Jump to search

The Expression query runs a query written using the expression language in the server and returns the query results. The expression language allows to query both the evenlog data and metadata in the system (e.g. datatables, projects, users).

See also, Expression Query examples.

Request

The request endpoint url is /api/expression/query and there are optional ResponseType and StringifyValues url parameters (see more in the response chapter. The payload of the request is a JSON object which properties are described in the query chapter. Example query:

Url: POST /api/expression/query?ResponseType=object&StringifyValues=false
Content-Type: application/json;charset=UTF-8
Body:
{
  "ModelId": 1234,
  "Filter": {
    "Items": [
      {
        "Type": "IncludeCases",
        "Items": [
          {
            "Type": "EventType",
            "Values": [
              "Change Price"
            ]
          }
        ]
      }
    ]
  },
  "Root": "Cases",
  "Dimensions": [
    {
      "Name": "Start Month",
      "Expression": "StartTime.Month"
    },
    {
      "Name": "Region",
      "Expression": "Attribute(\"Region\")"
    }
  ],
  "Values": [
    {
      "Name": "Average Case Duration in Days",
      "Expression": "Average(_.Duration).TotalDays"
    }
  ],
  "Ordering": [
    {
      "Name": "Start Month",
      "Direction": "Ascending"
    },
    {
      "Name": "Average Case Duration in Days",
      "Direction": "Descending"
    }
  ]
}

HTTP request header Authorization with value Bearer <access token> needs to be in place to identify the session.

Response

The response contains data in a tabular format (one-to-many columns with names and zero-to-many rows). The following url parameters are available to define the results format:

  • ResponseType: Determines the format of the response, either array (default) or object. See below how the structure of the JSON differs.
  • StringifyValues: determines that the data items are stringified (either true or false, by default false). When no stringification is used, values use JSON standard formatting for strings, numbers, booleans and nulls. Dates use format yyyy-MM-ddTHH:mm:ss.fff, and other data types use default stringification (more information about stringified values).

Example array type of response (without stringification):

[
  {"A": "DataA", "B": "DataB", "C": "DataC"},
  {"A": "DataA", "B": "DataB", "C": "DataC"},
  {"A": "DataA", "B": "DataB", "C": "DataC"}
]

Example object type of response (without stringification):

{
  "Columns": [
    {"Name": "A"},
    {"Name": "B"},
    {"Name": "C"}
  ],
  "Rows": [
    ["DataA", "DataB", "DataC"],
    ["DataA", "DataB", "DataC"],
    ["DataA", "DataB", "DataC"]
  ]
}

Query

When making a query to a model, the ModelId property is mandatory and additionally Filter property can be used to filter the data. The Root parameter is mandatory, and optionally Dimensions and Measures are used. When making a query that is not targeted a certain model, the ContextType needs to be generic and then ModelId and Filter properties are not used.

Properties

Property Description
ProcessingMethod

Defines the processing method to perform calculations:

  • inmemory : Calculation is done using QPR ProcessAnalyzer in-memory core based on the EventLog object and related entities. (default)
  • dataframe: : Calculation is done using SqlDataFrames. The root is assumed to produce an SqlDataFrame for which defined dimensions, measures, sorting etc. are applied. All calculations are performed in the datasource where the data is stored (for example: in QPR ProcessAnalyzer SQL Server database, Snowflake, AWS Redshift or Azure Databricks).
ContextType

Determines in which context the root expression is run. Following contexts can be used:

  • eventlog: root expression is run in the EventLog context (and thus EventLog functions and properties can be used). When EventLog context is used, the model is always loaded into memory already before the root expression is run.
  • model: root expression is run in the Model context. The model is not automatically loaded into memory (although using the EventLog property will load it into memory).
  • generic: root expression is run in the generic context (i.e. context that is not related to any model)

If not specified, eventlog is used if FilterId or ModelId parameter is specified.

ModelId Model for which the calculation is run. This parameter is mandatory, if the calculation is run in an eventlog context.
SourceData

Defines expressions and column mappings to create the initial event and case data for which the calculation is performed. If SourceData is not defined, case and events is read based on the ModelId and Filter parameters (when ContextType is Model or EventLog). SourceData has the following structure:

  • Events:
    • Expression: Expression returning events data as a SqlDataFrame (one row for each event). The must be columns for the case id, event type and timestamp. Other columns are event attributes.
    • Columns: Specifies which SqlDataFrame columns represent the case id, event type name and timestamp (see example below).
  • Cases:
    • Expression: Expression returning cases data as a SqlDataFrame (one row for each cases). There must be a column for the case id. Other columns are case attributes.
    • Columns: Specifies which SqlDataFrame column represents the case id (see example below).
{
	"SourceData": {
		"Events": {
			"Expression": "...",
			"Columns": {
				"CaseId": "CaseIdColumn",
				"EventType": "EventTypeColumn",
				"Timestamp": "TimestampColumn"
			}
		},
		"Cases": {
			"Expression": "...",
			"Columns": {
				"CaseId": "CaseIdColumn"
			}
		}
	}
}
Filter Filter definition as JSON. If Filter is defined, please do not use the FilterId parameter.
FilterId Stored filter for which the query is run. Alternative to the Filter parameter. If neither FilterId nor Filter is specified, calculation is run from the entire model.
Comparison Comparison definition as similar JSON structure than Filter.
RuntimeComparison When set to true, the expression query works in the comparison mode (i.e. root causes mode) and the ComparisonEventLog property is available in the expressions. You also need to pass the Comparison parameter which the comparison between case sets are based on. The default value is false.
Root The root expression is evaluated first, and objects it returns are used in the next calculation step (which is dimensioning if it's enabled). The root expression is run in the context defined by the ContextType parameter. If ProcessingMethod is dataframe, also expression aliases can be used in the root expression.
Ordering Array of Ordering objects. Ordering defined how the result table is sorted.
Property Description
Name Dimension or Value name to sort the data.
Direction Sorting direction, either Ascending or Descending.
AggregateOthers

When true, all rows that are left out due to the MaximumRowCount limit, are aggregated and shown as the last row. Expressions to calculate the aggregations, can be defined for each dimension and measure using the AggregationExpression property. When the AggregationExpression is not defined, the default aggregated value is null. When the AggregateOthers is used, the maximum number of rows is still the MaximumRowCount, so with the aggregation there is one data row less shown. Default for AggregateOthers is false.

MaximumRowCount

Integer value defining a limit how many rows at maximum are returned by the query. The default value is null, which means that all the rows will be returned. If sorting is used, the query results are first sorted before the limit is applied.

FirstRow

Integer value defining the starting row number from the query results to return. The default value is null, which means that the first row will be the first returned row. Row numbers start from 0. If sorting is used, the query results are first sorted before this setting is applied.

Criteria Expression returning boolean value to filter the resulting dataset row-by-row after dimensions and values are calculated. The criteria expression is calculated for each row of the dataset, and those rows are removed where the criteria expression results a false value. The respective row in the dataset is as a context for the criteria expression, similar to the DataFrame's Where function.
RowInitExpression Specifies an expression that is evaluated for every generated row before evaluating the value expressions. RowInitExpression is calculated after the dimension expressions have been calculated and dimensions generated. RowInitExpression can be used to make common calculations and define variables that are needed in several value expressions.
ColumnOrdering

An array of column names that defines the order in which the columns are returned. If an array contains a name that does not exist in the results, its value will be returned as null values in the returned result. If an array does not contain a name that is in the original result, then that column will not be returned at all. If ColumnOrdering is not defined, all the result rows will be returned in the defined order so that all the dimensions are returned before the values. Example:

"ColumnOrdering": ["Case Count", "Account Manager", "Average Duration"]
EnableResultCaching When set to true, expression query results are cached both in the client and server side. When false, query results are not cached. the default value is true when ContextType is eventlog or model. When ContextType is generic, no caching is used regardless of this setting.
QueryIdentifier Identifier of the query. Can be used to refer to the query, when cancelling pending queries.
CancelEarlierQueriesWithIdentifier Boolean value defining whether possible previous query with the same QueryIdentifier is cancelled when a new query with the same identifier is received by the server. The default value is false.

Expression allows configurable objects, such as cases, events, event types or flows, to be divided into configurable dimensions and calculable values (KPIs). In a basic form, the result of this query is a table with the following columns: (1) One column for each specified dimension, and (2) at least one column for each specified value. The result of this query is a table with one row for each unique dimension value combination.

When building expression query, the individual expressions are run in following contexts:

  • Root expression is run either in the EventLog or generic context, depending on the ContextType setting.
  • Dimension expressions are run in the context of the individual object returned by the root expression.
  • Value (measure) expressions is run in the context of the array of items that are part of the dimensioned slice. If dimensioning is disabled (i.e. dimensions parameter is null), the value expression is run in the context of the individual object returned by the root expression.
  • Row initialization expression is run in the same context as the measures. RowInitExpression is run before the measures, so variables initialized in the RowInitExpression are available in the measures.
  • Aggregation expression is run in the context of an array of the rest of the items, i.e. items to be aggregated as the last row.

Dimensions

Using dimensions, the root rows are sliced into different groups based on how many unique values the dimension expressions produce. There can be zero to many dimensions. The result data contains a column for each dimension.

If dimensions are an ampty array (i.e. Dimensions: []), all root objects are aggregated into a single group. If dimensions array is not defined (i.e. Dimensions: null), each object in the root expression will be as a separate row in the result. Then context for the value expression is a single object, instead of array of objects.

Each dimension may have the following properties:

Property Description
Name Name of the dimension. Dimension name works as the title for the dimension column. In addition, the dimension name is used when referring to this dimension in the ordering.
Expression Expression to calculate the dimension value for each root object, i.e. each of the root objects are used as the context for the dimension expression.
ValueExpression Specifies an expression that is used as the final returned value of the dimension. The ValueExpression is calculated in the context of an unordered array containing of all the objects in the dimension represented by the row (i.e. the context is different than in the Expression). The ValueExpression can be used to improve performance as follows: the Expression is used to provide a simplified expression that's enough for the dimensioning, and define a more complex expression in the ValueExpression to provide the final returned value. The benefit is that the ValueExpression is not calculated for rows that are left out by the MaximumRowCount setting.
NumberPrecision Number of decimals for rounding numerical dimension values. If not defined, no rounding is done. Rounding is done before dividing values into distinct dimension slots. If any value of this dimension is other than numerical, error is given.
DatetimeTruncation Dimension values that are of the datetime type, are truncated (i.e. rounded downwards) using the defined granularity. Available values are year, halfyear, quarter, month, week, day, hour, minute, second and millisecond. Truncation is done before dividing values into distinct dimension slots. If any value of this dimension is other than datetime, error is given.
TimespanPrecision Dimension values that are of the timespan type, are rounded using the defined granularity. Available values are year, halfyear, quarter, month, fortnight, week, day, hour, minute, second and millisecond. Rounding is done before dividing values into distinct dimension slots. If any value of this dimension is other than timespan, error is given.
IsHidden Is the dimension column hidden (true or false). Hiding dimensions don't affect the calculation, but the hidden columns are not returned to the client.

Values

When there are dimensions defined, the Values are aggregations from the dimensioned root objects, usually these are the measures or KPI's. Value expressions are calculated for each dimensioned data group, i.e. for each unique dimension value. The context of the expression to calculate the value is an array of objects. If dimensions are not defined, context for the value expression is a single root object. Each value may have the following properties:

Property Description
Name Name for the value. Used only when ValueType is Single.
Expression Behavior depends on the ValueType property:
  • Single: Expression for calculating the value for the single column. The expression is calculated for each row (i.e. the unique dimension combinations), and the expression usually contains aggregations from the group of objects into a single value.
  • Dynamic: Expression for calculating the value for each dynamically generated column specified by ValueDimensionExpression property. The result of the ValueDimensionExpression is accessible via ValueDimension -variable.
  • Pivot: Expression to produce an array, where each item represents a dynamically generated column. The result shown by the Expression query is number of items belonging to each column.
AggregationExpression Expression to calculate the aggregated value for the rest of the rows for this column when using the AggregateOthers setting. All the aggregated values are provided as the context (_) for the aggregation expression. If no aggregation expression is defined, null value is used. Example aggregation expressions:

Row count:

"AggregationExpression": "Count(_)"

Sum of values (for numerical columns):

"AggregationExpression": "Sum(_)"

Showing count, minimum and maximum values:

"AggregationExpression": "\"(\" + Count(_) + \" others) \" + Min(_) + \" - \" + Max(_)"
ValueType

Type of the value, which is one of the following:

  • Single (default): Produces a single column calculated by the expression.
  • Dynamic: There is an expression for defining the actual produced columns dynamically (ValueDimensionExpression) and another expression (Expression property) for specifying the actual values for the columns.
  • Pivot: Produces as many columns as there are unique values in the expression. Columns are ordered according to the order defined by DimensionOrderExpression.
ValueDimensionExpression

(ValueType: Dynamic, Pivot)

Expression used to generate columns in Dynamic and Pivot value types. The expression should return an array where each item represents a column. This expression is executed once when calculating the Expression query. For Dynamic type value ValueDimensionExpression field is mandatory, and for Pivot type it's not mandatory.

The result of this expression will be given as the value of variable ValueDimension when evaluating the Expression. For Pivot type value, this can be used to override the default set of columns created automatically based on the actual cell evaluation results. An array of all the actual dimension values created by the actual cell evaluation result is given as context object for the evaluation.

NameExpression

(ValueType: Dynamic, Pivot)

Expression used to generate column names in Dynamic and Pivot value types. This expression is run once for each column, and the expression has as a context the column generated in the ValueDimensionExpression, and as a result the expression should give the column name.
DimensionOrderExpression

(ValueType: Pivot)

Expression used to order the dimensions when pivot value type value is used. If not defined, the following default expression will be used: OrderByValue(_).

PivotAggregationExpression

(ValueType: Pivot)

Expression used to aggregate all the values within one cell of pivot type values. If the expression returns an array of arrays, the first item in the inner array is used as the ValueDimension and the second item is used as the root object when evaluating PivotAggregationExpression. By default, if expression returns only an array of atomic objects, the root object of the evaluation of PivotAggregationExpression is the value of the current ValueDimension.

DefaultValueExpression

(ValueType: Pivot)

Expression whose result is used to fill the "gaps" of the matrix created when pivot value type is used.

EvaluateAfterAggregation If set to true, makes the value calculation occur only after dimension aggregations have been performed, thus making it possible to calculate values based on the already aggregated value or dimension columns. Available only in the DataFrame processing method.
IsHidden Is the value column hidden (true or false). Hiding values don't affect the calculation, but the hidden columns are not returned to the client.