SQL Expressions: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
Line 146: Line 146:
Returns the number of cases having the same variation for every case.
Returns the number of cases having the same variation for every case.


Example for Cases:
Cast(DateDiff("Seconds", AggregateFrom(Events.Where(Column("EventType") == "Sales Order"), "Min", Column("TimeStamp")), AggregateFrom(Events.Where(Column("EventType") == "Invoice"), "Max", Column("TimeStamp"))), "Float")
Returns the duration in seconds between the first occurrence of "Sales Order"-event type and the last occurrence of "Invoice" event type for each case.
</pre>
 
Example for Model:
<pre>
<pre>
AggregateFrom(Cases, "Count", null, #{"Items":[#{"Type":"IncludeCases","Items":[#{"Type":"CaseAttributeValue","Values":["Dallas"], "Attribute":"Region"}]}]})
AggregateFrom(Cases, "Count", null, #{"Items":[#{"Type":"IncludeCases","Items":[#{"Type":"CaseAttributeValue","Values":["Dallas"], "Attribute":"Region"}]}]})
Returns the total number of cases in the model having "Dallas" as the value of "Region" case attribute.
Returns the total number of cases in the model having "Dallas" as the value of "Region" case attribute.
Cast(DateDiff("Seconds", AggregateFrom(Events.Where(Column("EventType") == "Sales Order"), "Min", Column("TimeStamp")), AggregateFrom(Events.Where(Column("EventType") == "Invoice"), "Max", Column("TimeStamp"))), "Float")
Returns the duration in seconds between the first occurrence of "Sales Order"-event type and the last occurrence of "Invoice" event type for each case.
</pre>
</pre>



Revision as of 15:48, 20 February 2022

SQL expressions are special expressions, which are converted into SQL and evaluated in an external system that supports SQL (e.g., Snowflake, SQL Server, Databricks, Redshift). Only a subset of expression language functionalities are supported by the SQL expressions (which are explained in this page). SQL expressions are used e.g. in the Where and WithExpressionColumn functions.

Operators

Following operators are supported by by the SQL expressions:

  • Arithmetic operators: +, -, *, /, %
  • Comparison operators: ==, <, <=, >, >=, !=.
  • Logical operators: &&, or, !
  • Data types: strings ("this is a string"), integers (123), decimal numbers (123.45), booleans (true, false), null value (null)

Mathematical functions

Function Description
Ceiling

Returns given value rounded to the nearest equal or larger integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Floor

Returns given value rounded to the nearest equal or smaller integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Date functions

Function Description¨
DateDiff

Calculates how many of the specified date part boundaries there are between the specified dates. Parameters:

  1. date period: Date period in which the difference is calculated between the dates. Supported date periods: second, minute, hour, day, week, quarter, month, year.
  2. start date: Starting timestamp.
  3. end date: Ending timestamp.
Day

Returns the days of the month (1-31) of given timestamp.

Day(Column("DateColumn"))
Hour Returns the hours part (0-59) of given timestamp.
Millisecond Returns the milliseconds part (0-999) of given timestamp.
Minute Returns the minutes part (0-59) of given timestamp.
Month Returns the months part (1-12) of given timestamp.
Second Returns the seconds part (0-59) of given timestamp.
Year Returns the year of given timestamp.

Other functions

Function Description
CaseWhen

Goes through conditions and returns a value when the first condition is met, similar to an if-then-else. Once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the else-expression. Consists of any number of pairs of condition and value expressions followed by an optional else expression.

The odd parameters are the conditions and the even parameters are the return values. If no conditions are true, it returns the value in the last parameter which is the "else" parameter. If the "else" parameter is not defined (i.e. there are even number of parameters), null value is used as default.

CaseWhen(Column("a") == null, 1, Column("a") < 1.0, 2, 3)
Returns 1 if the value of column "a" is null.
Returns 2 if the value of column "a" is less than 1.0.
Returns 3 otherwise.

Returns given value rounded to the nearest equal or larger integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Coalesce

Returns the first non-null parameter. There can be any number of parameters. If all parameters are null, returns null.

Coalesce(null, 3, 2)
Returns 3.

Coalesce(Column("column1"), "N/A")
Returns column "column1" value, except replaces nulls with "N/A".
Column

Return the value of given column.

Column("column1")

Column("My Column 2")
Concat

Return the concatenated string value of given values.

Concat("part 1", "part 2")
Returns "part 1part 2"

Concat(Column("column1"), " ", Column("column2"))
Returns column1 and column2 value concatenated separated by space.
Variable

Returns value of given variable. Supports number, string and boolean values.

Examples:

let myRegion = "Dallas";
DatatableById(123).SqlDataFrame.Where(Column("Region") == Variable("myRegion")).Collect()
Filters datatable by Region is Dallas.

AggregateFrom function

AggregateFrom function aggregates value from an aggregation level that is has smaller grain size than the current level used in the analysis to the current level. Parameters:

  1. aggregation level: Aggregation level to aggregate from including possible additional data frame expressions to prepare the aggregation level.
  2. aggregation function: Aggregation function or object definition.
  3. expression (optional, default = null): Value expression used to generate the expression evaluated in the external system prior to aggregation.
  4. filter: Optional filter to apply prior to performing value aggregation. Filter is given as JSON selection configuration transformed into expression language dictionaries, arrays and scalar values.

Example for EventTypes:

AggregateFrom(Events, "Count")
Returns the number of events having each event type.

Example for Cases:

AggregateFrom(Events, #{ "Function": "List", "Ordering": ["TimeStamp"], "Separator": "#,#" }, Column("EventType"))
Returns variation/event type path string for all the cases.

GetValueFrom(Variations, AggregateFrom(Cases, "Count"))
Returns the number of cases having the same variation for every case.

Cast(DateDiff("Seconds", AggregateFrom(Events.Where(Column("EventType") == "Sales Order"), "Min", Column("TimeStamp")), AggregateFrom(Events.Where(Column("EventType") == "Invoice"), "Max", Column("TimeStamp"))), "Float")
Returns the duration in seconds between the first occurrence of "Sales Order"-event type and the last occurrence of "Invoice" event type for each case.

Example for Model:

AggregateFrom(Cases, "Count", null, #{"Items":[#{"Type":"IncludeCases","Items":[#{"Type":"CaseAttributeValue","Values":["Dallas"], "Attribute":"Region"}]}]})
Returns the total number of cases in the model having "Dallas" as the value of "Region" case attribute.

GetValueFrom function

GetValueFrom function retrieves the value from aggregation level that has bigger or the same grain size than the current level to current level. Parameters:

  1. aggregation level: Aggregation level to aggregate from including possible additional data frame expressions to prepare the aggregation level.
  2. expression: Expression to evaluate in given aggregation level to get the returned value.
  3. filter: Optional filter to apply prior to performing expression evaluation. Filter is given as dictionary following the JSON filter syntax.

Examples as measure expression for events:

GetValueFrom(Cases, Column("Account Manager\"))
Returns for each event the value of Account Manager case attribute.

GetValueFrom(Variations, Column("Variation"))
Returns for each event variation/event type path string of its case.

Examples as measure expression for events:

GetValueFrom(Variations, AggregateFrom(Cases, "Count"))
Returns the number of cases having the same variation for every case.

GetValueFrom(Cases, Column("Variation"), #{"Items":[#{"Type":"IncludeEventTypes","Items":[#{"Type":"EventType","Values":["Shipment","Invoice"]}]}]})
Returns cases with their variations where only "Shipment" and "Invoice" event types are taken into account.