SQL Expressions: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
Line 30: Line 30:
||DateDiff
||DateDiff
||
||
||
Calculates how many of the specified date part boundaries there are between the specified dates. Parameters:
Calculates how many specified date part boundaries there are between the specified dates. Parameters:
# '''date period''': Date period in which the difference is calculated between the dates. Supported date periods: second, minute, hour, day, week, quarter, month, year.
# '''date part''': Units in which datediff reports the difference between the dates. Supported date parts: second, minute, hour, day, week, quarter, month, year.
# '''start date''': Starting timestamp.
# '''start date''': Starting timestamp.
# '''end date''': Ending timestamp.
# '''end date''': Ending timestamp.
Line 38: Line 37:
||Day
||Day
||
||
Returns the days part of given timestamp.
Returns the days of the month (1-31) of given timestamp.
<pre>
<pre>
Day(Column("DateColumn"))
Day(Column("DateColumn"))
Line 45: Line 44:
|-
|-
||Hour
||Hour
||Returns the hours part of given timestamp.
||Returns the hours part (0-59) of given timestamp.
|-
|-
||Millisecond
||Millisecond
||Returns the milliseconds part of given timestamp.
||Returns the milliseconds part (0-999) of given timestamp.
|-
|-
||Minute
||Minute
||Returns the minutes part of given timestamp.
||Returns the minutes part (0-59) of given timestamp.
|-
|-
||Month
||Month
||Returns the months part of given timestamp.
||Returns the months part (1-12) of given timestamp.
|-
|-
||Second
||Second
||Returns the seconds part of given timestamp.
||Returns the seconds part (0-59) of given timestamp.
|-
|-
||Year
||Year
||Returns the years part of given timestamp.
||Returns the year of given timestamp.
|}
|}



Revision as of 15:09, 20 February 2022

SQL expressions are special expressions, which are converted into SQL and evaluated in an external system that supports SQL (e.g., Snowflake, SQL Server, Databricks, Redshift). Only a subset of expression language functionalities are supported by the SQL expressions (which are explained in this page). SQL expressions are used e.g. in the Where and WithExpressionColumn functions.

Operators

Following operators are supported by by the SQL expressions:

  • Arithmetic operators: +, -, *, /, %
  • Comparison operators: ==, <, <=, >, >=, !=.
  • Logical operators: &&, or, !
  • Data types: strings ("this is a string"), integers (123), decimal numbers (123.45), booleans (true, false), null value (null)

Mathematical functions

Function Description
Ceiling

Returns given value rounded to the nearest equal or larger integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Floor

Returns given value rounded to the nearest equal or smaller integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Date functions

Function Description¨
DateDiff

Calculates how many of the specified date part boundaries there are between the specified dates. Parameters:

  1. date period: Date period in which the difference is calculated between the dates. Supported date periods: second, minute, hour, day, week, quarter, month, year.
  2. start date: Starting timestamp.
  3. end date: Ending timestamp.
Day

Returns the days of the month (1-31) of given timestamp.

Day(Column("DateColumn"))

Hour Returns the hours part (0-59) of given timestamp.
Millisecond Returns the milliseconds part (0-999) of given timestamp.
Minute Returns the minutes part (0-59) of given timestamp.
Month Returns the months part (1-12) of given timestamp.
Second Returns the seconds part (0-59) of given timestamp.
Year Returns the year of given timestamp.

Other functions

Function Description
CaseWhen

Goes through conditions and returns a value when the first condition is met, similar to an if-then-else. Once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the else-expression. Consists of any number of pairs of condition and value expressions followed by an optional else expression.

The odd parameters are the conditions and the even parameters are the return values. If no conditions are true, it returns the value in the last parameter which is the "else" parameter. If the "else" parameter is not defined (i.e. there are even number of parameters), null value is used as default.

CaseWhen(Column("a") == null, 1, Column("a") < 1.0, 2, 3)
Returns 1 if the value of column "a" is null.
Returns 2 if the value of column "a" is less than 1.0.
Returns 3 otherwise.

Returns given value rounded to the nearest equal or larger integer. The data type should be one of the numeric data types. If the value is null, then the result is also null.

Coalesce

Returns the first non-null parameter. There can be any number of parameters. If all parameters are null, returns null.

Coalesce(null, 3, 2)
Returns 3.

Coalesce(Column("column1"), "N/A")
Returns column "column1" value, except replaces nulls with "N/A".
Column

Return the value of given column.

Column("column1")

Column("My Column 2")
Concat

Return the concatenated string value of given values.

Concat("part 1", "part 2")
Returns "part 1part 2"

Concat(Column("column1"), " ", Column("column2"))
Returns column1 and column2 value concatenated separated by space.
Variable

Returns the value of given variable. Supports number, string and boolean values.

Examples:

let x = "124256122";
CreateRedshiftConnection().Table("eventlog_o2c_events").Where(#sql{Column("CaseId")==Variable("x")}).Collect()

Returns cases having id equal to "124256122".

AggregateFrom function

AggregateFrom function aggregates value from an aggregation level that is has smaller grain size than the current level used in the analysis to the current level. Parameters:

  1. aggregation level: Aggregation level to aggregate from including possible additional data frame expressions to prepare the aggregation level.
  2. aggregation function: Aggregation function or object definition.
  3. expression (optional, default = null): Value expression used to generate the expression evaluated in the external system prior to aggregation.
  4. filter (optional): Filter to be applied to the events prior to performing value aggregation. Filter is given as JSON selection configuration transformed into expression language dictionaries, arrays and scalar values.

Examples:

"Root": "EventTypes",
"Values": [{
  "Name": "Count",
  "Expression": "AggregateFrom(Events, \"Count\")"
}]

Returns the number of events having each event type.

"Root": "Cases",
"Values": [{
  "Name": "Variation",
  "Expression": "AggregateFrom(Events, #{ \"Function\":  \"List\", \"Ordering\": [\"TimeStamp\"], \"Separator\": \"#,#\" }, Column(\"EventType\"))"
}]

Returns variation/event type path string for all the cases.

"Root":"Cases",
"Values":[{
    "Name":"NumCasesWithSameVariation",
    "Expression": "GetValueFrom(Variations, AggregateFrom(Cases, \"Count\"))"
}]

Returns the number of cases having the same variation for every case.

"Root": "Model",
"Values": [{
  "Name": "NumCasesWithDallasAsRegion",
  "Expression":"AggregateFrom(Cases, \"Count\", null, #{\"Items\":[#{\"Type\":\"IncludeCases\",\"Items\":[#{\"Type\":\"CaseAttributeValue\",\"Values\":[\"Dallas\"], \"Attribute\":\"Region\"}]}]})"
}]

Returns the total number of cases in the model having "Dallas" as the value of "Region" case attribute.

"Root":"Cases",
"Values": [{
  "Name": "Duration",
  "Expression": "Cast(DateDiff(\"Seconds\", AggregateFrom(Events.Where(Column(\"EventType\") == \"Sales Order\"), \"Min\", Column(\"TimeStamp\")), AggregateFrom(Events.Where(Column(\"EventType\") == \"Invoice\"), \"Max\", Column(\"TimeStamp\"))), \"Float\")"
}]

Returns the duration in seconds between the first occurrence of "Sales Order"-event type and the last occurrence of "Invoice"-event type for each case.

GetValueFrom function

GetValueFrom function retrieves the value from aggregation level that has bigger or the same grain size than the current level to current level. Parameters:
  1. aggregationLevel: Aggregation level to aggregate from including possible additional data frame expressions to prepare the aggregation level.
  2. expression: Value expression to evaluate in given aggregation level.
  3. filter (optional): Filter to be applied to the events prior to performing value expression evaluation. Filter is given as JSON selection configuration transformed into expression language dictionaries, arrays and scalar values.

Examples:

"Root": "Events",
"Values":[{
  "Name":"Account Manager",
  "SourceColumn": "Account Manager",
  "Expression": "GetValueFrom(Cases, Column(\ "Account Manager\"))"
}]

Returns for each event the value of Account Manager case attribute.

"Root": "Events",
"Values":[{
  "Name": "Variation",
  "Expression": "GetValueFrom(Variations, Column(\"Variation\"))"
}]

Returns for each event variation/event type path string of its case.

"Root":"Cases",
"Values":[{
    "Name":"NumCasesWithSameVariation",
    "Expression": "GetValueFrom(Variations, AggregateFrom(Cases, \"Count\"))"
}]

Returns the number of cases having the same variation for every case.

"Root":"Cases",
"Values":[{
    "Name":"FilteredVariation",
    "Expression": "GetValueFrom(Cases, Column(\"Variation\"), #{{\"Items\":[#{{\"Type\":\"IncludeEventTypes\",\"Items\":[#{{\"Type\": \"EventType\",\"Values\":[\"Shipment\",\"Invoice\"]}}]}}]}})"
}]

Returns cases with their variations where only "Shipment" and "Invoice" event types are taken into account.