System Library: Difference between revisions

From QPR ProcessAnalyzer Wiki
Jump to navigation Jump to search
No edit summary
 
(25 intermediate revisions by the same user not shown)
Line 1: Line 1:
System library is a collection of Expression Language functions and properties that provide additional tools for scripting Process Analyzer functionalities. System library is referenced in scripts via ''_system''-property, which provides additional properties dedicated for different areas of interests for scripting.  
System library is a collection of Expression Language functions and properties that provide additional tools for scripting Process Analyzer functionalities. System library is referenced in scripts via ''_system''-property, which provides additional properties dedicated for different areas of interests for scripting.  


The following hierarchy shows the properties and functions available in System Library:
==ML.GeneratePredictionModel==
Documentation [[Create_Predicted_Eventlog#Create_prediction_script_in_QPR_ProcessAnalyzer|here]].


* ML
==ML.ApplyTransformations==
** [[Create Predicted Eventlog|GeneratePredictionModel]]
Documentation [[Create_Simulated_Eventlog#Create_simulation_script_in_QPR_ProcessAnalyzer|here]].


** [[Create Simulated Eventlog|ApplyTransformations]]
==Parallel.Run==
* Parallel
** Run
* RootCauses
** FindRootCausesDataFrame
* Utils
** GetSampledEvents
** ModifyColumnTypes
** RunFunctionWithParallelLogging
 
== Parallel.Run ==
Runs given functions in parallel.
Runs given functions in parallel.


====== Parameters ======
===Parameters===
*'''functions''':
**An array of functions to run in parallel.


* '''functions''':
===Return value ===
** An array of functions to run in parallel.
An array of results returned by the called functions, in the same order as the function generating them in the ''functions''-parameter.


====== Example ======
===Example===
The following script uses _system.Parallel.Run to run three functions:
The following script uses _system.Parallel.Run to run three functions:


* SAP-extraction from VBAK-table in SAP
* SAP-extraction from VBAK-table in SAP (connection parameters defined in connectionParametersDict-dictionary).
* Transform the extracted data by adding a new column.
*Transform the extracted data by adding a new column.
* Load the data into data table identified by dataTableId.
*Load the data into data table identified by dataTableId.
<syntaxhighlight lang="typescript">
<syntaxhighlight lang="typescript">
function ExtractTransformAndLoad(extractFunc, transformFunc, loadFunc)
function ExtractTransformAndLoad(extractFunc, transformFunc, loadFunc)
Line 59: Line 52:


ExtractTransformAndLoad(
ExtractTransformAndLoad(
   () => ExtractSap(GetSapConnectionJson().Extend(
   () => ExtractSap(connectionParametersDict.Extend(
     [
     [
     "FieldNames": "VBELN,ERDAT,ERZET,ERNAM,NETWR,WAERK",  
     "FieldNames": "VBELN,ERDAT,ERZET,ERNAM,NETWR,WAERK",  
Line 69: Line 62:
   df => df.SetColumns(["Test": () => `${Column("NETWR")} ${Column("WAERK")}`]),
   df => df.SetColumns(["Test": () => `${Column("NETWR")} ${Column("WAERK")}`]),
   dataFlow => {
   dataFlow => {
     DataTableById({dataTable.Id}).Import(dataFlow, ["Append": 0]);
     DataTableById(dataTableId).Import(dataFlow, ["Append": 0]);
   }
   }
);
);


DataTableById({dataTable.Id}).SqlDataFrame.OrderByColumns(["VBELN"], [true]).Collect()
DataTableById(dataTableId).SqlDataFrame.OrderByColumns(["VBELN"], [true]).Collect()
</syntaxhighlight>
 
==RootCauses.FindForDataFrame==
Finds root causes for a particular process phenomenon by comparing properties of selected cases against those of all cases in given model.
 
Based on the similar in-memory function: [[FindRootCauses Function|EventLog.FindRootCauses]].
 
===Parameters===
 
*'''model''':
**Model object of a model whose data tables are used to calculate the root causes.
*'''parameters''':
**A parameter convertible to a StringDictionary object with the following supported key-values:
*** '''Filter''':
****Filter json (#30921#) that is applied to the event log before calculating root causes.
*** '''Selection''':
**** Selection json (#30927#) that defines selected cases to find root causes for. Is applied on top of the filtered event log (specified by Filter parameter).
****If nothing is selected, 100% of cases are counted as selected.
*** '''CaseAttributeTypes''':
****An array of strings with the names of case attributes included into the root causes.
****Only case attributes of type string, integer or boolean are included.
****If CaseAttributeTypes is null or empty string or not specified, all case attributes having type "String" are included into the root causes.
****If the model doesn't have a case attribute with the specified name, an error message is shown.
*** '''EventAttributeTypes''':
****An array of strings with the names of event attributes included into the root causes.
****Only event attributes of type string are included.
****If EventAttributeTypes is null, all event attributes having type "String" are included into the root causes.
****If EventAttributeTypes is empty array or not specified, event attributes not applied to the root causes.
****If the model doesn't have an event attribute with the specified name, an error message is shown.
****Analysis column Name should contain Event Attribute Name.
****Analysis column Value should contain Event Attribute Value and number of occurrences in case: <value> (count).
****Analysis column Type should have "EventAttributeValue" as its value.
***'''WeightingExpression''':
**** Expression providing weights for each case.
****Expression, if defined, must be any of the following types:
***** A string containing the [[SQL Expressions|SqlExpression]] to evaluate.
*****A SqlExpression object, created e.g., using ToSqlExpression-function (see also: [[QPR_ProcessAnalyzer_Expressions#In-memory_expression_blocks_in_SQL_expressions|In-memory expression blocks in SQL expressions]]).
****A row is filtered out of result if expression result is null.
***'''MaximumRowCount''':
****The maximum number of the most and least contributing root causes to return. Thus, the actual number of returned rows can be at most two times this value (if specified).
****If undefined, 200 is used.
**** If set to 0, all rows are returned.
***'''MinValueUsage''':
****The minimum total usage of a value included into the comparison. The number of cases having every returned value should be at least given percentage (a float value between 0.0 and 1.0) of all the compared cases.
*****If not defined or null, all values are included (=default).
***'''MaxNumUniqueValues''':
****Maximum number of unique values to include into the comparison for each attribute column. If the amount of unique values for any attribute exceeds this value, only given number of attributes are included that have the highest usage.
****If not defined or null, all values are included (=default).
***'''IncludeOthers''': 
****Should the rest of the attribute values not included due to MinValueUsage or MaxNumUniqueValues filtering be included as an aggregated "Others" value?
****If given any string value, that value is used as the value for all the aggregated values.
****Default = undefined => others values will not be included into the results.
***'''ValueIfNull''': Value used to indicate null-values.
****Default = "(blank)".
****Must be not null.
 
===Return value===
Returns a [[SqlDataFrame in Expression Language|SqlDataFrame]] object with the following columns:
 
*Common columns:
**'''Type''':
***Type of the root cause:
****"CaseAttributeValue" for case attributes.
****"EventAttributeValue" for event attributes.
**'''Name''': 
***When type is CaseAttributeValue, case attribute name.
***When type is EventAttributeValue, event attribute name.
**'''Value''':
***When type is CaseAttributeValue, case attribute value.
***When type is EventAttributeValue, event attribute value and number of occurrences in case.
** '''Total''': 
***The total number of cases having the found root cause.
** '''Selected''':
***The number of cases that have the found root cause and belong to the selected cases.
**'''Compared''':
*** The number of cases that have the found root cause and don't belong to the selected cases.
**Columns when WeightingExpression not have value:
***'''Contribution''':
****The number of cases which contribute to the deviation from the average percentage of selected cases among all analyzed cases.
***'''ContributionPercentage''':
****The percent of cases which contribute to the deviation from the average percentage.
*** '''DifferencePercentage''':
****The deviation in percentage between selected cases with the found root cause and the average percentage of selected cases among all analyzed cases.
***'''SelectedPercentage''':
**** The percent of selected cases that have the found root cause out of all cases with that root cause.
**Columns when WeightingExpression have value:
***'''Contribution''':
****The sum of case weights which contribute to the deviation from the average percentage of selected case weights among all analyzed cases.
***'''ContributionPercentage''':
****The percent of case weights which contribute to the deviation from the average percentage.
*** '''DifferencePercentage''':
****The deviation in percentage between selected case weights with the found root cause and the average percentage of selected case weights among all analyzed cases.
***'''SelectedPercentage''':
****The percent of selected case weights that have the found root cause out of all case weights with that root cause.
*** '''SelectedWeight''':
**** The sum of weights that have the found root cause and belong to the selected cases.
*** '''ComparedWeight''':
****The sum of weights that have the found root cause and don't belong to the selected cases.
*** '''TotalWeight''':
****The sum of weights of all cases with that root cause.
 
===Example===
Calculate root cause analysis for given model using parameters read from the [[Web API: Expression/query|query configuration]].<syntaxhighlight lang="json">
{
  "ProcessingMethod": "DataFrame",
  "Root":"let m = _; _system.RootCauses.FindForDataFrame(m, _query.Configuration.Parameters.FindRootCausesParameters.Clone().Extend(#{\"Filter\": _query.Configuration.Filter}))",
  "Parameters": {
    "FindRootCausesParameters": {
      "CaseAttributeTypes": ["Account Manager","Customer Group","Product Group","Region"],
      "Selection": {"Items":[{"Type":"IncludeCases","Items":[{"Type":"CaseAttributeValue","Attribute":"Product Group","StringifiedValues":["0Hats"]}]}]},
      "MaxNumUniqueValues": 2,
      "MaximumRowCount": 1000,
      "MinValueUsage": 0.20,
      "WeightingExpression": "Column(\"Cost\")"
    }
  },
  "Ordering": [
    {"Name": "Contribution", "Direction": "Descending"}
  ]
}
 
</syntaxhighlight>
 
==Utils.GetSampledEvents==
Returns a [[SqlDataFrame in Expression Language|SqlDataFrame]] containing sampled events of given model where given filter is first applied.
 
===Parameters ===
 
*'''sourceModel''':
**ProcessAnalyzer model object of the model whose event data is to be filtered and sampled.
*'''sampledCaseCount''':
**The maximum number of cases to return (or null if all cases should be returned).
* '''filter''':
**JSON filter to be applied on the event data of the source model prior to performing the sampling.
 
===Return value===
[[SqlDataFrame in Expression Language|SqlDataFrame]] containing sampled events of given model where given filter is first applied.
 
===Example===
Get a sample of all the events of at most 1000 cases having "Hats" as "Product Group" case attribute value from model identified by modelId.<syntaxhighlight lang="typescript">
let eventDataSampleSdf = _system.Utils.GetSampledEvents(modelId, 1000, #{
  "Items":[#{
    "Type":"IncludeCases",
"Items":[#{
  "Type":"CaseAttributeValue",
  "Attribute":"Product Group",
  "StringifiedValues":["0Hats"]
}]
  }]
});
</syntaxhighlight>
 
==Utils.ModifyColumnTypes==
In-place modifies the column types of given columns in given data table.
 
===Parameters===
*'''dataTable''':
**ProcessAnalyzer model object of the model whose event data is to be filtered and sampled.
*'''columnTypesToSet''':
**Array of column name/type definitions to set.
***Only columns that are to be changed are required to be listed.
***Columns that don't exist in the data table will be skipped.
 
===Return value ===
Returns the modified data table object.
 
===Example===
Modify column "CaseId" to be of type string.<syntaxhighlight lang="typescript">
_system.Utils.ModifyColumnTypes(
  eventsTable,
  [#{"Name": "CaseId", "DataType": "String"}]
);
 
</syntaxhighlight>
</syntaxhighlight>


== RootCauses.FindRootCausesDataFrame ==
==Utils.RunFunctionWithParallelLogging==
foo
Runs given function that generates logging information into given data table in a way that all the logging will be included into the generated script run log as well (if run inside a script).
 
Internally polls the table every 5 seconds for new rows and adds all the newly added rows to script log.
 
===Parameters===
 
*'''logTable''':
**A DataTable used for logging.
*'''callbackFunc''':
** Function that uses given data table for logging it's current status.
 
===Return value===
The result of the callback function.
 
=== Example===
Run stored procedure named "StoredProcedureTest" in Snowflake, that generates new rows to log table identified by logTableId and log the generated rows into the script run log.<syntaxhighlight lang="typescript">
_system.Utils.RunFunctionWithParallelLogging(DataTableById(logTableId), () => {
    CreateSnowflakeConnection().CallStoredProcedure("StoredProcedureTest", #{})
});


== Utils.GetSampledEvents ==
foo


== Utils.RunFunctionWithParallelLogging ==
</syntaxhighlight>
foo

Latest revision as of 10:24, 20 November 2024

System library is a collection of Expression Language functions and properties that provide additional tools for scripting Process Analyzer functionalities. System library is referenced in scripts via _system-property, which provides additional properties dedicated for different areas of interests for scripting.

ML.GeneratePredictionModel

Documentation here.

ML.ApplyTransformations

Documentation here.

Parallel.Run

Runs given functions in parallel.

Parameters

  • functions:
    • An array of functions to run in parallel.

Return value

An array of results returned by the called functions, in the same order as the function generating them in the functions-parameter.

Example

The following script uses _system.Parallel.Run to run three functions:

  • SAP-extraction from VBAK-table in SAP (connection parameters defined in connectionParametersDict-dictionary).
  • Transform the extracted data by adding a new column.
  • Load the data into data table identified by dataTableId.
function ExtractTransformAndLoad(extractFunc, transformFunc, loadFunc)
{
  let rawDataFlow = extractFunc();
  let transformedDataFlow = ToDataFlow();

  _system.Parallel.Run([
    () => Catch({
      let df;
      while (!IsNullTop(df = rawDataFlow.Collect(#{"CollectChunk": true}))) {
        transformedDataFlow.Append(transformFunc(df));
        WriteLog(`A chunk having ${df.NRows} rows has been transformed.`);
      }
      if (rawDataFlow.HasError) {
        transformedDataFlow.Fail("Error occurred during data extraction.");
      }
      else {
        transformedDataFlow.Complete();
      }
    }, {
      transformedDataFlow.Fail("Error occurred during transformation calculation.");
    }),
    () => {
      loadFunc(transformedDataFlow);
    }
  ]);
}

ExtractTransformAndLoad(
  () => ExtractSap(connectionParametersDict.Extend(
    [
    "FieldNames": "VBELN,ERDAT,ERZET,ERNAM,NETWR,WAERK", 
    "QueryTable": "VBAK",
	"Options": ["VBELN BETWEEN '0000017448'" ,"AND '0060000042'"],
    "UseGateway": true
    ])
  ),
  df => df.SetColumns(["Test": () => `${Column("NETWR")} ${Column("WAERK")}`]),
  dataFlow => {
    DataTableById(dataTableId).Import(dataFlow, ["Append": 0]);
  }
);

DataTableById(dataTableId).SqlDataFrame.OrderByColumns(["VBELN"], [true]).Collect()

RootCauses.FindForDataFrame

Finds root causes for a particular process phenomenon by comparing properties of selected cases against those of all cases in given model.

Based on the similar in-memory function: EventLog.FindRootCauses.

Parameters

  • model:
    • Model object of a model whose data tables are used to calculate the root causes.
  • parameters:
    • A parameter convertible to a StringDictionary object with the following supported key-values:
      • Filter:
        • Filter json (#30921#) that is applied to the event log before calculating root causes.
      • Selection:
        • Selection json (#30927#) that defines selected cases to find root causes for. Is applied on top of the filtered event log (specified by Filter parameter).
        • If nothing is selected, 100% of cases are counted as selected.
      • CaseAttributeTypes:
        • An array of strings with the names of case attributes included into the root causes.
        • Only case attributes of type string, integer or boolean are included.
        • If CaseAttributeTypes is null or empty string or not specified, all case attributes having type "String" are included into the root causes.
        • If the model doesn't have a case attribute with the specified name, an error message is shown.
      • EventAttributeTypes:
        • An array of strings with the names of event attributes included into the root causes.
        • Only event attributes of type string are included.
        • If EventAttributeTypes is null, all event attributes having type "String" are included into the root causes.
        • If EventAttributeTypes is empty array or not specified, event attributes not applied to the root causes.
        • If the model doesn't have an event attribute with the specified name, an error message is shown.
        • Analysis column Name should contain Event Attribute Name.
        • Analysis column Value should contain Event Attribute Value and number of occurrences in case: <value> (count).
        • Analysis column Type should have "EventAttributeValue" as its value.
      • WeightingExpression:
        • Expression providing weights for each case.
        • Expression, if defined, must be any of the following types:
        • A row is filtered out of result if expression result is null.
      • MaximumRowCount:
        • The maximum number of the most and least contributing root causes to return. Thus, the actual number of returned rows can be at most two times this value (if specified).
        • If undefined, 200 is used.
        • If set to 0, all rows are returned.
      • MinValueUsage:
        • The minimum total usage of a value included into the comparison. The number of cases having every returned value should be at least given percentage (a float value between 0.0 and 1.0) of all the compared cases.
          • If not defined or null, all values are included (=default).
      • MaxNumUniqueValues:
        • Maximum number of unique values to include into the comparison for each attribute column. If the amount of unique values for any attribute exceeds this value, only given number of attributes are included that have the highest usage.
        • If not defined or null, all values are included (=default).
      • IncludeOthers:
        • Should the rest of the attribute values not included due to MinValueUsage or MaxNumUniqueValues filtering be included as an aggregated "Others" value?
        • If given any string value, that value is used as the value for all the aggregated values.
        • Default = undefined => others values will not be included into the results.
      • ValueIfNull: Value used to indicate null-values.
        • Default = "(blank)".
        • Must be not null.

Return value

Returns a SqlDataFrame object with the following columns:

  • Common columns:
    • Type:
      • Type of the root cause:
        • "CaseAttributeValue" for case attributes.
        • "EventAttributeValue" for event attributes.
    • Name:
      • When type is CaseAttributeValue, case attribute name.
      • When type is EventAttributeValue, event attribute name.
    • Value:
      • When type is CaseAttributeValue, case attribute value.
      • When type is EventAttributeValue, event attribute value and number of occurrences in case.
    • Total:
      • The total number of cases having the found root cause.
    • Selected:
      • The number of cases that have the found root cause and belong to the selected cases.
    • Compared:
      • The number of cases that have the found root cause and don't belong to the selected cases.
    • Columns when WeightingExpression not have value:
      • Contribution:
        • The number of cases which contribute to the deviation from the average percentage of selected cases among all analyzed cases.
      • ContributionPercentage:
        • The percent of cases which contribute to the deviation from the average percentage.
      • DifferencePercentage:
        • The deviation in percentage between selected cases with the found root cause and the average percentage of selected cases among all analyzed cases.
      • SelectedPercentage:
        • The percent of selected cases that have the found root cause out of all cases with that root cause.
    • Columns when WeightingExpression have value:
      • Contribution:
        • The sum of case weights which contribute to the deviation from the average percentage of selected case weights among all analyzed cases.
      • ContributionPercentage:
        • The percent of case weights which contribute to the deviation from the average percentage.
      • DifferencePercentage:
        • The deviation in percentage between selected case weights with the found root cause and the average percentage of selected case weights among all analyzed cases.
      • SelectedPercentage:
        • The percent of selected case weights that have the found root cause out of all case weights with that root cause.
      • SelectedWeight:
        • The sum of weights that have the found root cause and belong to the selected cases.
      • ComparedWeight:
        • The sum of weights that have the found root cause and don't belong to the selected cases.
      • TotalWeight:
        • The sum of weights of all cases with that root cause.

Example

Calculate root cause analysis for given model using parameters read from the query configuration.

{
  "ProcessingMethod": "DataFrame",
  "Root":"let m = _; _system.RootCauses.FindForDataFrame(m, _query.Configuration.Parameters.FindRootCausesParameters.Clone().Extend(#{\"Filter\": _query.Configuration.Filter}))",
  "Parameters": {
    "FindRootCausesParameters": {
      "CaseAttributeTypes": ["Account Manager","Customer Group","Product Group","Region"],
      "Selection": {"Items":[{"Type":"IncludeCases","Items":[{"Type":"CaseAttributeValue","Attribute":"Product Group","StringifiedValues":["0Hats"]}]}]},
      "MaxNumUniqueValues": 2,
      "MaximumRowCount": 1000,
      "MinValueUsage": 0.20,
      "WeightingExpression": "Column(\"Cost\")"
    }
  },
  "Ordering": [
    {"Name": "Contribution", "Direction": "Descending"}
  ]
}

Utils.GetSampledEvents

Returns a SqlDataFrame containing sampled events of given model where given filter is first applied.

Parameters

  • sourceModel:
    • ProcessAnalyzer model object of the model whose event data is to be filtered and sampled.
  • sampledCaseCount:
    • The maximum number of cases to return (or null if all cases should be returned).
  • filter:
    • JSON filter to be applied on the event data of the source model prior to performing the sampling.

Return value

SqlDataFrame containing sampled events of given model where given filter is first applied.

Example

Get a sample of all the events of at most 1000 cases having "Hats" as "Product Group" case attribute value from model identified by modelId.

let eventDataSampleSdf = _system.Utils.GetSampledEvents(modelId, 1000, #{
  "Items":[#{
    "Type":"IncludeCases",
	"Items":[#{
	  "Type":"CaseAttributeValue",
	  "Attribute":"Product Group",
	  "StringifiedValues":["0Hats"]
	}]
  }]
});

Utils.ModifyColumnTypes

In-place modifies the column types of given columns in given data table.

Parameters

  • dataTable:
    • ProcessAnalyzer model object of the model whose event data is to be filtered and sampled.
  • columnTypesToSet:
    • Array of column name/type definitions to set.
      • Only columns that are to be changed are required to be listed.
      • Columns that don't exist in the data table will be skipped.

Return value

Returns the modified data table object.

Example

Modify column "CaseId" to be of type string.

_system.Utils.ModifyColumnTypes(
  eventsTable, 
  [#{"Name": "CaseId", "DataType": "String"}]
);

Utils.RunFunctionWithParallelLogging

Runs given function that generates logging information into given data table in a way that all the logging will be included into the generated script run log as well (if run inside a script).

Internally polls the table every 5 seconds for new rows and adds all the newly added rows to script log.

Parameters

  • logTable:
    • A DataTable used for logging.
  • callbackFunc:
    • Function that uses given data table for logging it's current status.

Return value

The result of the callback function.

Example

Run stored procedure named "StoredProcedureTest" in Snowflake, that generates new rows to log table identified by logTableId and log the generated rows into the script run log.

_system.Utils.RunFunctionWithParallelLogging(DataTableById(logTableId), () => {
    CreateSnowflakeConnection().CallStoredProcedure("StoredProcedureTest", #{})
});