Machine Learning Functions in Expression Language: Difference between revisions
No edit summary |
No edit summary |
||
Line 11: | Line 11: | ||
|} | |} | ||
== Machine Learning Functions == | |||
{| class="wikitable" | {| class="wikitable" | ||
!''' | !'''Function''' | ||
! '''Parameters''' | ! '''Parameters''' | ||
! '''Description''' | ! '''Description''' | ||
|- | |- | ||
||MLModel (MLModel) | ||MLModel (MLModel) | ||
|| | || | ||
Type (string) | |||
|| | || | ||
Create a new binary classification model of given type. Takes type as a parameter which is the type of the prediction/classification model to create. Currently the only supported value is '''randomforest''' using the Accord.NET's RandomForest algorithm. | Create a new binary classification model of given type. Takes type as a parameter which is the type of the prediction/classification model to create. Currently the only supported value is '''randomforest''' using the Accord.NET's RandomForest algorithm. | ||
Line 46: | Line 48: | ||
|- | |- | ||
||Transform (array) | ||Transform (array) | ||
|| | || | ||
Input data | |||
|| | || | ||
Transforms given input data using the machine learning model thus generating predictions/classifications. | Transforms given input data using the machine learning model thus generating predictions/classifications. | ||
Line 57: | Line 60: | ||
== Examples == | == Examples == | ||
Example #1: Train a model using an event log and test its performance by replaying training data itself. | |||
<pre> | <pre> | ||
Def("GetOneHotColumnInformation", ( | Def("GetOneHotColumnInformation", ( | ||
Line 101: | Line 104: | ||
.Transform(trainDataOH)); | .Transform(trainDataOH)); | ||
Sum(Zip(outcomes, predictions).(_[0] == _[1] != 0)) / Count(outcomes) | Sum(Zip(outcomes, predictions).(_[0] == _[1] != 0)) / Count(outcomes) | ||
</pre> | |||
Example #2: Train a model using an a 75% sample of an event log and test its performance by using the rest 25% of the event log. | Example #2: Train a model using an a 75% sample of an event log and test its performance by using the rest 25% of the event log. | ||
<pre> | |||
Def("GetOneHotColumnInformation", ( | Def("GetOneHotColumnInformation", ( | ||
Let("el", _), | Let("el", _), | ||
Line 149: | Line 152: | ||
.Transform(testDataOH)); | .Transform(testDataOH)); | ||
Sum(Zip(testOutcomes, predictions).(_[0] == _[1] != 0)) / Count(testOutcomes) | Sum(Zip(testOutcomes, predictions).(_[0] == _[1] != 0)) / Count(testOutcomes) | ||
</pre> | |||
Example #3: Three sets of cases: training cases, target cases (subset of training cases) and test cases (independent set of cases). Try to predict which cases in the test set will eventually end up becoming a case in target cases. | Example #3: Three sets of cases: training cases, target cases (subset of training cases) and test cases (independent set of cases). Try to predict which cases in the test set will eventually end up becoming a case in target cases. | ||
<pre> | |||
Def("GetOneHotColumnInformation", ( | Def("GetOneHotColumnInformation", ( | ||
Let("el", _), | Let("el", _), |
Revision as of 12:55, 6 May 2019
This pages describes functions and properties that are related to the machine learning functionality in the QPR ProcessAnalyzer expression language.
MLModel
MLModel properties | Description |
---|---|
Type | Returns the exact type of the MLModel. |
Machine Learning Functions
Function | Parameters | Description | |
---|---|---|---|
MLModel (MLModel) |
Type (string) |
Create a new binary classification model of given type. Takes type as a parameter which is the type of the prediction/classification model to create. Currently the only supported value is randomforest using the Accord.NET's RandomForest algorithm. Returns the created MLModel object. |
|
Train (MLModel) |
|
Trains given MLModel using given input data and expected outcomes. Parameters:
Returns the trained MLModel object itself. | |
Transform (array) |
Input data |
Transforms given input data using the machine learning model thus generating predictions/classifications. Takes the input data as a parameter which is a two dimensional array of data where the first dimension (rows) specifies different data points and the the second dimension (columns) specifies the feature values. Returns an array of predictions/classifications. Transformations for each row in the input data can be found at the same index of the returned array. |
Examples
Example #1: Train a model using an event log and test its performance by replaying training data itself.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", EventLogById(1)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("allCases", el.Cases); Let("allCasesOH", columnInformation.GenerateOneHot(el.Cases)); Let("trainDataOH", allCasesOH); Let("outcomes", allCases.(Duration > TimeSpan(24))); Let("testDataOH", allCasesOH); Let("predictions", MLModel("randomforest") .Train(trainDataOH, outcomes) .Transform(trainDataOH)); Sum(Zip(outcomes, predictions).(_[0] == _[1] != 0)) / Count(outcomes)
Example #2: Train a model using an a 75% sample of an event log and test its performance by using the rest 25% of the event log.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", EventLogById(1)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("allCases", Shuffle(el.Cases)); Let("lastTrainCaseIndex", 0.75 * CountTop(el.Cases)); Let("trainCases", allCases[NumberRange(0, lastTrainCaseIndex)]); Let("testCases", allCases[NumberRange(lastTrainCaseIndex + 1, CountTop(el.Cases) - 1)]); Let("trainDataOH", columnInformation.GenerateOneHot(trainCases)); Let("testDataOH", columnInformation.GenerateOneHot(testCases)); Let("trainOutcomes", trainCases.(Duration > TimeSpan(24))); Let("testOutcomes", testCases.(Duration > TimeSpan(24))); Let("predictions", MLModel("randomforest") .Train(trainDataOH, trainOutcomes) .Transform(testDataOH)); Sum(Zip(testOutcomes, predictions).(_[0] == _[1] != 0)) / Count(testOutcomes)
Example #3: Three sets of cases: training cases, target cases (subset of training cases) and test cases (independent set of cases). Try to predict which cases in the test set will eventually end up becoming a case in target cases.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", <event log to use>); Let("trainCases", <cases to use for training>); Let("targetCases", <cases representing the properties we want to try to predict (subset of traincases)>); Let("testCases", <cases to use for testing>); Let("targetCasesDict", ToDictionary(targetCases:true)); Let("outcomes", traincases.(Let("c", _), targetCasesDict.ContainsKey(c) ? 1 : 0)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("mlModel", MLModel("randomforest")); mlModel.Train(columnInformation.GenerateOneHot(trainCases), outcomes); mlModel.Transform(columnInformation.GenerateOneHot(testCases));