Machine Learning Functions in Expression Language
This pages describes functions and properties that are related to the machine learning functionality in the QPR ProcessAnalyzer expression language.
MLModel
MLModel properties | Description |
---|---|
Type | Returns the exact type of the MLModel. |
Machine Learning Functions
Function | Parameters | Description | |
---|---|---|---|
MLModel (MLModel) |
Type (string) |
Create a new binary classification model of given type. Takes type as a parameter which is the type of the prediction/classification model to create. Currently the only supported value is randomforest using the Accord.NET's RandomForest algorithm. Returns the created MLModel object. |
|
Train (MLModel) |
|
Trains given MLModel using given input data and expected outcomes. Parameters:
Returns the trained MLModel object itself. | |
Transform (array) |
Input data |
Transforms given input data using the machine learning model thus generating predictions/classifications. Takes the input data as a parameter which is a two dimensional array of data where the first dimension (rows) specifies different data points and the the second dimension (columns) specifies the feature values. Returns an array of predictions/classifications. Transformations for each row in the input data can be found at the same index of the returned array. |
Examples
Example #1: Train a model using an event log and test its performance by replaying training data itself.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", EventLogById(1)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("allCases", el.Cases); Let("allCasesOH", columnInformation.GenerateOneHot(el.Cases)); Let("trainDataOH", allCasesOH); Let("outcomes", allCases.(Duration > TimeSpan(24))); Let("testDataOH", allCasesOH); Let("predictions", MLModel("randomforest") .Train(trainDataOH, outcomes) .Transform(trainDataOH)); Sum(Zip(outcomes, predictions).(_[0] == _[1] != 0)) / Count(outcomes)
Example #2: Train a model using an a 75% sample of an event log and test its performance by using the rest 25% of the event log.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", EventLogById(1)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("allCases", Shuffle(el.Cases)); Let("lastTrainCaseIndex", 0.75 * CountTop(el.Cases)); Let("trainCases", allCases[NumberRange(0, lastTrainCaseIndex)]); Let("testCases", allCases[NumberRange(lastTrainCaseIndex + 1, CountTop(el.Cases) - 1)]); Let("trainDataOH", columnInformation.GenerateOneHot(trainCases)); Let("testDataOH", columnInformation.GenerateOneHot(testCases)); Let("trainOutcomes", trainCases.(Duration > TimeSpan(24))); Let("testOutcomes", testCases.(Duration > TimeSpan(24))); Let("predictions", MLModel("randomforest") .Train(trainDataOH, trainOutcomes) .Transform(testDataOH)); Sum(Zip(testOutcomes, predictions).(_[0] == _[1] != 0)) / Count(testOutcomes)
Example #3: Three sets of cases: training cases, target cases (subset of training cases) and test cases (independent set of cases). Try to predict which cases in the test set will eventually end up becoming a case in target cases.
Def("GetOneHotColumnInformation", ( Let("el", _), ToDictionary([ "et": OrderByValue(el.EventTypes), "at": ToDictionary(ConcatTop(OrderByTop(el.CaseAttributes, Name).[_: Values])) ]) )); Def("GenerateOneHot", "cases", ( Let("columnInformation", _), cases.( Let("cas", _), Flatten( [ columnInformation.Get("et").(Let("et", _), If(Count(cas.EventsByType(et)) > 0, 1, 0)), ( Let("atColumns", columnInformation.Get("at")), OrderByValue(atColumns.Keys).( Let("key", _), Let("values", atColumns.Get(key)), Let("caseValue", cas.Attribute(key)), values.(If(_ == caseValue, 1, 0)) ) ) ] ) ) )); Let("el", <event log to use>); Let("trainCases", <cases to use for training>); Let("targetCases", <cases representing the properties we want to try to predict (subset of traincases)>); Let("testCases", <cases to use for testing>); Let("targetCasesDict", ToDictionary(targetCases:true)); Let("outcomes", traincases.(Let("c", _), targetCasesDict.ContainsKey(c) ? 1 : 0)); Let("columnInformation", el.GetOneHotColumnInformation()); Let("mlModel", MLModel("randomforest")); mlModel.Train(columnInformation.GenerateOneHot(trainCases), outcomes); mlModel.Transform(columnInformation.GenerateOneHot(testCases));