Datatable in Expression Language
Datatables store data in QPR ProcessAnalyzer persistently. Datatables can contain one-to-many named columns and zero-to-many rows. Datatables are stored to QPR ProcessAnalyzer database as database tables. Each datatable belongs to a project.
The datatable entity in the expression language is used to access Datatables metadata, such as name, description and audit information. The data contents can be accessed by using the DataFrame property, which causes the actual data to be loaded from the database into memory for calculations. The data contents can also be accessed by using the SqlDataFrame property, which performs calculations in the datasource where the datatable located and only transfers and shows the result of the calculation.
Datatables have many usecases:
- Models can load their data from Datatables.
- Datatables can be used to store data extracted from external systems
- Intermediate ETL calculation results can be stored to Datatables
- It's possible to visualize data from Datatables directly in the dashboards without building a QPR ProcessAnalyzer model
Datatable Properties
Property | Description |
---|---|
ColumnNames (String*) | Returns Datatable column names as a string array in the order they are in the DataTable. |
ColumnMappings (Dictionary) | Returns column mappings configured for this Datatable. Returned data is a dictionary, where keys are mapping names (CaseId, EventType, Timestamp) and value is the column name. The ColumnMappings property returns null if column mappings have not been defined for the Datatable.
Example: let caseIdColumnName = DatatableById(123).ColumnMappings("CaseId"); |
ColumnTypes (Dictionary*) |
Returns Datatable column metadata as array of dictionaries. Each column dictionary has following properties:
DatatableById(123).Columns[2].Datatype |
Configuration (Dictionary) | Returns the datatable json configuration as a dictionary. Datatable configuration has the following documented properties:
Example: DatatableById(123).Configuration.Columns[0].Datatype |
ConfigurationJson (String) | Returns the datatable configuration as json string. |
CreatedBy (User) | User who created the datatable. |
CreatedDate (DateTime) | Timestamp when the datatable was created. |
DatabaseNameInDataSource (String) | Returns the user-defined database name in the data source. If no database has been defined for the datatable (i.e., database of the connection string is used), null is returned. |
DataFrame (DataFrame) | Contents of the datatable as a DataFrame. |
DataSourceConnection | Returns the connection object used by this datatable to connect to its datasource. |
DataSourceObjectType (String) |
Returns table if the source data comes from a database table, and view if the data comes from a database view. When datasource type is view, any operations to modify the datatable data are not possible. When a datatable with the datasource type view is deleted, the underlying database view is not deleted, because the views are managed by users directly in the Snowflake. When a datatable with the datasource type table is deleted, the underlying database table is deleted, because the tables are managed by QPR ProcessAnalyzer automatically. |
DataSourceType (String) | Physical storage location of the datatable data. Options:
|
Description (String) | Datatable description. The datatable description may contain line breaks. |
ForeignKeys (String) | Returns foreign keys of the datatable. The data structure is an array of dictionaries with following keys:
Examples: DatatableById(1).ForeignKeys Returns: [ { "Columns": [ { "Source": "Case Name", "Target": "Case Id" } ], "TargetDatatable": "Cases" }, { "Columns": [ { "Source": "Employee Id", "Target": "Employee Id"}, { "Source": "Record version", "Target": "Record version"} ], "TargetDatatable": "Employees" } ] |
Id (Integer) | Datatable Id. Datatable Id is generated by QPR ProcessAnalyzer when the datatable is created. |
LastImportBy (User) | User who the last time imported or modified the data in the datatable. Modification covers adding new rows, modifying rows and deleting rows. This attribute contains correct information only if data is imported using QPR ProcessAnalyzer UI. Data can also be imported to Snowflake tables directly using Snowflake, and in that case this attribute doesn't track the user correctly. |
LastImportDate (DateTime) | Timestamp when data was last time imported or modified in the datatable. Shows the correct import time when data is imported using QPR ProcessAnalyzer UI. When the Synchronize function is used, the LastImportDate is set to the current time (regardless of when data was actually imported to the datatable). |
LastModifiedBy (User) | User who last time modified the datatable properties. This does not cover the data contained by the datatable (for that, use the LastImportBy by property). If DataSourceObjectType is view, gives null. |
LastModifiedDate (DateTime) | Timestamp when the datatable properties were last time modified (such as name, description or containing project). This does not cover the data contained by the datatable (for that, use the LastImportDate by property). |
Name (String) | Name of the datatable. |
NameInDataSource (String) | Name of the table in the datasource (e.g., Snowflake or SQL Server) containing the datatable data. If the table name is specified by a user, that table name is returned. Otherwise the automatically determined table name is returned. This property identifies the correct table when loading data directly to the table in the datasource. |
NColumns (Integer) | Number of columns in the datatable. |
NRows (Integer) | Number of data rows in the datatable. |
Project (Project) | Project the datatable belongs to. |
ProjectId (Number) | Id of the project the datatable belongs to. |
SchemaNameInDataSource (String) | Returns the user-defined schema name in the data source. If no schema has been defined for the datatable (i.e., schema of the connection string is used), null is returned. |
SqlDataFrame (SqlDataFrame) | Returns SqlDataFrame that can be used to access the contents of the DataTable. |
UsedInModels (Model*) | Returns list of all models which are using this datatable as a datasource. |
Datatable Functions
Function | Parameters | Description |
---|---|---|
AddColumn |
|
Adds a new column to the Datatable. Existing rows in the Datatable will get null values for the created column. The function returns the updated Datatable entity. Available data types are String, Integer, Float, DateTime, Boolean, Duration (Timespan) and Any (can contain any type of data). Example: Add three columns to the Datatable with id 123. DatatableById(123) .AddColumn("MyColumn1", "String") .AddColumn("MyColumn2", "Float") .AddColumn("MyColumn3", "DateTime") |
AddColumns |
|
Adds one or several columns to the Datatable in the same operation. Takes as a parameter an array objects with the following properties:
Function returns the Datatable object itself. Requires the GenericWrite permission for the project. Example: Add three columns (MyColumn1, MyColumn2, MyColumn3) with different data types to datatable having id 123: DatatableById(123) .AddColumns([ #{"Name": "MyColumn1", "DataType": "String"}, #{"Name": "MyColumn2", "DataType": "Float"}, #{"Name": "MyColumn3", "DataType": "Integer"} ]); |
DeletePermanently | (none) |
Deletes the Datatable permanently. This will lead to a lost of all data in the Datatable. Datatables cannot be moved to the recycle bin, so they can only be deleted permanently. Example: Delete permanently Datatable with id 123. DatatableById(123).DeletePermanently() |
DeleteRows | Expression to select deleted rows | Deletes defined rows from the datatable. The parameter is an expression (evaluated for each row) to select which rows are deleted. If the parameter is not specified, all rows are deleted.
Example: Delete rows from model's events datatable where Organization column value is Finance: ModelById(1) .EventsDataTable .DeleteRows(Column("Organization") == "Finance"); Example: Delete rows where Revenue column has lower value than Cost column: DatatableById(1) .DeleteRows(Column("Revenue") < Column("Cost")); Example: Delete rows where PredictedEvent column value is true: DatatableById(1) .DeleteRows(Column("PredictedEvent")); Example: Delete all rows from datatable: DatatableById(1).DeleteRows(); |
Import | Data to import (DataFrame/DataFlow) |
Imports data to the datatable from given DataFrame or DataFlow. Columns between datatable and DataFrame/DataFlow are matched with names (the column order does not matter). Columns where there are no values in the DataFrame/DataFlow are imported as null values. Following parameters affect the behavior of the import:
If using DataFrames, and the imported DataFrame and the target datatable are located in the same system, the import is performed within the system, so no data is transferred from the system to in-memory. Otherwise, the DataFrame is first loaded into memory and imported into the target datatable. For lower memory consumption, prefer DataFlows instead. The import function returns the Datatable object itself (to allow chaining). Example: DatatableById(123).Import( ToDataFrame( [ ["Case 1", "A", DateTime(2021,1,2)], ["Case 2", "B", DateTime(2021,1,5)], ["Case 2", "A", DateTime(2021,1,9)] ], ["Case", "Event type", "Time"] ), #{"Append": true} ) DatatableById(123).Import( ToDataFrame( [ ["Case 1", "A", DateTime(2021,1,2)], ["Case 2", "B", DateTime(2021,1,5)], ["Case 2", "A", DateTime(2021,1,9)] ], ["Case", "Event type", "Time"] ), #{"MatchByColumns": ["Case"]} ) |
Merge | Data to merge (DataFrame/SqlDataFrame) |
Modifies the datatable by merging its contents with contents of given DataFrame. Merging works with the same principle and parameters as merging between in-memory DataFrames Examples: let target = ToDataFrame([[0, "zero", "target"], [1, "", "target"]], ["id", "text", "frame"]) .Persist("test_data"); let source = ToDataFrame([[1, "one", "source"], [2, "two", "source"], [3, "three", "source"]], ["id", "text", "frame"]); target.Merge(source, "id").DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns): id;text;frame 0;zero;target 1;one;source 2;two;source 3;three;source target.Merge(source, "id", ["text"]).DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns, copy only text column from source): id;text;frame 0;zero;target 1;one;target 2;two; 3;three; target.Merge(source, "id", ["text"], _remove).DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns, copy only text column from source, remove rows found only in source): id;text;frame 0;zero;target 1;one;target target.Merge(source, "id", ["text"], _remove, false).DataFrame.ToCsv() Returns string (one key, identical dataframe columns, copy only text column from source, remove rows found only in source or only in target): id;text;frame 1;one;target target.Merge(source, "id", _remove, _remove, false).DataFrame.ToCsv() Returns string (one key, identical dataframe columns, remove all rows): id;text;frame target.Merge(source, "id", [], _remove, false).DataFrame.ToCsv() Returns string (one key, filter out all the non-matching rows): id;text;frame 1;;target |
Modify | Parameters dictionary |
Modifies datatable by changing its properties. Following properties can be changed:
Example: Change datatable name: DatatableById(1) .Modify(#{ "Name": "My datatable" }); Example: Move datatable into other project: DatatableById(1) .Modify(#{ "ProjectId": ProjectById(2).Id }); Example: Change the connected table in the datasource: DatatableById(1) .Modify(#{ "NameInDataSource": "MyTable", "SchemaNameInDataSource": "MySchema", "DatabaseNameInDataSource": "MyDatatable" }); |
RemoveColumns |
Column names (String*) |
Removes the given columns from the datatable. Data in the removed columns is also lost permanently. The function returns the updated datatable entity. Example: Remove three columns from the datatable with id 123. DatatableById(123).RemoveColumns(["MyColumn1", "MyColumn2", "MyColumn3"]) |
RenameColumns |
Column name mappings (Dictionary) |
Changes names of one or several columns in the datatable. The function returns the updated datatable entity. Examples: Rename one/two columns in the datatable with id 123. DatatableById(123).RenameColumns(#{"MyColumnNewName": "MyColumnOldName"}) DatatableById(123).RenameColumns(#{ "MyColumnNewName1": "MyColumnOldName1", "MyColumnNewName2": "MyColumnOldName2" }) |
SetForeignKeys | Foreign keys (Dictionary array) | Sets foreign keys for the datatable. Foreign keys mean that certain column(s) in the datatable forms a key that is referring to other datatable.
The function call replaces previously defined foreign keys, so all foreign keys of the datatable need to be mentioned in the call. The parameter has the same structure as in the ForeignKeys property (see above). Providing an empty array removes all foreign keys. Note that the column(s) and target datatable(s) mentioned in the foreign keys don't actually need to exist. Examples: DatatableById(1).SetForeignKeys([ #{ "Columns": [ #{ "Source": "Case Name", "Target": "Case Id" } ], "TargetDatatable": "Cases" }, #{ "Columns": [ #{ "Source": "Employee Id", "Target": "Employee Id"}, #{ "Source": "Record version", "Target": "Record version"} ], "TargetDatatable": "Employees" } ]); Remove all foreign keys: DatatableById(1).SetForeignKeys([]); |
SetPrimaryKey | Key columns (String array) |
Sets primary key for the datatable. The primary key consists of one or several columns of the Datatable. When providing an empty array, the primary key is removed. The primary key is used to optimize queries made to this datatable. Examples: DatatableById(1).SetPrimaryKey(["Id"]); DatatableById(1).SetPrimaryKey(["CaseId", "EventType"]); DatatableById(1).SetPrimaryKey([]); |
Synchronize | (none) |
Refreshes the datatable information from the datasource (e.g., Snowflake). This function needs to be called when the underlying table in the datasource is modified for QPR ProcessAnalyzer to be aware of the changes. This can be done right after the ETL run has completed (to add data to the tables). Note that when data is imported to datatables using QPR ProcessAnalyzer user interface or scripts, calling the synchronize function is not needed. Calling synchronize requires the GenericWrite permissions for the project. Calling synchronize will perform the following actions:
Example: DatatableById(1).Synchronize(); |
Truncate | (none) |
Truncates the datatable, i.e., removes all rows from it. Truncate is a very efficient operation to remove all rows comparing using the delete operation. At least the GenericWrite permission for the project is needed for this operation. Example: DatatableById(123).Truncate(); |
UpdateRows |
|
Updates column(s) in datatable for selected rows with given expression value. Function has following parameters:
It's possible to change several columns in one call by providing more column name and expression pairs as parameters. Example: For all rows where column "Organization" value is "Finance", set column "Role" value to "Accountant": DatatableById(1) .UpdateRows( Column("Organization") == "Finance", "Role", "Accountant" ); Example: For all rows where "Timestamp" is earlier than 01/01/2024, update "Cost" and "Profit" columns (note that the Profit will be calculated with the original Cost value): DatatableById(1) .UpdateRows( Column("Timestamp") < "01/01/2024", "Cost", Column("Cost") * 2, "Profit", Column("Revenue") - Column("Cost") ); Example: Set column "Cost" to null for all rows: DatatableById(1) .UpdateRows(true, "Cost", null); |
Function to get a datatable by id:
Function | Parameters | Description |
---|---|---|
DataTableById |
datatable id (Integer) |
Return datatable with given id. An exception is given if a datatable with the given id does not exist or user doesn't have permissions to it. Examples: DataTableById(27) Returns: DataTable for the data table having id 27. |