Datatable in Expression Language: Difference between revisions
Tag: Manual revert |
|||
Line 36: | Line 36: | ||
|- | |- | ||
||Configuration (Dictionary) | ||Configuration (Dictionary) | ||
||Returns the datatable configuration as dictionary. Example: | ||Returns the datatable configuration as dictionary. Datatable configuration has the following properties: | ||
* '''TableName''': Optional table name in the data source. The table name cannot start with "qprpa_dt_" to avoid conflicts with system created tables. | |||
* '''SchemaName''': Optional schema name for the database table. If not specified, data source will use one from the ODBC connection string. | |||
* '''DatabaseName''': Optional database name where the datatable is located. If not specified, data source will use one from the ODBC connection string. | |||
Example: | |||
<pre> | <pre> | ||
DatatableById(123).Configuration.Columns[0].Datatype | DatatableById(123).Configuration.Columns[0].Datatype |
Revision as of 12:26, 16 May 2023
Datatables are used to store data in QPR ProcessAnalyzer persistently. Datatables consist of one-to-many named columns and zero-to-many rows. The datatable entity in the expression language is used to access Datatables metadata, such as name, description and audit information. The actual data contents can be accessed by using the DataFrame property, which causes the actual data to be loaded from the database into memory for calculations.
Datatables have many usecases:
- Models can load their data from Datatables.
- Datatables can be used to store data extracted from external systems
- Intermediate ETL calculation results can be stored to Datatables
- It's possible to visualize data from Datatables directly in the dashboards without building a QPR ProcessAnalyzer model
Datatable Properties
Datatables are stored to QPR ProcessAnalyzer database as database tables. Each datatable belongs to a project.
Property | Description |
---|---|
ColumnNames (String*) | Returns Datatable column names as a string array in the order they are in the DataTable. |
ColumnMappings (Dictionary) | Returns column mappings configured for this Datatable. Returned data is a dictionary, where keys are mapping names (CaseId, EventType, Timestamp) and value is the column name. The ColumnMappings property returns null if column mappings have not been defined for the Datatable.
Example: let caseIdColumnName = DatatableById(123).ColumnMappings("CaseId"); |
ColumnTypes (Dictionary*) |
Returns Datatable column metadata as array of dictionaries. Each column dictionary has following properties:
DatatableById(123).Columns[2].Datatype |
Configuration (Dictionary) | Returns the datatable configuration as dictionary. Datatable configuration has the following properties:
Example: DatatableById(123).Configuration.Columns[0].Datatype |
ConfigurationJson (String) | Returns the datatable configuration as json string. |
CreatedBy (User) | User who created the datatable. |
CreatedDate (DateTime) | Timestamp when the datatable was created. |
DatabaseNameInDataSource (String) | Returns the user-defined database name in the data source. If no database has been defined for the datatable (i.e., database of the connection string is used), null is returned. |
DataFrame (DataFrame) | Contents of the datatable as a DataFrame. |
DataSourceObjectType (String) |
Returns table if the source data comes from a database table, and view if the data comes from a database view. When datasource type is view, any operations to modify the datatable data are not possible. When a datatable with the datasource type view is deleted, the underlying database view is not deleted, because the views are managed by users directly in the Snowflake. When a datatable with the datasource type table is deleted, the underlying database table is deleted, because the tables are managed by QPR ProcessAnalyzer automatically. |
DataSourceType (String) | Physical storage location of the datatable data. Options:
|
Description (String) | Datatable description. The datatable description may contain line breaks. |
Id (Integer) | Datatable Id. Datatable Id is generated by QPR ProcessAnalyzer when the datatable is created. |
LastImportBy (User) | User who the last time imported or modified the data in the datatable. Modification covers adding new rows, modifying rows and deleting rows. This attribute contains correct information only if data is imported using QPR ProcessAnalyzer UI. Data can also be imported to Snowflake tables directly using Snowflake, and in that case this attribute doesn't track the user correctly. |
LastImportDate (DateTime) | Timestamp when data was last time imported or modified in the datatable. Shows the correct import time when data is imported using QPR ProcessAnalyzer UI. When the Synchronize function is used, the LastImportDate is set to the current time (regardless of when data was actually imported to the datatable). |
LastModifiedBy (User) | User who last time modified the datatable properties. This does not cover the data contained by the datatable (for that, use the LastImportBy by property). If DataSourceObjectType is view, gives null. |
LastModifiedDate (DateTime) | Timestamp when the datatable properties were last time modified (such as name, description or containing project). This does not cover the data contained by the datatable (for that, use the LastImportDate by property). |
Name (String) | Name of the datatable. |
NameInDataSource (String) | Name of the table in the datasource (e.g., Snowflake or SQL Server) containing the datatable data. If the table name is specified by a user, that table name is returned. Otherwise the automatically determined table name is returned. This property identifies the correct table when loading data directly to the table in the datasource. |
NColumns (Integer) | Number of columns in the datatable. |
NRows (Integer) | Number of data rows in the datatable. |
Project (Project) | Project the datatable belongs to. |
ProjectId (Number) | Id of the project the datatable belongs to. |
SchemaNameInDataSource (String) | Returns the user-defined schema name in the data source. If no schema has been defined for the datatable (i.e., schema of the connection string is used), null is returned. |
SqlDataFrame (SqlDataFrame) | Returns SqlDataFrame that can be used to access the contents of the DataTable. |
Datatable Functions
Function | Parameters | Description |
---|---|---|
AddColumn |
|
Adds a new column to the Datatable. Existing rows in the Datatable will get null values for the created column. The function returns the updated Datatable entity. Available data types are String, Integer, Float, DateTime, Boolean, Duration (Timespan) and Any (can contain any type of data). Example: Add three columns to the Datatable with id 123. DatatableById(123) .AddColumn("MyColumn1", "String") .AddColumn("MyColumn2", "Float") .AddColumn("MyColumn3", "DateTime") |
AddColumns |
|
Adds one or several columns to the Datatable in the same operation. Takes as a parameter an array objects with the following properties:
Function returns the Datatable object itself. Requires the GenericWrite permission for the project. Example: Add three columns (MyColumn1, MyColumn2, MyColumn3) with different data types to datatable having id 123: DatatableById(123) .AddColumns([ #{"Name": "MyColumn1", "DataType": "String"}, #{"Name": "MyColumn2", "DataType": "Float"}, #{"Name": "MyColumn3", "DataType": "Integer"} ]); |
DeletePermanently | (none) |
Deletes the Datatable permanently. This will lead to a lost of all data in the Datatable. Datatables cannot be moved to the recycle bin, so they can only be deleted permanently. Example: Delete permanently Datatable with id 123. DatatableById(123).DeletePermanently() |
Import | Data to import (DataFrame/DataFlow) |
Imports data to the datatable from given DataFrame or DataFlow. Columns between datatable and DataFrame/DataFlow are matched with names (the column order does not matter). Columns where there are no values in the DataFrame/DataFlow are imported as null values. Following parameters affect the behavior of the import:
If using DataFrames, and the imported DataFrame and the target datatable are located in the same system, the import is performed within the system, so no data is transferred from the system to in-memory. Otherwise, the DataFrame is first loaded into memory and imported into the target datatable. For lower memory consumption, prefer DataFlows instead. The import function returns the Datatable object itself (to allow chaining). Example: DatatableById(123).Import( ToDataFrame( [ ["Case 1", "A", DateTime(2021,1,2)], ["Case 2", "B", DateTime(2021,1,5)], ["Case 2", "A", DateTime(2021,1,9)] ], ["Case", "Event type", "Time"] ), #{"Append": true} ) DatatableById(123).Import( ToDataFrame( [ ["Case 1", "A", DateTime(2021,1,2)], ["Case 2", "B", DateTime(2021,1,5)], ["Case 2", "A", DateTime(2021,1,9)] ], ["Case", "Event type", "Time"] ), #{"MatchByColumns": ["Case"]} ) |
Merge | Data to merge (DataFrame/SqlDataFrame) |
Modifies the datatable by merging its contents with contents of given DataFrame. Merging works with the same principle and parameters as merging between in-memory DataFrames Examples: let target = ToDataFrame([[0, "zero", "target"], [1, "", "target"]], ["id", "text", "frame"]) .Persist("test_data"); let source = ToDataFrame([[1, "one", "source"], [2, "two", "source"], [3, "three", "source"]], ["id", "text", "frame"]); target.Merge(source, "id").DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns): id;text;frame 0;zero;target 1;one;source 2;two;source 3;three;source target.Merge(source, "id", ["text"]).DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns, copy only text column from source): id;text;frame 0;zero;target 1;one;target 2;two; 3;three; target.Merge(source, "id", ["text"], _remove).DataFrame.ToCsv() Returns string (one key, default parameters, identical dataframe columns, copy only text column from source, remove rows found only in source): id;text;frame 0;zero;target 1;one;target target.Merge(source, "id", ["text"], _remove, false).DataFrame.ToCsv() Returns string (one key, identical dataframe columns, copy only text column from source, remove rows found only in source or only in target): id;text;frame 1;one;target target.Merge(source, "id", _remove, _remove, false).DataFrame.ToCsv() Returns string (one key, identical dataframe columns, remove all rows): id;text;frame target.Merge(source, "id", [], _remove, false).DataFrame.ToCsv() Returns string (one key, filter out all the non-matching rows): id;text;frame 1;;target |
RemoveColumns |
Column names (String*) |
Removes the given columns from the datatable. Data in the removed columns is also lost permanently. The function returns the updated datatable entity. Example: Remove three columns from the datatable with id 123. DatatableById(123).RemoveColumns(["MyColumn1", "MyColumn2", "MyColumn3"]) |
RenameColumns |
Column name mappings (Dictionary) |
Changes names of one or several columns in the datatable. The function returns the updated datatable entity. Examples: Rename one/two columns in the datatable with id 123. DatatableById(123).RenameColumns(#{"MyColumnNewName": "MyColumnOldName"}) DatatableById(123).RenameColumns(#{ "MyColumnNewName1": "MyColumnOldName1", "MyColumnNewName2": "MyColumnOldName2" }) |
SetPrimaryKey | Key columns (String array) |
Sets primary key for the datatable. The primary key consists of one or several columns of the Datatable. When providing an empty array, the primary key is removed. The primary key is used to optimize queries made to this datatable. Examples: DatatableById(1).SetPrimaryKey(["Id"]); DatatableById(1).SetPrimaryKey(["CaseId", "EventType"]); DatatableById(1).SetPrimaryKey([]); |
Synchronize | (none) |
Refreshes the datatable information from the datasource (e.g., Snowflake) to QPR ProcessAnalyzer. This function needs to be called when the underlying table in the datasource is modified for QPR ProcessAnalyzer to be aware of the modifications. This is typically done right after the ETL run (to add data to the tables) has completed. Note that when data is imported to datatables using QPR ProcessAnalyzer user interface or scripts, calling the synchronize function is not needed. Calling synchronize will perform the following actions:
Calling synchronize requires the GenericWrite permissions for the project. Example: DatatableById(123).Synchronize(); |
Truncate | (none) |
Truncates the datatable, i.e., removes all rows from it. Truncate is a very efficient operation to remove all rows comparing using the delete operation. At least the GenericWrite permission for the project is needed for this operation. Example: DatatableById(123).Truncate(); |
Function to get a datatable by id:
Function | Parameters | Description |
---|---|---|
DataTableById |
datatable id (Integer) |
Return datatable with given id. An exception is given if a datatable with the given id does not exist or user doesn't have permissions to it. Examples: DataTableById(27) Returns: DataTable for the data table having id 27. |