SqlDataFrame in Expression Language

From QPR ProcessAnalyzer Wiki
Revision as of 15:00, 5 May 2021 by Ollvihe (talk | contribs)
Jump to navigation Jump to search

SqlDataFrame represents tabular data similar to a result of an SQL query. Data in the SqlDataFrames are processed in the original datasource where the data is located (not in the QPR ProcessAnalyzer server in-memory). For each SqlDataFrame, there is an SQL query that processes and generates the actual data in the datasource.

Running SqlDataFrames and operations between them don't yet execute the SQL in the original datasource. This happens when the Collect function is called for an SqlDataFrame, which executes the SQL query of the SqlDataFrames and loads the data into QPR ProcessAnalyzer server in-memory dataframe (where it can be presented in a dashboard).

Each SqlDataFrames knows the datasource where the SQL query is executed.

There is a similar API for the SqlDataFrames as there is for the DataFrames.

SqlDataFrame functions Parameters Description
Aggregate (DataFrame)
  1. Aggregated columns (string array or key-value pairs)
  2. Aggregation methods (string array)
Same functionality as in the DataFrame.
Collect (DataFrame) (none)

Executes the SQL query for the SqlDataFrame in the datasource and returns results as an in-memory DataFrame. Then processing of the data can be continued as the in-memory DataFrame. In addition to the Persist function, Collect function is the only way to get the actual SQL query executed to see the results (or store them to a table).

Examples:

DataTableById(123).SqlDataFrame.Head(100).Collect().ToCsv()
Returns the top 100 rows from datatable id 123.
ExcludeValues (SqlDataFrame)
  1. Column name (string)
  2. Value (single item) or values (array) to exclude

Same functionality as in the DataFrame.

Head (SqlDataFrame) Number of top rows

Same functionality as in the DataFrame.

GroupBy (GroupedDataFrame)

Grouped columns (string array)

Same functionality as in the DataFrame.

IncludeOnlyValues (SqlDataFrame)
  1. Column name (string)
  2. Value (single item) or values (array) to include

Same functionality as in the DataFrame.

Join (SqlDataFrame)
  1. DataFrame
  2. Columns to match (String or key-value pairs)
  3. Join type (String)
Same functionality as in the DataFrame.
OrderByColumns (SqlDataFrame)
  1. Columns names to be ordered (String array)
  2. Sorting order (boolean array)

Same functionality as in the DataFrame.

Select (SqlDataFrame)

Column names (string array, or key-value pairs)

Same functionality as in the DataFrame.

TakeSample (SqlDataFrame) Number of rows (Integer)

Takes a random sample of given number of rows of the original rows.

Examples:

DataTableById(123).SqlDataFrame.TakeSample(100).Collect()
Returns 100 random rows out of the datatable.