DataFlow in Expression Language: Difference between revisions
No edit summary |
No edit summary |
||
Line 19: | Line 19: | ||
||DataFrame to append | ||DataFrame to append | ||
|| | || | ||
Adds given DataFrame to DataFlow. Examples: | |||
Examples: | |||
<pre> | <pre> | ||
ToDataFlow(ToDataFrame([], ["id", "color"])) | let myDataFlow = ToDataFlow(ToDataFrame([], ["id", "color"])); | ||
.Append(ToDataFrame([[1, "red"], [2, "green"]], ["id", "color"])) | myDataFlow | ||
.Append(ToDataFrame([[1, "red"], [2, "green"]], ["id", "color"])); | |||
myDataFlow | |||
.Complete() | .Complete() | ||
.Collect() | .Collect() | ||
Line 39: | Line 39: | ||
Examples: | Examples: | ||
<pre> | <pre> | ||
ToDataFlow(ToDataFrame([], ["id", "color"])) | ToDataFlow(ToDataFrame([], ["id", "color"])) | ||
Line 64: | Line 55: | ||
Examples: | Examples: | ||
<pre> | <pre> | ||
myDataFlow.Complete(); | |||
</pre> | </pre> | ||
|- | |- |
Revision as of 17:14, 8 December 2022
DataFlow is an object representing a stream of tabular data. DataFlow contains data with the similar structure as DataFrame, but difference is that in the DataFrame all its contents is stored to the system memory. If there is lot of data, also lot of memory is required when using DataFrames. On the other hand in the DataFlow, contents "flows" from the source to the destination, and data can be manipulated, while having only a small portion of the entire data in memory at the same time. Thus, DataFlows are suitable for ETL where data volumes are high.
DataFlow continues to run until it completes. DataFlow will complete automatically, when all queried items have been returned. DataFlow can also be completed explicitly by calling the Complete function. When the DataFlow has been completed, no new items can be added to it. When collecting the DataFlow to an in-memory DataFrame, the Collect call waits until the DataFlow completes, to make sure all items are included to the DataFrame.
Property | Description |
---|---|
IsCompleted (boolean) | Returns true when the Complete function has been called for the DataFlow and there are no more unread items in it. |
Function | Parameters | Description |
---|---|---|
Append (DataFlow) | DataFrame to append |
Adds given DataFrame to DataFlow. Examples: let myDataFlow = ToDataFlow(ToDataFrame([], ["id", "color"])); myDataFlow .Append(ToDataFrame([[1, "red"], [2, "green"]], ["id", "color"])); myDataFlow .Complete() .Collect() .ToCsv() |
Collect (DataFrame) | Parameters (Dictionary) |
Returns in-memory DataFrame extracted from the DataFlow. Returns the extracted data as in-memory DataFrame or null if either the timeout has been exceeded or the flow has been completed and is empty. Parameters:
Examples: ToDataFlow(ToDataFrame([], ["id", "color"])) .Append(ToDataFrame([[1, "red"], [2, "green"]], ["id", "color"])) .Append(ToDataFrame([[3, "blue"]], ["id", "color"])) .Complete() .Collect(#{"CollectChunk": true}) .ToCsv() |
Complete (DataFlow) | (none) |
Declares that the DataFlow is completed, i.e., there won't be any new items anymore added to the DataFlow. Examples: myDataFlow.Complete(); |
Persist (Datatable) |
|
Writes DataFlow into datatable. Works similarly as the same function in the DataFrame. |
Create new DataFlow:
Function | Parameters | Description |
---|---|---|
Persist (Datatable) |
|
Creates new DataFlow and optionally initializes it with given DataFrame. Examples: ToDataFlow(ToDataFrame([[1, "red"], [2, "green"]], ["id", "color"])) .Append(ToDataFrame([[3, "blue"]], ["id", "color"])) .Complete() .Collect() .ToCsv() |