Hitachi Vantara Pentaho Community Wiki
Child pages
  • Prioritize streams
Skip to end of metadata
Go to start of metadata

Description

Prioritize streams step allows you to control the order in which input streams will be read. If step has more than one input streams - data will be read concurrently form all these inputs with unpredictable order. However, if you want one of input stream to have a priority against others - you can use 'Prioritize streams' step.

For example: Step has 3 input streams A, B, C. Every input stream produces it's name letter, for example A produces only 'A', B only 'B', etc. Normally - these streams produce their names concurrently so you can't predict the result of these 3 stream's combined output, it can be: 'ABCABC...ABC' or 'AABBCCAABB.....BBCC'. Every stream produces some limited amount of data, and the expected result should be 'AAA..ABBB..BCCC..C'. To make this happen, use 'Prioritize streams' and set input step priority to the exact order" 1)A 2)B 3)C. If input stream A has a max priority, the 'Prioritize streams' step will wait until input stream A emits all it's data - meaning the step will read all 'A' until step A is complete and returns false from processRow(), effectively sending an end of stream signal. Only afterwards will the data from step B be processed. Data from stream C will be processed after step B is finished. It doesn't matter if streams B and C have data waiting and stream A is not sending any data. Until A signals done, all other streams wait.

Options

Option

Description

Step name

Name of the step; this name has to be unique in a single transformation

Step name priority

This list defines the order in which the input streams will be read. Names at the top of the list have higher priority than steps below.

Note: Names have to exactly match incoming steps names, otherwise the step will fail.

Get Previous Steps

This button will add the names of all the previous steps to the list. Right click to move rows up and down