Hitachi Vantara Pentaho Community Wiki
Child pages
  • Join Rows (Cartesian product)
Skip to end of metadata
Go to start of metadata

Description

The Join rows step allows you to produce combinations (Cartesian product) of all rows in the input streams as shown below:


 
The Years x Months x Days step outputs all combinations of Year, Month and Day (for example, 1900, 1, 1 2100, 12, 31) and can be used to create a date dimension.

Note: An alternative with a higher performance in most cases is the Merge Join step.

Options

The following table describes the options for configuring the Join rows step:

Option

Description

Step name

Name of the step; this name has to be unique in a single transformation

Temp directory

Specify the name of the directory where the system stores temporary files in case you want to combine more then the cached number of rows

TMP-file prefix

This is the prefix of the temporary files that will be generated

Max. cache size

The number of rows to cache before the system reads data from temporary files; required when you want to combine large row sets that do not fit into memory

Main step to read from

Specifies the step from which to read most of the data; while the data from other steps are cached or spooled to disk, the data from this step is not.

The Condition(s)

You can enter a complex condition to limit the number of output row.

Note: The fields in the condition must have unique names in each of the streams. Please see feature request PDI-16051 for more details.











Metadata Injection Support (7.x and later)

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.

Special Considerations for the Condition Field:

The Join Rows step is a special MDI scenario, since it has a nested structure of filter conditions. The condition is given in XML format. The condition XML has the same format as we store the transformation metadata in a .KTR file in XML format. We do not have a DTD (Document Type Definition) for the .KTR XML format, nor the condition.

It is easy to get to an XML condition:

  1. Create a sample Filter step with the different conditions you need. This sample step gives you all the information, such as the values for the functions you use.
  2. Select the step, copy it to the clipboard, and then paste it into a text editor. Alternatively, you can store the .KTR, and then open the .KTR in a text editor.
  3. Find the <condition> element and its nested elements and modify it accordingly to use it in your MDI scenario.
  • No labels