Added by Matt Casters, last edited by Jens Bleuel on Nov 21, 2008  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Description

The Join rows step allows you to produce combinations (Cartesian product) of all rows in the input streams as shown below:


 
The Years x Months x Days step outputs all combinations of Year, Month and Day (for example, 1900, 1, 1 2100, 12, 31) and can be used to create a date dimension.

Options

The following table describes the options for configuring the Join rows step:

Option Description
Step name Name of the step; this name has to be unique in a single transformation
Temp directory Specify the name of the directory where the system stores temporary files in case you want to combine more then the cached number of rows
TMP-file prefix This is the prefix of the temporary files that will be generated
Max. cache size The number of rows to cache before the system reads data from temporary files; required when you want to combine large row sets that do not fit into memory
Main step to read from Specifies the step from which to read most of the data; while the data from other steps are cached or spooled to disk, the data from this step is not.
The Condition(s) You can enter a complex condition to limit the number of output row.

Note: The fields in the condition must have unique names in each of the streams.

An alternative with a higher performance in most cases is the Merge Join step.

"while other steps are is not cached or spooled to disk, this step is not", can you spell it again?

how can I increase the perfomance in this step?, or can I change it to a similiar Step?

 many thank's

Comment: Posted by Marcelo Rossi at Nov 12, 2008 13:47

Thanks Marcelo,

I changed and added a performance hint.

Cheers,

Jens

Comment: Posted by Jens Bleuel at Nov 21, 2008 02:22