Added by Matt Casters, last edited by Jens Bleuel on Feb 18, 2014  (view change)

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Description

The Merge Join step performs a classic merge join between data sets with data coming from two different input steps. Join options include INNER, LEFT OUTER, RIGHT OUTER, and FULL OUTER.

Note: In this step rows are are expected in to be sorted on the specified key fields. When using the Sort step, this works fine. When you sorted the data outside of PDI, you may run into issues with the internal case sensitive/insensitive flag. Further information can be found on PDI-11440.

Note:  If the key fields have the same name (e.g. ID), a second key field called $key_1 (e.g. ID_1) will be created in the result.

Options

The following table describes the options available for the Merge Join step:

Option Description
Step name Name of the step; this name has to be unique in a single transformation
First Step Specify the first input step to the merge join (left)
Second Step Specify the second input step to the merge join (right)
Join Type Select from the available types of joins:
  • FULL OUTER: all rows from both sources will be included in the result, with empty values for non-matching keys in both data streams
  • LEFT OUTER: all rows from the first source will be in the result, with empty values for non-matching keys in the second data stream
  • RIGHT OUTER: all rows from the second source will be in the result, with empty values for non-matching keys in the first data stream
  • INNER JOIN: only rows having the same key in both sources will be included in the result
Keys for 1st step Specify the key fields on which the incoming data is sorted; click Get key fields to retrieve a list of fields from the specified step.
Keys for 2nd step Specify the key fields on which the incoming data is sorted; click Get key fields to retrieve a list of fields from the specified step.