Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

(warning) PLEASE NOTE: This documentation applies to Pentaho 8.0 and earlier. For Pentaho 8.1 and later, see Unique Rows on the Pentaho Enterprise Edition documentation site.

Description

The Unique rows step removes duplicate rows from the input stream(s).

(warning) Important: Make sure that the input stream is sorted; otherwise, only consecutive double rows are evaluated correctly.

See also the Unique rows (HashSet) step that does not need the rows to be sorted.

Options

The table below contains descriptions of all options for the Unique rows step:

Option

Description

Step name

Name of the step; this name has to be unique in a single transformation

Add counter to output?

Check this option to add a counter field to the stream.

Counter field

Define the counter field name.

Redirect duplicate row

Processes duplicate rows as an error and redirect rows to the error stream of the step. Requires you to set error handling for this step.

Error Description

Sets the error handling description to display when duplicate rows are detected. Only available when Redirect duplicate row is checked.

Fields to compare table

Specify the field names on which you want to force uniqueness or click Get to insert all fields from the input stream(s) You can choose to ignore case by setting the Ignore case flag to Y. For example: Kettle, KETTLE, kettle are the same if the compare is performed as case-insensitive. In this instance, the first occurrence (Kettle) is passed to the next step(s).

  • No labels