Added by Matt Casters, last edited by Matt Casters on Jul 10, 2008

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

Description

The Stream lookup step type allows you to look up data using information coming from other steps in the transformation. The data coming from the Source step is first read into memory and is then used to look up data from the main stream.

In the example below, the transformation adds information coming from a text-file (B) to data coming from a database table (A):

Information from B is used to perform the lookups as indicated by the Source step option shown below:

Options

The table below describes the features available for configuring the stream lookup:

Option Description
Step name Name of the step this name has to be unique in a single transformation
Lookup step The step name where the lookup data is coming from
The keys to lookup... Allows you to specify the names of the fields that are used to look up values. Values are always searched using the "equal" comparison
Fields to retrieve You can specify the names of the fields to retrieve here, as well as the default value in case the value was not found or a new field name in case you didn't like the old one.
Preserve memory Encodes rows of data to preserve memory while sorting
Key and value are exactly one integer field Preserves memory while executing a sort
Use sorted list Enable to store values using a sorted list; this provides better memory usage when working with data sets containing wide rows
Get fields Automatically fills in the names of all the available fields on the source side (A); you can then delete all the fields you don't want to use for lookup.
Get lookup fields Automatically inserts the names of all the available fields on the lookup side (B). You can then delete the fields you don't want to retrieve

The text suggests there is an example with A and B, but I don't see an example (and need one!)

Comment: Posted by Anonymous at Aug 22, 2008 14:09

In order to get this to work properly when reading from CSV files, you must uncheck the box labeled "Lazy Conversion?" in the CSV File step. Otherwise, your stream lookups will fail with messages like this:Unexpected error :
2008/12/05 16:25:28 - Check Dept ID.0 - ERROR (version 3.1.0, build 826 from 2008/09/30 11:32:36) : java.lang.RuntimeException: Error serializing row to byte array
It may be that you only need to uncheck the Lazy Conversion box on the CSV file that will be the source for the lookup step; however, it's easy enough to uncheck it on all of the CSV files, unless you run into performance issues.

Comment: Posted by Ben Chapman at Dec 05, 2008 15:28
Comment: Posted by A L at Feb 24, 2009 07:48