Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

Description

The Row Normaliser step normalizes data back from pivoted tables. For example, below is a sample table of product sales data:

Month

Product A

Product B

Product C

2003/01

10

5

17

2003/02

12

7

19

...

...

...

...

The Row Normaliser step converts the data into the format below
so that it is easier to update your fact table:

Month

Product

sales

2003/01

A

10

2003/01

B

5

2003/01

C

17

2003/02

A

12

2003/02

B

7

2003/02

C

19

...

...

...

Options

The following options are available for the Row Normaliser Step:

Option

Description

Step name

Name of the step; this name has to be unique in a single transformation.

Typefield

The name of the type field (product in the example above)

Fields table

A list of the fields you want to normalize; you must set the following properties for each selected field:

  • Fieldname: Name of the fields to normalize (Product A ? C in the example).
  • Type: Give a string to classify the field (A, B or C in our example).
  • New field: You can give one or more fields where the new value should transferred to (sales in our example).

Get Fields

Click to retrieve a list of all fields coming in on the stream(s).

Normalizing multiple rows in a single step

The example below illustrates using the Row Normaliser step to normalize more than one row at a time starting with the following data format:

DATE

PR1_NR

PR_SL

PR2_NR

PR2_SL

PR3_NR

PR3_SL

20030101

5

100

10

250

4

150

...

...

...

...

...

...

...

You can convert the data to a table similar to the one shown below:

DATE

Type

Product Sales

Product Number

20030101

Product1

100

5

20030101

Product2

250

10

20030101

Product3

150

4

...

...

...

...

Below is the setup you use to create the table:

Metadata Injection Support

You can use the Metadata Injection supported fields with ETL Metadata Injection step to pass metadata to your transformation at runtime. The following Value fields of the Row Normaliser step support metadata injection:

  • Fieldname
  • Type
  • New Field
  • No labels

2 Comments

  1. Some things to keep in mind when using this transformation

    • The input stream may have more than the variables used in transformation - in the example shown it has DATE column that is not part of transformation.  Any columns in input stream not part of normalizer will be passed through to the next step as a part of the row.  See above how date wasn't part of row normalizer transformation, yet it is first field in row.  Imagine all other columns not part of transformation, but part of input showing up as fields in the row.
    • Do not try to force extra columns through as separate 'types' in the row normalizer transformation - it's not needed and will cause the step to crash
  2. Take care that the Fields specified in the Row Normaliser are in a suitable order.

    In particular ensure that the "Type" values are grouped together.  If you enter these in random order, the Row Normaliser can put the data into the wrong "new Field".