Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)

March 30, 2006
Submitted by Matt Castors, Chief of Data Integration, Pentaho

Just wanted to share a Kettle tip with you, this week on the subject of the Script Values step. Let's do ... 

Regular Expressions

Today I got a question from a user that likes the regular expressions support in JavaScript, but had troubles using it in the Script Values step.
The problem is that if you grab the String representation of a field using for example NAME.getString(), you can't use do a replace like in this script:

String str = NAME.getString();
String result = str.replace( /aaa/g, "a3" );

This fails because str is of type java.lang.String and is in fact NOT a JavaScript String.

A simple workaround to convert to a JavaScript String is by doing the following:

String str = "" + NAME.getString();
String result = str.replace( /aaa/g, "a3" );

This will work fine and it will make certain text replace scenarios a lot easier to write.

Access Java Objects

You too can have access to all the java objects that are in Kettle's classpath. There are a lot of utility libs and you can add your own if you like.
To activate this, you can simply place the following line of code on top of your javascript:

java

Then you can use all kind of code by specifying the complete name of a class, for example:

var systemDate = new java.util.Date();
dateField.setValue(systemDate);

Row object

If you want to access the complete row that is passing through and not just the individual fields, you can use the row object.
With this you can for example compare the current row with the previous for all kinds of scripting fun...

var prevRow;
var prevName;

if (prevRow != null )
{
  prevName = prevRow.getString("NAME", "");
}
else
{
  prevName = "";
}

prevRow = row.Clone();

Telephone Numbers

Someone asked recently: how can I format telephone numbers?
This is a complex question in itself, but more often then not, it can be solved by using a piece of javascript.
Here is what we could use in Belgium to format telephone numbers, but I guess it can be adapted for other countries as well:

Assuming you have a field called "tel" as input :

----tel.lpad(9, '0');

var newTel;

if (tel.getString().startsWith("02") || tel.getString().startsWith("03"))
{
  newTel=tel.getString().substr(0,2) + "/" + tel.getString().substr(2);
}
else
{
  newTel=tel.getString().substr(0,3) + "/" + tel.getString().substr(3);
}

For example when the input is 54250137, the output would be 054/250137.

I hope you found this useful. Please feel free to add your own samples below or suggest another topic for next week.

Kind regards,

Matt


  1. Jun 19, 2009

    Matthew Pugsley says:

    Request for a new previous row example: The current code (in particular row.Clon...

    Request for a new previous row example: The current code (in particular row.Clone()) only works with 3.2 if compatibilty mode is on. It does not work if compatibility mode is off. Can a new example please be provided that uses the new methods getInputRowMeta(), getOutputRowMeta(), createRowCopy(size), and putRow(row) to achieve the same function?

    Thanks,

    Matt

     I've put together an example that works:

    var prevRow;var prevName;

    if ( prevRow != null )
    { prevName = prevRow[0]; }

    else

    {}
    prevName = "";
     }

    prevRow = createRowCopy(getInputRowMeta().size());  

    This transformation included two 'generate rows' steps. Each step generated a single row with a single column called NAME (data type string). One step produced a row with the value of "FirstName" and the other step produced a row with the value of "SecondName."

     Both steps point to a dummy step which then points to a Modified Java Script Value step, which has the script above. The incoming row has data in the first column (position 0). If the prevRow variable is set (there is a previous row), then the variable prevName is set to the value of position 0 in the previous row (which is the single column with the value of either "FirstName" or "SecondName"). If it is not set, then prevName is set to the blank string "".

     In either case, the value of prevRow is set by using the new function createRowCopy based on the size of the input row.

     I included the variable prevRow in the field section in the bottom of the screen for the Modified Java Script Value step, which includes prevRow as a new column in the downstream output.

    --Matt Pugsley, 6/19/09

This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder Powered by Atlassian Confluence