Hitachi Vantara Pentaho Community Wiki
Child pages
  • The art of data conversion
Skip to end of metadata
Go to start of metadata

Introduction

Data conversion is not as trivial as it seems at first glance.  Many things come into play: local language support, conversion masks, and a lot more.

To solve this once and for all (!) we wrote new conversion algorithms that, while remaining backward compatible with older versions, offer greater ease of use.

ValueMeta

The ValueMeta class is handling all data conversions in version 3.0.  Where data conversion was typically done all over the place in version 2, we have now centralized all the conversion code in this class.

String to <type> conversions


Typically these conversions happen in "Input" steps such as "Text File Input", "CSV Input", "Fixed Input", "XML Input", etc.

We can handle these one-way conversions by using the follow method in ValueMeta and ValueMetaInterface:

/**
  * Convert the specified data to the data type specified in this object.
  * @param meta2 the metadata of the object to be converted
  * @param data2 the data of the object to be converted
  * @return the object in the data type of this value metadata object
  * @throws KettleValueException in case there is a data conversion error
  */
  public Object convertData(ValueMetaInterface meta2, Object data2) throws KettleValueException;

Date example

Let's take a date String as an example:  "999.59:59:23 31/12/2007"

Here is to code to convert this to a Date object:

ValueMetaInterface source = new ValueMeta("src", ValueMetaInterface.TYPE_STRING);
source.setConversionMask("SSS.ss:mm:HH dd/MM/yyyy");
ValueMetaInterface target = new ValueMeta("tgt", ValueMetaInterface.TYPE_DATE);

Date date = (Date) target.convertData(source, "999.59:59:23 31/12/2007");

If we want to go the other way around, we need to explain to the target how the Date value should be represented as a String:

target.setConversionMask("yy/MM/dd HH:mm");
String string = (String) source.convertData(target, date);

String to <type> to String conversions

On occasion you encounter these round trip conversions. For example, in the sorting logic of the TableView (default PDI table editor).  The data in TableView is always represented as a String.  However, the original data could have certain local language settings or other specific formatting options.  Once the data is in a String format, it can be tricky to convert it back to the original data type and back to String.

For that reason we introduced the notion of "conversion meta-data" in ValueMeta.  The "conversion meta-data" explains to the ValueMeta object to which it is attached how the conversion should happen in both directions.

Date example

ValueMetaInterface datValueMeta = new ValueMeta("i", ValueMetaInterface.TYPE_DATE);
datValueMeta.setConversionMask("yyyy - MM - dd   HH:mm:ss'('SSS')'");

String string = datValueMeta.getString(originalValue); // This string is used in the preview window.

ValueMetaInterface strValueMeta = new ValueMeta("str", ValueMetaInterface.TYPE_STRING);
strValueMeta.setConversionMetadata(datValueMeta);

Date x = (Date) strValueMeta.convertDataUsingConversionMetaData(string);

 This has the distinct advantage that you can now convert the original date format back to the original format without having direct access to the original since you can call getConversionMetaData() on the String object.  It guarantees that data conversion will take place as it should in both directions.

  • No labels