Added by Matt Casters, last edited by Matt Casters on Jul 10, 2008

Labels:

Enter labels to add to this page:
Wait Image 
Looking for a label? Just start typing.

De-serialize from file

The De-serialize from file step, formerly known as Cube Input, reads rows of data from a binary Kettle file containing rows and metadata.
WARNING: Use this step to store short lived data only! Pentaho cannot guarantee that the file format will stay the same
between different versions of Pentaho Data Integration.

Options

Option Description
Step Name Name of the step; this name has to be unique in a single transformation.
Filename The name of the Kettle cube file to be generated.
Limit Allows you to limit the number of rows written to
Size the cube file. A value of zero (0) indicates no size limit (optional).

If you try to read a file that has zero rows in it you will get an error as shown here:

http://forums.pentaho.org/showthread.php?p=230181#post230181

Workaround:
1) In your job add a 'Delete File' step before your transform that generates the output file.
2) In the 'Serialize to File' step set the 'Do not create file at start' flag to checked.
3) In your job, right before the transform that is going to read in the file add a 'File Exsts' step.
4) On true, run the transform that loads the file
5) On false, skip your transform.

You can also do the file exists check within a transform. 

Comment: Posted by Andy Hurst at Jan 20, 2010 15:49