Hitachi Vantara Pentaho Community Wiki
Access Keys:
Skip to content (Access Key - 0)

Description

Json output step allows to generate json blocks based on input step values. Output json will be available as java script array or java script object depends on step settings.

Transformation with json output usage example

Options

General Tab

General tab allows to specify type of step operation, output json structure, step output file. This file will be used to dump all generated json.

Settings

See detailed settings description below.

Option Description
Step name Step name should be unique in context of transformation
Operation Specify step operation type. Currently available 3 types of operation:
  1. Output value - only pass output json as a step output field, do not dump to output file
  2. Write to file - only write to fie, do not pass to output field
  3. Output value and write to file - dump to file and pass generated json as a step output file
Json block name
This value will be used as a name for json block. Can be empty string that will affect output json structure, see detailed description below.
Nr. rows in a block Number of json block key - value pairs.
NOTE, 1 is a special values, in case of 1 every output will be generated as one object. See description below.
Output value
This value will be used as a step output field. Will contain generated json output block depending on step settings.
Compatibility mode
This check box handles compatibility configuration with previous Kettle versions (before PDI 4.3.0 if checked, after PDI 4.3.0 otherwise)

Compatibility mode was introduced since PDI 4.3.0 version. This was designed to handle fix with changes to json structure generation. By default new fixed mode is used. To enable compatibility mode check 'Compatibility mode' check box.

For example step has a row input data with 2 fields: field "name" with string value "item" and field "value" with numeric value 25. We have 4 rows of data with "name" and "value". 

Consider Json output step has settings combination:

  • 'Json block name' = "data"
  • 'Nr rows in block' = 3
  • 'Compatibility mode' NOT checked (and this is the default option)

step will generate output:

{
  "data" : [ {
    "name" : "item",
    "value" : 25
  }, {
    "name" : "item",
    "value" : 25
  }, {
    "name" : "item",
    "value" : 25
  } ]
}{
  "data" : [ {
    "name" : "item",
    "value" : 25
  } ]
}

So We have 2 objects, first object has a 'data' (Json block name) array contains 3 json objects (because of number rows in a block is 3) and second object with array contains only one object. 3+1 is actually 4 rows processed.

If compatibility mode is enabled and step has settings combination like:

  • Json block name = "data"
  • Nr rows in block = 3
  • 'Compatibility mode' is checked

step will generate output:

{"data":[{"name":"item"},{"value":25},{"name":"item"},{"value":25},{"name":"item"},{"value":25}]}
{"data":[{"name":"item"},{"value":25}]}

Pretty formatting does not affect compatibility mode. We have 2 output json objects. First object harvest first 3 input rows and second object harvests only one row. This happens because of number of rows in a block is 3. Anyway it can be considered as incorrect result, as the real object count for array is 6 for the first output object. By default compatibility mode is disabled.

If 'Json block name' is an empty string (by default it has 'data' value) - compatibility mode will use empty string for block name. Normally - if compatibility mode was not checked, step output will be:

[ {
  "name" : "item",
  "value" : 25
}, {
  "name" : "item",
  "value" : 25
}, {
  "name" : "item",
  "value" : 25
} ][ {
  "name" : "item",
  "value" : 25
} ]

So we have 2 json arrays, first array contains 3 items, second array contains one item.

Another special case is when 'Nr. rows in a block' = 1.
If used with empty json block name output will looks like:

{
  "name" : "item",
  "value" : 25
}{
  "name" : "item",
  "value" : 25
}{
  "name" : "item",
  "value" : 25
}{
  "name" : "item",
  "value" : 25
}

We will have just 4 simple json objects that will be outputted as a 4 step output rows.

In case of json block name is defined - output structure will looks like:

{
  "data" : {
    "name" : "item",
    "value" : 25
  }
}{
  "data" : {
    "name" : "item",
    "value" : 25
  }
}{
  "data" : {
    "name" : "item",
    "value" : 25
  }
}{
  "data" : {
    "name" : "item",
    "value" : 25
  }
}

So this is will be same 4 output objects with json block name defined.

If 'Nr. rows in a block' will be less that 1 output will be as a one object:

{
  "data" : [ {
    "name" : "item",
    "value" : 25
  }, {
    "name" : "item",
    "value" : 25
  }, {
    "name" : "item",
    "value" : 25
  }, {
    "name" : "item",
    "value" : 25
  } ]
}

This will be one object (one output row) with data block containing json array with 4 objects (as we had 4 input data rows). Please note - when using 0 'Nr. rows in a block' step will build output object until input data is available. When input is done - one big output object will be passed to output row. For big input data it can impact memory usage.

Output File

Option Description
Filename full path to output file
Append
If not checked new file will be created every time step is running. If file with specified name already existed - it will be replaced by a new one. If checked - new json output will be appended at the end of existing file. Or if existing file is not exists - it will be created as in previous case.
Create Parent folder
Usually file name contains some path folder as a parent folder. If parent folder does not exists and this option is checked - parent folder will be created as a new folder. Otherwise - file not be found and step will fail.
Do not open create at start
If not checked - file (and in some cases parent folder) will be created/opened to write during transformation initialization. If checked - file and parent folder will be created only after step will get any first input data.
Extension
Output file extension. Default value is 'js'
Encoding
Output file encoding
Pass output to servlet
Enable this option to return the data via a web service instead writing into a file (see PDI data over web service)
Include date in filename?
If checked - output file name will contains File name value + current date. This may help to generate unique output files.
Include time in filename
If checked - output file name will contains file creation time. Same as for 'Include date in filename' option
Show filename(s) button
Can be useful to test full output file path
Add file to result filenames?
If checked - created output file path will be accessible form step result

Fields Tab

This tab is used to map input step fields to output json values

Option Description
Fieldname Input step field name. Use 'Get Fields' button to discover available input fields
Element name
Json element name as a key. For example "A":"B" - A is a element name, B is actual input value mapped for this Element name.

Metadata Injection Support

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.


This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder (4.2.0) Powered by Atlassian Confluence 3.3.3, the Enterprise Wiki