Access Keys:
Skip to content (Access Key - 0)

Available since version 4.4.0 GA

Description

The Concat Fields step is used to concatenate multiple fields into one target field. The fields can be separated by a separator and the enclosure logic is completely compatible with the Text File Output step.

Notes:

  • The compatibility with the Text File Output step makes it also possible to create the same fields and layout (including header, footer and ending lines) that could be stored somewhere else. By using a Group by step with the "Concatenate strings separated by" option and selecting a line delimiter (e.g. CR/LF) it is possible to create one field that contains the same as an output field.
  • Performance considerations: Check the Fast data dump option to get a maximum of throughput. You may also consider checking the option "Disable the enclosure fix?". More details can be found in the Advanced options below.

4.4.0 release note: Unfortunately we found an issue (PDI-8857) with this step that was too late to incorporate into 4.4.0. The step adds carriage return and line feed to the fields it creates. Workaround is to use the String operations step with the option "carriage return & line feed" after the step or to enable the advanced option "Fast data dump (no formatting)"

Options

Option Description
Step name Name of the step. Note: This name has to be unique in a single transformation.
Target Field Name The name of the target field (String type)
Length of Target Field The length of the string type (setting the meta-data of the String type, this is also used by the Fast Data Dump option for performance optimization)
Separator Specify the character that separates the fields in a single line of text. Typically this is ; or a tab.
Enclosure A pair of strings can enclose some fields. This allows separator characters in fields. The enclosure string is optional.

Fields Tab

This is identical to the fields tab option of the Text File Output step and has the same functionality.

Option Description
Name The name of the field.
Type Type of the field can be either String, Date or Number.
Format The format mask to convert with. See Number Formats for a complete description of format symbols.
Length The length option depends on the field type follows:
  • Number - Total number of significant figures in a number
  • String - total length of string
  • Date - length of printed output of the string (e.g. 4 only gives back year)
Precision The precision option depends on the field type as follows:
  • Number - Number of floating point digits
  • String - unused
  • Date - unused
Currency Symbol used to represent currencies like $10,000.00 or E5.000,00
Decimal A decimal point can be a "." (10,000.00) or "," (5.000,00)
Group A grouping can be a "," (10,000.00) or "." (5.000,00)
Trim type The trimming method to apply on the string. Trimming only works when there is no field length given. (see feature request PDI-2486)
Null If the value of the field is null, insert this string into the textfile
Get Click to retrieve the list of fields from the input fields stream(s)
Minimal width Alter the options in the fields tab in such a way that the resulting width of lines in the text file is minimal. So instead of save 0000001, we write 1, etc. String fields will no longer be padded to their specified length.

Advanced Tab

Option Description
Remove selected fields Check this to remove all selected fields from the output stream.
Force the enclosure around fields? This option forces all field names to be enclosed with the character specified in the Enclosure property above.
Disable the enclosure fix? This is for backward compatibility reasons (since version 4.1) related to enclosures and separators. The logic since version 4.1 is: When a string field contains an enclosure it gets enclosed and the enclose itself gets escaped. When a string field contains a separator, it gets enclosed. Check this option, if this logic is not wanted. It has also an extra performance burden since the strings are scanned for enclosures and separators. So when you are sure there is no such logic needed since your strings don't have these characters in there and you want to improve performance, un-check this option.
Header Enable this option if you want a header row. (First line in the stream). Note: All other output stream fields are set to Null when this line is produced.
Footer Enable this option if you want a footer row. (Last line in the stream). Note: All other output stream fields are set to Null when this line is produced.
Encoding Specify the String encoding to use. Leave blank to use the default encoding on your system. To use Unicode specify UTF-8 or UTF-16. On first use, Spoon will search your system for available encodings. Note: This is needed especially when you concatenate different encoded fields into the target field with a unique encoding. This applies also on Binary stored string fields due to Lazy conversion.
Right pad fields Add spaces to the end of the fields (or remove characters at the end) until they have the specified length.
Fast data dump (no formatting) Improves the performance when concatenating large amounts of data by not including any formatting information. Please consider setting the "Length of Target Field" option to an approximately maximum of the target field length. This improves performance since the internal buffer will be allocated and needs no reallocation when it is not sufficient.
Note: When then "Length of Target Field" option is "0", the internal buffer size is calculated as 50 times the number of concatenated fields, for instance an internal buffer of 250 is used by default for 5 fields.
Split every ... rows If this number N is larger than zero, split the resulting stream into multiple parts of N rows. Note: This is only needed when a Header/Footer is used to be compatible with the result of the Text File Output step.
Add Ending line of file Allows you to specify an alternate ending row to the output stream. Note: All other output stream fields are set to Null when this line is produced.

Metadata Injection Support

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.


This documentation is maintained by the Pentaho community, and members are encouraged to create new pages in the appropriate spaces, or edit existing pages that need to be corrected or updated.

Please do not leave comments on Wiki pages asking for help. They will be deleted. Use the forums instead.

Adaptavist Theme Builder (4.2.0) Powered by Atlassian Confluence 3.3.3, the Enterprise Wiki