Purpose of the step
The Edi to Xml step converts Edi message text (conforming to the ISO 9735 standard) to generic XML. The XML text is more accessible and allows for selective data extraction using XPath and the Get Data From XML step.
The step configuration requires the field name containing the EDI text, and an output field name for the XML text. If the output field name is left empty, the EDI text is going to be replaced by the XML text.
The structure of the XML output follows the following pattern:
<edifact> <SEGMENT> <element> <value></value> ... </element> ... </SEGMENT> ... </edifact>
The conversion rules are:
- the root of the document is the "edifact" tag
- each segment in the edifact message is converted to a tag, using the segment name as the tag name.
- each field within a segment is represented by an "element" tag
- each value within a field is represented by an "value" tag
When working with multi-message edifact strings, the generated XML may still be difficult to work with. In some scenarios it may be more practical to first split the multi-message string into multiple rows (on message boundaries) before converting them to XML.
A complete sample of using the Edi to Xml step is bundled with PDI in samples/transformations/Edifact - using the Edi2XML step.ktr
The sample transformation is explained in detail in the following article: http://type-exit.org/adventures-with-open-source-bi/kettle-plugins/edi2xml-plugin/
Currently the step only supports the default UNA settings, namely UNA:.? ' or UNA:,? '. If the UNA header is missing UNA:+.? ' is assumed.