Hitachi Vantara Pentaho Community Wiki
Skip to end of metadata
Go to start of metadata

Description

The XML Join Step allows to add xml tags from one stream into a leading XML structure from a second stream. While the target stream must have only one row, since it represents a XML document, the other stream can consist of many rows and the tags from all rows will be added to the target document.

After the join only one row will be produced containing the fields of the target step plus the result field of the join.

Options

Property

Description

Target XML Step

Step that sends the target document to the join

Target XML Field

Field that contains the XML structure

Source XML Step

Step that send the XML structure(s) to the join that need to be added to the target

Source XML Field

Field that contains the XML structures that get added to the target

XPath Statement

XPath statement to find the node in the target document where the tags will be added.
When the complex join is eneabled a single ? is used as a placeholder.

Complex Join

Flag to enable the complex join syntax, using the placeholder in the XPath Statment

Join Comparision Field

Field that contains the values that get replaced in the XPath Statement

Result XML field

The field that wil contain the result.

Encoding

Encoding to be used in the XML Header and to transform the XML.

Omit XML header

Should the XML header be omitted? The encoding of the
target XML will be preserved

Omit null values from XML result

Without checking this option, null values are added to the XML output as an empty element, for instance: <abc/>
When this option is checked, these tags are completely omitted in the output. This is useful to save space in the output file (for high volume transactions) and to create special needed output.

Examples

Please see XML Join - Create a multilayer XML file.ktr in the data-integration/samples/transformations folder that is described here:

Regular join sample

The main XML structure coming from xmlOrderList

       <OrderList businessUnitId="EU10" plantId="EU11" source="XY" timestamp="19700101">
                <OrderHeaders>
                </OrderHeaders>
       </OrderList>

Two Rows from xmlOrder Headers:

    <OrderHeader customerNumber="1000" orderDate="19700101" orderNumber="4711" orderType="AN"
       originator="G">
                <OrderLines>
                </OrderLines>
                <OrderHeaderComments>
                </OrderHeaderComments>
       </OrderHeader>
       <OrderHeader customerNumber="1001" orderDate="19700101" orderNumber="4712" orderType="AN"
       originator="G">
                <OrderLines>
                </OrderLines>
                <OrderHeaderComments>
                </OrderHeaderComments>
       </OrderHeader>


Result after join:

       <OrderList businessUnitId="EU10" plantId="EU11" source="XY" timestamp="19700101">
                <OrderHeaders>
                <OrderHeader customerNumber="1000" orderDate="19700101" orderNumber="4711"
                orderType="AN" originator="G">
                        <OrderLines/>
                        <OrderHeaderComments/>
                </OrderHeader>
                <OrderHeader customerNumber="1001" orderDate="19700101" orderNumber="4712"
                orderType="AN" originator="G">
                        <OrderLines/>
                        <OrderHeaderComments/>
                </OrderHeader>
        </OrderHeaders>
</OrderList>

Complex join sample

XPath Statement includes placeholder "?" that get's substituted by the field value of the comparision field.


 
Input from coming previous XML Join

        <OrderList businessUnitId="EU10" plantId="EU11" source="XY" timestamp="19700101">
                <OrderHeaders>
                         <OrderHeader customerNumber="1000" orderDate="19700101" orderNumber="4711"
                         orderType="AN" originator="G">
                                 <OrderLines/>
                                 <OrderHeaderComments/>
                         </OrderHeader>
                         <OrderHeader customerNumber="1001" orderDate="19700101" orderNumber="4712"
                         orderType="AN" originator="G">
                                 <OrderLines/>
                                 <OrderHeaderComments/>
                         </OrderHeader>
                </OrderHeaders>
        </OrderList>

Input coming from xmlOrderHeaderComments

        <OrderHeaderComment lineNumber="1" text="double lines, line1"/>
        <OrderHeaderComment lineNumber="2" text="double lines, line2"/>
        <OrderHeaderComment lineNumber="1" text="a comment with special characters: äöüÜÖÄß&lt;&gt;!"/>

Result after XML Join:

        <OrderList businessUnitId="EU10" plantId="EU11" source="XY" timestamp="19700101">
                <OrderHeaders>
                         <OrderHeader customerNumber="1000" orderDate="19700101" orderNumber="4711"
                         orderType="AN" originator="G">
                                 <OrderLines/>
                                 <OrderHeaderComments>
                                         <OrderHeaderComment lineNumber="1" text="double lines, line1"/>
                                 </OrderHeaderComments>
                         </OrderHeader>
                         <OrderHeader customerNumber="1001" orderDate="19700101" orderNumber="4712"
                orderType="AN" originator="G">
                                 <OrderLines/>
                                 <OrderHeaderComments>
                                         <OrderHeaderComment lineNumber="2" text="double lines, line2"/>
                                         <OrderHeaderComment lineNumber="1" text="a comment with
                                         special characters: äöüÜÖÄß&lt;&gt;!"/>
                                 </OrderHeaderComments>
                         </OrderHeader>
                </OrderHeaders>
        </OrderList>

Metadata Injection Support (7.x and later)

All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.

  • No labels