XML Input stage in DataStage
You can transform hierarchical XML data to flat relational tables by using the XML Input stage.
Use the XML Input stage to extract, validate, and transform XML data. You can extract data from a single column in a table or a whole document. XML Input supports a single input link and one or more output links.
Stage tab
Specify properties for the stage. For more information, see XML Input: Stage tab (DataStage).
Input tab
In the Input tab, specify the input column and the format of the XML document. An input column can contain an XML document, a URL, or a file path.
Output tab
In the Output tab, you can specify properties on the output links. You can specify one reject link to store rejection messages and rejected rows and select which output column to store them on.
You can also specify whether to inherit the Transformation properties from the stage, and use the Load box to specify XPath expressions. XPath expressions are used on output links to identify data in an XML document and transform it into columns and rows. See Transformation settings for more information. If you do not supply an XPath expression, the stage can use a passthrough mechanism to copy data without modification from an input link to an output link. This requires an exact match between the input and output column names, which are case-sensitive.
Select a repetition element by clicking Edit under Columns and selecting one of the columns as a key. The stage will generate an output row for each occurrence of the repetition element.
<!ELEMENT table (row*)>
<!ELEMENT row (column*)>
<!ELEMENT column (#PCDATA | NULL)>
<!ATTLIST column name CDATA #REQUIRED >
<!ELEMENT NULL>