XML

Synopsis

HELP:   Read XML data from a file
TYPE:   OBJECT
SYNTAX: XML(NET.{},FILE['str'/STREAM/DUMMY...],BLKSIZE=num,NOCMNT,NODFLT,NOEMPD,ADDEMP,NOPINS,NODTD,CCDATA,DATLEN=num,LENERR,CCSID='str'/DEFAULT/ASCII/EBCDIC/BOMUTF/BOMUCS/SYSTEM/LOCAL,CHRMODE=STOP/IGNORE/SUBSTITUTE/IDENTITY/TRANSLIT,SKIPEQUAL,USRTABLE='str'/NPAS/SEPA/DELA/DLAX,ONEMAP,COMBINED=NFD/NFC/AUTO/ON/OFF,BOM,KEEPBOM,ENL2LF,RPLFFD[=num],DECRYPT[{}...],SUBSYSTEM(),FRCBLK,REMOVE,LANG='str',PLATFORM=WIN/UNX/ZOS/USS/VSE/BS2/MAC,OWNER='str',ENVID='str',HASH(),SIGNATURE.{},CHECK,AVSCAN(),NOARCH,PREPROCESS[()...],POSTPROCESS/PSTPRO[()...])

Description

Read XML works on blocks of binary or text data. Character conversion takes place before processing the XML data. If no CCSID is provided, then auto detection is used. If UTF-8 is detected, character conversion is skipped. Line delimiters must be one of 0x0A, 0x0D or 0x0D0A (after conversion to UTF-8). If a CCSID is supplied, the character data is converted from the provided CCSID to UTF-8 before performing XML processing, using the supplied CCSID as source encoding. If the input data is encoded (e.g. Base64), encrypted, compressed or contains 4 byte length fields, it is automatically decoded, decrypted, and/or decompressed to build a valid XML data block. During parsing, all line delimiters are normalized to line feed characters (0xA) as defined by the XML specification.

The XML data is parsed using the Expat library and transformed into FLAM elements, which allows various formatting options when writing, including minimizing, pretty printing XML data and restoring an equivalent of the original document.

There are several switches available to exclude certain types of XML elements from the parsed element list.

When reading XML documents through the byte or record interface using the element formatter (format.element()), the character data between starting and closing tags may be split into multiple XML data elements as the data can be of an arbitrary length. The DATLEN parameter controls the minimum length of a data element before being split (default is 1024). Note that this a minimum, so the data elements returned may be actually much larger. As a rule of thumb, the maximum length of data elements is close to the input buffer size (but may exceed it). If you want to get all data between any pair of starting and closing tags as a single data element, simply set the DATLEN parameter to a large number. You must be aware, however, that this might require considerable amounts of memory, depending the maximum data length in tag bodies. With the LENERR parameter, an error occurs if a data element exceeds the specified length. If the ignore empty data (NOEMPD) switch is set, data elements containing only whitespace are suppressed. Additional with the switch ADDEMP switch you can insert a empty data element to the element list if a end tag follows direct after a start tag.

On some EBCDIC machines, if character conversion from an EBCDIC charset is used, the new line character (0x15) is not properly converted, causing the XML parser to fail. In this case, turn on the enl2lf to enable proper conversion of new lines (0x15) to line feeds (0xA).

If reading XML, the semantics of write modes (write.*()) change as follows:

Known limitations:

Arguments