flcl_manual-flcl_commands-xcnv-input-save-file-fmt-xml

XML

Synopsis

HELP:   Format data stream in XML elements (tags, attributes, data, ...)
TYPE:   OBJECT
SYNTAX: XML(CHRSET=NONE/SYSTEM/ASCII/UCS1/UTF8/EBCDIC/UCS2BE/UTF16BE/UCS2LE/UTF16LE/UCS4BE/UTF32BE/UCS4LE/UTF32LE/LOCAL,FMTERR,NOCMNT,NODFLT,NOEMPD,ADDEMP,NOPINS,NODTD,CCDATA,NAMSPC[=num/LF2/LF3/TAB2/TAB3/PIPE2/PIPE3/COLON2/COLON3/SLASH2/SLASH3/REMOVE],USELCH,DATLEN=num,LENERR,BUFSIZ=num,INICNT=num)

Description

The object "format XML" parses block of text data containing an XML document. The XML data is parsed using the Expat library and transformed into FLAM elements, which allows various formatting options when writing, including minimizing and pretty printing the XML data.

The text data must be in UTF-8 or ASCII. Lines must be delimited with 0x0A, 0x0D or 0x0D0A. During parsing, all line delimiters are normalized to line feed characters (0xA) as defined by the XML specification.

There are several switches available to exclude certain types of XML elements from the parsed element list.

When reading XML documents through the byte or record interface using the element formatter (format.element()), the character data between starting and closing tags may be split into multiple XML data elements as the data can be of an arbitrary length. The DATLEN parameter controls the minimum length of a data element before being split (default is 1024). Note that this a minimum, so the data elements returned may be actually much larger. As a rule of thumb, the maximum length of data elements is close to the input buffer size (but may exceed it). If you want to get all data between any pair of starting and closing tags as a single data element, simply set the DATLEN parameter to a large number. You must be aware, however, that this might require considerable amounts of memory, depending the maximum data length in tag bodies. With the LENERR parameter, an error occurs if a data element exceeds the specified length. If the ignore empty data (NOEMPD) switch is set, data elements containing only whitespace are suppressed. With the ADDEMP switch, an empty data element is inserted to the element list if an end tag follows directly after a start tag (only relevant when writing custom applications using the APIs).

A special parameter can be used for namespace handling. The parameter NAMSPC only makes sense when using our APIs, because the result is no longer valid XML. With the parameter, the tags can be built up in doubles from URI and tag name or as a triple from URI tag name and NS prefix or you can only get the tag name (REMOVE). The namespace handling thus serves to simplify the machine processing of tags.

Translated with www.DeepL.com/Translator (free version)

XML formatting supports a lot of powerful features which can be accessed with the parameters below.

Arguments

NUMBER: CHRSET=NONE/SYSTEM/ASCII/UCS1/UTF8/EBCDIC/UCS2BE/UTF16BE/UCS2LE/UTF16LE/UCS4BE/UTF32BE/UCS4LE/UTF32LE/LOCAL - Character set [auto]
- NONE - No character set defined
- SYSTEM - SYSTEM (environment specific (on mainframe EBCDIC else ASCII))
- ASCII - ASCII (mainly used in the for open system)
- UCS1 - UCS-1 (for text formatting identical to ASCII < 64k)
- UTF8 - UTF-8 (for text formatting identical to ASCII < 2M)
- EBCDIC - EBCDIC (mainly used on IBM mainframe)
- UCS2BE - UCS-2 Big Endian (multibyte character set < 64k)
- UTF16BE - UTF-16 Big Endian (multibyte character set < 2M)
- UCS2LE - UCS-2 Little Endian (multibyte character set < 64k)
- UTF16LE - UTF-16 Little Endian (multibyte character set < 2M)
- UCS4BE - UCS-4 Big Endian (multibyte character set < 64k)
- UTF32BE - UTF-32 Big Endian (multibyte character set < 2M)
- UCS4LE - UCS-4 Little Endian (multibyte character set < 64k)
- UTF32LE - UTF-32 Little Endian (multibyte character set < 2M)
- LOCAL - LOCAL (platform specific (on mainframe EBCDIC else ASCII))
SWITCH: FMTERR - Enforce a format error if data contains only space or is empty [FALSE]
SWITCH: NOCMNT - Ignore XML comments [FALSE]
SWITCH: NODFLT - Ignore XML default elements (i.e. whitespace before/after root tag) [FALSE]
SWITCH: NOEMPD - Ignore empty/whitespace XML data elements (not in CDATA section) [FALSE]
SWITCH: ADDEMP - Add an empty XML data element if end tag follows direct a start tag [FALSE]
SWITCH: NOPINS - Ignore XML processing instructions [FALSE]
SWITCH: NODTD - Ignore XML document type definitions [FALSE]
SWITCH: CCDATA - Collect CDATA to simple data elements [FALSE]
NUMBER: NAMSPC=num/LF2/LF3/TAB2/TAB3/PIPE2/PIPE3/COLON2/COLON3/SLASH2/SLASH3/REMOVE - Activate namespace handling [NONE]
- LF2 - URI lf TAG
- LF3 - URI lf TAG lf NSP
- TAB2 - URI tab TAG
- TAB3 - URI tab TAG tab NSP
- PIPE2 - URI '|' TAG
- PIPE3 - URI '|' TAG '|' NSP
- COLON2 - URI ':' TAG (not unique and not recommended)
- COLON3 - URI ':' TAG ':' NSP (not unique and not recommended)
- SLASH2 - URI '/' TAG (not unique and not recommended)
- SLASH3 - URI '/' TAG '/' NSP (not unique and not recommended)
- REMOVE - Remove namespace prefix
SWITCH: USELCH - Use literal cache for tags (reduce memory but increase CPU utilization) [FALSE]
NUMBER: DATLEN=num - Minimum length of returned XML data elements (for APIs only) [1024]
SWITCH: LENERR - Return an error if minimum data element length exceeded [FALSE]
NUMBER: BUFSIZ=num - Initial buffer size for preallocation [65536]
NUMBER: INICNT=num - Initial amount of records for preallocation [128]