HELP: Convert a block (binary or text) of data into a list of records TYPE: OBJECT SYNTAX: REC(METHOD=BIN/TXT/REC/WRP/LEN/L4I/L4X/B4I/B4X/S4I/S4X/HLI/HLX/HBI/HBX/HSI/HSX/DLM,CHRSET=NONE/SYSTEM/ASCII/UCS1/UTF8/EBCDIC/UCS2BE/UTF16BE/UCS2LE/UTF16LE/UCS4BE/UTF32BE/UCS4LE/UTF32LE/LOCAL,RPLFFD[=num],RPLTAB/RPLHTB[=num],RPLVTB[=num],RPLBSP,RPLCTR=SPACE/SUBSTITUTE/DELETE,SUPTWS,NELDLM,PADCHR=num,RECLEN=num,BUFSIZ=num,INICNT=num,RECDLM='bin'/CR-ASCII/LF-ASCII/NL-ASCII/CRLF-ASCII/CR-EBCDIC/LF-EBCDIC/NL-EBCDIC/CRLF-EBCDIC/CR-UTF08/LF-UTF08/NL-UTF08/CRLF-UTF08/CR-UTF16BE/LF-UTF16BE/NL-UTF16BE/CRLF-UTF16BE/CR-UTF16LE/LF-UTF16LE/NL-UTF16LE/CRLF-UTF16LE/CR-UTF32BE/LF-UTF32BE/NL-UTF32BE/CRLF-UTF32BE/CR-UTF32LE/LF-UTF32LE/NL-UTF32LE/CRLF-UTF32LE,RSTDAT=RECORD/ERROR/IGNORE,SUPPAd,PRSATR=ASA/MCC/REL/RELASA/RELMCC,SKPTXT,SKPLEN)
The record converter can be used to convert data blocks with delimiters or on input with different record length formats into data lists with records. This conversion component uses the text and record formatting component and provides all its features.
Auto detection of 4 byte record length fields is attempted if no method is provided. If the data block contains no record length fields or the provided binary delimiter does not fit and the switch to skip text formatting is not enabled, then all capabilities of the text formatting component are used to form records.
The SKPTXT flag causes input blocks to be only converted to records if record length fields or binary delimiter are detected. This can be useful, for example, for reading ZIP archives containing record-oriented files. PKZIP stores record-oriented files with a 4 byte length field (length of the field itself not included) in front of the record data in little endian (L4X). Another example files stored with FILEDATA=RECORD (B4X) in USS. When reading such a file, the length fields must be interpreted to re-build the corresponding records. This conversion can be added after any I/O or other conversion step.
ZIP files, GZIP files and the ASCII armor header support storing custom attributes which is used to store file attributes like the length field format, binary delimiter of the records, whether ASA or machine control characters are present, etc. If the file is written using CNV.BLK(PUTATR), the record attributes (RRDS slot number, print control characters) are written in front of each record and a custom file attribute stores a flag to mark which record attributes are present in order to be able to later parse the records correctly automatically. Other file formats do not support custom attributes. Hence, it is not possible to store file attributes in, e.g., BZIP2 or binary PGP files. In these cases, the PRSATR (parse attributes) parameter must be set to the correct keyword in order to read the file correctly if it has been written with the PUTATR switch enabled. The keyword to choose depends on the original file format.
NUMBER: METHOD=BIN/TXT/REC/WRP/LEN/L4I/L4X/B4I/B4X/S4I/S4X/HLI/HLX/HBI/HBX/HSI/HSX/DLM - Method for record/text formatting [DEFAULT]
BIN - Convert data blocks in text records of defined record length
TXT - Parse data block for text delimiter (DEFAULT)
REC - Use read/known record length (only useful on mainframes)
WRP - Binary wrap of data blocks in records of defined record length
LEN - Parse data based on 4 byte length fields with auto detection of the format
L4I - Parse data based on 4 byte length fields: Little endian integer, length inclusive
L4X - Parse data based on 4 byte length fields: Little endian integer, length exclusive (ZIP)
B4I - Parse data based on 4 byte length fields: Big endian integer, length inclusive
B4X - Parse data based on 4 byte length fields: Big endian integer, length exclusive (USS)
S4I - Parse data based on 4 byte length fields: System endian integer, length inclusive
S4X - Parse data based on 4 byte length fields: System endian integer, length exclusive (VAR)
HLI - Parse data based on 4 byte length fields: Little endian short (LLxx), length inclusive
HLX - Parse data based on 4 byte length fields: Little endian short (LLxx), length exclusive
HBI - Parse data based on 4 byte length fields: Big endian short (LLxx), length inclusive (MVS)
HBX - Parse data based on 4 byte length fields: Big endian short (LLxx), length exclusive
HSI - Parse data based on 4 byte length fields: System endian short (LLxx), length inclusive
HSX - Parse data based on 4 byte length fields: System endian short (LLxx), length exclusive
DLM - Parse data based on the provided binary record delimiter
NUMBER: CHRSET=NONE/SYSTEM/ASCII/UCS1/UTF8/EBCDIC/UCS2BE/UTF16BE/UCS2LE/UTF16LE/UCS4BE/UTF32BE/UCS4LE/UTF32LE/LOCAL - Character set [SYSTEM]
NONE - No character set defined
SYSTEM - SYSTEM (environment specific (on mainframe EBCDIC else ASCII))
ASCII - ASCII (mainly used in the for open system)
UCS1 - UCS-1 (for text formatting identical to ASCII < 64k)
UTF8 - UTF-8 (for text formatting identical to ASCII < 2M)
EBCDIC - EBCDIC (mainly used on IBM mainframe)
UCS2BE - UCS-2 Big Endian (multibyte character set < 64k)
UTF16BE - UTF-16 Big Endian (multibyte character set < 2M)
UCS2LE - UCS-2 Little Endian (multibyte character set < 64k)
UTF16LE - UTF-16 Little Endian (multibyte character set < 2M)
UCS4BE - UCS-4 Big Endian (multibyte character set < 64k)
UTF32BE - UTF-32 Big Endian (multibyte character set < 2M)
UCS4LE - UCS-4 Little Endian (multibyte character set < 64k)
UTF32LE - UTF-32 Little Endian (multibyte character set < 2M)
LOCAL - LOCAL (platform specific (on mainframe EBCDIC else ASCII))
NUMBER: RPLFFD=num - Replace form feeds, filling rest of page with blank lines assuming n lines per page [60]
NUMBER: RPLTAB/RPLHTB=num - Replace horizontal tabulators by spaces using this tab width [0 = no replacement]
NUMBER: RPLVTB=num - Replace vertical tabulators by new lines using this tab width [0 = no replacement]
SWITCH: RPLBSP - Replace backspace (deletes the backspace and the byte before) [FALSE]
NUMBER: RPLCTR=SPACE/SUBSTITUTE/DELETE - Replace remaining control characters [NONE]
SPACE - Replace control characters with whitespace character (0x20/0x40)
SUBSTITUTE - Replace control characters with substitution character (0x1A/0x3F)
DELETE - Remove control characters
SWITCH: SUPTWS - Suppress trailing whitespaces at text parsing [FALSE]
SWITCH: NELDLM - Activate NEL (0x85) as delimiter for ASCII character sets [FALSE]
NUMBER: PADCHR=num - Padding character [0x00]
NUMBER: RECLEN=num - Length used to cut the data block in records [512]
NUMBER: BUFSIZ=num - Initial buffer size for preallocation [65536]
NUMBER: INICNT=num - Initial amount of records for preallocation [4<128<1024]
STRING: RECDLM='bin'/CR-ASCII/LF-ASCII/NL-ASCII/CRLF-ASCII/CR-EBCDIC/LF-EBCDIC/NL-EBCDIC/CRLF-EBCDIC/CR-UTF08/LF-UTF08/NL-UTF08/CRLF-UTF08/CR-UTF16BE/LF-UTF16BE/NL-UTF16BE/CRLF-UTF16BE/CR-UTF16LE/LF-UTF16LE/NL-UTF16LE/CRLF-UTF16LE/CR-UTF32BE/LF-UTF32BE/NL-UTF32BE/CRLF-UTF32BE/CR-UTF32LE/LF-UTF32LE/NL-UTF32LE/CRLF-UTF32LE - Delimiter used to parse end of record
CR-ASCII - Delimiter: carriage return in ASCII (0x0D)
LF-ASCII - Delimiter: line feed in ASCII (0x0A)
NL-ASCII - Delimiter: new line in ASCII (0x85)
CRLF-ASCII - Delimiter: carriage return line feed in ASCII (0x0D0A)
CR-EBCDIC - Delimiter: carriage return in EBCDIC (0x0D)
LF-EBCDIC - Delimiter: line feed in EBCDIC (0x25)
NL-EBCDIC - Delimiter: new line in EBCDIC (0x15)
CRLF-EBCDIC - Delimiter: carriage return line feed in EBCDIC (0x0D25)
CR-UTF08 - Delimiter: carriage return in UTF-8 (0x0D)
LF-UTF08 - Delimiter: line feed in UTF-8 (0x0A)
NL-UTF08 - Delimiter: new line in UTF-8 (0xC285)
CRLF-UTF08 - Delimiter: carriage return line feed in UTF-8 (0x0D0A)
CR-UTF16BE - Delimiter: carriage return in UTF-16BE (0x000D)
LF-UTF16BE - Delimiter: line feed in UTF-16BE (0x000A)
NL-UTF16BE - Delimiter: new line in UTF-16BE (0x0085)
CRLF-UTF16BE - Delimiter: carriage return line feed in UTF-16BE (0x000D000A)
CR-UTF16LE - Delimiter: carriage return in UTF-16LE (0x0D00)
LF-UTF16LE - Delimiter: line feed in UTF-16LE (0x0A00)
NL-UTF16LE - Delimiter: new line in UTF-16LE (0x8500)
CRLF-UTF16LE - Delimiter: carriage return line feed in UTF-16LE (0x0D000A00)
CR-UTF32BE - Delimiter: carriage return in UTF-32BE (0x0000000D)
LF-UTF32BE - Delimiter: line feed in UTF-32BE (0x0000000A)
NL-UTF32BE - Delimiter: new line in UTF-32BE (0x00000085)
CRLF-UTF32BE - Delimiter: carriage return line feed in UTF-32BE (0x0000000D0000000A)
CR-UTF32LE - Delimiter: carriage return in UTF-32LE (0x0D000000)
LF-UTF32LE - Delimiter: line feed in UTF-32LE (0x0A000000)
NL-UTF32LE - Delimiter: new line in UTF-32LE (0x85000000)
CRLF-UTF32LE - Delimiter: carriage return line feed in UTF-32LE (0x0D0000000A000000)
NUMBER: RSTDAT=RECORD/ERROR/IGNORE - Defines handling of remaining rest [auto]
RECORD - Build a record with the remaining rest (default for delimiter)
ERROR - Return an error if a remaining rest found (default for length fields)
IGNORE - Ignore a remaining rest (no record, no error)
SWITCH: SUPPAd - Suppress trailing padding/equal bytes at record parsing [FALSE]
NUMBER: PRSATR=ASA/MCC/REL/RELASA/RELMCC - Parse attributes (e.g. print control character) in front of the record [NONE]
ASA - ASA print control character (1 byte)
MCC - Machine print control character (1 byte)
REL - Slot number of RRDS (8 byte integer in big endian)
RELASA - Slot number plus ASA print control character (9 byte)
RELMCC - Slot number plus Machine print control character (9 byte)
SWITCH: SKPTXT - Skip and don't try text conversion if no record length fields found in the data [FALSE]
SWITCH: SKPLEN - Skip conversion if method LEN and no record length fields found in the data [FALSE]