CSV

Synopsis

HELP:   Read a data field based on separators and optional enclosing characters (CSV - RCF4180)
TYPE:   OBJECT
SYNTAX: CSV(COLNAM='str',CHRSET=NONE/SYSTEM/ASCII/UCS1/UTF8/EBCDIC/UCS2BE/UTF16BE/UCS2LE/UTF16LE/UCS4BE/UTF32BE/UCS4LE/UTF32LE/LOCAL,SEPCHR[num/COMMA/COLON/SEMICOLON/TABULATOR/BLANK...],FLDCHR[num/QUOTATION/APOSTROPHE/GRAVEACCENT/PARENTHESES/RBRACKETS/SBRACKETS/BRACES/CBRACKETS/ABRACKETS...],EXCCHR[num/TABULATOR...],PADCHR=num,ALIGN=num,HEADLN,CASEHL,NONFLD)

Description

This object is used to read a column in CSV format, which exists in multiple variants. All that is necessary to parse CSV input is a list of column separators and optional field enclosing characters. A limit on the maximum length and/or an alignment can optionally be set for better format detection.

All control characters (0x00-0x19) are regarded as row delimiters. If a padding byte is specified, it is also regarded as control character. The field enclosing characters cannot be a control or whitespace character. The defined column separators are excluded from the control characters. An additional list of code points can be specified which are excluded from the control character list.

If a field is enclosed by field enclosing characters, any kind of control character and whitespace can be in the data. If the field enclosing character is also part of the actual data, it is expected to be escaped by doubling the character (i.e. two quotation marks for a quotation mark in the data). If a field consists of only two consecutive field enclosing characters, it is treated as an empty value. A special behavior is implemented for brackets. If one of the opening brackets "(<{[" is set as field enclosing character, then the corresponding closing bracket is expected for field termination. This allows to read CSV lines like: (John Doe),(123),(42,23)

If you need 0xC285 (UTF-8 NEL) as record delimiter or any other kind of binary delimiter, please use CNV.REC() for table formatting.

Sometimes, especially on mainframe systems, the last column has a separator and any remaining padding must be read as an additional column. In this case, the HEADLN switch has been added to prevent the use of a column name containing spaces. The HEADLN switch forces the return of true for headline recognition. This can be used as a default for all columns to prevent comparing each column name with the of each column name with the corresponding string. In this case, the first row is always read as a headline, which means it is ignored. To automatically detect a header, the COLNAM must match the string, and the CASEHL switch can be used to enable a case sensitive comparison.

Arguments