flcl_manual-flcl_commands-grep

GREP

Synopsis

HELP:   Find pattern in files
TYPE:   OBJECT
SYNTAX: > flcl GREP(NETINPUT.{},INPUT['str'/STREAM/DUMMY...],FROM='str'/DEFAULT/ASCII/EBCDIC/BOMUTF/BOMUCS/SYSTEM/LOCAL,PATTERN='str',NOCASE,EXTENDED,FORMAT='str'/LIST/CSV/XML/JSON/GREP,TO='str'/DEFAULT/ASCII/EBCDIC/SYSTEM/LOCAL,OUTPUT='str'/STREAM/DUMMY,NETOUTPUT.{},FALLOC(),DIR(),LOGGING.{},MESSAGE(),NORUN)

Description

The GREP command searches for patterns in files using a Perl compatible regular expression.

Writing regular expressions may seem like a daunting task if you never did so before. You can find a tutorial that will get you started quickly here:

https://www.regular-expressions.info/quickstart.html

Countless other tutorials and documentation for regular expressions can be found on the web. We recommend consulting these resources to familiarize yourself with regular expressions.

Advanced users can find the complete syntax documentation for writing regular expression patterns by following this URL:

https://www.pcre.org/current/doc/html/pcre2pattern.html

The GREP command support directory walk using file lists with wildcards. Die DIR object can be used to control the directory walk. Below is a simple example. The default output will be STREAM, but in this case the matches are written to the file result.txt.

   flcl GREP input='test/*' pattern='Hugo' output='result.txt' dir(recursive)

The GREP command is a simplification of the conversion subprogram using READ.AUTO() with default behavior except the parameter to specify the input. For each match a row with 5 columns is created and written with WRITE.TEXT(). The table format is as shown here:

   ROW(NAME='GREP'
    COLUMN(NAME='PATTERN' TYPE.STRING())
    COLUMN(NAME='FILE'    TYPE.STRING())
    COLUMN(NAME='LINE'    TYPE.STRING())
    COLUMN(NAME='COLUMN'  TYPE.STRING())
    COLUMN(NAME='DATA'    TYPE.STRING())
   )

The row name is GREP. The column PATTERN contains the pattern used. The column FILE contains the complete URL with connection parameter, the filename and the optional member name. The column LINE contains the line number of the match record. The column COLUMN contains the character position in the line of the match record. The column DATA contains the matching record. For more information how to define row and column definitions see chapter Table Support.

Compressed files are decompressed and encrypted files are decrypted, member of archives (ZIP, FLAM, ...) are searchable by using the corresponding switches in the DIR object. The match is done based on UTF-8 using neutral string format. Trailing whitespace is removed. The delimiter is not part of the data element used for matching. The `NOCASE' and EXTENDED switches can be used to control the matching algorithm.

With the format parameter the output format can be defined. There are several default formats supported using a key word, but it is also possible to provide a row specification if a special format (other order of the columns or something like this) for the output is needed.

The definitions of the default formats are:

FORMAT=LIST

   FORMAT=CSV DEFAULT(TYPE=STRING NOHDLN SEPCHR=COLON FLDMTD=NONE NOCHECK) ROW(NAME='GREP'
    COLUMN(NAME='FILE')
    COLUMN(NAME='LINE')
    COLUMN(NAME='COLUMN')
    COLUMN(NAME='DATA')
   )

FORMAT=CSV

   FORMAT=CSV DEFAULT(TYPE=STRING) ROW(NAME='GREP'
    COLUMN(NAME='PATTERN')
    COLUMN(NAME='FILE')
    COLUMN(NAME='LINE')
    COLUMN(NAME='COLUMN')
    COLUMN(NAME='DATA')
   )

FORMAT=XML

   FORMAT=XML DEFAULT(TYPE=STRING COMMENT='') ROW(NAME='GREP'
    COLUMN(NAME='PATTERN' ROOT='GREP'            PATH='&PATTERN')
    COLUMN(NAME='FILE'    ROOT='GREP/FILE'       PATH='&NAME'   )
    COLUMN(NAME='LINE'    ROOT='GREP/FILE/MATCH' PATH='&LINE'   )
    COLUMN(NAME='COLUMN'  ROOT='GREP/FILE/MATCH' PATH='&COLUMN' )
    COLUMN(NAME='DATA'    ROOT='GREP/FILE/MATCH' PATH='&DATA'   )
   )

FORMAT=JSON (currently realized using the TVD support)

   FORMAT=TVD DEFAULT(TYPE=STRING INDSIZE=2) ROW(NAME='GREP'
    COL(name='PATTERN' root='\W{\N*}\C/!!"grep":\B{\N*}\C' path='"pattern":\B"*"\C')
    COL(name='FILE'    root='\W{\N*}\C/!!"grep":\B{\N*}\C' path='"file":\B"*"\C'   )
    COL(name='LINE'    root='\W{\N*}\C/!!"grep":\B{\N*}\C/!!"match":\B[\N*]\C' path='{\W*}\C/"line":\B"*"\c')
    COL(name='COLUMN'  root='\W{\N*}\C/!!"grep":\B{\N*}\C/!!"match":\B[\N*]\C' path='{\W*}\C/"column":\B"*"\c')
    COL(name='DATA'    root='\W{\N*}\C/!!"grep":\B{\N*}\C/!!"match":\B[\N*]\C' path='{\W*}\C/"data":\B"*"\c')
   )

FORMAT=GREP (like LIST but without the character offset in the line)

   FORMAT=CSV DEFAULT(TYPE=STRING NOHDLN SEPCHR=COLON FLDMTD=NONE NOCHECK) ROW(NAME='GREP'
    COLUMN(NAME='FILE')
    COLUMN(NAME='LINE')
    COLUMN(NAME='DATA')
   )

These default formats can be used as a base for adjustments. In the example below the CSV format without the pattern and no headlines and the line number as first column is used. Here in JCL for z/OS:

    //FLCLGREP EXEC PGM=FLCL,REGION=0M,PARM='GREP=DD:PARM'
    //STEPLIB  DD DSN=&SYSUID..FLAM.LOAD,DISP=SHR
    //SYSOUT   DD SYSOUT=*
    //SYSPRINT DD SYSOUT=*
    //OUTPUT   DD SYSOUT=*
    //FORMAT   DD *
    "FORMAT=CSV DEFAULT(TYPE=STRING NOHDLN) ROW(NAME='GREP'
       COLUMN(NAME='LINE')
       COLUMN(NAME='FILE')
       COLUMN(NAME='DATA')
    )"
    /*
    //PARM     DD *
       INPUT='<SYSUID>.TEST.**'
       PATTERN='HUGO'
       NOCASE
       FORMAT=f'DD:FORMAT'
       TO=LOCAL
       OUTPUT='DD:OUTPUT'
       DIR(
          LINK
          ALIAS
          RECURSIVE
          ARCHIVE
       )
       MESSAGE(
         ERRONLY
         SOURCE
         MATCH
         SUMMARY
       )
    /*

The parameter string for the format must be enclosed in double quotation marks or better backticks (it is also known as backquote, grave, or grave accent) to be interpreted in total as one string.

To get syntax information, please use:

   flcl SYNTAX GREP

To get help for a parameter, please use:

   flcl HELP GREP.parameter[.parameter[...]]

To read the manual page for a parameter, please use:

   flcl MANPAGE GREP.parameter[.parameter[...]]

To generate the user manual for the command, please use:

   flcl GENDOCU GREP=filename

Parameters can be set as arguments in the command line (directly or per file) or as properties taken from the corresponding property file.

Arguments

STRING: INPUT['str'/STREAM/DUMMY...] - Name/URL of files to read [''==origin.ext]
- STREAM - Read from stdin or write to stdout
- DUMMY - Read EOF or write nothing
STRING: FROM='str'/DEFAULT/ASCII/EBCDIC/BOMUTF/BOMUCS/SYSTEM/LOCAL - Conversion from this CCSID
- DEFAULT - Use default CCSID (auto-detect)
- ASCII - Use default ASCII CCSID (environment)
- EBCDIC - Use default EBCDIC CCSID (environment)
- BOMUTF - Determine the correct UTF CCSID from byte order mark (BOM) only when reading
- BOMUCS - Determine the correct UCS CCSID from byte order mark (BOM)only when reading
- SYSTEM - Use system character set (environment/logical)
- LOCAL - Use local character set (auto-detect + system/physical)
STRING: PATTERN='str' - Pattern of regular expression for matching
SWITCH: NOCASE - Case insensitive matching
SWITCH: EXTENDED - Makes parser ignore whitespace and '#' comments within the pattern (can be used for readability)
STRING: FORMAT='str'/LIST/CSV/XML/JSON/GREP - Format definition for output (table object with 4 strings (PATTERN, FILE, LINE, DATA)) [LIST]
- LIST - Colon separated list (FILE:LINE:COLUMN:DATA)
- CSV - CSV format ("PATTERN","FILE","LINE","DATA")
- XML - XML format (<GREP PATTERN="pattern")>...</GREP>
- JSON - JSON format ({"GREP": { "PATTERN": "pattern" ...}})
- GREP - Colon separated list (FILE:LINE:DATA) without offset (like grep)
STRING: TO='str'/DEFAULT/ASCII/EBCDIC/SYSTEM/LOCAL - Conversion to this CCSID
- DEFAULT - Use default CCSID (UTF-8)
- ASCII - Use default ASCII CCSID (environment)
- EBCDIC - Use default EBCDIC CCSID (environment)
- SYSTEM - Use system character set (environment/logical)
- LOCAL - Use local character set (system/physical)
STRING: OUTPUT='str'/STREAM/DUMMY - Name/URL of file to write [''==origin.ext]
- STREAM - Read from stdin or write to stdout
- DUMMY - Read EOF or write nothing
SWITCH: NORUN - Don't run the command only show parsed parameter