flam4_manual-flam4_utility-filename_handling

Filename Handling

All filenames are mapped by CLE/P, which means that all replacement mechanisms it supports are also valid for each filename used within a command.

See filename mapping of Command Line Processor.

All filenames specified in command strings are parsed and handled in a unified way across the different platforms. Standard replacements by shells are usually not usable within the command syntax.

There are some special attributes that can be appended to a filename when using paths or the URL-like syntax by using one of the following separators followed by a value string:

/? - member name (to access a member in an archive by name)
/: or /# - member index (to access a member in an archive by position)
/& - CCSID that the file's contents are encoded in

If the URL-like syntax is used, special characters can be URL-encoded as a percent character followed by two hex digits (%xx). If no schema is provided, but a raw path, URL-encoded characters are not decoded because the string is not a URL. The schema file:// can be used to access a local file on the system in URL-like syntax.

Providing a CCSID is mainly intended to be used for parameter files which are read to memory from a remote location which may require conversion to the local character set. The filename is the only way to specify the CCSID for this type of files.

Filenames in archives (ZIP, TAR, FLAM) or certain file headers (GZIP, PGP) are sanitized when reading or writing to prevent directory traversal attacks. This type of attack aims at writing files in arbitrary locations in the filesystem by storing specially crafted paths in the archive or file header. For example, extracting a ZIP file containing paths such as /etc/passwd or ../../../../../../etc/passwd as user root on a Unix-based system could lead to a lock-out of all users. Path sanitization removes all path components that lead to an extraction outside the desired output directory. This includes turning absolute paths into relative ones and removing all .. directory components. Both example paths above would be transformed to etc/passwd which are safe for extraction.

The filename handling works as follows:

On mainframe systems (POSIX(OFF)):

Usually, a filename has to be specified fully qualified, because a dynamic allocation is done in most cases. The following substitutions can be used, which are automatically replaced by their respective value during execution:

 <envar>  - to expand an environment variable

If the environment variables below are not defined, the following defaults are used:

 <SYSUID>  - The current user id (Example: file='<SYSUID>.TEST.DATA')
 <USER>    - The current user id (Example: file='DD:<USER>')
 <HOME>    - The current home directory (Example: file='<HOME>/dat.bin')
 <OWNERID> - The current user id, but normally the default owner 'limes' is defined
 <ENVID>   - The default is simply the first letter of SYSCLONE or if not defined 'T' for test, normally not required for file names

The syntax to define a DD name is:

 DD:name  - a DD-Statement for name (Example: file='DD:INDAT')

A replacement may also be used in a DD name definition:

 Example: file='DD:<USER>'

Additionally, you can specify member names in a DD name definition if a PDS(E) is allocated. The member name can be fully qualified or contain wildcards. Member names for archives can be specified as part of the URL-like notation or it can be specified separately.

 Example: file='DD:INPUT(*B*)'
          file='DD:INPUT(EGBERT)'
          file='DD:INPUT(EGBERT)/?*HOGO*'
          name='DD:INPUT(*GB*)' member='*HOGO*'

When using static allocation (DD name) for writing, the DCB parameter must be provided if the file does not exist. Otherwise, the defaults of the runtime environment (often PS RECFM=UNDEFINED LRECL=0 BLKSIZE=6144) are used. A static allocation is only useful to write the logical byte stream in a certain physical file format. In all other cases, we recommend to use dynamic allocation when writing new files.

Since version 5.1.22-28565 the DD name must not be prefixed with 'DD:' anymore. If only one qualifier given, if it a valid DD name and if a allocation for it found, then the qualifier will be used as DD name.

If you use FILE=STREAM when reading the file descriptor STDIN (DD:SYSIN) and when writing STDOUT (DD:SYSPRINT) is used. All other messages are printed by default to STDERR (DD:SYSOUT).

For read the default DD name 'FLAMIN' and for write the default DD name 'FLAMOUT' is preferred if no file specification given. If such a allocation not found at write the default dataset name (original name plus extension) is used. For read or write of a FLAMFILE additional the DD name 'FLAMFILE' are supported to be backward compatible with the old FLAM utility.

A Unix path name must contain a forward slash (/). To use a file in the current directory ./filename must be used. Paths relative to the home directory may start with a tilde character (~) as an abbreviation for <HOME> for UNIX pathnames or <SYSUID> for host dataset names.

 Example: file='~/mydata.txt'
          file='~.po.dataset(test)'
          file='DD:~'

Starting with version 5.1.5, the plus sign (+) can no longer be used as abbreviation for the user's home directory. The tilde (~) remains unaffected and is now the only valid abbreviation. The plus sign (+) is now a wildcard character. It was supported as alternative for EBCDIC systems, because tilde is located on different code points depending on the EBCDIC code page. Since version 5.1.5, we support the correct interpretation of diacritical characters on EBCDIC systems based on the CCSID defined in the environment variable LANG.

see Used environment variables and SPECIAL EBCDIC CODE PAGE SUPPORT of Command Line Processor

To specify the casing (mainly useful for USS on ZOS) for a user ID, the following definitions can be used (if not defined as environment variable):

 CUSER - The current user ID in upper case (same as <SYSUID>)
 Cuser - The current user ID in title case
 cuser - The current user ID in lower case

On Unix, Linux, Windows and other non-mainframe systems (including USS)

A filename may be an absolute or relative path and may contain any of the substitutions (including ~) mentioned in the previous section.

To use a smaller than character (<) as part of a filename, the character must be escaped by doubling it.

We recommend not to use special or whitespace characters in filenames.

The ampersand character (&) can be used instead of at sign (@) because the letter has different code points in EBCDIC. (the alternative character on EBCDIC, if @ is not working, is § (the command line parser converts the @ corresponding to the environment variable LANG on EBCDIC systems, if the value not set or wrong, then you can use & or try §)).

On USS and Micro Focus EDZ, DD names are supported. A UNIX path name is distinguished from a data set name in the same way as it is done in the z/OS runtime environment.

Example on USS or EDZ:

 file="//DS:user.test(data)"
 file="//'user.test(data)'"
 file="//test(data)"
 file="//DD:name"
 file="DD:name"

Within Microfocus Enterprise Server (Linux or Windows)

If the variable FLAM4MF is set to yes or a valid CCSID, a DD name is assumed unless it contains a dot character and is used as the dataset name.

Otherwise a DD name must be specified as DD:name and a dataset is given with //'name'

 Examples:  file="//'user.test(data)'"
            file=DD:name
            file=readme.txt

This corresponds with the syntax for host dataset names on USS of z/OS. With the Microfocus support for Windows and Linux, you can work on x86 Windows and Linux systems with the local MF-EDZ-Enterprise-Server like you work with the catalog access on USS for MVS files.

If you work with EBCDIC on the record interface in a EDZ environment then you must set the FLAM4MF environment variable to the used CCSID (e.g. IBM1141). If a CCSID is defined and starts with IBM, this will also override the default CCSID in the character conversion component and mark it as z/OS platform.

In an error situation, it is possible to enable writing a trace to a file, which can be used to analyze problems. The environment variable FLAM4MF_TRACEFILE must contain a valid filename to write the trace output.

To set the missing system variables for MF-EDZ, several solutions are implemented including DD:SYSVAR and the environment variables FLAM4MF_STATIC/DYNAMIC_SYSVAR documented above.

Local and remote file access (via SSH) with URLs (all platforms)

Instead of local filenames, complete URLs defining all communication parameters and (if required) the member name or index can be used. Optionally a CCSID for this file can be added. The syntax is:

 schema://userid:authdata@hostname:port/filename/&ccsid
 schema://userid:authdata@hostname:port/filename/?membername/&ccsid
 schema://userid:authdata@IPv4:port/filename/:index/&ccsid
 schema://userid:authdata@[IPv6]:port/filename/#index/&ccsid

If the URL-like syntax is used, special characters can be URL-encoded as a percent character followed by two hex digits (%xx). If no schema is provided, but a raw filename, URL-encoded characters are not decoded as the string is not a URL. The schema file:// can be used to access a local file on the system in URL-like syntax.

Currently, remote file access via SSH is supported for all file types including FLAM4 archives. The ssh:// prefix indicates remote access via SSH. The userid is optional, the default is the current login user ID. You can use replacements also in the connection parameter. I.e. <cuser> gets replaced with the lower case login user id, which is useful mainly on mainframes, where the SYSUID is in upper case. The password is optional. If no password is specified, public key authentication is attempted. The hostname/IP is mandatory and the port is optional, defaulting to the SSH standard port 22. The hostname and the remote file path are separated by a slash. An IPv6 address must be put in square brackets (e.g. [::1]). An absolute path requires another slash (i.e. // after the hostname). Otherwise, the path is relative to the current directory after login to the SSH server. See examples below:

 file='ssh://user:password@hostname:port//path/file.xxx'
 name='ssh://:password@hostname/path/file.xxx'
 name='ssh://[::1]//path/file.*/?*/member.*'

This URL notation is a simplified form of accessing files via SSH. For even more connection options, you can use the net.ssh() object which allows, e.g., the configuration of the public/private key file path or different host key check procedures. See the documentation for the net.ssh() object for further details.

If both, URL notation and net.ssh() object, are used to configure the SSH connection, communication parameters specified in the URL have higher priority. So, if a username is given within the URL as well as the through the object, the username from the URL is used.

The file for an inverse command cannot be on a remote system. If a remote report, info or log file is specified, the output data written to the remote system is in the local system's character set. When reading user tables, checksum files, password files or other miscellaneous text files, an automatic character conversion to the local character set is performed.

When accessing files via SSH from a mainframe system, the remote file is handled like a file from a non-mainframe platform that does not support datasets since SSH file access is block-oriented and not record-oriented. This means that, if record I/O is used, records are written or expected in one of the open formats on the remote system. For example:

   read.record(file='DATA.SET.PSVBA133')
   write.record(file='ssh://<cuser>@server1/[name]')

This statement for the CONV command of FLCL reads a local physical sequential variable block dataset with a logical record length of 133. A pair of 4 byte length field followed by the record data is written to the remote file for each read record. This is because the default mapping of the host record format (VB) to a non-host platform is OPN_VAR.

Using file name lists

With version 5.1.21 the file name specification was extended to an array. This make it possible to provide a list of files/URLs in the forms below:

   file=hugo.txt file='ssh://user@server/berta.dat file=s*.txt
   file[hugo.txt,'ssh://user@server/berta.dat,s*.txt]
   file=hugo.txt,'ssh://user@server/berta.dat,s*.txt
   file=>list.txt file=>'DD:FLIST'

In the last case a parameter file containing the list with the URLs can be provided using dynamic or static allocation (DD: only for z/OS). In the case a lot of URLs with several connections can be used. FLAM supports now a connection cache with the result that only the required connections are open. The NET overlay provides the default settings for each different URL. In each file specification wildcards are still possible. The feature can be used to combine different files in one archive.