HELP: Additional match parameter to select certain data TYPE: OVERLAY SYNTAX: MATCH[{FILTER()/DATASET()}...]
With the MATCH overlay, you can choose whether you want to match against a filter specification with a binary signature or against a clear data set using a row specification.
The use of a filter with a signature for the update or deletion of individual records is rejected, as this operation is not unique. In such a case, you must always work with the dataset object.
If you are an archive operator who should not have access to the clear data, the data owner should calculate a Bloom filter with a binary signature using the FILTER command and make it available for the search.
If you are the data owner yourself, you can simply specify the clear data set to be selected in the respective specified format using the table support.
flcl archive.comp(read.record(...) to.new(flam(...) store.file(name='myarchive.fl5'))) flcl archive.deco(from(store.file(name='myarchive.fl5') flam(... match.dataset[='myrequest.txt'] subset)) write.text(table(format=CSV) ... file='myresult.csv')))
If you operate an archive remotely outside your own territory, the following procedure would be used to search for specific data.
- create archive with separate passwords for member (search) and data access and write it to the remote side flcl archive.comp(read.record(...) to.new(flam(encrypt.pwd(data(pw='datapw') member(pw='memberpw') dir(pw='dirpw'))...) store.file(name='ssh://remote/myarchive.fl5'))) - give 'memberpw' to the remote side (better use the PGP key from the remote side as MEMBERID and your own ID for the data) - create locally the format specification from the remote repository or request the ROWLIST from remote carrier -at minimum the directory password is required but the member and data password will also work flcl archive.list(store.file(name='ss://remote/myarchive.fl5' decrypt.pwd(password='dirpw')) rowlist='myformats.txt') - use the local ROWLIST to create the filter (if the ROWLIST not available the indexing of the archive must be known) flcl archive.filter(rowlist='myformats.txt' dataset[='myrequest.txt'] output='myfilter.txt') - give 'myfilter.txt' to remote side (in the example below SSH is used to access the filter on local side) - run search on the remote side and create a subset repository with the search result using the 'memberpw' flcl archive.copy(from(store.file(name='myarchive.fl5') flam(decode.pwd(password='memberpw') match[='ssh://local/myfilter.txt'] subset)) ... to.dup(store.file(name='ssh://local/mysubset.fl5'))) - transfer 'mysubset.fl5' to local side (in the example above SSH is used to write the subset archive to the local side) - run the search request on the local subset archive to get the clear records matching your search request flcl archive.deco(from(store.file(name='mysubset.fl5') flam(decode.pwd(password='datapw') match.dataset='myrequest.txt' subset)) write.text(table(format=CSV) ... file='myresult.csv'))) - to access the data the data password is required and should only be known by the data owner
Use the command LIST with the parameter ROWLIST to create a file with all formats in this archive. Then use the command FILTER with this list of row specifications and with DATASET='myrequest.txt' to create a filter (myfilter.txt) for the clear data and send this signature to the archive operator. The archive operator generates a subset of the compressed and encrypted segments via a COPY TO.DUPLICATEs with MATCH='myfilter.txt' and sends this archive (mysubset.fl5) with the result set of still compressed and encrypted segments to the requester. The owner of the data now uses the DECODE with MATCH.DATASET='myrequest.txt' to have the records that correspond to the search query (myrequest.txt) written out as originals from this subset of segments (mysubset.fl5).
The same would work for any number of members and formats in the archive.