flacsv reads EBCDIC and ASCII binary files and produces CSV — without a copybook, without a schema. ROWDET auto-detects character set, field boundaries, BCD, floats, integers, and IMS multi-segment structures. From €149/month. Demo license (files ≤ 1 MiB) included.
Copybooks get lost. Developers retire. Acquisitions happen. Data archives outlive the documentation. flacsv is the tool for the moment someone drops an EBCDIC dump on your desk and says "we need a CSV by tomorrow."
Without knowing field boundaries, you can't even start. Each incorrect field offset cascades into garbage for the rest of the record. Manual reverse-engineering of a 300-byte record eats days of senior consultant time at €1 800/day.
Commercial mainframe tools (File-AID, IBM File Manager, Precisely DMX-h, Qlik Replicate) all require a copybook. They add no value when the copybook is the thing that's missing. Enterprise contracts often run six-figure deals — for a tool that can't handle your actual problem.
JRecord CodeGen (the only OSS tool with file-analysis inference) targets CSV and ASCII fixed-width files. It doesn't understand EBCDIC, IBM packed decimal, zoned sign nibbles, HFP floating point, or IMS envelope framing. You need a tool designed for mainframe data, not repurposed from network-protocol research.
Ransomware restored a backup with lost documentation. Audit demands data-profile by Friday. Due-diligence data-room closes in 48 hours. You can't wait for the former mainframe team to locate the copybook. You need a tool that works now.
Five heuristic stages, each validated across multiple sample records. The same byte sequence must satisfy the test in every record — never on a single lucky match.
Detects EBCDIC, ASCII or UTF-8 via statistical frequency of printable ranges
and padding bytes (0x40 EBCDIC, 0x20 ASCII, 0x00 NUL
filler). Pass CHRSET=EBCDIC as a hint to skip this step.
Identifies string fields by trailing padding runs plus fully-filled printable sequences. A zone is accepted only if it appears in ≥ 80 % of sample records (voting consensus). Base16 hex strings are recognized as binary, not string.
For binary gaps between strings: tries zoned decimal, packed decimal with C/D/F sign, 2/4/8-byte integers with endian detection, HFP floats (EBCDIC default), IEEE floats with L-shape bit-frequency analysis, and DFP. Each candidate must be valid in every sample record.
If records start with an IMS DFSURGU0 or DFSURGL0 envelope, flacsv recognizes IMSL (20-byte prefix with SEGNAM), IMSS (4-byte prefix) or HDU (36-byte prefix with SEG-CODE), groups records by segment discriminator, and runs the detection recursively on each group's payload.
Produces a FLAM row specification with one row per detected record type. Writes
it to a ZIP archive as CSV. Use ROWOUT=schema.tab to save the row-spec
and feed it into flbcsv ROWIN=schema.tab for deterministic, faster
production runs.
Every feature is tuned against real mainframe unload data. You can restrict the search space via bitmask hints when you already know parts of the format — fewer candidate types mean faster, more confident detection.
Finds string field endings by locating runs of 0x40 (EBCDIC space),
0x20 (ASCII space), or 0x00 (NUL filler). A zone is only
accepted if the same offset range is printable in the required percentage of
sample records.
Correctly identifies IBM packed decimal (two digits per byte, sign nibble C / D / F
in last byte) and zoned decimal (EBCDIC 0xF0..0xF9 or ASCII
0x30..0x39 with overpunch sign on final nibble). No false positives on
random binary garbage.
IBM hexadecimal floating point (HFP) for EBCDIC-family data, IEEE 754 for ASCII, and IEEE 754-2008 decimal floating point (DFP). Float identification uses bit-frequency analysis — the exponent bits show an L-shape pattern inspired by BinaryInferno (NDSS 2023), distinguishing real floats from lookalike binary noise.
Recognizes 2, 4 and 8-byte integers (signed and unsigned, big- and little-endian).
Endian is decided by counting leading-zero-byte distribution across sample records.
Restrict to specific widths via INTMSK=(S32,U32) when you already know the
layout.
Detects IMS DFSURGU0 LONG (IMSL), SHORT (IMSS) and HD Unload (HDU, DFSURGL0), groups records by SEGNAM or SEG-CODE, and infers payload structure per segment type recursively. Emits drain columns for envelope prefixes and a catch-all row for unknown segments — exactly the pattern a hand-written copybook would produce.
When a printable zone contains only hex digits and has even length, flacsv
recognizes it as base16-encoded binary and emits
type.binary(decode=BASE16) instead of type.string().
Base32 and base64 follow on the roadmap.
Save the inferred row-spec as a TABLE string with
ROWOUT=inferred.tab. Reuse it in production with
flbcsv ROWIN=inferred.tab — no detection overhead per file,
deterministic, copybook-quality. Discovery tool flacsv hands off to
production tool flbcsv seamlessly.
DETECT(CHRSET BCDMSK INTMSK ENDIANMSK FLTMSK BASEMSK UNLOADMSK SAMPLE CONFIDENCE MINSTRLEN STRVOTE)
— every bitmask restricts the candidate search. Knowing your data reduces
false positives and speeds up inference dramatically.
Same ZIP machinery as flbcsv: RFC-4180 CSV, five compression algorithms
(DEFLATE, COPY, BZIP2, ZSTD, LZMA), ISO/IEC 21320-1 conform member names
via [midN] token, stream mode for z/OS sequential datasets.
gzip, bzip2, xz, zstd, lz4, lzip — detected by magic bytes without a flag.
Read directly from ZIP or FLAM archive members via
INPUT='archive.zip/?MEMBER'.
Same batch pipeline as flbcsv: process directories, exclude patterns, remove source files after success, rename by pattern. Built for data-migration workflows.
Standalone binary, no Spark cluster, no JVM, no Python environment. Runs on x86-64, x86-32, POWER, SPARC and natively on z/OS USS. Read mainframe data on any platform.
Seven optional detection hints (CCSID, ENDIAN, DEPTH, BCD, FLOAT, INT, BASE)
let you tell flacsv what you already know about the data origin. FLOAT=HFP
or BCD=PACKED prioritise the relevant type families, making detection
faster and more accurate when the platform is known.
The ROWOUT parameter writes the detected row-spec TABLE string to
a file. Pass it back to flbcsv ROWIN= for deterministic, production-grade
processing without re-running auto-detection on every batch run.
DETECT(CHRSET=EBCDIC) or DETECT(CHRSET=ASCII) narrows detection
to the right byte class without requiring a specific CCSID name. For FIO-level
byte conversion of pure text streams use the top-level INPCCS= parameter
instead.
flacsv now detects delimited text files automatically, including comma-separated, semicolon-separated, and tab-separated layouts. Column names are inferred from the header line and column types (integer, float, string) are detected per column. No hints needed for typical CSV files.
INPCCS='IBM-1141' applies a byte-stream charset conversion at the
FIO layer before detection (for pure EBCDIC text streams). OUTCCS='UTF-8'
sets the output CCSID for CSV members in the ZIP archive, overriding STYLE.CCSID.
Both are optional and independent of the DETECT hints.
DETECT(TARGET=CSV) forces CSVDET only; DETECT(TARGET=FIX)
forces FIXDET only. DETECT(DISABLE=CSV) suppresses CSV detection for
binary data that happens to look like delimited text. Useful to resolve
ambiguous input formats.
DETECT(FRADIG=2) emits packed or zoned decimal fields as
type.float with two decimal places, turning packed 001234 into
12.34 in the CSV. Essential for financial and billing data where BCD encodes
amounts with implied decimal points.
No commercial mainframe tool offers copybook-free inference. OSS alternatives exist for ASCII but not for EBCDIC/packed/HFP. flacsv fills the gap.
| Tool | Price | Copybook required? | EBCDIC / Packed / HFP | IMS multi-segment | Runs without mainframe? |
|---|---|---|---|---|---|
| BMC File-AID | ~€2 300/mo | ✗ required | ✓ with copybook | ✓ with copybook | ✗ z/OS only |
| IBM File Manager | six-figure/year | ✗ required | ✓ with copybook | ✓ with copybook | ✗ z/OS only |
| Precisely DMX-h | enterprise | ✗ required | ✓ with copybook | partial | Hadoop/Spark |
| JRecord CodeGen | Free (OSS) | wizard assists | ✗ CSV/ASCII only | ✗ | ✓ Java |
| BinaryInferno (research) | academic | ✓ no | ✗ network protocols | ✗ | ✓ |
| flacsv | €149/mo | ✓ NOT required | ✓ EBCDIC-native heuristics | ✓ IMSL / IMSS / HDU with recursive inference | ✓ standalone CLI |
Copybook-loss is a real, recurring problem — and the commercial tool ecosystem has zero answer for it.
Same pricing model as all FLAM command products: one tool, multiple platforms, pay only for the platform you deploy on. Annual subscriptions receive a 15 % discount. Demo license embedded in every distribution — no registration, no credit card.
x86-64 and x86-32
Native z/OS and z/OS USS
POWER and Solaris platforms
Demo license: every FLAM distribution includes a built-in demo license with
all features enabled for input files up to 1 MiB. No registration, no credit
card, no feature throttling — evaluate the full product before you buy.
Annual subscription: 15 % discount (equivalent to one free month).
Contact us
for enterprise licensing, multi-seat discounts, or on-premise deployment.
Answers to the questions we hear most often from data-forensics teams and migration architects.
ROWOUT=inferred.tab), hand-verify the row-spec,
then switch to flbcsv with ROWIN=inferred.tab. That's the
recommended workflow.
ROWOUT, then run flbcsv with
ROWIN on the same data for production.
DETECT object takes bitmask hints for character set,
BCD formats, integer widths, endian variants, float types, base-encoded strings,
and IMS envelopes. Every hint you provide restricts the candidate search space,
making detection faster and more accurate. For example:
DETECT(CHRSET=EBCDIC BCDMSK=PACKED INTMSK=(S32) FLTMSK=HFP64)
skips the character-set vote and only considers those specific type candidates.
type.binary() for the complex parts. For production
pipelines with complex COBOL semantics, use flbcsv with a copybook or the FLAM
full license.
FLAC_ROWDET),
sold as part of the flacsv command product or as a "Table Support Add-on" to
the FLAM full license (bundled with FLAC_TABCPL, the table
compiler). The detection heuristics are informed by academic research
(BinaryInferno NDSS 2023 L-shape float analysis) combined with EBCDIC-native
engineering (packed/zoned overpunch, HFP exponent bias, IMS envelope layouts)
that we developed specifically for mainframe data.
FLAC_ACSV = 0x000DU | FLAC_COMD_MASK, along with its required
feature codes FLAC_ROWDET = 0x00B3U | FLAC_FEAT_MASK and
FLAC_TABCPL = 0x00B2U | FLAC_FEAT_MASK. When these are present
in your license, flcl ACSV and the standalone flacsv
binary are active.
Every distribution includes a built-in demo license — all features, files up to 1 MiB, no registration required. See how flacsv reads your data before you decide.
Request Demo BinaryOr: compare with flbcsv — the copybook-driven sibling for production pipelines.