How flacsv infers structure without a copybook

Five heuristic stages, each validated across multiple sample records. The same byte sequence must satisfy the test in every record — never on a single lucky match.

Character set

Detects EBCDIC, ASCII or UTF-8 via statistical frequency of printable ranges and padding bytes (0x40 EBCDIC, 0x20 ASCII, 0x00 NUL filler). Pass CHRSET=EBCDIC as a hint to skip this step.

String zones

Identifies string fields by trailing padding runs plus fully-filled printable sequences. A zone is accepted only if it appears in ≥ 80 % of sample records (voting consensus). Base16 hex strings are recognized as binary, not string.

Numeric interpretation

For binary gaps between strings: tries zoned decimal, packed decimal with C/D/F sign, 2/4/8-byte integers with endian detection, HFP floats (EBCDIC default), IEEE floats with L-shape bit-frequency analysis, and DFP. Each candidate must be valid in every sample record.

IMS envelope

If records start with an IMS DFSURGU0 or DFSURGL0 envelope, flacsv recognizes IMSL (20-byte prefix with SEGNAM), IMSS (4-byte prefix) or HDU (36-byte prefix with SEG-CODE), groups records by segment discriminator, and runs the detection recursively on each group's payload.

Row-spec emission

Produces a FLAM row specification with one row per detected record type. Writes it to a ZIP archive as CSV. Use ROWOUT=schema.tab to save the row-spec and feed it into flbcsv ROWIN=schema.tab for deterministic, faster production runs.

Capabilities in detail

Every feature is tuned against real mainframe unload data. You can restrict the search space via bitmask hints when you already know parts of the format — fewer candidate types mean faster, more confident detection.

Padding-blank detection across record boundaries

Finds string field endings by locating runs of 0x40 (EBCDIC space), 0x20 (ASCII space), or 0x00 (NUL filler). A zone is only accepted if the same offset range is printable in the required percentage of sample records.

Packed + zoned decimal with overpunch

Correctly identifies IBM packed decimal (two digits per byte, sign nibble C / D / F in last byte) and zoned decimal (EBCDIC 0xF0..0xF9 or ASCII 0x30..0x39 with overpunch sign on final nibble). No false positives on random binary garbage.

HFP / IEEE / DFP floating point

IBM hexadecimal floating point (HFP) for EBCDIC-family data, IEEE 754 for ASCII, and IEEE 754-2008 decimal floating point (DFP). Float identification uses bit-frequency analysis — the exponent bits show an L-shape pattern inspired by BinaryInferno (NDSS 2023), distinguishing real floats from lookalike binary noise.

Integer detection with endian inference

Recognizes 2, 4 and 8-byte integers (signed and unsigned, big- and little-endian). Endian is decided by counting leading-zero-byte distribution across sample records. Restrict to specific widths via INTMSK=(S32,U32) when you already know the layout.

IMS-aware multi-segment inference

Detects IMS DFSURGU0 LONG (IMSL), SHORT (IMSS) and HD Unload (HDU, DFSURGL0), groups records by SEGNAM or SEG-CODE, and infers payload structure per segment type recursively. Emits drain columns for envelope prefixes and a catch-all row for unknown segments — exactly the pattern a hand-written copybook would produce.

Base16-encoded binary strings

When a printable zone contains only hex digits and has even length, flacsv recognizes it as base16-encoded binary and emits type.binary(decode=BASE16) instead of type.string(). Base32 and base64 follow on the roadmap.

ROWOUT — natural upgrade path to flbcsv

Save the inferred row-spec as a TABLE string with ROWOUT=inferred.tab. Reuse it in production with flbcsv ROWIN=inferred.tab — no detection overhead per file, deterministic, copybook-quality. Discovery tool flacsv hands off to production tool flbcsv seamlessly.

Hint-driven performance

DETECT(CHRSET BCDMSK INTMSK ENDIANMSK FLTMSK BASEMSK UNLOADMSK SAMPLE CONFIDENCE MINSTRLEN STRVOTE) — every bitmask restricts the candidate search. Knowing your data reduces false positives and speeds up inference dramatically.

ZIP archive output with CSV per row-spec

Same ZIP machinery as flbcsv: RFC-4180 CSV, five compression algorithms (DEFLATE, COPY, BZIP2, ZSTD, LZMA), ISO/IEC 21320-1 conform member names via [midN] token, stream mode for z/OS sequential datasets.

Transparent auto-decompression

gzip, bzip2, xz, zstd, lz4, lzip — detected by magic bytes without a flag. Read directly from ZIP or FLAM archive members via INPUT='archive.zip/?MEMBER'.

Directory walk, EXCLUDE, REMOVE, RENAME

Same batch pipeline as flbcsv: process directories, exclude patterns, remove source files after success, rename by pattern. Built for data-migration workflows.

Cross-platform: Linux · Windows · AIX · Solaris · z/OS

Standalone binary, no Spark cluster, no JVM, no Python environment. Runs on x86-64, x86-32, POWER, SPARC and natively on z/OS USS. Read mainframe data on any platform.

🎯

DETECT hints — narrow the search space

Seven optional detection hints (CCSID, ENDIAN, DEPTH, BCD, FLOAT, INT, BASE) let you tell flacsv what you already know about the data origin. FLOAT=HFP or BCD=PACKED prioritise the relevant type families, making detection faster and more accurate when the platform is known.

🔁

Upgrade path to flbcsv via ROWOUT

The ROWOUT parameter writes the detected row-spec TABLE string to a file. Pass it back to flbcsv ROWIN= for deterministic, production-grade processing without re-running auto-detection on every batch run.

DETECT.CHRSET shorthand

DETECT(CHRSET=EBCDIC) or DETECT(CHRSET=ASCII) narrows detection to the right byte class without requiring a specific CCSID name. For FIO-level byte conversion of pure text streams use the top-level INPCCS= parameter instead.

📊

CSV auto-detection (CSVDET)

flacsv now detects delimited text files automatically, including comma-separated, semicolon-separated, and tab-separated layouts. Column names are inferred from the header line and column types (integer, float, string) are detected per column. No hints needed for typical CSV files.

🔤

INPCCS and OUTCCS — explicit charset control

INPCCS='IBM-1141' applies a byte-stream charset conversion at the FIO layer before detection (for pure EBCDIC text streams). OUTCCS='UTF-8' sets the output CCSID for CSV members in the ZIP archive, overriding STYLE.CCSID. Both are optional and independent of the DETECT hints.

🎝

Detector control (TARGET and DISABLE)

DETECT(TARGET=CSV) forces CSVDET only; DETECT(TARGET=FIX) forces FIXDET only. DETECT(DISABLE=CSV) suppresses CSV detection for binary data that happens to look like delimited text. Useful to resolve ambiguous input formats.

💱

Fractional BCD digits (FRADIG)

DETECT(FRADIG=2) emits packed or zoned decimal fields as type.float with two decimal places, turning packed 001234 into 12.34 in the CSV. Essential for financial and billing data where BCD encodes amounts with implied decimal points.

Simple, transparent pricing

Same pricing model as all FLAM command products: one tool, multiple platforms, pay only for the platform you deploy on. Annual subscriptions receive a 15 % discount. Demo license embedded in every distribution — no registration, no credit card.

Linux / Windows

x86-64 and x86-32

€149

per month, per server

ROWDET engine (CSVDET + FIXDET + XMLDET)
EBCDIC + ASCII + UTF-8 detection
Packed + zoned BCD, HFP + IEEE + DFP floats, 2/4/8-byte integers
IMS multi-segment inference (IMSL / IMSS / HDU)
Base16 binary string recognition
ROWOUT upgrade path to flbcsv
ZIP output, RFC-4180 CSV, STYLE tuning
Transparent decompression (gzip / bzip2 / xz / zstd / lz4 / lzip)
Email support

z/OS LPAR

Native z/OS and z/OS USS

€999

per month, per LPAR

Runs natively on z/OS USS
Reads VSAM and sequential datasets directly
Full ROWDET engine with EBCDIC-native heuristics
No mainframe ISV middleware required
Priority email and phone support
Especially valuable for forensic analysis of archived datasets

AIX / SPARC

POWER and Solaris platforms

€199

per month, per server

IBM AIX on POWER (32- and 64-bit)
Solaris SPARC and Solaris x86
Same feature set as Linux edition
Ideal for midrange data-migration and legacy analysis
Email support

Demo license: every FLAM distribution includes a built-in demo license with all features enabled for input files up to 1 MiB. No registration, no credit card, no feature throttling — evaluate the full product before you buy.
Annual subscription: 15 % discount (equivalent to one free month).
Contact us for enterprise licensing, multi-seat discounts, or on-premise deployment.

Frequently asked questions

Answers to the questions we hear most often from data-forensics teams and migration architects.

How accurate is flacsv without a copybook?

Detection quality depends on the regularity of the data. For typical mainframe records with padded string fields, packed decimals, and binary integers, flacsv achieves field-boundary correctness of 80 % or more on the first pass — enough to explore the data, answer audit questions, and build a first draft of a copybook. For production pipelines with 100 % deterministic guarantees, use flacsv for discovery (ROWOUT=inferred.tab), hand-verify the row-spec, then switch to flbcsv with ROWIN=inferred.tab. That's the recommended workflow.

What's the difference between flacsv and flbcsv?

flbcsv converts mainframe binary files to CSV using a provided COBOL copybook (or PL/1, C99, XSD, HLASM, DFDL in later waves). It's deterministic, fast, and the right tool for production pipelines. flacsv does the same conversion but infers the row-spec from the data itself — no copybook required. It's slower per file (detection overhead) and statistical (not 100 % deterministic), so it's the right tool for discovery, forensics, and one-off analysis.

The natural workflow combines both: start with flacsv to understand the data, save the row-spec with ROWOUT, then run flbcsv with ROWIN on the same data for production.

Does flacsv support IMS multi-segment unloads?

Yes. flacsv recognizes IMS DFSURGU0 LONG (IMSL, 20-byte prefix with SEGNAM), DFSURGU0 SHORT (IMSS, 4-byte LL+ZZ prefix), and HD Unload (HDU, DFSURGL0 with 2-byte SEG-CODE). Records are grouped by segment discriminator and detection runs recursively on each group's payload. The output is a multi-row-spec ZIP archive with one CSV per segment type, plus drain columns for the envelope prefix and a catch-all row for unknown segments — exactly the structure you'd get from a hand-written copybook.

Can I tell flacsv what I already know?

Yes. The DETECT object takes bitmask hints for character set, BCD formats, integer widths, endian variants, float types, base-encoded strings, and IMS envelopes. Every hint you provide restricts the candidate search space, making detection faster and more accurate. For example: DETECT(CHRSET=EBCDIC BCDMSK=PACKED INTMSK=(S32) FLTMSK=HFP64) skips the character-set vote and only considers those specific type candidates.

Why is flacsv more expensive than flbcsv (€149 vs. €59)?

Different use case, different value. flbcsv serves regular pipelines where the copybook is already in your schema repository — a routine conversion tool. flacsv serves critical moments: audit, incident response, M&A due diligence, migration kickoff. Use-cases where the data has no schema and someone needs answers fast. The value of having the right tool in that moment is substantially higher than a routine conversion, which is why the pricing is proportional to it. You can always combine both — use flacsv once to infer the schema (€149 for the month), then run flbcsv (€59/month) for ongoing production. Total is still less than €250/month, versus €2 300/month for the nearest commercial alternative that doesn't even solve the copybook-free case.

Do I need a mainframe or Spark cluster?

No. flacsv runs as a standalone command-line binary on Linux x86-64, Windows, AIX, Solaris, and natively on z/OS USS. You only need the binary file (exported via FTP, NFS, or any transfer method). No mainframe connection, no Spark cluster, no JVM, no Python environment required.

How does the demo license work?

Every FLAM distribution ships with an embedded demo license. All features are enabled, but input files are limited to 1 MiB. No registration, no credit card, no feature throttling — you see the full product. Once your files exceed 1 MiB, switching to a paid license takes two minutes. This is the standard model across all FLAM command products (flbcsv, flworm, flacsv, ...).

Does flacsv handle OCCURS DEPENDING ON or complex REDEFINES?

No — that's deliberate. Variable-length arrays (OCCURS DEPENDING ON) and nested REDEFINES with dynamic discriminator logic require the full deterministic semantics of a COBOL parser, which only flbcsv and the FLAM full license provide. Data with these constructs will be inferred as fixed-width approximations or fall back to type.binary() for the complex parts. For production pipelines with complex COBOL semantics, use flbcsv with a copybook or the FLAM full license.

Is flacsv's detection engine open source?

No. The ROWDET engine is a proprietary feature (feature code FLAC_ROWDET), sold as part of the flacsv command product or as a "Table Support Add-on" to the FLAM full license (bundled with FLAC_TABCPL, the table compiler). The detection heuristics are informed by academic research (BinaryInferno NDSS 2023 L-shape float analysis) combined with EBCDIC-native engineering (packed/zoned overpunch, HFP exponent bias, IMS envelope layouts) that we developed specifically for mainframe data.

Where do I see flacsv in the license manager?

In the FLAM license-info output you'll see the command code FLAC_ACSV = 0x000DU | FLAC_COMD_MASK, along with its required feature codes FLAC_ROWDET = 0x00B3U | FLAC_FEAT_MASK and FLAC_TABCPL = 0x00B2U | FLAC_FEAT_MASK. When these are present in your license, flcl ACSV and the standalone flacsv binary are active.

Tool	Price	Copybook required?	EBCDIC / Packed / HFP	IMS multi-segment	Runs without mainframe?
BMC File-AID	~€2 300/mo	✗ required	✓ with copybook	✓ with copybook	✗ z/OS only
IBM File Manager	six-figure/year	✗ required	✓ with copybook	✓ with copybook	✗ z/OS only
Precisely DMX-h	enterprise	✗ required	✓ with copybook	partial	Hadoop/Spark
JRecord CodeGen	Free (OSS)	wizard assists	✗ CSV/ASCII only	✗	✓ Java
BinaryInferno (research)	academic	✓ no	✗ network protocols	✗	✓
flacsv	€149/mo	✓ NOT required	✓ EBCDIC-native heuristics	✓ IMSL / IMSS / HDU with recursive inference	✓ standalone CLI

You inherited the data.
Not the copybook.

When the schema is gone, the data isn't useless — it's invisible.

Custom Python breaks at COMP-3 and packed decimals

BMC File-AID costs €2 300/month — and still wants a copybook

OSS alternatives can't read EBCDIC

Incident-response has no time for a Jira cycle

How flacsv infers structure without a copybook

Character set

String zones

Numeric interpretation

IMS envelope

Row-spec emission

Capabilities in detail

Padding-blank detection across record boundaries

Packed + zoned decimal with overpunch

HFP / IEEE / DFP floating point

Integer detection with endian inference

IMS-aware multi-segment inference

Base16-encoded binary strings

ROWOUT — natural upgrade path to flbcsv

Hint-driven performance

ZIP archive output with CSV per row-spec

Transparent auto-decompression

Directory walk, EXCLUDE, REMOVE, RENAME

Cross-platform: Linux · Windows · AIX · Solaris · z/OS

DETECT hints — narrow the search space

Upgrade path to flbcsv via ROWOUT

DETECT.CHRSET shorthand

CSV auto-detection (CSVDET)

INPCCS and OUTCCS — explicit charset control

Detector control (TARGET and DISABLE)

Fractional BCD digits (FRADIG)

How flacsv compares

A market pressure no one else addresses

Simple, transparent pricing

Linux / Windows

z/OS LPAR

AIX / SPARC

Frequently asked questions

Get the demo license today

You inherited the data.Not the copybook.

When the schema is gone, the data isn't useless — it's invisible.

Custom Python breaks at COMP-3 and packed decimals

BMC File-AID costs €2 300/month — and still wants a copybook

OSS alternatives can't read EBCDIC

Incident-response has no time for a Jira cycle

How flacsv infers structure without a copybook

Character set

String zones

Numeric interpretation

IMS envelope

Row-spec emission

Capabilities in detail

Padding-blank detection across record boundaries

Packed + zoned decimal with overpunch

HFP / IEEE / DFP floating point

Integer detection with endian inference

IMS-aware multi-segment inference

Base16-encoded binary strings

ROWOUT — natural upgrade path to flbcsv

Hint-driven performance

ZIP archive output with CSV per row-spec

Transparent auto-decompression

Directory walk, EXCLUDE, REMOVE, RENAME

Cross-platform: Linux · Windows · AIX · Solaris · z/OS

DETECT hints — narrow the search space

Upgrade path to flbcsv via ROWOUT

DETECT.CHRSET shorthand

CSV auto-detection (CSVDET)

INPCCS and OUTCCS — explicit charset control

Detector control (TARGET and DISABLE)

Fractional BCD digits (FRADIG)

How flacsv compares

A market pressure no one else addresses

Simple, transparent pricing

Linux / Windows

z/OS LPAR

AIX / SPARC

Frequently asked questions

Get the demo license today

You inherited the data.
Not the copybook.