File Conversion and Processing
Migrate File Formats. Transform, Protect, and Report, too.
Challenges
Mainframe data types and file formats may be unsuitable for relational databases, data warehousing, and reporting environments on open systems. The converse may also be true if you still process data on, or for, a mainframe. For this reason, you may need to convert variable block or COBOL index files to CSV, or convert text to I-SAM.
Similarly, XML is a popular interchange format, but large XML files have not been practical for manipulation or conversion. ASN.1 CDR, LDIF, and Parquet files on the other hand, hold large amounts of information but are not file formats that many applications can import or process.
You may therefore need a way to convert between file formats and data types. You may also need to manipulate, report from, and protect data in multiple file formats -- possibly at the same time. Most of the solutions available are in the form of very expensive product suites or custom consulting services.
Solutions
File Format Conversion Only
IRI NextForm software migrates popular flat-file and legacy index file formats, the layout of their records, and the data types within fields. NextForm supports the translation of more than 100 data and 126 file types, including:
XML (flat) |
The NextForm "Unstructured Data" edition is an upgrade that can search for, and structure data elements into flat files from any number (and combination) of: doc/x, eml, pdf, ppt/x, rtf, txt, xlsx, and xml files.
File Processing and Conversion
The SortCL program incluced in the IRI CoSort data manipulation package or IRI Voracity data management platform supports the simultaneous transformation (sort, join, aggregate, remap) and inter-change (both conversion and creation) of the same data and file types supported by NextForm. SortCL can also generate detail and summary reports from these file formats, and protect sensitive data at the field level with a variety of data masking functions.
These capabilities are useful for mainframe and database migrations, ETL, SOA, and desktop application imports.
To specify a file format conversion in NextForm or SortCL, just declare the input and output formats in a script or through the IRI Workbench GUI, built on Eclipse. The source spec might contain:
/INFILE=/path/filename1 /PROCESS=CSV
and the output(s) declarations might be:
/OUTFILE=/path/filename2 /PROCESS=XML /OUTFILE=/path/filename3 /PROCESS=LDIF
Data Type Conversion
You can also convert between field data types in SortCL or NextForm jobs. For details, see:
For more details on:
- Profiling and searching for values in flat files, see: Products > Workbench > Discover Data
- Converting between file formats and data types, see: Products > NextForm
- Integrating, transforming, and reformatting files, see: Products > CoSort > SortCL
- Manipulating large data files in general, see: Solutions > Data Transformation
- Generating reports from your files, see: Solutions > Business Intelligence
- Protecting sensitive data in files, records, and fields, see: Solutions > Data Masking
- Prototyping applications safely with test data, see: Solutions > Test Data
See also: