Re-map / Reformat

 

Modify Field and Record Layouts during Transformation

Challenges
 

Data transformation, re-formatting, and reporting are often performed in slow, separate steps: e.g., sort, then join or aggregate, then stop in a flat file that gets handed off to a data scrubbing process. Then the result of those processes  gets opened in a data mart, BI tool, etc. All those I/O passes add up.

 

Sometimes complex languages like Perl or Python are used to recast data. They may be hard to code or maintain over time, and in volume, run too slow.

Solutions

 

The SortCL data transformation program in the IRI CoSort data manipulation product or IRI Voracity data management platform maps data using source field names as symbolic references on output. This allows you to reformat, replicate, report on, and even virtualize (federate) data in the same job (and I/O) with ETL or other data migration operations. See the data sources (and targets) supported here.

Specifically, as SortCL maps fixed- or variable-position fields from input to output, it can re-map (i.e., re-position, re-size, align, trim, pad), and type-convert the values. Additional custom layout options include changing fixed-position layouts to variable (floating) and vice versa.

Here are some other things you can do at the same time

  • Parse, strip, or rewrite header records on output. Insert special formatting characters and environment variables, including markup language commands for web-ready reports.
  • Perform mathematical expressions (cross-calculation) between field data, or on joined and/or aggregated values, to derive and output new detail or summary report values.
  • Create as many output targets and formats as you need in the same job script and I/O pass.
  • Reformat files from one type to another. For example, go from a COBOL index file to CSV and vice versa.
  • Append a "sequencer" field to each sorted record so you can cross-reference them by index values across multiple tables or files.
  • Populate targets directly through ODBC, pipes, or procedures, or feed them flat files for loading or further integration.