SortCL Functionality


Combine Data Transformation, Migration, Masking & Reporting

What Can SortCL Do?


The Sort Control Language (SortCL) program in the IRI CoSort product or IRI Voracity platform accepts multiple inputs, including:

  • sequential (delimited or fixed-position), COBOL index, and semi-structured (flat JSON/XML) files
  • pipes
  • relational (and some NoSQL) database tables (collections) via ODBC
  • URLs for static and streaming sources, including S3/GCP/AzureBlob, HTTP/S, FTP/S, HDFS, MongoDB, Kafka, and MQTT
  • user procedures

in multiple formats, processes them in many ways, and produces one or more targets in multiple formats -- as well as customized reports -- all at once. See the table below and this diagram in the context of CoSort, or the data integration, migration, governance, and analytic portions of this diagram in the broader context of Voracity.


Specifically, SortCL can, in one job script and I/O pass, rapidly perform and combine data transformation, conversion, protection, reporting, and related processes:





At the byte, field, and record level, plus duplicate removal and saving


Conditional (include/omit) selection with if-then-else, else-if logic


Multiple keys, directions, sequences


Two or more pre-sorted files

Join (Match)

Two or more un/sorted sources on many conditions for ETL, file compares and change data capture (delta reporting) ops


Parallel roll-up and drill-down sum, min, max, average, and count values; accumulate (running); rank; lead and lag (sliding value windows)


Verify source data is pre-sorted prior to sort or join operations


Resize, reposition, and realign fields


Change data types (e.g., EBCDIC<>ASCII, Packed<>Numeric)


Convert between file formats (e.g., Text <>XML<>VS<>RS<>ISAM<>Vision<>LDIF<>CSV<>JSON)

Pivot / Unpivot

De-normalize and normalize dimensional layouts


De-duplicate, validate, homogenize, filter, find/replace, and re-structure


Integrate and segment data enhance row and column detail; create new data forms and layouts through conversions, calculations and expressions, and composite (templates)

Migrate DBs

via remapping and replication of columns and tables


Math and trig functions across detail and summary rows, plus internal and external stats functions


Bit-level manipulations and Perl-compatible regular expression logic for pattern matching, etc.


Check that character and field attributes match their specifications (i.e., "iscompares", gap analysis)


For custom indexing, reporting, and database load operations, plus UUID/GUID value insertion

Set Lookup

Discrete field substitutions, pseudonymization, etc., using "set" file field dimensions

Fuzzy Lookup

For slowly changing dimension (SCD) reporting and data quality


Get discrete (lookup) values and virtualize results in reports and replicas

Mask (Protect)

Encrypt and mask data at the field level and audit data security measures; also anonymization, de-identification, filtering, and pseudonymization

Mask (Format)

Numeric and date layout masking to replace and customize new value formats


Discrete or random draws from set files for use in ETL lookup transforms, pseudonymziation, and test data generation


Create randomly-generated or set-selected (safe) test data files (see RowGen)


Custom-formatted, segmented detail, and summary targets


Copy, manipulate, and move data from one or more sources to one or more targets


Complex field-level user functions (e.g., 3rd-party DQ libraries)

Beyond data staging, manipulation, and migration, use SortCL to report on changed data (inserts, updates, deletes), slowly changing dimensions, and trend line intersection.

 Additional SortCL features support: metadata and master data management, clickstream analytics (data webhousing), real-time and near-real-time processing, customer data integration and segmentation, data wrangling (data preparation for BI and analytics), and data governance objectives.