Voracity Features & Benefits

Data Discovery, Integration, Migration, Governance, Analytics

You won't believe how much you can do!


IRI Voracity® is a full stack data lifecycle management platform that seamlessly combines the best of CoSort®, Eclipse™, Hadoop®, and other best-in-class technologies.


Since 1978, IRI CoSort has accelerated and expanded data transformation work. It is a proven: ETL optimizer, BI data preparer, DB load and query accelerator, data validation and quality tool, report generator, data converter, and it can mask PII and build test data.


Today, IRI Voracity harnesses the power of CoSort and Hadoop, and the familiarity and interoperability of Eclipse -- plus plug-ins like Erwin Mapping Manager and KNIME -- to deliver and enhance data discovery, integration, migration, governance, and analytics.


Open these tabs to review the major data and enterprise information management features of Voracity. Unless designated as a third-party option with an asterisk (*), everything listed is included in the base license.

Job Design

Job Design Features



New job wizards

auto-creates ETL, migration, masking, test data, and reorg scripts

hand-held, step-by-step job creation

Dialog and form editors

point and click job specification from the toolbar or script

easier parm definition and modification

Job/mapping diagrams

intuitive GUI and palette for ETL mapping and workflow scripts

fast and easy job design and review

Syntax-aware editors

write, modify and validate: IRI DDF, job scripts, SQL, and Java

delight code-centric data and ETL architects

Script menus and outlines

drill-down GUI views of, and dialog interactivity with, job scripts

bypass the need to learn the 4GL

DDF and SCL metadata

self-documenting, cross-IRI-job-, MIMB- and ADSMM-compatible

easy, interoperable, batchable CLI code

EMF and XML metadata

self-documenting, re-entrant EMM infrastructure (see below)

modify jobs in any UI and update all

"Gulfstream" API/SDK

documented JARs for IRI flow and metadata to/from applications

easy EAI, easy enter/exit for IRI XML

Erwin Mapping Manager *

spreadsheet-style, code-free source-target mapping definitions

leverage existing skills and repositories

Erwin CatFx and LSC *

auto-migrate ETL tool and SQL metadata to IRI Voracity

leave megavendor ETL tools sooner

Data Discovery

Data Discovery Feature



Data Classification

define and manage enterprise-wide data class libraries

search and apply rules across multiple sources at once

Structured data discovery

preview sequential file/DBs layouts, define IRI DDF

automated metadata creation

DDF conversion and import

CPY, CSV, LDIF, ODBC and XML to DDF migration in CLI/GUI

automated metadata conversion

DB & file profiling - statistics

select/report on column or field values, counts, lengths, duplicates, etc.

automated, custom table analysis

DB & file pattern search

finds values in tables or flat files conforming to defined Java RegEx patterns

auto-fine and protect numeric PII

DB & file string search

find explicit or dictionary-matching strings (names) in tables or flat files

auto-find and protect individuals

DB & file fuzzy search

finds near matches to defined probabilities with multiple algorithms

facilitate masking, BI, DQ, and MDM

DB integrity check

compare foreign keys with primary key in each defined column

easily maintain referential integrity

Cross-DB E-R diagramming

create and customize views of tables and relationships in any DB

visualize layouts of multiple DBs

Dark data discovery

find/report on pattern-matching values and forensic info in documents

expose dark data and metadata

Dark data structuring

extract and structure found values into flat files, create DDF and EIF

integrate/curate unstructured data

Data Integration

Data Integration Feature



Multiple source connections

manipulate and mash-up structured and unstructured sources

correlate internal and external data

Multiple target definitions

update and bulk-load: tables, files, pipes, procedures, and reports

single I/O, synchronized data

FAst extraCT (FACT) *

parallel unload Oracle, DB2, MySQL, SQL Server, Sybase, etc.

faster ETL, reorg, migration, archive

CoSort transform engine

resource-optimized, auto-tuned, single-pass sort, join, aggregate, etc.

relieves BI/DB tool, precludes more HW

CoSort (SortCL) 4GL DDL/DML

one-script / one-pass: transformation, cleansing, masking, mapping reporting

consolidates products, simplifies metadata, and saves I/Os

Hadoop transforms *

seamless options for MapReduce 2, Spark, Spark Stream, Storm or Tez processing

unlimited scalability, commodity servers

Data mapping

format endian, field, record, file, and tables, add surrogate keys, etc.

supports ETL, federation, replication

DB DDL and loader compatibility

automated table creation and (pre-CoSorted) load script definition

faster target table creation and loading

Data cleansing and validation

find, filter, unify, replace, validate, regulate, standardize, synthesize

high quality data = reliable ETL and BI

Data Migration

Data Migration Feature



Data type conversion

convert between alphanumeric, date/time, and multi-byte formats

speed platform migration

File format conversion

convert to/from fixed/delimited, LDIF, MF-ISAM and Vision, VB, XML, etc.

support application migration

Endian conversion

big, little and BOM recognition or change at field and file levels

ease mainframe migration

DB table/schema conversion

profile existing DBs, build and populate new schema and tables with mapped data

facilitate DB vendor migration

Dark data structuring

search and extract pattern-matched strings from MS documents, pdf, etc.

unlock and use unstructured data

Data federation

process data in place (LDW) and/or send mashup results to the console or service apps

display ad hoc values and views without centralization or persistence

Data replication

copy file or table data selectively to new formats or ETL patterns

ease the re-use of legacy data

JCL data redefinition

recognize and convert mainframe sort parms during sort migration

leave legacy sort software faster

Data Governance

Data Governance Feature



Metadata Management (EMM)

create, share, and track robust data definition and rule infrastructure in Eclipse

reuse data and govern information

Master Data Management (MDM)

develop, define, unify, modify, bucket, use, and share master data

360º view of customers and other data

Metadata asset management

master and metadata lineage, impact analysis, version control in Git or Erwin Edge *.

secure, shared, graphical data lineage

Data masking

search and secure PII with static and dynamic masking functions in 13 categories, automatic auditing

protect and comply, secure BI/DW ops

Re-ID risk scoring

measure and report on the risk of quasi-identifiers, then anonymize them

comply with HIPAA EDM, etc.

Database subsetting

define, build, and mask subsets of production databases for testing

rapid, agile prototyping of relational data

Data quality

discover, de-duplicate, filter, fuzzy search, interity-check, standardize

improve MDM, ETL, analytic reliability

Encryption key management

store, rotate and manage field-level encryption keys with built-in or web / HSM key vault technology

improves decryption and restoration security

Test data generation

parse, generate, and load synthetic DB, file, mart and report targets

faster, compliant EDW/app prototyping

COBIT support

support majority of COBIT objectives through myriad data lifecycle management tasks

align data management with risk control

Compliance services

train for compliance, and work with independent, HIPAA-qualified statistician and attorneys for verification and breach insurance/defense

assess HIPAA and PCI gaps, compliance


BI & Analytics

Analytic Feature



Embedded reporting

custom detail and summary BI targets with math, transforms, masking, etc.

report while transforming

Change data capture

find inserts, updates, deletes, non-changes, and value deltas from data, not logs

multi-input, custom output

Slowly changing dimensions

update and report on changes in data using fuzzy logic

use SCD data in BI, DI

Clickstream and CDR support

transform, convert, mask, federate, and report on data in web log and ASN.1 formats

bypass mediation software

Data segmentation

custom-select and silo data during integration, masking, quality, and reporting


BIRT, KNIME and splunk integrations

feed IRI data results directly into plug-ins for immediate reporting, deep learning, analytics, and advanced display and action

faster, free visual BI in Eclipse

BI/analytic tool data preparation

wrangle raw data and hand-off display-ready results to speed results from BOBJ, Cognos, Microstrategy, Oracle (OBIEE, DV/D or OAC) Power BI, QlikView, R, Splunk, Spotfire, Tableau, etc.

remove DI from the BI layer

JupiterOne * Analytics

support real-time visualizations using SQL syntax and Spark processing in Voracity

immediate results, bypass ETL