Big Data, Small Change

Process Faster with or without Hadoop

  "More Hadoop projects will be swept under the rug as businesses devote major resources to their big data projects before doing their due diligence, which results in a costly, disillusioning project failure."

Gary Nakamura, Concurrent



  "Spend (on big data) wisely. Follow a CRAWL - WALK - RUN strategy."

Peter Aiken, Data Blueprint


To mine big data, you must smelt it first. Hadoop distributions and specialty software cannot access or handle all the data you need, mash it, or prepare it thoroughly enough (cleansing, masking, reformatting). The IRI Voracity platform, on the other hand, can, and it handles both big and small data sets in a governed, map once, deploy anywhere BI/DW framework.


Choose between multi-threaded file system processing in the default CoSort engine, or run the same jobs in MR2, Spark, Spark Stream, Storm, or Tez in HDFS ... using the same Eclipse job design and managed metadata infrastructure.

For more than three dozen years, IRI has been the proven performer for preparing and manipulating massive, multiple data sources across industries, geographies, and Unix/Windows platforms. Find out why you may only need:

  • one affordable product, the IRI Voracity platform which discovers, integrates, migrates, governs, and analyzes data, all in:
  • one simple place, a free Eclipse GUI supporting a simple 4GL, and,
  • one I/O pass, combining data transformation, protection, and reporting.

Here's what you can do with Voracity:

Big Data Packaging

Integrate (search, acquire, join, etc.), enrich (clean, remap, calc, etc.), and transform (filter, sort, aggregate, etc.) in HDFS or your file system.


Big Data Protection

Mask, encrypt, pseudonymize, de-ID, hash, tokenize, etc. data as you transform and provide it.


Big Data Provisioning

Bulk load  with pre-sorted data, create replicas and federated views, prepare (blend, munge) data for BI/analytic tools, write report, feed BIRT or index Splunk directly, or create big test data.

Voracity and all constituent IRI products all use the same self-documenting 4GL program from IRI CoSort called SortCL for data definition, manipulation, masking and reporting.

Design and manage your jobs in your choice of UIs from the same Eclipse IDE. Share, version-control, secure, and run the jobs from the GUI, or build them into batch scripts, applications, or distributed computing environments like Hadoop for even more speed and scalability.

Browse this section and its links for more details, or request a free trial.


Did you know?

  1. Voracity uses CoSort or Hadoop engines interchangeably, and that CoSort pre-dated Hadoop in big data, with technology under development since 1978? IRI has used the term "big data" since 2004, across CoSort deployments in telco CDR data warehousing projects on either multi-core or distributed hardware, and long before that with other industry (banking and government) transaction files.

  2. CoSort, typically used for data transformation, staging and reporting, can also do what its spin-offs do; i.e. data migration (IRI NextForm), data masking (IRI FieldShield), and test data generation (IRI RowGen).

  3. IRI Voracity uses the same metadata and Eclipse GUI (IRI Workbench) as CoSort and its spin-offs, but also lets you design and schedule jobs with state-of-the-art ETL workflow and built-in automation tools.

  4. Voracity users in the IRI Workbench GUI can view their HDFS files and contents, transfer data to and from HDFS, and auto-convert their transformation and masking job scripts (and batch flows). Execution, in the file system or in HDFS, can be driven on-demand or scheduled in the same GUI where you design and manage your jobs and metadata.