IRI @ Big Data Bootcamp

Discover, Integrate, Migrate, Govern & Analyze Data

Big Data Bootcamp - Tampa, FL


At the Big Data Bootcamp in Tampa, FL, on December 9th, IRI updated attendees on developments in its new end-to-end platform for big data management, Voracity.

Speaker Backgrounds


Since 1978, IRI, The CoSort Company, has continuously innovated in data processing software, and was touting its handling of big data in 2003. David Friedland, the VP and COO, joined the company in 1988 and has focused on the data management product line and corporate growth. Also presenting is Don Purnhagen, IRI's CTO and in-house data scientist. He has been with the company for about 15 years and is the lead developer on the Hadoop and other big data initiatives, including IoT and analytic platform integration.



Recent Related Work


We have been adding capabilities to the five core areas in the IRI big data management platform, Voracity. They are: Data Discovery, Integration, Migration, Governance, and Analytics. Additionally, we've expanded into streaming data from web services, IoT devices, and Kafka, as well as Amazon and Azure source and target compatibility. Other recent developments include further data governance initiatives with data classification and master data management, plus a new firewall that monitors, protects, and audits on-premise and cloud databases.

Top 5 IRI Big Data Use Cases

1. Customer segmentation and promotion based on analytics of IP (web) traffic and/or CDR (call) logs

2. Medical insurance claim integrity using our data transformation engine in a fraud detection data warehouse

3. Masking and pseudonymizing protected health information (PHI) in NoSQL documents and RDBMS tables

4. Generating massive test sets for Cassandra, MongoDB, Teradata, and HDFS to simulate production conditions

5. Revenue optimization and fleet efficiency in transport by churning and using historic and operational data

Presentation Takeaways

Attendees learned the breadth and some of the depth of Voracity's scope and relate it to their own data management life-cycle. Questions were asked regarding their big data challenges, including on prominent topics like:

  • Data Discovery - how to search, profile, and classify data in files, tables, and documents    
  • Data Integration - how to rapidly extract, transform, and load data between heterogeneous silos or accelerate other ETL tools    
  • Data Migration - how to convert data from one type, file format, database table, or endian state to another    
  • Data Governance - how to cleanse, mask, and unify data as well as manage its metadata    
  • Data Analytics - how to report while transforming or prepare data for tools R, Splunk, and Tableau