Data Integration Implementation = ETL Acceleration

Data Integration Implementation

Selected Operational Capabilities

Overview

Analytics

ETL & Beyond

Does your ETL tool do everything you need it to do and in a way that seamlessly integrates with other critical data management activities?

If so, and you're not happy with speed and price; or:
If not, and you're looking for a something better...

then you should check out IRI Voracity.

All the features/functions listed below are supported in the Voracity data management platform and its included IRI Data Manager and IRI Data Protector suite products.

GUI refers to the IRI Workbench Graphical User Interface for Voracity. IRI Workbench is a free Integrated Development Environment (IDE), built on Eclipse™, for integrating and transforming data with SortCL or seamless Hadoop jobs designed and managed in Voracity.

DTP refers to the Data Tools Plugin (and Data Source Explorer) in the IRI Workbench. DDF refers to Data Definition Files, IRI's simple, open metadata for source and target data layouts.

Learn how Voracity underlies and accelerates every major Data Integration paradigm:

Voracity in the DW/EDW

See more details here.

Voracity in the ODS/EDH

See more details here.

Voracity in the LDW/VDW

See more details here.

Voracity in the Data Lake

See more details here.

Voracity as a Production Analytic Platform

See more details here.

*Operation*	*Description*
Profile	Discover data in pattern, fuzzy, and dictionary searches through DBs, files, or "dark data" documents. Perform traditional DB profiling and E-R diagramming on connected tables. Auto-classify data into groups and match them to transformation, protection, and other rules.
Design	Create and modify jobs in multiple ways: a visual workflow palette, end-to-end wizards, GUI dialogs, batchable 4GL scripts, and even a metadata API that are all modeled and outlined in the GUI's syntax-aware editor ... even work your flow and task scripts in any external text editor
Connectors	Manage your data assets (including RDBs, LDIF, CSV, XML, COBOL, and other files) from the Eclipse project explorer, data source explorer, and remote systems explorer. Support is also available for mainframe index files, unstructured data data file formats, ASN.1-compatible CDRs, multiple legacy/proprietary formats, and big data and cloud/SaaS platforms; see the complete list here.
Job Wizards	Automate the generation of your ETL or standalone unload, transform, or load jobs, plus slowly changing dimension, change data capture, pivoting, subsetting, data masking, data migration / replication, and test data generation / population jobs.
ETL	High performance, standalone or combined ETL operations in Voracity, i.e., 1. ODBC (surgical) or IRI FACT (parallel bulk) extracts 2. Optimized & combined IRI CoSort data transforms 3. ODBC (surgical) or DB utility (pre-sorted bulk) loads If you have single sources >10TB, Voracity can also run many CoSort (SortCL) transformation, reformatting, and masking jobs seamlessly in Hadoop MapReduce2, Spark, Spark Stream, Storm, or Tez through the VGrid gateway to your (Cloudera, HortonWorks, MapR, or generic Apache) distribution.
ELT	High performance "E" and pre-sorted "L". Design/manage "T" in Voracity (above) or integrated SQL operations
BI & Analytics	Generate detail and summary reports in the same-pass, drive BIRT or Splunk from Voracity, or hand-off data to another visualization tool
Mask (Protect)	Encrypt, redact, pseudonymize, hash, randomize, tokenize, or otherwise de-identify PII seamlessly
Cleanse	Improve data quality with a variety of data scrubbing and standardization techniques
Migrate & Replicate	Acquire, filter, subset, re-map and/or copy data from old to new data stores
Team Share	Update, check-in, manage, and share metadata and jobs in GIT, CVS, SVN, AnalytiX DS Mapping Manager, etc.
Repositories	Save, share, and re-use DDF metadata, master data dictionaries, business glossaries, set (lookup) files, rules, flow, and job scripts
Data Views	See and work directly with your source and target data in files and tables in custom editor and cell displays
Schema	Use static, create dynamic, or convert schemas via target mapping and table creation options
Lineage	Free Eclipse plug-ins support manual, and AnalytiX DS mapping manager supports visual, data lineage and impact analysis. Track and compare metadata and other resources (scripts, rules, templates) in version control hubs.
Job Fragments	Save, reference, and re-use job and metadata subsets in standalone, portable .DDF files, rule libraries, and other open artifacts
Pivoting	Transpose rows to columns and columns to rows to de-normalize or normalize your data efficiently through an easy wizard
Change Data Capture	Compare files or tables to identify, report on, and feed updates for smaller, real-time ETL using an intuitive job wizard
Slowly Changing Dimensions	Report on values from "fuzzy" lookup logic where they satisfy 'other than equal' criteria in all the common types from one wizard
Windowed Aggregates	Perform aggregation within specified row ranges for fair cost accounting and other apps
Rules	Define, store, and re-use field-level business rules for data transformation, protection, and test data generation
Prototype & Test	Generate and load safe, realistic, and referentially correct test data in file or table targets -- without real data -- for an entire EDW in Voracity's built-in IRI RowGen wizard(s). Or, use Voracity's built-in DB subsetting wizard to filter and mask referentially correct DB test sets. Or, preview the output of ETL and other workflow tasks with real data, or immediately simulated test data in the same format.

BBBT Podcast

How IRI and Voracity Help in DW/BI: listen now.