Why RowGen Is Better


Consider Key Advantages and Differentiators

IRI RowGen is the only tool designed for big, structured and semi-structured data environments with multiple targets requiring safe, realistic test data.

 

RowGen produces huge test data volumes for databases, BI and DW ETL/ELT tools, and custom report formats. RowGen's simple batch execution paradigm is database- and platform-agnostic, making it ideal for customizing and automating test data generation across uses cases and operating systems, and for precluding other operations like data masking.

 

RowGen is nonetheless directly compatible with IRI FieldShield data masking jobs -- and IRI Voracity ETL, replication and reporting operations. This is very convenient if you want to synthesize and work with test data from scratch in the same environment. They all use the same Eclipse pane of glass (IRI Workbench), and share the same metadata and scripting language, too.

 

Here however, are the main reasons most data architects and VLDB administrators consider RowGen to be the best package for building test data for databases and custom file and report targets:

Compliance

Unlike most test data tools, Row Gen does not require access to original production files or databases. RowGen generates data at random in the right ranges and formats, and can also make safe use of real data through random selections from real tables or set files. It allows you to create data based strictly on common database DDL, or file/report metadata in IRI DDF formats, and in the process, preserve the structural and referential integrity of production data and reflect its value appearance, range, volume, and distribution characteristics.

RowGen can automatically, accurately, and reliably build and populate test data for an entire VLDB platform or enterprise data warehouse from scratch, without the need for data masking, NDAs, learning curves, or consulting engagements. Built-in audit logs and available version control support compliance with data privacy laws and safe project outsourcing, respectively.

Conformance

RowGen test data conforms to production data characteristics. RowGen delivers realism to database application development and prototyping through:

  1. preservation of referential integrity (primary-foreign key links and various constraints)
  2. determination of row counts per table and null occurrences
  3. customization of value appearances, ranges, and frequency distributions
  4. transformation of test data during generation (e.g., filter, sort, join, aggregate, convert)
  5. generation of all and valid (joined) pair value combinations
  6. invocation during database virtualization (cloning) operations in Windocks, Commvault, and Actifio

RowGen delivers realism to software application development with the above, and through:

  1. modification of test file formats, record layouts, and report templates
  2. compound data value definitions or format masks
  3. conditional display of constants and variables, like time/date stamps
  4. random pulls from real set data, inline sets, and literal value ranges as field-level rules
  5. math and custom computation functions (e.g., to create valid national ID values)
  6. integration with IRI DarkShield to synthesize test document and image files (unstructured targets)
  7. invocation within multiple CI/CD (DevOps) platforms on-premise or in the cloud

Convenience

Other test data tools are strictly GUI-driven, and the processes of job design and maintenance can be complex or require consulting. The metadata behind their creation and population of test data may also be cryptic or hidden. RowGen, by contrast, exposes and holds its data generation, manipulation, and population metadata in simple, portable text files that are easy to create -- either by hand or automatically with wizards in the IRI Workbench GUI for RowGen, built on Eclipse™.

Your editor or the GUI will show how easy RowGen jobs are to learn, export, extend, and run. RowGen GUI users in IRI Workbench also enjoy the same metadata and job scripting syntax with the data transformation (integration and staging), data profiling, data masking, reporting, and advanced BI functions available in the IRI Voracity platform ecosystem.

And if you use an existing CI/CD pipeline for DevOps and need test data, RowGen jobs can be called directly into an Amazon CodePipeline, an Azure DevOps Pipeline, Jenkins, Github, and GitLab (see examples in iri.com blog), as well as in Bitbucket, CircleCI, Travis, and more. 

Performance

Only RowGen has the high volume, high performance data generation, manipulation, and pre-sorted bulk-loading capabilities of CoSort that DBAs and big data architects require to test queries, applications, and operations across billions of rows.

Only RowGen can also meet the test data and compliance requirements of outsourcers and software developers who rely on safe, intelligent test data in multiple flat file and custom report formats. RowGen not only runs faster than other test data generators, it saves time in learning, test data definition, deployment, and project management.

Compatibility

RowGen is the only test data package compatible with Eclipse and the easiest to configure with any relational databases supported by the Eclipse Data Tools Platform (DTP) plug-in. It is also a component of the larger IRI Voracity data management platform in the same pane of glass (IRI Workbench), where data masking, DB subsetting, and other techniques can be combined with value generation to satisfy an unlimited range of use cases requiring test data.

RowGen job scripts are compatible with the metadata of the CoSort SortCL program for data transformation and reporting, as well as IRI NextForm for data and database migrations, and IRI FieldShield for data masking. Thus, if you have data or job definitions for any of those tools, you can re-use them in RowGen (and vice versa when you're in production).

Only RowGen is compatible with Dan Linstedt's Data Vault 2.0 architecture, the Meta Integration Model Bridge (MIMB) and the Erwin (formerly AnalytiX DS) Mapping Manager frameworks for test data metadata interchange with third-party data modeling and ETL architectures.