Data for Application DevOps

 

Stress Programs with Smart, Synthetic Test Data

Challenges

 

Applications and databases have their own logic and unique properties. For test data to be useful in these contexts, it must reflect production characteristics such as:

  • selection conditions (business rules)
  • column attributes and transformations
  • inter-field/key relationships (referential integrity)
  • value ranges and inter-column calculations

Production-quality test data must also have these attributes:

  • type - correct column / field values and formats
  • width - values with current (and future) ranges
  • frequency - realistic value occurrence patterns
  • depth - volumes that address scalability concerns

In a fast-paced DevOps and Continuous Integration (CI) / Continuous Deployment (CD) environments, the ability to generate and automate consistent, and realistic test sets that can range widely in format and volume can be a tall order, distracting programmers with tight delivery timelines.

Solutions

Applications developed with realistic data formats and volumes are more likely to succeed in production. The IRI RowGen test data creation package uses production metadata to synthesize custom test sets with randomly-generated data, and/or randomly selected data from production sources. IRI FieldShield and IRI DarkShield data masking software can also be used to find and mask sensitive data in production and move bespoke targets into lower environments for testing.

To generate the right values and value ranges, RowGen uses conditional selection and formatting parameters. RowGen further enhances test data realism through referential integrity, frequency distributions, and built-in transformation and formatting functions. For example, you can randomly select data and specify ranges from pools of real data and weighted numbers (respectively).


For Continuous Integration and Continuous Deployment or Delivery (CI/CD) environments, RowGen can synthesize test data at any step in a development process without depending on data being made available from another step. Moreover and uniquely, embedded data transformation, validation, and formatting functions can run simultaneously against the generated data in the same script! This can facilitate incremental application tests that assure backwards compatibility and forward compliance with your production releases. See
this use case.

Combine random generation and set-file selection, field-level conditions and manipulations, and custom layout features. Rapidly build the intelligent data you need to stress-test and vet your applications. Improve the quality and reliability of your deliverables. Schedule jobs to repeat generation and testing operations in IRI Workbench (or your own CLI-supporting automation tool) to smooth out CI, and enable CD, processes.

If you still prefer to test with data sitting in production DBs, you can also use RowGen to quickly subset and mask it. Or, you can mask an even broader range of sources with IRI FieldShield and IRI DarkShield. If you need multiple capabilities, or need to virtualize test data from a variety of static or streaming sources, check out the IRI Voracity data management platform platform and partners like Value Labs, and their Voracity-supporting Test Data Hub which provides test data on-demand.