Data Pseudonymization


De-Identifying & Shuffling Names or Nouns with Realism

Challenges

 

While masking data, or producing useful test data, you need output values that look real, but do not reveal personally identifiable information (PII). This is particularly true with the names of people, places, and things.

 

Encryption, de-identification, and other masking, hashing, and bit-level obfuscation functions protect data at risk, but do not provide the level of realism certain recipients require. You need an easier way to change the individualizing characteristics of data using a substitute, but realistic, output value.

 

You must also ensure that the real name cannot be readily discovered through reversal or guesswork.

Solutions

 

If you work with PII in tables or flat files, use IRI FieldShield - or the SortCL program in the IRI CoSort product or IRI Voracity platform -- to replace that data with safe, but realistic replacement output stored in DB tables or external data sets called set files. If you need to do the same with ranges in Excel, use IRI CellShield, or for unstructured data sources, use IRI DarkShield. They support:

Recoverable Pseudonymization

Specify a lookup set where real and fake names are either pre-associated, or automatically associated at random. Use the restore set to recover the original names.

Unrecoverable Pseudonymization

Randomly select substitute names for the original value from a set file containing real or fake names. This way the original name value has no automatic basis for restoration.


Specify the pseudonym method used in your output fields in simple 4GL job scripts, or use the pseudonymization dialog in the FieldShield GUI, or DarkShield wizards, in the same Eclipse™ IDE, or in CellShield, which also supports pseudonymous lookup replacements oif values in Excel.

Pseudonymization is only one method that FieldShield can use to de-identify information in a record. And, it can combine pseudonyms with other field-level data security functions.

Need Test Names?

In addition to pseudonymizing and otherwise masking production data, there is a standalone solution for producing safe, but realistic first and last names of either gender (or other nouns). IRI RowGen uses the same metadata as FieldShield (and SortCL) to create and format pseudonyms for use as test data values (or in formatted test data targets).

RowGen is especially necessary for providing anonymous, but real-looking, test data when production data is unavailable or insufficient. RowGen builds structurally and referentially correct test data into database, file, and report targets. Note that RowGen is also included in Voracity.