Safe Harbour Anonymization

De-Identification through Obfuscation

Obfuscate PHI and More


Do you need to redact or disguise protected health information (PHI) or other personally identifiable information (PII)? And, can you do it in a way that:

  • is safe from hacking, or bypass (gleaning the information from other data)?
  • preserves the original column and field layouts (position, size data type)?
  • keeps the anonymized data looking real enough for testing purposes?
  • is convenient, simple, efficient, and affordable?
  • will comply with the HIPAA Safe Harbour rule?

With IRI FieldShield, you can easily classify, find, and remove or otherwise de-identify the key identifiers (and anonymize the quasi-identifiers) in database columns, and in fields within structured and JSON files. With IRI CellShield, you can do the same in Excel 2010 and later spreadsheets. And with IRI DarkShield, you can do the same for PHI values in unstructured files like HL7 and X12, PDF and Word, Excel and PowerPoint, and image files (including burned-in PHI in DICOM formats). All of these data masking products are part of the IRI Data Protector suite, or included free within the IRI Voracity total data management platform.


The data de-identification / obfuscation / masking method you choose will determine the appearance of the masked results, and the likelihood of recovering the original values. See this article ("Which Data Masking Function Should I Use?") for advice.


IRI's data masking functions in general (and below, field-level encryption in particular) apply to HITRUST CSF and, as shown below, the HIPAA Privacy Rule for covered entities and business associates. Irreversible obfuscation or outright removal of key identifiers will satisfy the Safe Harbour rule. If you need to preserve quasi-identifiers but further anonymize them through blurring or generalization to comply with the HIPAA Expert Determination Rule, see.

45 CFR 164.312, Technical Safeguards

Implement technical policies and procedures to limit EPHI access to only "those persons or software programs that have been granted access rights. These systems must allow for unique user identification, emergency access, automatic logoff, and encryption and decryption.

With column/field-level control, you can use multiple encryption libraries and keys (pass phrases) for field-specific, need-to-know decryption entitlements.

* Transmission security, including two addressable specifications:

  1. Integrity controls -- security measures to ensure that electronically-transmitted PHI is not improperly modified without detection until disposed of, and.
  2. Encryption --  Designation of encryption as an addressable specification is a key departure from the proposed rule, which explicitly required encryption when using open networks. Covered entities now must determine how to protect EPHI "in a manner commensurate with the associated risk."

FieldShield makes encryption another option for field-level protection within tables and files, along with filtering, anonymization, and pseudonymization, while CellShield does the same in Excel. The CoSort SortCL program or various Hadoop masking engines deployed interchangeably in the Voracity platform do as well, and even while running high volume manipulations and reports against massive data sources.

* Hardware, software, and/or procedural methods for providing audit controls

Optional application statistics, and a query-ready XML audit log, record the job script and encryption libraries used to show what, when, how, and by whom the PHI field data was encrypted (and otherwise protected and/or transformed).

* Policies and procedures to protect EPHI from improper alteration or destruction to ensure data integrity. This integrity standard is coupled with one addressable implementation specification for a mechanism to corroborate that EPHI has not been altered or destroyed in an unauthorized manner.

Data that does not decrypt with the proper encryption key suggests that the decrypted field has been compromised. You can trace this through runtime statistics and audit logs that IRI software produces automatically. You can see when and how each field was modified.

* Person or entity authentication, which requires the covered entity to implement procedures that verify that a person or entity seeking access to EPHI is the one claimed to be doing so.

IRI software users can leverage a number of role-based access controls to data sources, and executables.