Migrate and Process Excel Data
Convert, Cleanse, Integrate, and Mask Spreadsheets
Challenges
Excel spreadsheets are ubiquitous sources and targets of structured data and processes, but data in those sheets seldom starts or ends there. Data sitting or streaming in other formats like external files, database tables, and URLs need to feed spreadsheets for analytics and exchange. Conversely, data held in Excel needs to be integrated with those sources, or used in applications that connect to them.
The problem is that while helpful, Excel's import and export functionality is functionally limited and requires manual intervention and separate steps to leverage in data conversion jobs. In addition, that utility only consumes or produces delimited files, and not fixed position files, database, pipes or message queues.
In addition to the data migration challenge, Excel itself lacks other key data integration and governance capabilities, and cannot apply its own data manipulation functionality to other sources.
Solutions
As of CoSort v10, IRI Voracity platform software and the standalone component products linked below deliver Excel data conversion and processing functionality in several products. Choose based on need:
- Excel Data Conversion Only
- Excel Data Conversion, Transformation, Masking, and/or Reporting
- Excel Data Protection
- Excel Test Data
Use the IRI NextForm product to move data in XLS and XLSX files into DB tables, convert sheets or ranges into other file formats (like CSV, LDIF, COBOL, JSON, XML, etc.), or from other formats into Excel.
NextForm includes an XLS/X file parser to automatically create the field layouts* used in your file conversion job scripts. NextForm also supports data type conversion at the field level, and the remapping of record layouts. NextForm job definitions also work in SortCL-compatible products like Voracity if you upgrade later.
Use the SortCL program in the IRI Voracity platform or IRI CoSort package to convert, transform, mask, report from, and create new Excel sheets and other structured data targets.
Declare one or more Excel and non-Excel files for input and output as part of any SortCL job involving data:
- filtering (select, scrub, links to DQ tools)
- transformation (sort, join, aggregate, calc, etc.)
- conversion (data-type and file-format migrations)
- reporting (CDC, detail and summary formats)
- protection (field encryption, de-ID, masking)
SortCL makes all of these capabilities, one or more at a time, available to data architects who need to work with Excel and other sources. See this series of articles for more details!
Use IRI CellShield or IRI FieldShield to encrypt, mask, or otherwise de-identify values in Excel files or IRI DarkShield when the data you need to find and mask within your sheets are less structured (like embedded images or charts with sensitive data). See this article comparing them!
Use IRI RowGen if you need to populate your spreadhseet(s) with realitic, but synthetic test data. See example #1 in this article. RowGen is included in IRI Voracity, and uses the same layout metadata as CoSort, NextForm, and FieldShield, so you can easily move between test data generation and real data transformation, masking, reporting, etc.