Substring Manipulation

 

Pattern Matching Expressions and String Functions

Challenges

 

String-level expression logic is used to search, manipulate, and report on data according to certain patterns or rules. This functionality is often associated with text editors, SQL and shell commands, but is rarely integrated into high volume data processing operations.

 

In other words, string parsing, pattern matching, and other low level data manipulations must usually occur in a separate tools and I/O steps, increasing coding and processing overhead.

 

Some tools for ETL, data quality, and reporting are also missing the kind of substring functionality needed in to accomodate special use cases, like date value manipulation or sensitive character replacement.

Solutions

 

The SortCL program in IRI CoSort and IRI Voracity supports Perl Compatible Regular Expression (PCRE) logic for pattern matching, as well as find and replace, and other string and substring-level manipulations. SortCL also supports field padding and alignment, character validation, and field re-mapping.

These functions are also useful in the context of data discovery, master data management, and data quality improvement.

More importantly, this intricate data transformation can occur in the same job script and I/O pass with all the other simultaneous functions SortCL performs, like:

Please use the form below to tell us about your use cases for substring operations or other complex data manipulations.