CoSort Deep Dive
Sorting is Only the Beginning
The Sort Engine
For four decades, IRI CoSort has defined the state-of-the-art in big data sorting and related manipulation technology. From advanced algorithms to automatic memory management, and from multi-core exploitation to I/O optimization, there is no more proven performer for production data processing than CoSort.
CoSort was the first commercial sort package developed for open systems: CP/M in 1978, MS-DOS in 1982, Unix in 1985, and Windows in 1995. Repeatedly reported to be the fastest commercial-grade sort product for Unix, CoSort was also judged by PC Week to be the "top performing" sort on Windows.
Being the first and fastest are just two reasons why CoSort is "The Open Systems Standard" in sorting technology. For more details on the power of the engine, see details here.
To accelerate your current sort function, or convert from a legacy sort product like SyncSort, to CoSort, see the details about the Sort PlugIns here and about the Sort Migration here.
Data Transformation, Cleansing, and Reporting
For many CoSort users, speed is more than just how many gigabytes they can sort in a minute. It is also about how much data staging, security, and analytic work they can accomplish in the same place and pass through their data. CoSort uses an award-winning, open 4GL program for data definition and transformation called SortCL to perform and consolidate multiple activities.
It is because SortCL can also do all this work at once that it's the default manipulation and mapping program in the IRI Voracity ETL and data management platform.
For information on the combinable functions in CoSort, check out the SortCL function matrix, or enlarge this diagram:
How many data sources can you transform -- while producing delta, detail, load, and report targets -- all in one job and I/O pass?
Learn about the direct role of the CoSort SortCL program in:
This sounds like "super-fast ETL and more" on the cheap. Is it, and can it happen in one product, place, and pass?
Yes. And that's all accomplished in CoSort'sSort Control Language (SortCL) program, and optionally managed in its free IRI Workbench GUI, built on Eclipse™. SortCL is a self-documenting 4GL and program for data definition and manipulation that's called from the command line, in batch, the GUI, or your applications.
SortCL is a simpler coding, and faster runtime, alternative for data integration, staging, and reporting jobs in Perl, Python, and shell scripts, SQL procedures, legacy ETL and ELT tools, and programs written in C, Java, VB, etc.
BI/DW architects can call SortCL jobs into their existing ETL tool command tasks to offload and thus optimize transformations. But for those who also want to save time and money with a full ETL package exploiting SortCL'sscalable performance and task consolidation, they can in the managed, visual ETL workflow environment of the IRI Voracity (subscription) platform, powered by CoSort and built on Eclipse.
See:
Accelerating Applications
CoSort is not only a standalone product for data manipulation and management, but also a "point solution" for speeding these database, DW ETL, and BI/analytic platforms:
ETL Tools
ETI SolutionIBM DataStageInformaticaPowerCenterMicrosoft SSISOracle Data IntegratorPentahoKettleTalendOpen Studio
BI Tools
BIRTBOBJCognosExcelMicroStrategyQlikViewOBIEE
Analytic Tools
JupiterOneOpenTextRSASSplunkSpotfireTableau
Databases Tools
CassandraDB2MongoDBMySQLOracleSQL ServerSybase
As well as: offline database reorgs, SAS Proc Sort, Software AG Natural, and legacy COBOL programs -- wherever high-volume jobs need more sorting and data transformation speed.
You can also use CoSort sort plug-ins and API calls, or CoSort SortCL programs to prepare (blend) large data sets through selection, sorting, joining and aggregation. The results of CoSort operations routinely feed database load utilities, analytic engines, BI tools and cubes, Excel, and custom applications.
See:
Solutions > Database Acceleration
Solutions > Data Transformation
Solutions > Business Intelligene> BI Tool Optimization
Products > CoSort > COBOL Tools
Included: Data Masking and Test Data Creation
For information security and compliance efforts, CoSort can de-identify sensitive data, and thus mitigate the security and governance risks that personally identifiable information (PII) represents. Powerful field-level data encryption and masking functions are built into SortCL.
Transform and protect PII at the same time, and still allow access to your tables, files, and everything around them. Produce an audit trail to verify compliance with data privacy regulations.
SortCL metadata also works in IRI's fit-for-purpose data masking product FieldShield, and test data generator RowGen.
See:
CoSort Package Contents
Here is what is inside every CoSort package:
- multiple third-party sort plug-ins
- the SortCL program
- the IRI Workbench GUI, built on Eclipse™
- third-party metadata converters
- API libraries
- the COBOL tools
Technical specifications for the product are delineated in a product overview booklet available on request. Installation, tuning, and implementation reference materials are extensive. The searchable .pdf CoSort product manual is over 700 pages, half of which documents SortCL.
Job logging options are described on the SortCL page.