Data Quality and Comparison are SSIS components built to help developers have consistent, clean data.

There are currently 2 components available in the SSIS Productivity Pack that fall under this category.

The following are the data quality and comparison components available within the SSIS Productivity Pack and the link to their Help Manuals:

  • Diff Detector
    • Enables the comparison of two sources; a primary and a secondary source. Rows from the inputs are matched using a primary key (simple or compound key) and compared to each other to determine if the rows are unchanged, changed, deleted from the primary data source or added in the secondary data source. 
  • Duplicate Detector
    • Compares rows within a data source to identify duplicate rows based on an approximate (Fuzzy) or exact match. The component creates two outputs: Unique Rows and Duplicate Rows. The Duplicate Rows output has 4 additional fields: Richness Score, Richness Rank, Similarity Score, and GroupID.