Using the Premium Slowly Changing Dimension Component

The Premium Slowly Changing Dimension Component is an SSIS data flow pipeline component that can be used to monitor changes in the value of dimension fields that change slowly. Slowly changing dimensions are dimensions with values that are relatively static and change infrequently and unpredictably. The component can process three types of slowly changing dimensions: Fixed Attributes which are never supposed to change, Changing Attributes which can change but don't have a history, and Historical Attributes which can change, and a history of all previous values is kept.

The component requires two inputs: a Current Dimension Input with data that should be monitored for changes, and a New Source Input with potentially changed data.

The component includes the following four pages to configure how you want to process data.

  • General
  • Change Type Settings
  • Advanced Settings
  • Error Handling

General Page

The General page of the Premium Slowly Changing Dimension Component allows you to specify the general settings of the component. 

SSIS Premium Slowly Changing Dimension

Fields to Compare Grid

The Fields To Compare grid will display all the available fields from the Current Dimension Input. Each of these fields can be paired to a field in the New Source Input. For every field that has been paired it is possible to designate these fields as a business key. A compound business key is created from all the key fields that are selected. The compound business key is used to determine which rows to compare in the Current Dimension Input and New Source Input. Selecting a Column Type for a mapped pair of fields will enable options on the Change Type Settings page.

The possible column types are:

  • Fixed Attribute (Type 0) (Retain Original) - Attributes that are not supposed to change. An error row will be directed if this attribute changes, and an unchanged row will be directed if it does not change.
  • Changing Attribute (Type 1) (Overwrite) - Attributes that are supposed to change. A changed row will be directed if this row changes, and an unchanged row will be directed if it does not change.
  • Historical Attribute (Type 2) (Track History)  - Attributes that are supposed to change and a history kept. If this attribute changes a changed row will be be directed to expire the current row, and a new row will be directed to represent the new current row.
  • Surrogate Key - A surrogate key column has a numeric value that is unique in the current dimension input. Having a surrogate key can be useful when performing inserts and updates later in the workflow. The component can automatically increment this value in new rows that it creates.
  • Current Record - The current record column type is used to identify a record as current or expired.  The values that represent current and expired are defined on the change type settings page.
  • Start Date - The start date column type is used to identify the column in the record that represents the start date for the period in which it is current.
  • End Date - The end date column type is used to identify the column in the record that represents the end date for the period in which it is current.  The format of the end date is defined on the change type settings page.

Note: Fields that are different types can be compared, but a warning may be shown if the comparison combination is unsupported.  The component supports the following comparisons of different data types:

Data Type 1 Data Type 2 Notes
Any String Type Any String Type
Any Numeric Type Any Numeric Type Internally numbers are converted to a common compatable type and compared.
Any String Type Boolean The component attempts to parse the string to a boolean.  True/False and Yes/No strings are supported. This is a case insensitive comparison.
Any Numeric Type Boolean The component treats 0 as False and 1 as True.
Filter

The current dimension input columns that are visible can be filtered by entering text in the Filter text box.

Hide Unpaired Fields

When the Hide Unpaired Fields checkbox is checked only paired input columns will be shown.

Hide Paired Fields

When the Hide Paired Fields checkbox is checked only unpaired input columns will be shown.

Refresh  Component  Button 

Clicking the Refresh Component  button will reload input columns from the current dimension input and new source input.

Map Unmapped Fields Button 

Clicking the Map Unmapped Fields button will attempt to pair all the current dimension input fields to new source input fields with matching names.

Clear All Mappings  Button 

Clicking the Clear All Mappings button will clear all pairings between current dimension input fields and new source input fields.

Change Type Setting

The Change Type Settings page has settings that are applicable to change types selected on the General page.

SSIS Premium Slowly Changing Dimension - Settings

Changing Attribute Settings

If you have set one of the paired dimensions on the General page to change type "Changing Attribute" the changing attribute settings option will be enabled, the option is labelled "Change all the matching records, including outdated records, when changes are detected in a changing attribute". If this option is checked all records from primary input with a matching business key in secondary input will be updated with the new dimension value.

If this option is not checked, the component will attempt to identify the current record and only update that record. The current record for changing attributes can only be identified if dimensions on the General page have column types Start Date/End Date or Current Record.

Historical Attribute Settings

If you have set the column type of one of the dimensions on the General page to Start Date/End Date or Current Record the historical attribute settings option will be enabled. There are two ways to identify the current and expired records, using a single column, or using start and end dates.

Historical Attribute Settings - Single Column

Values used to detect current or expired columns can be used on string, numeric or boolean columns.  The component will attempt to convert the strings that you specify to the type of the column to identify or mark the record as current or expired.

  • Value when current - The value of the record when current.
  • Value when expired - The value of the record when expired.
Historical Attribute Settings - Date Columns
  • Variable to set date values - The variable that is used to get the current date for setting start date and end date values in new and updated rows.
  • End Date Value for Current Records - The value to use when setting the end date for current records, options are Null Value, Dec 31, 9999 or a custom date.
Expire Current Dimension Records not found in New Source

When this option is enabled, records exist in Current Dimension table but not New Source will be set to expired.

Advanced Settings

The Advanced Settings page contains additional settings to control the behavior of the premium slowly changing dimension component.

SSIS Premium Slowly Changing Dimension - advanced settings

Culture Identifier

Specifies the culture identifier that should be used when comparing strings.

Ignore Leading and Trailing White Space

When Ignore Leading and Trailing White Space is checked leading and trailing white space of string fields will be ignored during comparison.

Ignore Case

When the Ignore Case is checked case will be ignored when comparing string fields.

Surrogate Key Settings

If you have selected a surrogate key column two additional options must be specified. In both cases the value can be specfied, loaded from a variable or auto detected.

  • Surrogate Key Start Value - The value to use for the surrogate key in the first new row.  If this value is set to Auto Detect the component will look for the highest surrogate key value in the current dimension input and use this as the start value.
  • Surrogate Key Increment - The value to increment the surrogate key by for each new row.  If this value is set to Auto Detect the component will look for the smallest increment between surrogate key values in the current dimension input and use this as the increment.
Enable ChangeType Column for Default Output

When the ChangeType checkbox is checked the type of change, either 'Fixed', 'Updated' or 'New' will be included in an additional output column for each output.

Error Handling Page

The Error Handling page allows you to specify how errors should be handled when they happen. 

SSIS Premium Slowly Changing Dimension - error handling

There are three options available.

  1. Fail on error
  2. Redirect rows to error output
  3. Ignore error

When the Redirect rows to error output option is selected, rows that failed to write to the Premium Slowly Changing Dimension component will be redirected to the 'Error Output' output of the component. The blue output connection represents rows that were successfully written, and the red 'Error Output' connection represents rows that were erroneous. The 'ErrorMessage' output column found in the 'Error Output' may contain the error message that was reported by server or the component itself.

Sample Scenario

The scenario used for this tutorial has one Supplier dimension table and a source file. Supplier dimension table will be updated using a source file.

Source File

The source file contains 5 records that will apply changes to the dimension table when processed.

SupplierCode SupplierName Address State
S0000001 ABCD Company 46 Lori Street AL
S0000002 XYZ Corporation 1156 Balist Road MA
S0000003 Test New Company 6540 Cole Alley GA
S0000004 Blue Account 216 Reinger Shores NE
S0000005 M Organization 153 Baff View NJ

Supplier Dimension

The Supplier Dimension is a table in target database. Initially this table contains records for four Suppliers.

SupplierSK SupplierCode SupplierName State IsCurrent EffDate ExpDate
1 S0000001 ABC Company AL 1 2017-01-13
2 S0000002 XYZ Corporation WA 1 2017-01-01
4 S0000003 Test Company NY 1 2016-10-26
5 S0000004 Blue Account NE 1 2016-05-08

Building Slowing Changing Dimension SSIS Package

We are going to build an SSIS package which reads the data from Supplier Dimension and source file and load data into Premium Slowly Changing Dimension component to perform the comparison and configure what action we want to take when change occurs.

The genaral page in Premium Slowly Changing Dimension is used to define what the various columns of the dimension table are used for. Set column types for the columns as shown below.

SCD Sample Package General

We have set the SupplierName as Historical Attribute so that it will output an expired record to the ‘Changed Rows’ output, and a new record to the ‘New Rows’ output which can be used to update the dimension table. The State attribute is set to Change Attribute thus existing row in the dimension table is updated to contain the new attribute values.

SCD Sample Change Type Setting

In the Change Type Settings page, we define the values write to Current Record, Start Date and End Date columns.

SCD Sample Advanced

The Advanced Settings page is used to specify how surrogate keys are generated and additional string comparison rules. Also we have the option to enable DetectedChangeType column which can be used as a reference of what action has been applied to the record.

After finishing the configuration in Premium Slowly Changing Dimension component, then we need to write the output data into Supplier Dimension table as the shown below.

Below is a snap shot of the output data after processing.

SupplierSK SupplierCode SupplierName State IsCurrent EffDate ExpDate DetectedChangeType
Unchanged Rows
5 S0000004 Blue Account NE 1 2016-05-08 No Change
Changed Rows
1 S0000001 ABC Company AL 0 2017-01-13 2017-03-07 Expired (Type2)
2 S0000002 XYZ Corporation MA 1 2017-01-01 Update
4 S0000003 Test Company NY 0 2016-10-26 2017-03-07 Expired (Type2)
New Rows
6 S0000001 ABCD Company AL 1 2017-03-07
New Record(Type2)
7 S0000003 Test New Company GA 1 2017-03-07
New Record(Type2)
8 S0000005 M Organization NJ 1 2017-03-07
New Record