Using the Azure Data Lake Storage Destination Component

The Azure Data Lake Storage Destination Component is an SSIS data flow pipeline component that can be used to write data to Azure Data Lake Storage. You can create, delete, move, or append objects that allow a particular action with this component. There are three pages of configuration:

  • General
  • Columns
  • Error Handling

The General page is used to specify general settings for the Azure Data Lake Storage Destination Component. The Columns page allows you to map the columns from upstream components to Azure Data Lake Storage fields in the destination object. The Error Handling page allows you to specify how errors should be handled when they occur.

General Page

The General page allows you to specify general settings for the component.

Azure Data Lake Storage Destination Editor

Connection Manager

The Azure Data Lake Storage Destination Component requires an Azure Data Lake Storage connection. The Azure Data Lake Storage Connection Manager option will show all Azure Data Lake Storage connection managers that have been created in the current SSIS package or project.

Storage Service Object:

This option allows you to specify the object you want to work with, whether it is a File or a Directory.

Action

The Action option allows you to specify how data should be written to Azure Data Lake Storage. There are currently four (4) supported:

  • Create: Creates a new File or Directory.
  • Move: Moves a File or Directory.
  • Append (Available only for when working with Files): Adding new data at the end of a file.
  • Delete: Deletes a File or Directory.
Refresh Component Button

Clicking the Refresh Component button causes the component to retrieve the latest metadata and update each attribute to its most recent metadata.

Map Unmapped Fields Button

By clicking this button, the component will try to map any unmapped Azure Data Lake Storage attributes by matching their names with the input columns from upstream components. This is useful when your source component has recently added more columns, in which case you can use this button to automatically establish the association between input columns and unmapped destination attributes.

Clear All Mappings Button

By clicking this button, the component will reset all your mappings in the destination component.

Expression fx Button

Clicking the fx button to launch SSIS Expression Editor to enable dynamic update of the property at run time.

Generate Documentation Button

Clicking the Generate Documentation button generates a Word document which describes the component's metadata including relevant mapping, and so on.

Columns Page

The Columns page of the Azure Data Lake Storage Destination Component allows you to map the columns from upstream components to the Azure Data Lake Storage destination fields.

On the Columns page, you would see a grid that contains four columns as shown below.

Azure Data Lake Storage Destination Editor

  • Input Column: You can select an input column from an upstream component for the corresponding Azure Data Lake Storage field.
  • Destination Field: The Azure Data Lake Storage field that you are writing data.
  • Data Type: This column indicates the type of value for the current field.
  • Unmap: This column can be used to unmap the field from the upstream input column, or otherwise it can be used to map the field to an upstream input column by matching its name if the field is not currently mapped.

Error Handling Page

The Error Handling page allows you to specify how errors should be handled when they happen.

Azure Data Lake Storage Destination Editor

There are three options available.

  1. Fail on error
  2. Redirect rows to error output
  3. Ignore error

When the Redirect rows to error output option is selected, rows that failed to write to Azure Data Lake Storage will be redirected to the 'Error Output' output of the Destination Component. As indicated in the screenshot below, the blue output connection represents rows that were successfully written, and the red 'Error Output' connection represents erroneous rows. The 'ErrorMessage' output column found in the 'Error Output' may contain the error message that was reported by Azure Data Lake Storage or the component itself.

Error Output

Note: Use extra caution when selecting Ignore error option, since the component will remain silent for any errors that have occurred.