Using the Azure Blob Storage Source Component

The Azure Blob Storage Source Component is an SSIS data flow pipeline component that can be used to read / retrieve data from Azure Blob Storage.

The component includes the following two pages to configure how you want to read data.

  • General
  • Columns

General page

The General page of the Azure Blob Storage Source Component allows you to specify the general settings of the component.

SSIS AWS S3 Source

Connection Manager

The Azure Blob Storage Source Component requires a connection in order to connect to Azure Blob Storage. The Connection Manager drop-down will show a list of all connection managers that are available to your current SSIS package.

Source Item Path

The Source Item Path specifies the location of the file or folder that you are trying to read from. Click the eclipse button ('...') to open up a Azure Blob Storage browser dialog to select an item.

Item Selection Mode

The Item Selection Mode settings specifies what sub items (if any) you wish to retrieve. The available modes are:

  • Selected Item - Retrieves only the item specified at Source Item Path
  • Recursive - Retrieves the item specified at Source Item Path and its sub items recursively
  • Recursive (Files Only) - Selects items the same as the Recursive mode but only returns files.
  • Selected Level (Files Only) - Selects items the same as the Selected Level mode but only returns files.
Page Size

The Max Keys lets you specify how many records to retrieve per service call to Azure Blob Storage. The default is set to 1000.

Include Metadata

The Max Keys lets you specify how many records to retrieve per service call to Azure Blob Storage. The default is set to 1000.

Include Snapshots

The Max Keys lets you specify how many records to retrieve per service call to Azure Blob Storage. The default is set to 1000.

Include Copy

The Max Keys lets you specify how many records to retrieve per service call to Azure Blob Storage. The default is set to 1000.

Include Uncommitted Blobs

The Max Keys lets you specify how many records to retrieve per service call to Azure Blob Storage. The default is set to 1000.

Refresh Component Button

Clicking the Refresh Component button causes the component to retrieve the latest metadata and update each field to its most recent metadata.

Columns page

The Columns page of the Azure Blob Storage Source component shows you all available attributes from the object that you specified on the General page. 

SSIS AWS S3 Source - Columns Page

On the top left of the grid, you can see a checkbox, which can be used to toggle the selection of all available fields. This is a productive way to check or uncheck all available fields. 

The Columns Page grid consists of:

  • Amazon S3 Field- Column that will be retrieved from the current item (file or folder). 
  • Data Type - The data type of this field.  

Note: As a general best practice, you should only select the fields that are needed for the downstream pipeline components.