Using the Azure Data Lake Storage Connection Manager

The Azure Data Lake Storage Connection Manager is an SSIS connection manager component that can be used to establish connections with Azure Data Lake Storage (Gen1 / Gen2).

To add a Azure Data Lake Storage connection to your SSIS package, right-click the Connection Manager area in your Visual Studio project, and choose "New Connection..." from the context menu. You will be prompted the "Add SSIS Connection Manager" window. Select the "Azure Data Lake Storage" item to add the new Azure Data Lake Storage connection manager.

New Connection

Add Azure Data Lake Storage Connection

The Azure Data Lake Storage Connection Manager contains the following two pages which configure how you want to connect to Azure Data Lake Storage.

  • General
  • Advanced Settings

General page

The General page on the Azure Data Lake Storage Connection Manager allows you to specify general settings for the connection. 

Azure Data Lake Storage Connection Manager

Azure Data Lake Storage

This option allows you to select the type of the Azure Data Lake Storage you are trying to connect to. Available options are:

  • Azure Data Lake Storage Gen1
  • Azure Data Lake Storage Gen2
Storage Endpoints Domain (Available only for Azure Data Lake Storage Gen2)

This option allows you to specify the location of the Azure Data Lake Storage Gen2 account you are trying to connect to. Available options are:

  • Azure
  • Azure China
  • Azure Germany
  • US Government
  • Other (enter below)
Storage Account

This option allows you to specify the name of the Azure Data Lake Storage account you are trying to connect to

Authentication Mode

This option allows you to select the type of authentication you want to use in order to connect to your Azure Data Lake Storage instance. Available options are:

  • OAuth Authorization Code
  • OAuth Client Credentials (service to service authentication)
  • Shared Key (Available only for Azure Data Lake Storage Gen2)
OAuth Authorization Code:
Get Token

This button completes the entire OAuth authentication process inside of the toolkit. All you need to do is login to the service endpoint and authorize our app to generate your token.

Azure Data Lake Storage Connection Manager

Tenant Id

The Tenant Id option allows you to specify the unique ID which identifies the tenant you are connecting to.

Client Id

The Client Id option allows you to specify the unique ID which identifies the application making the request.

Client Secret

The Client Secret option allows you to specify the client secret belonging to your app.

Redirect Url

The Redirect Url option allows you to specify the Redirect Url to complete the authentication process.

Generate Token (In App)...

The Generate Token File (In App)... button completes the entire OAuth authentication process inside of the toolkit. All you need to do is login to the service endpoint and authorize our app to generate your token.

Generate Token (In Browser)...

The Generate Token File (In Browser)... button completes the OAuth authentication using your default browser. After you click this button simply follow the steps in the dialog to generate your token.

Path to Token File

The path to the token file on the file system.

Token File Password

The password to the token file.

OAuth Client Credentials (service to service authentication):
Tenant Id

The Tenant Id option allows you to specify the unique ID which identifies the tenant you are connecting to.

Client Id

The Client Id option allows you to specify the unique ID which identifies the application making the request.

Client Secret

The Client Secret option allows you to specify the client secret belonging to your app.

Shared Key (Available only for Azure Data Lake Storage Gen2)

Azure Data Lake Storage Gen2 APIs support authorization with an Azure Storage Shared Key which can be specified using this option.

Upload Chunk Size

The Upload Chunk Size option allows you to specify the size of the file content to be divided to upload large file sequentially.

Timeout (secs)

The Timeout (secs) option allows you to specify a timeout value in seconds for the connection. The default value is 120 seconds.

Retry on Intermittent Errors

This is an option designed to help recover from possible intermittent outages or disruption of service so the integration does not have to be stopped because of such temporary issues. Enabling this option will allow service calls to be retried upon certain types of failure. A service call may be retried up to 3 times before an exception is fired. Retries occur after 0 seconds, 15 seconds, and 60 seconds. Warning: although we have carefully designed this feature so that such retries should only happen when it is deemed to be safe to do so, in some extreme occasions, such retried service calls could result in the creation of duplicate data.

Test Connection

After all the connection information has been provided, you may click the Test Connection button to test if the connection settings entered are valid.

Advanced Settings

The Advanced Settings page on the Azure Data Lake Storage Connection Manager allows you to specify some advanced and optional settings for the connection.

Azure Data Lake Storage Connection Manager

Proxy Mode

Proxy Mode option allows you to specify how you want to configure the proxy server setting. There are three options available.

  • No Proxy
  • Auto-detect (Using system configured proxy)
  • Manual
Proxy Server

Using Proxy Server option allows you to specify the name of the proxy server for the connection.

Port

The Port option allows you to specify the port number of the proxy server for the connection.

Username (Proxy Server Authentication)

Username option (under Proxy Server Authentication) allows you to specify the proxy user account.

Password (Proxy Server Authentication)

Password option (under Proxy Server Authentication) allows you to specify the proxy user's password.

Note: The Proxy Password is not included in the connection manager's ConnectionString property by default. This is by design for security reasons. However, you can include it in your ConnectionString if you want to parameterize your connection manager. The format would be ProxyPassword=myProxyPassword;  (make sure you have a semicolon as the last character). It can be anywhere in the ConnectionString.