Using the Google Cloud Storage Source Component

The Google Cloud Storage Source Component is an SSIS data flow pipeline component that can be used to read/retrieve data from Google Cloud Storage.

The component includes the following two pages to configure how you want to read data from Google Cloud Storage.

  • General
  • Columns

General Page

The General page of the Google Cloud Storage Source Component allows you to specify the general settings of the component.

SSIS Google Cloud Storage Source Component

Source Item Settings
Connection Manager

The Google Cloud Storage Source Component requires a Google Cloud Storage connection in order to connect to Google Cloud Storage. The Connection Manager drop-down will show a list of all Google Cloud Storage Connection Managers that are available for your current SSIS package.

Source Item Path

The Source Item Path specifies the base item to select. Click the ellipsis button ('...') to open up a Google Cloud Storage browser dialog to select an item.

Item Selection Mode

The Item Selection Mode specifies what Google Cloud Storage item(s) will get selected. The Files Only modifier will still go through folders to retrieve items but will not return any records for the folders.

  • Selected Item: Retrieves only the item specified by Source Item Path.
  • Recursive: Retrieves the selected item (specified by the Source Item Path option) and all sub-items recursively.
  • Recursive (Files only): Retrieves items the same as the Recursive mode but only returns files.
  • Selected Level (Files only): Retrieves items the same as the Selected Level mode but only returns files.
Projection

Specifies the properties to return as Fields in the Columns page of the Source Component. Acceptable values are:

  • Include all properties.
  • Omit owner, acl, and defaultObjectAcl properties
User Project

The project to be billed for this request. Required for Requester Pays buckets.

Max Results

The Max Results lets you specify how many records to retrieve per service call to Google Cloud Storage. The default is set to 1000.

Refresh Component Button

Clicking the Refresh Component button causes the component to retrieve the latest metadata and update each field to its most recent metadata.

Expression fx Button

Clicking the fx button to launch SSIS Expression Editor to enable dynamic update of the property at run time.

Generate Documentation Button

Clicking the Generate Documentation button to generate a Word document that describes the component's metadata including relevant mapping, and so on.

Columns Page

The Columns page of the Google Cloud Storage Source Component shows you all available attributes from the object that you specified on the General page.

SSIS Google Cloud Storage Source Component - Columns Page

On the top left of the grid, you can see a checkbox, which can be used to toggle the selection of all available fields. This is a productive way to check or uncheck all available fields.

The Columns Page grid consists of:

  • Google Cloud Storage Field: Column that will be retrieved from the current item (file or folder).
  • Data Type: The data type of this field.

Note: As a general best practice, you should only select the fields that are needed for the downstream pipeline components.