Google BigQuery Source Component

The Google BigQuery Source component is an SSIS data flow pipeline component that can be used to read/retrieve data from Google BigQuery.

The component includes the following two pages to configure how you want to read data from Google BigQuery.

  • General
  • Columns

General

The General page of Google BigQuery Source Component allows you to specify the general settings of the component.

SSIS Google BigQuery Source

Connection Manager

The Google BigQuery Source Component requires a connection in order to connect to Google BigQuery. The Connection Manager drop-down will show a list of all connection managers that are available to your current SSIS packages.

Object Type

The Report Type drop-down contains a list of available Report Types in Google BigQuery.

  • PROJECT: Lists all projects to which you have been granted any project role
  • DATASET: Lists all datasets in the specified project to which you have been granted the Dataset - READER role
  • JOB: Lists all jobs that you started in the specified project
  • TABLE: Lists all tables in the specified dataset to which you have been granted the Dataset - READER role
  • TABLEDATA: Retrieves table data from the specified table
  • QUERY: Run a BigQuery SQL query and returns results if the query completes
Project

The Project option allows you to specify which project you want to read from Google BigQuery. The drop-down will present a list of all available projects in your Google BigQuery.

Dataset

The Dataset option allows you to specify which dataset you want to read from Google BigQuery. The drop-down will present a list of all available datasets in the specified project.

Create Query Job

The Create Query Job option allows you to create a query job in Big Query when executing BigQuery SQL query text.

Note: This option is only available when working with QUERY Object Type.

Poll Throttle Rate

The Poll Throttle Rate option will limit the number of polling requests that can be sent per second. This is useful to limit the rates to get Query Job status when you enable the Create Query Job option.

Note: This option is only available when Create Query Job option is enabled.

Page Size

The Page Size option allows you to specify how many records you want to retrieve each time.

Output As JSON

The Output As JSON option specifies whether the output should be one single output column which contains the values in JSON format for each row returned by Google BigQuery.

Note: This option is only available when the Object Type is either TABLE DATA or Query.

Use Query Cache

The Use Query Cache option allows to specify whether to look for the result in the query cache. This option is unchecked by default.

Use Legacy SQL

The Use Legacy SQL option allows you to specify whether to use Google BigQuery's legacy SQL dialect for this query. This option is checked by default. If you uncheck this option, the query will use Google BigQuery's standard SQL.

Query

The Query text box lets you specify a snippet of query to retrieve data or create a query job in Google BigQuery.

Table

The Table option allows you to specify which table you want to read from Google BigQuery. The drop-down will present a list of all available tables in the specified dataset.

Job Projection

The Job Projection option allows you to restrict information returned to a set of selected fields. There are two options available:

  • Full: Includes all job data
  • Minimal: Does not include the job configuration

Note: This option is only available when the Object Type JOB is selected.

Job State

The Job State option allows you to filter jobs by their running state. There are four options available:

  • All: returns jobs in all states
  • Done: returns finished jobs
  • Pending: returns pending jobs
  • Running: returns running jobs

Note: This option is only available when the Object Type JOB is selected.

Include All Users

The Include All Users option allows you to specify whether to display jobs owned by all users in the project.

Note: This option is only available when the Object Type JOB is selected and requires you to have the Project - Is Owner role for the specified project.

Start Row

The Start Row option allows you to specify the index of the starting row to read. Default value is 0.

Note: Row index in Google BigQuery is zero-based. When you set the Start Row option to 0, the component will read the TABLEDATA from first row. This option is only available when the Object Type TABLEDATA is selected.

Refresh Component Button

Clicking the Refresh Component button causes the component to retrieve the latest metadata and update each field to its most recent metadata.

Expression fx Button

Click the fx button to launch SSIS Expression Editor to enable dynamic update of the property at run time.

Generate Documentation Button

Click the Generate Documentation button to generate a Word document which describes the component's metadata including relevant mapping, and so on.

Columns

The Columns page of the Google BigQuery Source Component shows you all available fields from the report type that you specified on the General page.

SSIS Google BigQuery Source - columns

On the top left of the grid, the checkbox can be used to toggle the selection of all available fields. This is a productive way to check or uncheck all available fields.

The Columns Page grid consists of:
  • Google BigQuery Field: Column that will be retrieved from Google BigQuery.
  • Data Type: The data type of this field.
  • Properties window for the field listed
    • Name: Specify the column name.
    • Data type: The data type can be changed according.
    • Length: Specify the Length of the fields. If the data type specified is a string, the length specified here would be the maximum size. If the data type is not a string, the length will be ignored.
    • Precision: Specify the number of digits in a number.
    • Scale: Specify the number of digits to the right of the decimal point in a number.
    • CodePage: Specify the Code Page of the field.