Use of Salesforce Source Component

Salesforce Source Component is an SSIS data flow pipeline component that can be used to read / retrieve data from your Salesforce.com instance. Salesforce source component supports using either a Salesforce object, or an object query statement as the data source. 

Salesforce Source Component includes the following two pages to configure how you want to read data from Salesforce.

  • General
  • Columns

General page

The General page of Salesforce Source Component allows you to specify the general settings of the component. 

Salesforce Source Editor

Salesforce Connection Manager

Salesforce source component requires a Salesforce connection in order to connect to your Salesforce.com instance. The Salesforce Connection Manager option will show all Salesforce connection managers that have been created in the current SSIS package.

Batch Size

Batch Size option allows you to specify how many records you want to retrieve each time. The default value is 1000. The maximum allowed value is 2000. This parameter should not be too small for performance reason. Note that the component will automatically page through all records based on the object or object query you have specified below, regardless the value of Batch Size option.

Source Type

Source Type option allows you to specify whether you want read data from Salesforce object, or use a snippet of Salesforce object query. Using Object option, you have the flexibility of visually picking which fields that you want to read from Salesforce object. When ObjectQuery is used, you can leverage the capabilities and flexibility of SOQL (Salesforce Object Query Language), such as JOIN, filtering criteria, etc. Your business requirements should influence which option to choose.

Include Deleted / Archived

Include Deleted / Archived option specifies whether those deleted or archived Salesforce records should be returned.

Use Bulk API

Use Bulk API option specifies whether you want to use Salesforce Bulk API to read data from your Salesforce organization. This option can be used when you have significantly large number of records in Salesforce system which requires an extraordinary long time if they are read synchronously. 

Note that when this option is enabled, the Include Deleted / Archived option above will have no effect, as Salesforce Bulk API does not support reading deleted or archived records.

PK Chunking Header (since v6.0)

PK Chunking Header allows to use automatic primary key (PK) chunking for a bulk query job, which can be used to handle large data set extracts.

The supported field values are:

  • TRUE: Enables PK chunking with the default chunk size (100,000), starting from the first record ID in the queried table.
  • FALSE: Disables PK chunking. If no value provided to PK Chunking Header property, the default is FALSE.
  • chunkSize: Specifies the number of records within the ID boundaries for each chunk. The maximum size is 250,000.
  • parent: Specifies the parent object when you’re enabling PK chunking for queries on sharing objects.
  • startRow: Specifies the 15-character or 18-character record ID to be used as the lower boundary for the first chunk. 

Note this option will only be available when the Use Bulk API option is enabled. 

Output Timezone

Output Timezone  option specifies how Salesforce datetime values are retrieved. There are 3 options available:

  • Timezone of Connection User (Default)
  • Adjust to UTC
  • Adjust to Local System Timezone
Source Object

Source Object option is only available when Source Type option has been chosen as Object. When you click the drop down button of this option, the toolkit will automatically retrieve the list of all available Salesforce objects, so that you can pick one. 

Object Query

When the above Source Type option has been chosen as ObjectQuery, you will be presented a text editor. You can use this to enter a snippet of object query to retrieve data from your Salesforce.com instance.

Salesforce Source Editor

There are several advantages of using ObjectQuery. First, you can use SOQL (Salesforce Object Query Language) to build complex query that retrieves data from Salesforce. You may include fields from parent object(s) in your query to read data from the parent object(s) directly. Secondly, using ObjectQuery option, you can apply WHERE clause to filter the data that is returned from Salesforce.com instance, so that you only work with the data that you are interested in. 

SSIS Integration Toolkit for Salesforce has the most sophisticated support of SOQL. You can write very complex SOQL queries, including using child-to-parent relationship, aggregation functions and datetime functions in the queries. The following is a supported SOQL query which includes the usage of child-to-parent relationship.

SELECT c.Id, 
         c.FirstName, 
         c.LastName, 
         c.Account.Name, 
         c.Account.NumberOfEmployees
  FROM Contact c

The following query with aggregation functions is also supported.

SELECT CALENDAR_YEAR(CreatedDate) CreatedYear, SUM(Amount) TotalAmount
  FROM Opportunity
  GROUP BY CALENDAR_YEAR(CreatedDate)

Due to the constraints of SSIS, we do not support reading data from child objects in object query statement. For example, the following query is not supported.

SELECT Id, 
         Name, 
         (SELECT Id, Name FROM Contacts) 
  FROM Account

The object query editor comes with a toolbar, which provides some common functionality useful when editing the query statement. One of the buttons is called Insert Variables, which can be used to insert SSIS variables in the query. With this capability, you can parameterize your SOQL query so it only returns the records that you are interested in.

Salesforce Source Editor

You can use SSIS variables in whichever way you would like to use, and you are not limited to use SSIS variables in conditions only.

Salesforce Source Editor

It should be noted, when using SOQL aggregation functions, datetime functions in the SELECT clause, you would typically provide an alias for each function, otherwise your output field name will be in exprN format, where N is the sequence number of such function.

Refresh Salesforce Metadata Button

By clicking Refresh Salesforce Metadata button, the component will retrieve the latest metadata from Salesforce and update each field. This feature works by performing the following three actions.

  • Update any existing object fields to the latest metadata
  • Add any new object fields that have recently been created in Salesforce
  • Remove any object fields that have recently been deleted from Salesforce

After clicking this button, you will receive the following screen once the refresh is done.

Refresh Metadata

The Refresh Salesforce Metadata feature has been designed to minimize any additional rework. Particularly, the component will only update all existing fields to the latest Salesforce metadata without actually deleting and re-creating them. This would otherwise cause rework if you have downstream pipeline component mapped to such fields.

Columns page

The Columns page of Salesforce Source Component shows you all available columns from the object or object query that you have defined in General page. 

When the Source Type option has been chosen as Object in General page, you will notice that the grid in Columns page has a checkbox column on the left. This allows you to specify what fields you want to read from the chosen Salesforce object.

Salesforce Source Editor

On the top left of the grid, you can see a checkbox, which can be used to toggle the selection of all available Salesforce fields. This is a productive way to select or unselect all available Salesforce fields. 

Note: As a general best practice, only select the Salesforce fields that are needed for the downstream pipeline components. This would not only speed up Salesforce queries, but also save resources for SSIS engine. 

When the Source Type option has been chosen as ObjectQuery in the General page, you will not see any checkboxes in the grid. The reason is, object query itself defines what to read from Salesforce. 

Salesforce Source Editor