Using the JSON Source Component
The JSON Source component is an SSIS source component that can be used to retrieve JSON documents from an HTTP URL or a local file, break up the structure of the documents and produce column data which can then be consumed by a downstream SSIS pipeline component.
Data Source Page
The data source page determines where your JSON documents are coming from. The connection manager property can either be set to an HTTP Connection Manager, or a local file.
- Connection: Local File
-
- Local File Path
-
The path on the file system to the JSON document that will be extracted.
- Encoding
-
The encoding of the document that will be retrieved.
- Connection: HTTP Connection Manager
-
If an HTTP connection manager is selected the UI will display request settings that are used to determine the content of HTTP requests.
- HTTP Method
-
The HTTP method that will be used to send requests. Available methods include:
- GET
- POST
- PUT
- DELETE
- PATCH
- Relative Path
-
The relative path is the second part of the URL that will be used to make the HTTP request. It is combined with the base path from the connection manager to form the full path. The relative path can contain query string parameters, but they can also be added individually in the grid below.
- Query String Parameters
-
Query string parameters that will be appended to the full path when sending HTTP requests.
- Request Headers
-
Request headers that will be combined with the headers in the connection manager and included with HTTP requests.
- Request Body
-
The body of the HTTP request.
- Request Again Rule
-
The request again rules allows paged data to be retrieved. After each request the request again rule is evaluated and if it evaluates to 'True' another request will be sent. If the rule evaluates to false or is left blank no more requests will be sent.
- Max Identical
-
The max identical property will raise an error if a certain number of identical requests are sent. The purpose of this feature is to prevent accidentally sending the same request in a never ending loop. If 0 is selected the feature will be disabled and an unlimited of identical requests can be sent.
- Preview Button
-
The preview button launch the Preview dialog which will send the first request based on the current settings of the component, and show a preview of the response. If there are more requests based on the value of the request again rule they can be sent and the response previewed by clicking the 'Next' button. The Preview dialog is a good way to identify HTTP configuration problems without running the package.
- Import Settings Button
-
Click Import Settings button to import the configurations from a file.
- Export Settings Button
-
Click Export Settings button to save the configurations of the component to a JSON file later for reuse.
- Expression fx Icon
-
Click the blue fx icon to launch SSIS Expression Editor to enable dynamic update of the property at run time.
- Generate Documentation Icon
-
Click the Generate Documentation icon to generate a Word document which describes the component's metadata including relevant mapping, and so on.
- Expression Editor
-
Many of the properties on this page can be set as either a text merged field or an expression. Properties default to text merge mode, but can be changed to expression mode in the Expression Editor. The expression editor for a property can be launched by clicking the fx button. The expression editor has some predefined tokens:
- AccessToken
-
Returns the access token for sending requests. The token file needs to be generated by OAuth 2 Token Generated and specified in the HTTP Connection Manager's Authentication page.
- GetValueFromPreviousResponseHeaders
-
This is a function that is used to the value of HTTP headers from the previous response. This function is only available in expression mode.
- RequestCount
-
The number of requests that have been sent. This will be 0 on the first request and incremented by 1 after each request.
- ResponseBody
-
This is the body of the response from the previous request. On the first request is will be empty.
- ValueExistsFromPreviousResponseHeaders
-
Returns a boolean specifying if a value of the header with a specified name exists from the response of the preceding request.
- Text Merge Mode
-
In text merge mode, property values will be whatever text was entered, but with any variables or special tokens replaced with their actual values before each HTTP request is sent.
- Expression Mode
-
In expression mode the property is set to an expression which is evaluated before each request to determine the value of the field. Expressions can contain variables and special tokens, but also functions and conditional logic. The functions in the Value Lookup Functions category may be particularly useful for this component, as they can be used to look up values and JSON documents and HTTP headers.
Design Page
This page allows you to import an example JSON document, either from the file system - Import JSON (Local File) button or from the web using the HTTP settings specified on the Data Source page - Import JSON (Web) button. The imported document is used to generate the expected hierarchy of incoming JSON documents. From there you can add and remove nodes manually to refine the structure.
Once you've added some nodes by either importing or manually adding, you may need to refine the structure by editing the node properties.
The following properties will have an effect on the extract process:
- JSON Name: This is the name of the node as it is in the expected JSON. Notice when this property is changed the JPath and Data Name properties are automatically updated.
- Data Name: This is the name of the Output (green rows) or Column (yellow rows). Notice that some rows do not have a Data Name property. This is because these are nodes that define hierarchy and have no immediate value to extract. These nodes are important however because it affects the JPath of child nodes.
-
Type: There are 3 different JSON node types to choose from:
- Array: This node contains a list of nodes. Notice the JPath of the child elements will contain a '[n]' at the end. This means this node repeats.
- Object: This node often just defines hierarchy and is simply a node that contains nodes.
- Value: This is a basic node containing a value available for extraction.
- JPath: This property is not editable but instead generated based on the above properties.
General
Configure Input and Output settings.
- Null Mode
-
Specifies what represents a NULL value in the JSON. There are 2 options:
- Null Value: If a node cannot be found or is null then the value is NULL.
- Empty String: If a node cannot be found or is empty then the value is NULL.
- Outputs
-
In the Outputs data grid, you can check and uncheck outputs to specify if you wish to have these outputs in your component.
- Output Name: The name of the output. This property is configurable.
- Key Field: This specifies the identifier column for this output. This is important for linking outputs with each other and this property is configurable. By default, the Key Field is set to '_RowIndex' which is a special field that contains the current count of this output node. This is useful because many times a JSON object will have no key field because relationships are defined hierarchically. During runtime, the value of this field will go into the _ParentKeyField field of all of this output's children.
- JPath: The JPath of the output. This property is not editable.
- Output Name: The name of the output. This property is configurable.
Columns
Configure column settings for each output.
Select the output whose columns you wish to configure in the top left combo box. Select a column in the data grid on the left to populate its properties in the property grid on the right.
- The Columns Page grid consists of:
-
- Column Name: Column that will be retrieved.
- Data Type: The data type of this field.
- Properties window for the field listed
- Column Type: Description of the selected column.
- Name: Specify the column name.
- CodePage: Specify the Code Page of the field.
- Data type: The data type can be changed according.
- Length: Specify the Length of the fields. If the data type specified is a string, the length specified here would be the maximum size. If the data type is not a string, the length will be ignored.
- Precision: Specify the number of digits in a number.
- Scale: Specify the number of digits to the right of the decimal point in a number.
There are a couple of special columns to take note of:
- _RowIndex: This column contains the current count of this output node.
- _ParentKeyField: This column contains the value of this records parent key field.