The JSON Extract Component is a transformation component used to extract column data from JSON documents. Configure your outputs by importing an example JSON, or manually adding/removing nodes. From there just configure the columns for each output and the component is ready to extract data from JSON documents from the Input.
This page allows you to import an example JSON document to generate the expected hierarchy of incoming JSON documents. From there you can add and remove nodes manually to refine the structure.
Once you've added some nodes by either importing or manually adding, you may need to refine the structure by editing the node properties. The following properties will have an effect on the extract process:
- JSON Name - This is the name of the node as it is in the expected JSON. Notice when this property is changed the JPath and Data Name properties are automatically updated.
- Data Name - This is the name of the Output (green rows) or Column (yellow rows). Notice that some rows do not have a Data Name property. This is because these are nodes that define hierarchy and have no immediate value to extract. These nodes are important however because it affects the JPath of child nodes.
Type - There are 3 different JSON node types to choose from:
- Array - This node contains a list of nodes. Notice the JPath of the child elements will contain a '[n]' at the end. This means this node repeats.
- Object - This node often just defines hierarchy and is simply a node that contains nodes.
- Value - This is a basic node containing a value available for extraction.
- JPath - This property is not editable but instead generated based on the above properties.
Configure Input and Output settings.
Specifies the input column that contains the JSON to extract.
- Null Mode
Specifies what represents a NULL value in the JSON. There are 2 options:
- Null Value - If a node cannot be found or is null then the value is NULL.
- Empty String - If a node cannot be found or is empty then the value is NULL.
In the Outputs data grid, you can check and uncheck outputs to specify if you wish to have these outputs in your component. You can also configure the following properties:
- Output Name - The name of the output
- Key Field - This specifies the identifier column for this output. This is important for linking outputs with each other. By default, the Key Field is set to '_RowIndex' which is a special field that contains the current count of this output node. This is useful because many times a JSON object will have no key field because relationships are defined hierarchically. During runtime, the value of this field will go into the _ParentKeyField field of all of this output's children.
Configure column settings for each output.
Select the output whose columns you wish to configure in the top left combo box. Select a column in the data grid on the left to populate its properties in the property grid on the right. There are a couple of special columns to take note of:
- _RowIndex - This column contains the current count of this output node.
- _ParentKeyField - This column contains the value of this records parent key field.
This page allows you to specify how errors should be handled when they happen.
- There are three options available.
- Fail on error
- Redirect rows to error output
- Ignore error
When the Redirect rows to error output option is selected, rows that failed to be sent will be redirected to the 'Error Output' output of the Transformation Component. As indicated in the screenshot below, the blue output connection represents rows that were successfully sent, and the red 'Error Output' connection represents rows that were erroneous. The 'ErrorMessage' output column found in the 'Error Output' may contain the error message that was reported by the component.