Working with Large Files via MongoDB GridFS using KingswaySoft

31 March 2026
KingswaySoft Team

MongoDB is a popular pick when it comes to choosing a NoSQL database system, which offers scalability and flexibility, along with easy querying and indexing. And since it's document-based, instead of Columns and Rows, the data is stored in JSON like format called BSON, which can have varying non-tabular unstructured data, and also supports conventional datatypes. However, when integrating data into MongoDB, you often run into a hard limit of 16MB BSON document size. If your integration involves storing large high-resolution images, video files, or massive PDF files etc., a standard MongoDB document upload may fail because of the size restrictions. This is where GridFS comes into play.

GridFS is a specialized MongoDB specification designed for storing and retrieving files that exceed the 16MB document limit. Instead of assigning a massive file to a single document, GridFS partitions the file into smaller sections called chunks. And while reading back these files, these chunks are assembled back as needed. GridFS uses two collections to store the files, and these are placed in a common bucket by default. And due to that, the collection names are prefixed as "fs". Note that you can always choose a different bucket name (which will change the prefix), as well as create multiple buckets in a single database, too. Assuming that we use the default one, the collection names are as follows:

  • fs.chunks: This collection stores binary chunks, and the documents in this collection will have the below format.
  • fs chunks collection in MongoDB

  • fs.files: This collection stores the file's metadata, such as filename, upload date, file size, etc. The documents in this collection would be in the below format.
  • fs.files collection in MongoDB

KingswaySoft offers MongoDB components as part of our SSIS Integration Toolkit, and since our release v26.1, we support using GridFS for easily writing (and reading) large files to MongoDB. In this blog post, we will see how this can be achieved.

Configuring the SSIS Data Flow Task

In our use case, we have a large file in our local folder, which needs to be uploaded to MongoDB. And since the file is over the size limit allowed by BSON, we will use GridFS, which is supported in the KingswaySoft MongoDB Destination component. As a first step, within your SSIS Data flow, to access the binary content metadata of the file, our KingswaySoft Premium File System Source Component can be used.

Premium File System Source

The Source path captures the file in question, and if you have more than one file, you can specify the folder, or use Advanced Filtering (the second page in the component) to get the files. The Columns page will have the metadata of the file along with "FileContent", which is the binary field. You can pick other metadata, such as File Name, etc., as required. 

Premium File System Source - Columns page

Now we drag and drop the MongoDB Destination Component and connect the source component to it. And create a MongoDB connection manager to connect to your instance, and configure the Destination Component as shown below to choose the Destination Type as GridFS, and the bucket for the collections. We leave it as the default "fs" bucket. For more details on how to configure the connection manager, please refer to our Online Help Manual using this link.

MongoDB Destination - General page

In the Columns page, the MongoDB Field called _content represents the binary field for the file. We have mapped the filename along with it and left the others as default, which could be mapped as required. The below can be used as a reference:

  • chunkSize: The size of each chunk in bytes. GridFS divides the document into chunks of size chunkSize, except for the last, which is only as large as needed. The default size is 255 KiB.
  • metadata  (Optional): The metadata field may be of any data type and can hold any additional information you want to store. If you wish to add additional arbitrary fields to documents in the files collection, add them to an object in the metadata field.

MongoDB Destination - Columns page

Once the components are configured, it's time to run the Data Flow. We have enabled a data viewer to view the file details.

Data Flow execution

When it's executed successfully, let's move on to our MongoDB instance to view how the collections are organized as chunks and files.

GridFS collections within MongoDB

Within your MongoDB instance, navigate to your cluster in Data Explorer and choose your Database. In here, you can see the GridFS related collections ".chunks" and ".files" prefixed by the default buckets we have used (fs). As mentioned in the above section, you could create your own if required.

GridFS FS cluster

Select fs.chunks collection, and you can see the below document with the binary information about the file chunks.

FS Chunks

Similarly, the fs.files collection shows the metadata of the file, such as the file name you had set while uploading, along with the upload date, and any additional metadata you might have specified.

FS Files

Now that we have seen how the file upload works, let's take a quick peek at how the files are read back from MongoDB using GridFS. 

Reading the files using GridFS in SSIS

 In your SSIS Data Flow Task, drag and drop the Kingswaysoft MongoDB Source component. Choose the connection manager, database, and set the Source Type as GridFS and select your bucket.

MongoDB Source

In the columns page, the _content field would have the binary content of the file as indicated by the image datatype. And the rest of the available metadata can be used as required and optionally.

MongoDB Source columns

Conclusion

By combining the chunk based storage of MongoDB GridFS, along with the easy configuration offered by KingswaySoft MongoDB components, you can bypass document size limits while maintaining a high-performance, scalable integration. Whether you are archiving massive documents or streaming media, this approach ensures your data remains organized, accessible, and ready for growth.

We hope this has helped!

Archive

March 2026 2 February 2026 2 January 2026 2 December 2025 2 November 2025 2 October 2025 2 September 2025 2 August 2025 2 July 2025 2 June 2025 1 May 2025 2 April 2025 3 March 2025 1 February 2025 1 January 2025 2 December 2024 1 November 2024 3 October 2024 1 September 2024 1 August 2024 2 July 2024 1 June 2024 1 May 2024 1 April 2024 2 March 2024 2 February 2024 2 January 2024 2 December 2023 1 November 2023 1 October 2023 2 August 2023 1 July 2023 2 June 2023 1 May 2023 2 April 2023 1 March 2023 1 February 2023 1 January 2023 2 December 2022 1 November 2022 2 October 2022 2 September 2022 2 August 2022 2 July 2022 3 June 2022 2 May 2022 2 April 2022 3 March 2022 2 February 2022 1 January 2022 2 December 2021 1 October 2021 1 September 2021 2 August 2021 2 July 2021 2 June 2021 1 May 2021 1 April 2021 2 March 2021 2 February 2021 2 January 2021 2 December 2020 2 November 2020 4 October 2020 1 September 2020 3 August 2020 2 July 2020 1 June 2020 2 May 2020 1 April 2020 1 March 2020 1 February 2020 1 January 2020 1 December 2019 1 November 2019 1 October 2019 1 May 2019 1 February 2019 1 December 2018 2 November 2018 1 October 2018 4 September 2018 1 August 2018 1 July 2018 1 June 2018 3 April 2018 3 March 2018 3 February 2018 3 January 2018 2 December 2017 1 April 2017 1 March 2017 7 December 2016 1 November 2016 2 October 2016 1 September 2016 4 August 2016 1 June 2016 1 May 2016 3 April 2016 1 August 2015 1 April 2015 10 August 2014 1 July 2014 1 June 2014 2 May 2014 2 February 2014 1 January 2014 2 October 2013 1 September 2013 2 August 2013 2 June 2013 5 May 2013 2 March 2013 1 February 2013 1 January 2013 1 December 2012 2 November 2012 2 September 2012 2 July 2012 1 May 2012 3 April 2012 2 March 2012 2 January 2012 1

Tags