[CSVUtil] Large csv files throwing System.OutOfMemoryException

Forge Component
(26)
Published on 28 Sep by Wei Zhu
26 votes
Published on 28 Sep by Wei Zhu

I just wanted to point out a current limitation when processing large csv files due to the capacity limit of StringBuilder variables. We are working with files containing 2 million+ rows and so when converting to text, the StringBuilder variables eventually throw a "System.OutOfMemoryException" because the appended text exceeds the 2147483647 character limit. 

Any ideas on how to overcome this?

Keep up the good work.

Ossama 

Hi Ossama, can you batch-process the record list? What I mean is to process for example 100,000 rows at a time through CSVUtil, then get the Text back, process the next 100k rows, then append Text to the previous output.


Hey Jonathan :)

We've ruled out that idea because we're later converting the text output to binary and sending it to an Azure Blob. We have to send the complete binary to Azure at once. 

Processing it in batches would require us to concatenate the outputs at the end, resulting in the same issue.

Hi Ossama


For OutOfMemory error, this is the limitation of .NET. 

Because the maximum object size in the managed heap is 2GB.


For Azure Blob, if you want to update large file, you can use PubBlockList.

 https://docs.microsoft.com/en-us/rest/api/storageservices/put-block-list


Regrads

Wei






Solution

Hi Wei

We were actually able to overcome this by converting the StringBuilder variable to a list of strings.

Cheers

Ossama

Solution