I want to read large CSV file(contains more than 1 million records) and insert/update the data accordingly. I have used existing outsystems CSVToRecord extension but it is very slow as this extension is loading the entire file in to memory.
So, I wanted to write an extension using outsystem DB API. I am processing the records by using chunk based approach. The initial few chunks are taking very less time but eventually the subsequent chunks are taking more time. In my extension code, I am clearing the list and adding fresh set of recrods when processing next chunk. I think this is because of memory leak.
I also noticed that outsystems DM API doesn't have the provision to perform batch inserts/updates. I couldn't find any class which supports batch inserts/updates. Is there any utility or action which I am not aware ?
I don't know the solution, but I know a collegue of mine has recently done the same. I'll ask him to reply.
Did you measure where your chunks were taking more time? It would be nice to know if it is the database inserts that get slower, or if it is reading the file that gets slower. The last time I did something like this, I created a .NET extension to read a file from up from a certain point (using seek) that returned a list of records to Service Studio.
Then i used this extension http://www.outsystems.com/forge/component/1117/bulkinsert/ to do batch inserts. If your table has indexes, disabling them during the insert and enabling them after may speed up your process.
If you also need to do updates, then you may want to consider inserting your records into a staging table and using MERGE (https://msdn.microsoft.com/en-us/library/bb510625.aspx), assuming you have SQL Server as your database. If you go the MERGE route, you may want to grant additional permissions to the OSRUNTIME user, so it can use TRUNCATE TABLE on the staging table.