Quickly insert large amounts of records into a table

Quickly insert large amounts of records into a table


I have a requirement to parse and insert into my database (SQL Server 2008 R2) some very large plain text files, with sizes between 100MB and 2GB (200k to 20M records). Usually this would be done with SQL Server Integration Services, but the files need some uncommon data transformations and I'd prefer to control the batch processes from the Outsystems application. I need to dump the files into the DB as quickly as possible, since this is a process that will run every night.

I read a tech note about the "Custom Log" (http://www.outsystems.com/goto/technote-custom-log), which sounded just what I needed - however it was actually MUCH slower than just using the built-in Create action (on Agile Platform 6.0). Anyone has any clue why?

I'm currently thinking of making an extension to use the .NET SqlBulkCopy class (http://www.codeproject.com/Tips/309564/SQL-Bulk-copy-method-to-insert-large-amount-of-dat). Has anyone tried this? Any other advice? Thanks in advance!
I'll just leave this here, for future reference.

After trying the SqlBulkCopy class these were the results:

CustomLog - 24000 records / min
CreateENTITY - 120000 records / min
SqlBulkCopy - 400000 records / min

I imagine these values may change completely on a different infrastructure. 

I'm sure an extension could be made that looked into the platform's metadata to map any record list to physical table fields (that's probably what Custom Log does), but I only need to do this for a few tables, so my extension (attached to this post) has the mappings hardcoded.
Nice one João!

The CustomLog probably took longer because it makes it in an assynchronous way, sending all info to a MessageQueue, and afterwards batchprocessing the queue into SQL Server (with Bulk insert). It is made that way to have the minimum amount of impact in the database, while guaranteeing the successfull delivery of all "logs".

I think your approach is the best one if you can comply with having a spike of SQL Server processing impacting your applications while batch importing your data...
Hi João,

I had to do some similar stuff some time ago for a personal project (not so many records, but similar) and also ended up doing my own SqlBulkCopy-like extension, but didn't share it because it did alot more inside the extension method than just the inserts.

Thanks for sharing your version of it (you so much nicer than me xD).

Just on a side note that I had problems with: .net SqlBulkCopy methods are Case Sensitve to the  tables names and attribute names case. It gives strange errors.

João Rosado

i get an deprecated on " DBTransactionManager.Current.ConnectionString;"
how can i fetch it now?

Have you looked into the RuntimePublic.DB API?
Maybe it's what you are looking for.
Actually I checked and could not find any alternative to the ConnectionString/GetConnection methods.

I would suggest you to keep using those marked as obsolete for now.

João Rosado