Quickly insert large amounts of records into a table

Quickly insert large amounts of records into a table

  
Greetings!

I have a requirement to parse and insert into my database (SQL Server 2008 R2) some very large plain text files, with sizes between 100MB and 2GB (200k to 20M records). Usually this would be done with SQL Server Integration Services, but the files need some uncommon data transformations and I'd prefer to control the batch processes from the Outsystems application. I need to dump the files into the DB as quickly as possible, since this is a process that will run every night.

I read a tech note about the "Custom Log" (http://www.outsystems.com/goto/technote-custom-log), which sounded just what I needed - however it was actually MUCH slower than just using the built-in Create action (on Agile Platform 6.0). Anyone has any clue why?

I'm currently thinking of making an extension to use the .NET SqlBulkCopy class (http://www.codeproject.com/Tips/309564/SQL-Bulk-copy-method-to-insert-large-amount-of-dat). Has anyone tried this? Any other advice? Thanks in advance!
I'll just leave this here, for future reference.

After trying the SqlBulkCopy class these were the results:

CustomLog - 24000 records / min
CreateENTITY - 120000 records / min
SqlBulkCopy - 400000 records / min

I imagine these values may change completely on a different infrastructure. 

I'm sure an extension could be made that looked into the platform's metadata to map any record list to physical table fields (that's probably what Custom Log does), but I only need to do this for a few tables, so my extension (attached to this post) has the mappings hardcoded.
Nice one João!

The CustomLog probably took longer because it makes it in an assynchronous way, sending all info to a MessageQueue, and afterwards batchprocessing the queue into SQL Server (with Bulk insert). It is made that way to have the minimum amount of impact in the database, while guaranteeing the successfull delivery of all "logs".

I think your approach is the best one if you can comply with having a spike of SQL Server processing impacting your applications while batch importing your data...
cheers
Hi João,

I had to do some similar stuff some time ago for a personal project (not so many records, but similar) and also ended up doing my own SqlBulkCopy-like extension, but didn't share it because it did alot more inside the extension method than just the inserts.

Thanks for sharing your version of it (you so much nicer than me xD).

Just on a side note that I had problems with: .net SqlBulkCopy methods are Case Sensitve to the  tables names and attribute names case. It gives strange errors.


Regards,
João Rosado
./bump

i get an deprecated on " DBTransactionManager.Current.ConnectionString;"
how can i fetch it now?


Have you looked into the RuntimePublic.DB API?
Maybe it's what you are looking for.
Actually I checked and could not find any alternative to the ConnectionString/GetConnection methods.

I would suggest you to keep using those marked as obsolete for now.

Regards,
João Rosado

Hey,

I just ran into this extension and can anyone help me to know if this is still ready to use? I have a CreateOrUpdate function being used inside a cycle and, with two aggregates, and I want to avoid this. So instead, I thought of building an Advanced SQL command that would be able to replicate the CreateOrUpdate function, but I want to run this SQL command outside of the cycle. How can I run a list of Update or Insert commands in the best way possible? I thought of using StringBuilder and pass the String built to the Advanced SQL command, but is this the best way?

When I downloaded the BulkCopy extension I noticed that it only includes one method (BulkLoadTerms) and I can't fully understand the Structure that it uses, once I can't see where it's supposed to be put the SQL query.


Thank you, João Santos.

Hi João.

The BulkCopy extension I posted in this topic was just an example coded for a specific business use case, so it never was ready to use.

Since then I've published a generic component with the same purpose. You can find it here: https://www.outsystems.com/forge/component/1117/bulkinsert/

I hope it helps.