Skip to Content (Press Enter)
OutSystems.com
Personal Edition
Community
Support
Training
Training
Online Training
Developer Schools
Boot Camps
Certifications
Tech Talks
Documentation
Documentation
Overview
ODC
O11
Forums
Forge
Get Involved
Get Involved
Jobs
Ideas
Members
Mentorship
User Groups
Platform
Platform
ODC
O11
Search in OutSystems
Log in
Get Started
Back to Forums
J.
MVP
457
Views
6
Comments
Is there a common best practice/pattern for bulk inserting large amount of data?
Question
Best Practices
Hi,
What is the common best practice of inserting large amounts of records into tables with validations.
for example, our test(!)-set consists of 300000 records.
1. they need to be validated (values not empty, proper integers etc.)
2. Lookups to get the identifier
3. Insert the data
I remember some presentation which ofcourse I cannot find anymore.
I see something like BulkCopy?
Is a simple commit per 1000 records sufficient?
Nuno Reis
MVP
I also remember that presentation. (it has to be from NextStep 2012, but I couldn't
find it
).
Justin James
MVP
Some of the things we do:
1. Do it in a timer!
2. Put the raw records into an Entity as a queue. Add a "status" attribute for "pending", "success", and "error".
3. Have the timer only pull raw records with a status of "pending".
4. Have the timer record the ID of the real value that the row got created into so you can audit and tie the new rows to their source.
5. Commit periodically, I like every 100 records, which seems to reduce locking issues and allows scheduled maintenance like backups to run more smoothly.
6. You can use the "LogRecord" action from the "AsynchronousLogging" extension if you wish to do the equivalent of a "bulk insert" and don't need to know the status... I do NOT recommend this since it makes your status attribute pretty useless and doesn't let you tie the created record to the raw record.
J.Ja
João Pedro Abreu
Staff
Greetings,
I've worked in several projects that required inserting / updating large amounts of records daily and used a similar process to what Justin James described above.
Since we had time constraints on those insert / update processes, we also used an extension to perform inserts into the queue entity with
SqlBulkCopy
. That really sped up inserts a lot (around 10 times faster than inserting one record at a time). We didn't create a generic extension for that; it had methods to insert into specific entities, which meant any changes to the entities would required changes to the extension.
I also used
SQL Server's MERGE statement
to do bulk "create or updates", but since it is not standard SQL and also not "maintenance friendly" I would advise against it, unless you have tight time constraints.
Gerry
I assume that SqlBulkCopy is only for SQL Server. But, a timer is the way to go. Screen and business processes will time out at 2 and 5 minutes, respectively.
J.
MVP
currently my status is,
- it's in a timer
- I need to import 18 csv's going from 100 record to 2million records in another csv
- I need to create FK's to each other.
- running out of time.
- it's a 2 stage for every csv basically.. bulk upload into a staging-table, then onwards to the real table with validation.
- crappiest part, only the initial load is so high, after that I will only get incrementals.
João Rodrigues
Staff
Hi,
In what step is it timming out?
The upload of the CSV's? Or the processing afterwords?
I've made a presentation in NextStep 2011 that might help regarding timers. You can find it here:
https://www.slideshare.net/jfrodrigues/easy-and-efficient-processing-of-massive-data-sets
If you can share more in detail the operations you are doing, I can try to help you out with my experience. (feel free to message me).
Cheers
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting.
See the full guidelines
Loading...