Archiving old data

  

Hi all,

We are running an OutSystems application (version 9.1) for roughly a year now. Some tables contain tens of millions of records. Only a small part of those records is active and the rest is just dead weight. To keep performance of the application above a desired level archiving will be a necessity .

I have searched for archiving strategies in OutSystems platform and found following link:
https://success.outsystems.com/Documentation/Best_Practices/Performance_Best_Practices/Performance_Best_Practices_-_Data_model#Archive_old_data_in_separate_entities

In this article item (b) in the description is very applicable to our situation:
(b) when there is a table with a high growth rate where a small percentage of the data is heavily used and the rest of the data is used just as an archive that is rarely accessed.  

The solution provided in this article is very high level and I'm looking for some details:

  1. When introducing archive entities:
    1. The archive entity must have the same attributes as the source entity or is it possible to omit fields? I can imagine that when the entities don't have matching attributes this might make moving data more complex.
    2. The primary key attribute Id will have property Is AutoNumber ticked off. And the primary keys of the source entity is reused. Is this a valid strategy?
    3. Between the archive entities the same foreign key relations as between the source entities are used? I think this is only possible (without unnecessary complexity) when the primary keys of the archive entities are not auto numbered.

  2. When moving data from source entity to archive entity:
    1.  I can think of the following strategy:
      1. Select records from the source entity filtered on a date attribute.
      2. For each record in the list returned by the select of step 1:
        1.  insert the current record in the archive entity (CreateEntity action).
        2. Select records in child entities
        3. For each record:
          1. Insert the current record in the archive entity (CreateEntity action).
          2. Select records in child entities (...repeat until deepest level is reached...)
          3. Delete the current record from the source entity (DeleteEntity action)
        4. Remove the current record from the source entity (DeleteEntity action).
    2. Biggest question i have here is: Will this strategy perform? 
      1. When there are 100k+ records each day in the top level entity that need to be archived
      2. This top level entity has 3 levels of child entities that also need data to be archived
      3. There are many nested loops and many single database actions in this strategy. Are there actions available to move complete resultsets without looping over the results?


I hope someone else has been in the same situation I am in at the moment and can give me a direction or hints.
Sharing your ideas about the above is very welcome as well!


Thanks in advance.

1.1 it's really up to you. do you want to be able to search/browse and retrieve old-data? then you should not omit data, otherwise, if it's just for storage well, up 2 you.

1.2 yes.. you will have loose-coupling then, so the archive is standalone and data is keeped even if the PK is delete in the current-table

1.3 see 1.2 you end up with losely-coupling the data, so you can join them, but without the contstraints.


2. really depends how you are going to implement it.

Perhaps it's possible to select records based on create/update-time, so you don;t need to worry about relations.

so you can archive each table seperately and using timers for example to be able to use them paralell.

when all succeeded you delete them in the source.