Preparing for an application version rollback scenario in Production

Preparing for an application version rollback scenario in Production

  

We want it never to happen, rarely it needs to be done, always should we prepare it: especially for complex mission critical systems having a rollback plan is a must.

Purpose of this post is to clarify concepts and give some indications for preparing and testing a rollback scenario using OutSystems Platform. However, it should not be considered as a full proof technical note, as asserted in the next section.


Fallacies and pitfalls


Fallacy #1: "You cannot do a rollback with OutSystems Platform"


You can check how to use LifeTime to publish a previous version of an app here. So, with OutSystems, doing a rollback is exactly like publishing "forward" an app, right? Well, for the Platform it is, indeed, just a new version being published: code is generated and compiled, the differential SQL script is determined and binaries are copied and deployed to the front-end(s). Also, as usual, do not forget that OutSystems Platform never deletes data from the DB, even when Entity attributes (or entire Entities) are removed from a module - those are still kept in the DB for the developer to decide what best to do with them... And here lies one practical difference between publishing "forward" and "backward": when you "go back", there is an unusual number of removals from the application data model while those will not be directly reflected in the DB after publish and impact on deployment needs to be assessed and mitigated.


Another important aspect is that if changes in the application involve changes on the DB constraints you may face an upgrade/downgrade roadblock with the data already present in the DB and probably require the execution of a DB script before deployment can take place. Those constraints can be related to:

  i. foreign keys (roadblock when trying to put back a DB constraint while the new data in the DB might no longer respect it)

  ii. primary keys (same as previous)

  iii. unique indexes (same as previous)

  iv. mandatory attributes (for foreign keys)

  v. attribute data types (when trying to convert back an attribute while new data in the DB might no longer be convertible to it)


So, definitely, rollback is a scenario requiring careful preparation for while, depending on the risk involved, it might not be a viable option at all. If data loss is affordable, do consider the simpler procedure of restoring to a DB snapshot. Still, to completely validate this alternative you will also need to know how all the OutSystems applications in the server use DB schemas and assess impact for each application: namely data loss but also considering how to address eventual interactions with external systems that might happen after the recovery point - and that would still exist on those external systems). All factors considered the usual less risk approach is, in fact, to publish forward a hotfix when found needed for the new version.


Also interesting is to note that the need for a rollback scenario is rare with OutSystems while a more usually discussed topic with “traditional” development. Surely related to the tremendous added value of OutSystems features such as the TrueChange code validation engine, automated module ?dependencies validation and fully automated deployment, not to mention a very large baseline of virtually bug-free generated code handling most of the more technical aspects of the solution.


Fallacy #2: "Rollback can always be made easy and low-risk while not losing any data"


Regardless of using OutSystems Platform or traditional development, the risk of rolling back to the previous version of an application depends on three main aspects:

A) amount and nature of database changes (as described in the previous bullet)

B) operational use (and consequent data changes) since deployment of the new version

C) business and technical complexity of the application


A substantial part of the preparation for a rollback plan is the analysis of the data model and application logic changed and how that will affect production data: if there are substantial differences, the more time it passes between the new version is published and its rollback, higher the risk and cost of a rollback operation.


Usually the best option is not to rollback but to move forward, directly addressing the cause for the rollback with a hotfix of the application.


Preparation during development phase using the OutSystems Platform


1) Coding best practice: never use 'SELECT *' or 'SELECT <alias>.*' in your advanced queries (more about this here)

2) Coding best practice: never use auto-number integers for Static Entity IDs - replace by easily recognizable unique text codes/labels, having the same value throughout all environments

3) Of course, any bootstrap (for example executed "On Publish") should require preparing its rollback (creating the corresponding "undo" DB script)

4) Value changes of existing configurations (for example Site Properties) should also be undone in a rollback scenario

5) When closing a cycle of data model changes you can extract a report over the introduced differences with the version currently in Production by downloading all modules of both versions and using the IDE command line with a bit of script automation: ServiceStudio.exe -d <module v1> <module v2> <target report file> for all modules of the application. This report can be used to determine potential risks from cases described above (i. to v.) for both "forward" (v1 >> v2) and "backward" (v2 >> v1) publishing scenarios

6) Considering that no data is deleted by OutSystems Platform, consider that values added to Static Entities or other lookup tables might require manual removal from DB through script

7) Start preparing your rollback script as soon as first scripts or steps are identified, including the identification of necessary regression test cases to perform over the affected functionality


In situations where rollback shows a potentially high risk (assessing the 3 main aspects above mentioned - A, B and C) then assess the DB restore / snapshot option previously mentioned (which has its own risks and potential impact also on the other applications of the server). At the end, the best option might be not to rollback at all and address any issue with a specific hotfix of the application (for example by restoring related parts from the previous version through "Merge" functionality). Steps 3) to 7) are not required when following DB snapshot or hotfix strategies.


Preparation during release acceptance phase using the OutSystems Platform


1) When no further changes of the data model are expected then, using an environment resembling Production as much as possible (regarding DB schema, same application version being replaced and, as best as possible, similar data):

1.1) Perform a DB backup

1.2) Publish target new version of the application, then execute tests in order to create data related to the data model changes (focusing on the "hot" topics listed above in i. to v.), followed by a trial rollback (executing the rollback script built during development phase). If preparation during development phase was successful then "backward" publish should also be. Otherwise, address found issues, restore DB backup and redo step

1.3) If any BPT processes were changed with the new version, check in Service Center for the existence of activities that became pending due to no longer having a consistent context: some activities might have to be suspended, others might require case-by-case fix by DB script or other (this might be a more difficult task on scenarios having a high volume of activities and processes)

1.4) Perform regression tests using the "rolled-back" version

2) Reassess risk of rollback versus alternative use of DB restore / snapshot mechanisms versus following a hotfix "forward" publish strategy. None of the above steps are required when following DB snapshot or hotfix strategies.


For the Production deployment script


1) In case BPT processes were changed avoid deploying those changes (either "forward" or "backward") before ensuring related activities will not be running

2) Schedule publish aiming for a period of low application activity (for both user and batch operations)

3) Define a minimal viable set of "keep new version vs rollback" tests to run after deploy to allow reaching a decision on the shortest possible amount of time - knowing that, after some time, it is no longer reasonable to assume all variables can still be controlled and, as such, no longer any form of rollback will be an option: hotfix will be the only path forward


Last thoughts


Finally, let me recall how I started: rarely it needs to be done but always should we prepare a rollback plan, especially for mission critical apps. Still, depending on the nature of the app and how complex it is, a DB snapshot might be an alternative to be considered against a version rollback: it avoids the specific rollback procedure creation and testing above mentioned (and putting that effort into bringing more value to the business!). However, use of DB snapshot might have an impact for all applications in the server: namely possible data loss and possible interactions with external systems happening after the recovery point - that would still exist on those external systems). All factors considered (and regardless of OutSystems vs "traditional" development) the usual less risk approach is, in fact, to publish forward a hotfix when found needed for the new version.


Feel free to provide your feedback or suggestions on this topic!


Hi nuno,

Great post.
I have a remark on this: "2) Coding best practice: never use auto-number integers for Static Entity IDs - replace by easily recognizable unique text codes/labels, having the same value throughout all environments".
Isn't what you propose not to do the default behavior when creating static entities in OutSystems. I support your remark but It would be great if the platform would created static entities following this guideline directly.