Personal Environment 503

  

So I've created an application that runs a BPT process with a loop that executes many thousands of times, doing some rather intense calculations as it does. As a result, I've run into cause #2 of this article:

You have created an application that was using too much CPU for a long time

... and after running for what seems like a random duration, my entire application returns 503 errors for 180 seconds. I've taken to executing a commit after every loop, as the process activity automatically restarts after the 180 seconds elapses.

Presumably the 503 is being triggered based on the available resources of the server. Seeing as I can break and resume my loop at any point, it seems like the obvious solution is to break when I'm approaching my CPU time limit. I have been unable to determine what this is, however.

Does anyone have any insight on how I can foresee the coming CPU time limit and tell my application to chill for a minute (I'm thinking a wait action)?

Hi Jason,

As this article describes, BPT automatic processes last for 5 minutes max. If you want to process for a longer time, it is strongly suggested you use a Timer instead of trying to do it within a BPT automatic activity. The general rule of thumb: short processing: BPT, long processing: timers. Note you can kick-off a Timer from your BPT process, then Wait in the BPT process for the Timer to finish.

I'm not sure what triggers the 503 though, imho that shouldn't happen, and you might contact OutSystems Support for that.

Kilian Hekhuis wrote:

Hi Jason,

As this article describes, BPT automatic processes last for 5 minutes max. If you want to process for a longer time, it is strongly suggested you use a Timer instead of trying to do it within a BPT automatic activity. The general rule of thumb: short processing: BPT, long processing: timers. Note you can kick-off a Timer from your BPT process, then Wait in the BPT process for the Timer to finish.

I'm not sure what triggers the 503 though, imho that shouldn't happen, and you might contact OutSystems Support for that.

That article is new information, however in a previous attempt to combat the 503 problem, I have implemented a timeout site property that would exit and restart the activity. I set it for values between 200 and 10 seconds, all of which still resulted in the 503 issue. In my most recent process, none of my activities were running for longer than 60 seconds or so (see attached screenshot).

This behavior matches my theory about my process using too much CPU time and triggering the 503 as a result of that, as according to the process logs, there is little downtime between the end of an activity and the restart of the next one (less than a second).

Hi Jason,

The only thing I can think of is that you run out of memory or disk space (or both), but I wouldn't expect that unless you do some really weird things... I would contact OutSystems Support regardless of finding a solution, because the 503s should really not be happening.

Also note that you really shouldn't restart an Activity just because you run out of time - using Timers is the advised course of action in such a case.

Solution

Hi,


I'm not exactly sure what are the limits set on the personal environments but from the article way of explaining the issue I think it's fairly safe to assume it's using the IIS CPU Limits with the action set to "Kill".

Not knowing the actual limit and period makes it hard to quantify how much you need wait between actions, but if you read a bit about the mechanism its obvious that just ending one activity and starting another won't help since it calculates the average use of cpu over a period of time.


So lets assume the limit is 50% (round number just to make my math easier) cpu over 5min (default iis interval). That means that if your actions use 100% cpu when running you need to wait 2.5 min every 2.5min. It gets a lot better if the % is a bit higher.


Looking at the information that you posted probably it's safe to say that the limit interval is set to 1min.

I think your best option is to use the Start Date property of the activities to force them to wait before starting.


I was also going to mention that cycles inside processes are not a very good idea since each process has a limit number of activities... but looking at your Decision names I think you already noticed that. Do you really need both Decisions? They can have multiple "Outcomes" so it would probably reduce the total number of activities in 1/3.


Solution

João Rosado wrote:

Hi,


I'm not exactly sure what are the limits set on the personal environments but from the article way of explaining the issue I think it's fairly safe to assume it's using the IIS CPU Limits with the action set to "Kill".

Not knowing the actual limit and period makes it hard to quantify how much you need wait between actions, but if you read a bit about the mechanism its obvious that just ending one activity and starting another won't help since it calculates the average use of cpu over a period of time.

So lets assume the limit is 50% (round number just to make my math easier) cpu over 5min (default iis interval). That means that if your actions use 100% cpu when running you need to wait 2.5 min every 2.5min. It gets a lot better if the % is a bit higher.

This seems like the most likely scenario (which Outsystems support could not provide in a ticket I submitted, I might add). Thanks. Marking as solution.

Looking at the information that you posted probably it's safe to say that the limit interval is set to 1min.

How'd you figure that? If you're using the timestamps I posted in that screenshot, those activity lengths are already being artificially limited by me programatically (if it's been 90 seconds since start, end activity).

Based on the info I can find about personal environment specs, I'm willing to bet that the interval is not any set time limit, but rather is based off of what everyone else on the shared server is doing. Maybe I can use an extension to peek at CPU usage...

I was also going to mention that cycles inside processes are not a very good idea since each process has a limit number of activities... but looking at your Decision names I think you already noticed that. Do you really need both Decisions? They can have multiple "Outcomes" so it would probably reduce the total number of activities in 1/3.

The first decision determines if the target count has been reached and the process can be ended. The second decision counts the number of activities this process has executed to see if it's reaching the OS-imposed limit. If so, it ends the process but starts a new process of the same type.

I was just taking some guesses, I really have no idea of what is actually configured.

The thing is:

  • The default is 5min, and I see no good reason for increasing it. Also I once had a infinite cycle in my personal and it got killed in less than that.
  • Having it lower than 1min makes it really hard to avoid processes being killed. It would mean that developers would get punished really quickly for having a quick work spike.

So it makes sense to be something between that.


As for the hypothesis of it being based on the total amount of work in the machine I doubt it. That would require something other than the IIS default mechanisms (reading the documentation its always "per pool").