Tip: Long Polling/Continuous-Process Timer pattern (done right!)
Discussion

Hey all!


Since this is something that comes up every now and then, I am creating this public post for describing the Long Polling/Continuous-Process Timer pattern, so that I can link to this thread instead of explaining it all over again. :-)


What's a Long Polling/Continuous-Process pattern?

Imagine that we need to create a background process (queue service-likey) that runs constantly, 24/7, no interruptions. One way of achieving near-realtime (notice "near" in italics) data processing is by creating a Timer Action that does the processing, and then proceeds to call WakeTimer on itself right before ending the logic flow.

The problem is: the WakeTimer essentially just sets the NextRun parameter of that Timer to the current date/time, which is then left to be picked up by the Scheduler Service the next time it polls pending Timers, taking normally up to 20 seconds each run.

In some [very] specific cases, we might want the queue to be as "realtime" as computationally possible (instead of just "near-realtime"). Examples of such cases are:

  • When there's a requirement that the time it takes to process new data is critical and must be near-zero (e.g. external auditing/logging systems that should be notified when a record is updated in a table from an external database);
  • Or if the user could be left hanging and only let go after the Timer kicks in and does the necessary processing (imagine keeping the user waiting on a loading spinner for 20+ seconds until the Scheduler Service wakes the Timer; it goes without saying that this would NOT be a great user experience).

This is where the Long Polling/Continuous-Process pattern comes into play: it's a way of building a Timer that is constantly running for a few minutes, up until before reaching its timeout limit (at which time it gracefully shuts down and wakes up again for a new round of continuous processing).

But be advised: you should not implement this pattern just because you can; improper usage can seriously degrade server performance, as well as consume all available Timer slots very quickly, so be mindful of that.


When to use it:

  • When creating a Launch On trigger in Light BPT is not an option (e.g. it needs to be triggered from a table of an external database instead of an OutSystems Entity);
  • And when timing is absolutely critical, which would otherwise negatively impact UX or time-sensitive integrations (like external logging systems).


When NOT to use it:

  • When it's not really necessary/i.e. overkill (or, in other words, when it's fine if the record waits the usual 20-seconds delay of the Scheduler Service spin-up to be processed);
  • Or when there are already too many concurrent Timers using this pattern (danger of occupying all available Timer slots, preventing other Timers from running, which would obviously be very bad!). Ideally, there should be at most 1 Long Polling/Continuous-Process Timer per front-end server.
  • And lastly, if the timeout needs to be long (> 20 min) because the processing takes too long to finish, do NOT use this pattern. Long Polling/Continuous-Process Timers need to gracefully stop and wake up again every few minutes as means to give the Scheduler Service some breathing room, so that it's able to run other higher priority Timers that were queued as well.


Prerequisites

Now that you know when to use and, most importantly, when NOT to use this pattern, let's see how the actual implementation goes.

You will need two basic things: a Server Action that detects whether a Timer is reaching its timeout (see Avoid long-running timers and batch jobs), and the Timer Action itself using the Long Polling/Continuous-Process pattern.


Timeout detection Action

First, create a Server Action of Function type called CheckTimeout() with the following parameters:

ScopeTypeName
InputTextTimerName
InputIntegerMarginSeconds
OutputBooleanIsReachingTimeout
Local VariableIntegerTimeoutSeconds
Local VariableIntegerSecondsElapsed

In it, add an Aggregate called GetMetaCyclicJob using the entities Meta_Cyclic_Job and Cyclic_Job_Shared from (System), as follows:

TIP: You can use the ActionInfo Forge component to automagically grab the TimerName value instead of manually providing it as an input parameter (optional, but really handy).

Following the Aggregate, add an Assign node setting the following values:

AssignmentExpression
TimeoutSecondsIf(GetMetaCyclicJob.List.Current.Meta_Cyclic_Job.Effective_Timeout = 0, GetMetaCyclicJob.List.Current.Meta_Cyclic_Job.Timeout, GetMetaCyclicJob.List.Current.Meta_Cyclic_Job.Effective_Timeout) * 60
SecondsElapsedDiffSeconds(GetMetaCyclicJob.List.Current.Cyclic_Job_Shared.Is_Running_Since, CurrDateTime())
IsReachingTimeoutTimeoutSeconds - SecondsElapsed < MarginSeconds

Here's the final look of the code:

And that's it! This function now dynamically tells you whether your running Timer is reaching its timeout, taking into account the provided margin (in seconds) at which the Timer should stop before timing out, and the effective timeout configuration set in Service Center.


Applying the pattern

The pattern follows some criteria:

  • It needs a "Kill switch": basically a Site Property that is constantly checked against during the process loop, giving the possibility to interrupt the process at any time by simply flipping the Site Property value to False through Service Center.
  • Transaction committing: Always call CommitTransaction after processing each data batch. This is a must, otherwise the DBMS log file might become full of uncommitted changes, locking problems might surface and plain chaos might be created.
  • Error handling: If any uncaught error is thrown within the context of the Timer, call WakeTimer and gracefully end the logic flow so that it boots up again the next time the Scheduler Service runs (~20 secs to run again).
  • Near-timeout exit: After processing the current data batch and committing, call CheckTimeout() providing the TimerName and MarginSeconds parameters (60 seconds is usually a reasonable margin to stop before timing out, but it depends on how long your data batch takes to process on average). If the timeout check is positive, proceed to call WakeTimer and end the flow.

Here's a depiction of the logic flow using this pattern:

TIP: Try processing your data in small batches. For example, instead of updating all records at once, try just grabbing the top 100 records each time and keep moving forward. The processing loop runs so frequently that it works best when it's given smaller chunks of data to process each time.

TIP²: Set the schedule in Service Center to run every 5 minutes. This ensures that the Timer will boot-up again in 5 minutes in case it fails and stops for any weird reason.


And that's all there is to it. This can be a really powerful pattern, just be aware of the consequences of misusing it. With great power comes great responsibility. Let me know if this helps anyone, I would be curious to know that. Any feedback would be appreciated.


Thanks!

Thanks. Very good and complete post.

Best Regards

Great post,

Good explanation on when (not) to use.

There´ s one part i did not understand.  

You say the problem is that it could take 20 seconds for the scheduler to react to a WakeTimer, but I´ m not seeing how you avoid that, there is still a WakeTimer.

Dorine 

Good question.

Because we need the Timer to stop every few minutes of continuous processing (in order to give the Scheduler Service some breathing room so that other timers also get a chance to run if they ever end up queued), we are not actually abolishing the 20-sec delay entirely.

But it happens WAY less frequently than in the simpler scenario of just processing data only once and then calling WakeTimer, without looping it.

I've used this in a number of occasions for different customers and in all cases it's proven to be irrelevant that every 15 mins the Timer stops for (at most) 20 seconds before booting up again.

Ah ok,

I tought the purpose was to reduce those 20 seconds to 0, so when I started reading, I tought your trick would be to have two timers running the same logic.

  • At beginning, DO_DATA_PROCESSING is stalled until twin timer has stopped
  • 30 seconds before end, Wake timer executed for twin timer
  • but current timer keeps on running 30 more seconds

So there's about 10 - 20 seconds overlap, ensuring almost no discontinuing of processing (provided of course there's not too many other timers running on the same server)

Dorine



Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.