Timeout Expired on OutSystems.HubEdition.RuntimePlatform.GenericExtendedActions

Timeout Expired on OutSystems.HubEdition.RuntimePlatform.GenericExtendedActions

  

Hi guys,


so, we are getting tons of Timeout error logs in our platform, and we have no idea why. We already have created a support case with OutSystems, but while the support can't figure out what might be missing in our side (or if is some platform bug), I remembered that maybe one of you guys have already passed through this and can probably have a solution for this.

Let me explain a bit of our factory:

  1. MultiTenant - with about 4.000 Tenants
  2. .NET / SQL Server
  3. Our code, .NET and SQL, is in the cloud (AWS) 
  4. We've followed all OutSystems recommendations regarding configurations


Anyway, if you can give us ideas to check on it, no matter how simple the ideas are, they are very welcome.

The error that we are having is this one:
Timeout expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.  This failure occured while attempting to connect to the Principle server.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at System.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
   at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async, Int32 timeout, Task& task, Boolean asyncWrite, SqlDataReader ds, Boolean describeParameterEncryptionRequest)
   at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, TaskCompletionSource`1 completion, Int32 timeout, Task& task, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(TaskCompletionSource`1 completion, String methodName, Boolean sendToPipe, Int32 timeout, Boolean asyncWrite)
   at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
   at OutSystems.HubEdition.Extensibility.Data.ExecutionService.BaseExecutionService.ExecuteNonQuery(IDbCommand cmd)
   at OutSystems.Internal.Db.Command.ExecuteNonQuery(String description, Boolean isApplication, Boolean skipLog, Boolean applyTransformationsToParameters)
   at OutSystems.HubEdition.RuntimePlatform.GenericExtendedActions.#nNb(HeContext heContext, Int32 inParamUserId, String inParamPassword, Boolean inParamPersistent, Boolean checkPassword)


Any ideas?


Thanks,

Nelson Freitas

Hi Nelson,


From that stack, your error is trying to update the "LastLogin" attribute of the user. That operation has a low timeout of 1 second and failing does not affect the behavior of your application (other than taking 1second longer than it should and not updating the LastLogin attribute).

The most frequent cause of those errors are when you changed something in the User entity before doing calling Login and didn't commit the transaction in the meanwhile.


Regards,
João Rosado

Hi João,


Thank you very much for your reply. At least, its something that we can try to track down. So, we are going to check on all the User Updates and see if its possible to have a commit without messing with the rest of the code.


But in the meantime, do you know if there is anything that we can do for now to increase that timeout ?


Best regards,

Nelson Freitas

Hey João,

So, I've search our code for an update to see if it matches with your description. I've found one place where we had an UpdateUser without a commit after. Yet, this code does not run at the same time as we have the login. It runs on demand in other completely different places.

The attachment that I've added here (W_Theme_UI) contains our Login action (Wodify_Login) which basically does a TenantSwitch if needed and a Login. 

I was wondering, do you think the TenantSwitch do something to the User entity ? Because when we run a TenantSwitch, the OnSessionStart starts running too...I don't know, trying to send ideas to the table.

Regarding the last question that I had in the last post, do you know if there is something we can do?

Thanks,


Nelson Freitas

Hi Nelson,


I'll look into this issue and followup on your support case.

But answering on some of your questions:

- Currently there is no way to increase the timeout value on that query

- The TenantSwitch/OnSessionStart by themselves do not do any changes to the database at all, but since your OnSessionStart does both a TenantSwitch followed by a Login it will probably cause your OnSessionStart to run twice (recursively) and login double the times that it actually needs causing extra contention on your database. Change your connector after the TenantSwitch to the "Different User" If instead of going directly to the "Login" to see if it helps.


Regards,
João Rosado

João Rosado wrote:

Hi Nelson,


I'll look into this issue and followup on your support case.

But answering on some of your questions:

- Currently there is no way to increase the timeout value on that query

- The TenantSwitch/OnSessionStart by themselves do not do any changes to the database at all, but since your OnSessionStart does both a TenantSwitch followed by a Login it will probably cause your OnSessionStart to run twice (recursively) and login double the times that it actually needs causing extra contention on your database. Change your connector after the TenantSwitch to the "Different User" If instead of going directly to the "Login" to see if it helps.


Regards,
João Rosado


Hi João,


once again, thank you for your answer. 

When you say "Change your connector after the TenantSwitch to the "Different User" If instead of going directly to the "Login""

I believe that's what I'm doing in the eSpace. After TenantSwitch I have an If "Session.UserId <> UserId that I want to be logged in", that if true it will Login:


Or am I not understanding well ?

Thank you,

Nelson Freitas

The one inside the Use_AuthenticationToken (that is the action that runs inside the OnSessionStart) doesn't look like that.


Regards,
João Rosado


João Rosado wrote:

The one inside the Use_AuthenticationToken (that is the action that runs inside the OnSessionStart) doesn't look like that.


Regards,
João Rosado


João,

Holy guacamole!

Thanks, I did not see that. Going to change immediately to be just like the previous screenshot ( after TenantSwitch only login if Session.UserId <> UserId that I want to be logged in).

I will only see the results in January 14th, because we only have that issue in Production.

Also, if you recommend doing any changes in our TenantSwitch/Login/OnSessionStart feature, please be my guest. We are trying to be as performant as possible.


Once again, thank you.

Nelson Freitas

I don't have any other suggestion at this point.


But please note that my suggestion is not a complete solution for your timeout problem, is just a improvement (by reducing in half the number of logins).

We will look into this in order to reduce the amount of errors on systems with a heavy load (as a 1 second timeout can be short when the database is under pressure + network delays).


Regards,
João Rosado

Hi João,


so, support has asked me to check if we use the column Last_Login very often, because that 1 second timeout is related with updating that column on logging in. I went through all our eSpaces that uses the User entity to look for queries that are using the Last_Login column. I've found 3 places. The first 2 are there for ages (at least 2 years), but the third one was added not that long and it praticaly matches the time we started having these issues.

So, I've removed that usage, and need to wait for a deployment to production to see if we got rid of all those timeouts.

I'll let you know as soon as I have any info on it, se we can close this and mark as solution :)


Best Regards,

Nelson Freitas