Big difference between query took time and node's execution time

Back to Forums

Takasi Moriya

MVP

Solved

Big difference between query took time and node's execution time

Question

Hello,

I'm facing strange behavior with advanced sql tool.
My action flow is below.

The following are real sample of General Log in Service Center.

2019-11-19 17:29:41  LogMessage2                                                Debug
2019-11-19 17:29:09  Query FetchKoujis.GetKoujisByEigyousyos.List took 2922 ms  SLOWSQL
2019-11-19 17:29:09  LogMessage                                                 Debug

Why does it take 32s(LONG!!) between LogMessage and LogMessage2 even though the 'Query took' is displayed as 2.9s(2922ms)?
How do I speed up the execution of this node?

The data returned by the sql node has only 140 rows and 10 columns.
Columns consists of string, date and date. Longest string columns are under 200 characters.
The tables using in the sql are on external Oracle 11g.

Other environment info:
* Platform Server 11.0.607.0
* Windows Server 2016

Thanks.

19 Nov 2019

Takasi Moriya

MVP

Solution

Tracking number RPD-4719 addressing to this issue was created.
Some comments by support were as following.

When a query is executed by the platform the start time is saved and soon as the query finishes its execution in the database, even if all the dataset is still not returned, the end time is also saved and the difference is logged in a SLOWSQL.

As an example, with your query, the execution time with all the dataset already returned is 7 seconds, however, the platform will only obtain the results when needed. Therefore while the results are being fetched from the database, since the query was already executed, a pointer to the dataset is returned from the database and consequently the SLOWSQL is generated with that time difference.

That said and specifically addressing your latest question, we would like to inform you that we fully understand your specific use case and we find your concerns valid. Therefore, we have opened a new defect, with an improvement suggestion, to be analyzed internally by our development team, which will be done according to backlog and severity.

As of now, we do not know when and if these improvements will be applied to our product, but you can track the ID RPD-4719 in our "Release Notes".

Regarding your current situation, the only way we found to troubleshoot any slow database nodes in your logic – either Aggregates or Advanced SQL – would be by adding the additional logging to your applications’ logic, as explained in our previous communication.

See solution in context

14 Jan 2020

Afonso Carvalho

MVP

Hi Takasi,

The LogMessage action is asynchronous - there's no guarantee of when the data is written to the Log Entity. It's not your code that's taking so long, it's just the message queue being processed.

Instead of relying on the time of write to the General tab, you could write a CurrDateTime() value into each log and use that for comparison.

19 Nov 2019

Takasi Moriya

MVP

Hello Afonso,

Thank you for comment.

I know the General Log doesn't record correct time.
I also attempted to record correct time by using function or another SQL snippet as you wrote.
But I the action's execution time is really more than 30s. Also on my wrist watch.

Why does it take so long time even though the query took only 2.9s?
How do I improve the performance of action?

19 Nov 2019

Afonso Carvalho

MVP

Performance can be tricky to diagnose, but I see three main issues here:

- you say you've measured more than 30s, but does it take 30s to execute, or for you as a user to see a response in the browser? You should separate execution time from completion time - I'd measure this with a local variable initialized with CurrDateTime(), and then logging this time right after your Start node. Then, right before the End node, I'd add another log that subtracts this local variable with a new CurrDateTime(), so you know exactly how long it takes the code to execute - this allows you to exclude network issues and lets you isolate performance to just the code within the action;

- you have a query that takes almost 3 seconds to return 100 records. This is worrying - is it complex? Do you perform a lot of calculations? I see that it's external, so I'd ask if 3 seconds is "normal". Can you run the query directly in the database to compare?

- if you have 30 seconds of total execution time and you know the query takes 3, that's 27 seconds unaccounted for. Not knowing what happens in the initial If node (are you using functions that could be obfuscating additional logic?), or what happens in the String Join action, the remaining time could be being eaten up by other logic.

I think the main priority is to exclude network issues (by using logs at the start and at the end of the Action), and should you confirm that it's not related to the network, trying to measure how long it takes to execute each separate step of your logic.

19 Nov 2019

Takasi Moriya

MVP

Hello Afonso,

Thank you for using your time for my problem.

You should separate execution time from completion time

I changed my action as below. All LogMessage nodes use CurrDateTime().

And obtained the following log.

2019-11-20 09:39:46 LogMessage3 - 2019-11-20 09:39:46, (43s)                  Debug
2019-11-20 09:39:46 LogMessage2 - 2019-11-20 09:39:46   Debug
2019-11-20 09:39:03 Query FetchKoujis.GetKoujisByEigyousyos.List took 5250 ms SLOWSQL
2019-11-20 09:39:03 LogMessage1 - 2019-11-20 09:39:03                         Debug
2019-11-20 09:39:03 LogMessage0 - 2019-11-20 09:39:03                         Debug

I see that it's external, so I'd ask if 3 seconds is "normal". Can you run the query directly in the database to compare?

Yes, it is normal. This query is complex a little. Few days ago, executing this query using sqlplus in console directly took about 3 thru 10 seconds. (And I can not currrently perform this again, unfortunately.)

Not knowing what happens in the initial If node (are you using functions that could be obfuscating additional logic?), or what happens in the String Join action, the remaining time could be being eaten up by other logic.

The log I pasted above shows only SQL node takes almost all time.

Is there another action I should try?

20 Nov 2019

Lennart Kraak

Champion

Hi Takasi,

You can use Lifetime Analytics to look into more detail to that specific action and query. This will give you more insight in where the delay is and whether it might be network related (Slow connection to an external DB Server?).

Regards,
Lennart

20 Nov 2019

Afonso Carvalho

MVP

It's strange - there's something else taking up the time in between LogMessage1 and LogMessage2, but all that's there is the query and that's only taking up a visible fraction (5 seconds out of 40).

Can you Test the query? How long does it take to return results?

Lennart's suggestion is solid as well - do you have access to LifeTime? The Analytics tab could provide insight into what's going on.

20 Nov 2019

Takasi Moriya

MVP

Hello Lennart,
Hello Afonso,

Thank you for comments.

Unfortunately, the environment doesn't have LifeTime currently.
Yes, it's strange.

My colleague came up with an idea that could explain this phenomenon.
The Query Took time in the log is probably the time until the first data can be returned, not the time until the node finished to obtain all the data.

For example, when using the following query, the external Oracle will return first row in short time period. And long time is needed to finish returning all data. The Query Took time in the log maybe indicate the first time.

select    -+
    ...    |
from       | <simple query (fast)>
    ...   -+

union all

select    -+
    ...    |
from       | <complex query (slow)>
    ...   -+

21 Nov 2019

Afonso Carvalho

MVP

I've always been led to believe that it's the full time for the query execution, but I'm not too familiar with Oracle - it could be possible.

Do you suppose you could extract and isolate this logic into a new espace and try reproducing it there? Worst case scenario, if you cannot draw any further conclusions from this you'll have the problem contained/reproduced and could contact OutSystems support for their suggestions.

21 Nov 2019

Takasi Moriya

MVP

Hello Afonso,

Sorry for my delay.
Thank you for commenting.
I asked to OutSystems support.
When I will obtain any information from them, I will share it with you here.

25 Nov 2019

Takasi Moriya

MVP

OutSystems' support have been investigating this issue and I haven't received their reply yet.
By the way, I have become to be able to reproduce the issue on Personal Environment.
My colleague's hypothesis is probably correct.
Attached oml file can reproduce the problem. Screenshots are from the oml.

StrangeSlowSqlLog.oml

20 Dec 2019

Afonso Carvalho

MVP

Thank you for the update and the sample Takasi - I ran it in my personal environment and got very similar times in the log. Testing the SQL query also took around 8 seconds, matching those times very closely - looks like your colleague might be right.

Hopefully support can confirm this.

20 Dec 2019

Takasi Moriya

MVP

Solution

Tracking number RPD-4719 addressing to this issue was created.
Some comments by support were as following.

When a query is executed by the platform the start time is saved and soon as the query finishes its execution in the database, even if all the dataset is still not returned, the end time is also saved and the difference is logged in a SLOWSQL.

As an example, with your query, the execution time with all the dataset already returned is 7 seconds, however, the platform will only obtain the results when needed. Therefore while the results are being fetched from the database, since the query was already executed, a pointer to the dataset is returned from the database and consequently the SLOWSQL is generated with that time difference.

That said and specifically addressing your latest question, we would like to inform you that we fully understand your specific use case and we find your concerns valid. Therefore, we have opened a new defect, with an improvement suggestion, to be analyzed internally by our development team, which will be done according to backlog and severity.

As of now, we do not know when and if these improvements will be applied to our product, but you can track the ID RPD-4719 in our "Release Notes".

Regarding your current situation, the only way we found to troubleshoot any slow database nodes in your logic – either Aggregates or Advanced SQL – would be by adding the additional logging to your applications’ logic, as explained in our previous communication.

14 Jan 2020

Community GuidelinesBe kind and respectful, give credit to the original source of content, and search for duplicates before posting.

See the full guidelines