I have a query to show in a graph the number of users in each UserStatus. It's working however I'm not understanding if its better to use the query, in terms of performance, like this (with parenthisis):


SELECT {UserStatus}.[Label], Count({User}.Id)

FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]

Where ({UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1) 

 and  {User}.[RegistrationDate]  >= @StartDate   and {User}.[RegistrationDate] < @EndDate

group by {UserStatus}.[Label]


or like this (without the parenthisis):


SELECT {UserStatus}.[Label], Count({User}.Id)

FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]

Where {UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1 

 and  {User}.[RegistrationDate]  >= @StartDate   and {User}.[RegistrationDate] < @EndDate

group by {UserStatus}.[Label]


I checked some examples with () but I'm not understanding what is the difference between using parenthesis or not.

Solution

Hello OsCaR_,

The issue here is not a performance issue, it is a logic issue.  AND takes precedence over OR.  Your first example is:

(A or B) and C and D

Your second example would be run like:

A or (B and C and D)

This could certainly result in a difference in performance, but the bigger issue is the difference in the logical conditions.

EDIT: As a side note, as long as you are not changing the logical conditions, having a parenthesis or not will not affect the performance.

Solution

Additionally, why don't you use an Aggregate? If you are using an Aggregate, you don't need parentheses, as the Filters are automatically ANDed together. In general, use Aggregates whenever you can!


You also might want to specify your join statement, if not specified it will always be executed as an INNER JOIN. This means the query will only return records that are both found in the left and the right table.

On the other hand a LEFT (OUTER) JOIN, means that even though the right table yields no result, you will still get a result for each record in your left table.

Ofcourse a RIGHT (OUTER) JOIN is the same as a LEFT JOIN, but works the other way around, depending on the order of your ON statement a LEFT or RIGHT JOIN could potentially be the same as an INNER JOIN.

A FULL JOIN will return records that are present in LEFT and RIGHT, so it will basically return all records in both tables.

Lastly a CROSS JOIN, also very interesting, will JOIN all records of the LEFT table with each record of the RIGHT table (so basicaly, LEFT.RECORDCOUNT * RIGHT.RECORDCOUNT), unless you specify a statement, in which case it can join as an INNER JOIN.

In outsystems aggregates you have 2 options, Only with, and a with or without. 

Only with is the same as a INNER JOIN, where with or without is the same as a LEFT JOIN.

Hi Joey,

That's incorrect! "JOIN" is equivalent to "INNER JOIN". "LEFT JOIN" is short for "LEFT OUTER JOIN".

Kilian Hekhuis wrote:

Hi Joey,

That's incorrect! "JOIN" is equivalent to "INNER JOIN". "LEFT JOIN" is short for "LEFT OUTER JOIN".


Hey Kilian, I noticed my wrongdoings at the moment of writing :(
I deeply apoligize!
I have changed it and provided some additional information.
I also didn't know about a cross join before.

No problem, we all learn every day :)

Thanks, but this query is possible to do in the aggregate? Because this query uses the "in" operator. 


SELECT {UserStatus}.[Label], Count({User}.Id)

FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]

Where ({UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1)

 and  {User}.[RegistrationDate]  >= @StartDate   and {User}.[RegistrationDate] < @EndDate

group by {UserStatus}.[Label]

Unfortunately, the IN clause is not supported by aggregates at this time. SQL Query is the correct approach unless you have a fixed number of values in that list and can convert it to OR statements.

Craig St.Jean wrote:

Hello OsCaR_,

The issue here is not a performance issue, it is a logic issue.  AND takes precedence over OR.  Your first example is:

(A or B) and C and D

Your second example would be run like:

A or (B and C and D)

This could certainly result in a difference in performance, but the bigger issue is the difference in the logical conditions.

EDIT: As a side note, as long as you are not changing the logical conditions, having a parenthesis or not will not affect the performance.

Thanks, so basically how this logic operator works in the queries are for example, in the first query, it returns the UserStatus.Label and the count and if the UserStatus.Id is in the variable UserCheckedStaus the "or ReturnAll = 1" is ignored and so the code goes to the next and only returns records where the registration date is greater than the variable StartDate and then also the code goes to the next and returns only records where the registration date is lesser than the EndDate variable?


SELECT {UserStatus}.[Label], Count({User}.Id)

FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]

Where

 ({UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1) 

 and  {User}.[RegistrationDate]  >= @StartDate   

and {User}.[RegistrationDate] < @EndDate

group by {UserStatus}.[Label]


But I don't know if I'm understanding the difference of the query below. The difference is that if the @ReturnAll = 1  is true then the other conditions are ignored? That is the Where {UserStatus}.[Id] in (@UserCheckedStatus), and  {User}.[RegistrationDate]  >= @StartDate   and {User}.[RegistrationDate] < @EndDate, are ignored? 


SELECT {UserStatus}.[Label], Count({User}.Id)

FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]

Where {UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1 

 and  {User}.[RegistrationDate]  >= @StartDate   and {User}.[RegistrationDate] < @EndDate

group by {UserStatus}.[Label]



Not sure I follow your explanation, let me try.

The first query will return:

  • If @ReturnAll is 1 - return the row as long as the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate
  • If @ReturnAll is 0 - return the row as long as the UserStatus matches @UserCheckedStatus and the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate

The second query will return:

  • If @ReturnAll is 1 - all rows where the UserStatus matches @UserCheckedStatus and also rows with the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate
  • If @ReturnAll is 0 - only rows where the UserStatus matches @UserCheckedStatus

Craig St.Jean wrote:

Not sure I follow your explanation, let me try.

The first query will return:

  • If @ReturnAll is 1 - return the row as long as the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate
  • If @ReturnAll is 0 - return the row as long as the UserStatus matches @UserCheckedStatus and the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate

The second query will return:

  • If @ReturnAll is 1 - all rows where the UserStatus matches @UserCheckedStatus and also rows with the RegistrationDate >= @StartDate AND RegistrationDate < @EndDate
  • If @ReturnAll is 0 - only rows where the UserStatus matches @UserCheckedStatus

Thanks But I'm not understanding properly the result of the second query.

For example the  1st query, (A or B) and C and D:


SELECT {UserStatus}.[Label], Count({User}.Id)
FROM {User} JOIN  {UserStatus} on {User}.[UserStatusId] = {UserStatus}.[Id]
Where
({UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1)
and  {User}.[RegistrationDate]  >= @StartDate  
and {User}.[RegistrationDate] < @EndDate
group by {UserStatus}.[Label]


The @ReturnAll is just a boolean variable that is True if the other condition "{UserStatus}.[Id] in (@UserCheckedStatus)" is false right? That is, the @ReturnAll is just a variable that is True if the UserStatus.Id is not in the @UserCheckedStatus variable?

- So if @ReturnAll is 1 the current row is always returned if {User}.[RegistrationDate]  >= @StartDate  
and {User}.[RegistrationDate] < @EndDate
- And if @ReturnAll is 0 the current row is always returned if the UserStatus.Id is in the @UserCheckedStatus variable and {User}.[RegistrationDate]  >= @StartDate and {User}.[RegistrationDate] < @EndDate

So this looks similar to your explanation for the result for the 1st query.

But for the second query A or (B and C and D)
I'm not understanding properly the result of the query that you explained because Im understanding the result of the query like this, which seems different from your explanation:


- If @ReturlAll = 1, it means that {UserStatus}.[Id] in (@UserCheckedStatus) is false, so the current row should be returned only if B and C and D is True because its A or (B and C and D), so if A is False, (B and C and D) should be True for the current row be returned
- If @ReturlAll = 0, it means that the {UserStatus}.[Id] in (@UserCheckedStatus) is true, so the current row is always returned because if A is True is enough for the current row to be returned (because TRUE or False is True, so if A is True the (B and C and D) can be False)

Hi OsCaR_

The difference in the other query is the parenthesis.
What is inside of parenthesis are evaluated BEFORE the rest of the expression.
When you don't have the parenthesis, the AND will be evaluated BEFORE the OR.
So, in the end, your query that is like this:

{UserStatus}.[Id] in (@UserCheckedStatus)
or 
@ReturnAll = 1
and 
{User}.[RegistrationDate] = @StartDate 
and 
{User}.[RegistrationDate] < @EndDate

 would be evaluated in this way: 

{UserStatus}.[Id] in (@UserCheckedStatus)
or 
(
    @ReturnAll = 1
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Because the AND will be evaluated BEFORE the OR.
You can learn about operator precedence in SQL Server here: https://docs.microsoft.com/en-us/sql/t-sql/language-elements/operator-precedence-transact-sql?view=sql-server-2017

Cheers.

Thanks,

So this query, without parenthesis, if the @ReturnAll is 0, will return only rows where {UserStatus}.[Id] in (@UserCheckedStatus). This part I understand.

But if @ReturnAll if is 1 I'm not understanding properly, can you explain better how the query works in that case?

If @ReturnAll is 1, the  "{UserStatus}.[Id] in (@UserCheckedStatus)" is false right? So the current row is only stored in the output result if the other condition is True? That is, if "@ReturnAll = 1 and {User}.[RegistrationDate] = @StartDate  and  {User}.[RegistrationDate] < @EndDate" is True. Im understanding like this but it seems that is not correct.

{UserStatus}.[Id] in (@UserCheckedStatus)
or 
(
    @ReturnAll = 1
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Why do you think "{UserStatus}.[Id] in (@UserCheckedStatus)" depends on @ReturnAll? 

As there is a single item in the IN list, it is like if you were doing UserStatus.Id = @UserCheckedStatus. If both values are the same, than this expression will be evaluated as True, otherwise it will be evaluated as False, regardless of the value in @ReturnAll or the evaluation of @ReturnAll = 1 be True or False... 

Eduardo Jauch wrote:

Why do you think "{UserStatus}.[Id] in (@UserCheckedStatus)" depends on @ReturnAll? 

As there is a single item in the IN list, it is like if you were doing UserStatus.Id = @UserCheckedStatus. If both values are the same, than this expression will be evaluated as True, otherwise it will be evaluated as False, regardless of the value in @ReturnAll or the evaluation of @ReturnAll = 1 be True or False... 


Thanks, because the @ShowAll parameter is:

SelectedStatusIds = ""


Sorry. Didn't understand what do you meant. 

Eduardo Jauch wrote:

Sorry. Didn't understand what do you meant. 


Sorry is not @ShowAll is @ReturnAll, and @ReturnAll is equal to CheckedStatusIds = "". 

Ah!

So, outside the SQL, the value of @ReturnAll is defined using the expression CheckedStatusIds = ""?
Why? I mean, why is a text if @ReturnAll seems to be an integer?


Eduardo Jauch wrote:

Ah!

So, outside the SQL, the value of @ReturnAll is defined using the expression CheckedStatusIds = ""?
Why? I mean, why is a text if @ReturnAll seems to be an integer?



Thanks, but the @ReturnAll is a boolean that is true If CheckedStatusIds is "". 

Ok, now I see... This is related to the other question with about the string with commas.
Ok, let's go back to where we stopped :)

This:

{UserStatus}.[Id] in (@UserCheckedStatus)
or 
(
    @ReturnAll = 1
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Now, if @ReturnAll is 1 (true), your query filter will be the same as:

FALSE
or 
(
    TRUE
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

That can be rewritten as:

(
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Now, the result will still depend on the StartDate and EndDate.

Eduardo Jauch wrote:

Ok, now I see... This is related to the other question with about the string with commas.
Ok, let's go back to where we stopped :)

This:

{UserStatus}.[Id] in (@UserCheckedStatus)
or 
(
    @ReturnAll = 1
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Now, if @ReturnAll is 1 (true), your query filter will be the same as:

FALSE
or 
(
    TRUE
    and 
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

That can be rewritten as:

(
    {User}.[RegistrationDate] = @StartDate 
    and 
    {User}.[RegistrationDate] < @EndDate
)

Now, the result will still depend on the StartDate and EndDate.


Ah ok, thanks! So do you know if my explanation of the query, with and without parenthesis, is correct?

 Query With (): 

Where ({UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1) and {User}.[RegistrationDate] >= @StartDate and {User}.[RegistrationDate] < @EndDate group by {User Status}.[Label] 

Explanation with ():

- If @ReturnAll = 1, we have (False or True) and ... and ..., so the current row, that dont have an ID that is in the @UserCheckedStatus variable, is stored in the output result if {User}.[RegistrationDate] >= @StartDate and {User}.[RegistrationDate] < @EndDate

- If @ReturnAll = 0, we have (True or False) and ... and ...., so the current row, that has an ID that is in the @UserCheckedStatus variable, is stored in the output result if {User}.[RegistrationDate] >= @StartDate and {User}.[RegistrationDate] < @EndDate 

Query Without (): 

Where {UserStatus}.[Id] in (@UserCheckedStatus) or @ReturnAll = 1 and {User}.[RegistrationDate] >= @StartDate and {User}.[RegistrationDate] < @EndDate group by {UserStatus}.[Label] 

Explanation without (): 

- If @ReturnAll = 1, we have False or (True and ... and ...), so the current row, that has not and ID that is in the @UserCheckedStatus variable, is stored in the output result if {User}.[RegistrationDate] >= @StartDate and {User}.[RegistrationDate] < @EndDate 

- If @ReturnAll = 0, we have True or (False and ... and ....), so the current row, that has an ID that is in the @UserCheckedStatus variable, is stored in the output result independently of the StartDate and EndDate (because (False and ... and ....) is False)

That's exactly like you're saying. :) 

Eduardo Jauch wrote:

That's exactly like you're saying. :) 


Eheh, thanks!!