Identifying processor overload under the .NET Stack using Performance Counters

Identifying processor overload under the .NET Stack using Performance Counters

  
If you are experiencing high processor load and suspect that is due to application misbehavior, for instance an application that is consuming all the processor resources due to poor design, there is no easy way of telling which application is to blame.

When talking about OutSystems applications you can use the process that will be described here to gain insight on what application may be misusing the processor.

First of all we need a tool to measure some indicators about the processes processor usage ratio, we will use the Windows 2003 Server administrative tool Performance Monitor. The following steps are needed in order to setup the gathering of relevant processor performance indicators:

    1) Open the Performance Monitor by accessing the Start button in the taskbar, Administrative Tools and click on Performance;
    2) On the left pane expand the ‘Performance Logs and Alerts’;
    3) Right-click on the ‘Counter Logs’ and select ‘New Log Settings’;



    4)     Choose the name of the Log and click the OK button;
    5)     On the General tab click the ‘Add Counters’ button;



    6)     Click on the ‘Select counters from computer’ and using the drop-down select your machine name;
    7)     Under ‘Performance object’ select Process and select the radio button ‘Select counters from list;
    8)     From the counters list select the following by clicking on the counter and clicking the Add button:
                a.     % Privileged Time
                b.    % Processor Time
                c.     % User Time
                d.    Private Bytes
                e.     Virtual Bytes
                f.     ID Process
    9)     After adding all of the necessary counters select the radio button ‘All instances’. Although in this scenario we suspect the problem is related to some OutSystems application misbehavior, we cannot be certain of that assumption, as such, by selecting all the processes we can rapidly gain some insight on the cause of the processor excessive load. Click the Close button;



    10)     Go to the Log Files tab, under Log file type select ‘Text File (Comma delimited)’ this will enable you to easily view this logs later using a spreadsheet;
    11)     Go to the Schedule tab and under the Start log select Manually;
    12)     Click the OK button.

Now you have setup a counter log for your processes, and you can start gathering these logs by right-clicking the created counter log and selecting Start.

At this time we can collect information about the processor usage ratio by process, if you can tell by the logs that the problem has nothing to do with the w3wp processes you can end this process and try and figure out what are the dependencies of the process that is consuming all of the processor resources. If in the other hand, you can see that the process that is causing this is any of the w3wp process instances you should proceed to find out which application is responsible.

Regarding OutSystems applications we only know that they are running in one (or several depending on your IIS configuration) w3wp process(s). Although this can tell us that the problem relies in one of the OutSystems Applications, it does not specify in which one so we need to do some more actions on top of just using a performance monitoring tool.

In order to isolate the application that is potentially causing the processor to spike, we need to divide them into separate application pools, this will enable us to separately monitor the various w3wp processes serving them.

Since you might potentially have a large factory we propose you use a binary search like strategy. So you would start by dividing your factory into two application pools and then view the logs to see which of the corresponding w3wp processes is consuming more processor resources. This strategy should then be repeated to the point when you only have two application pools and thus are able to pinpoint the application that is causing the problem.

Note: this binary search like strategy of dividing the Application pools helps in situations where the processor usage is very high and one can clearly understand something is wrong, if you just want to know which application is consuming the most resources, this is not the approach you are looking for.

Although the strategy described in the above paragraph can tell you the processor usage by process ID, we also need to bind the instances of the w3wp processes to the correspondent application pool they’re serving so that we can rule which one contains the application to troubleshoot. In order to do so you just need to run the w3wpProcessInfo.bat batch script (tested in Windows Server 2003) which will gather the w3wp process ID and corresponding application pool name. Once we have this information we can cross reference it with the processor usage logs.


w3wpProcessInfo.bat
---------------------------------------------------------------------------------------------------------------------------------------
@echo off
echo Gathering w3wp processes info, this will take about 30 seconds.....
for /L %%x in (1,1,30) do (
       timeout 1 > nul
       echo ---------------------------------------------------------------------------- >> ProcessIdDescription.txt
       echo %time% >> ProcessIdDescription.txt
       echo ---------------------------------------------------------------------------- >> ProcessIdDescription.txt
       cscript c:\WINDOWS\system32\iisapp.vbs >> ProcessIdDescription.txt
)
----------------------------------------------------------------------------------------------------------------------------------------
Hi Pedro,

Something that was not clear to me was the step 8 and 9.

In step 8, when we add the counters what instances should
we choose? (_Total right?)

In step 9 wich counter must be selected when selecting all instances? (id Process?)
Then we should click "add" otherwise clicking Close Button nothing is going to be added.

Best regards
Is there any way to identify what page/function is causing these kinds of issues? Our eSpaces can be pretty big, finding out which eSpace is the problem is a help, but not as much as we need.

Thanks!

J.Ja
Justin James wrote:
Is there any way to identify what page/function is causing these kinds of issues? Our eSpaces can be pretty big, finding out which eSpace is the problem is a help, but not as much as we need.

Thanks!

J.Ja
 Finding responsible eSpace for processor overload is almost "mission impossible", since you must walk in the darkness making guesses which espace you should move from one apppool to other to isolate some others until you have only one eSpace isolated. This this approach collapses if instead of one eSpace causing trouble you have two.


What you ask is a bright blue sky and in my opinion the path to follow.

João Inácio
 
Hello,

Actually, there is a pretty simple way (in some cases) to identify not only the eSpace taking up the CPU usage but also exactly which page is being executed.

If you go to IIS manager,
Select the server,
On the middle pane, select Worker Processes

Here you'll see a list of all worker processes and some stats (including CPU usage). One of them will be taking up a lot of CPU.

Click that application pool.
Now you'll see a list of requests being serviced by that application pool, with information about how long they are running.

Usually the ones taking up a lot of CPU are the ones that last the longest, so you can usually identify those rather quickly using this method.

Best regards,
Ricardo Silva
Here is something I should already know long time!!

God Job Ricardo thanks for sharing.

Regards,
João Inácio
Hi Ricardo,

Your approach seems suitable to identify each request from a given application pool.
Nonetheless, is it also possible to obtain the correspondent http request (GET or POST) related to that Request?

Thanks for your attention,
José Martins