Hi,
I think I am quickly coming to the conclusion that what vCOPS is telling me in regards to resources is not always a good indicator of what is really going on.
Over the last week I had a few people asking for more CPU as they noticed bad performance. More often than not this is actually not the issue.
I have a few cases where the VM is peaking at 100% CPU (not necessarily for long). First thing I always do is use vCOPS and start generating some metric graphs. In these cases , the demand or usage in vCOPS is not anywhere near 100%. Even the peak demand is usually not anywhere near.
In one monitored case the demand usage was around 76% (30 min period). TOP showed that during this time the CPU had less than 1% idle
What explains the difference here? I know that vCOPS/vCenter gets its stats at 20 sec intervals so if it the CPU runs at 50% for 10 sec and 100% for the other 10 sec I would get 75% for the interval. Is this how it works? So the VM may perform poorly because it is actually running at 100% for half that 20 seconds. This VM was monitored through ESXTOP and %RDY and %CSTP times where pretty much non existing.
In another case I pretty much saw the same issue . 100% CPU in task manager and low demand/usage in vCOPS and when we added a vCPU the VM was very happy.
I would like to confidelity say to people whether there is indeed an issue. Most of the time I do not have insight into the VM or the applications itself but I am worried I might give people the wrong impression when I show them the charts.
In all of the above there did not appear to be any underlying issues.
Cheers