Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invoke-IcingaCheckCPU : Incorrect high % values #404

Open
Aleksey-Maksimov opened this issue Jun 17, 2024 · 6 comments
Open

Invoke-IcingaCheckCPU : Incorrect high % values #404

Aleksey-Maksimov opened this issue Jun 17, 2024 · 6 comments
Assignees

Comments

@Aleksey-Maksimov
Copy link

Hello.

Sometimes Invoke-IcingaCheckCPU shows strange load % data greater than 100%

изображение

Environment configuration:

PowerShell Root                 => C:\Program Files\WindowsPowerShell\Modules\
Icinga for Windows Service Path => C:\Program Files\icinga-framework-service\
Icinga for Windows Service User => NT Authority\NetworkService
Icinga for Windows Service Pid  => 4544
Icinga for Windows JEA Pid      =>
Icinga Agent Path               => C:\Program Files\ICINGA2\
Icinga Agent User               => NT AUTHORITY\NetworkService
Defined Default User            => NT Authority\NetworkService
Icinga Managed User             => False
PowerShell Version              => 5.1.20348.2110
Operating System                => Microsoft Windows Server 2022 Standard
Operating System Version        => 10.0.20348
JEA Context                     =>
JEA Session File                =>
Api Check Forwarder             => True
Debug Mode                      => False

Icinga for Windows Certificate:

Issuer  => CN=Icinga CA
Subject => CN=winhost002.holding.com

List of configured background daemons on this system:

Start-IcingaWindowsRESTApi
-----------
No arguments defined

List of configured background service checks on this system:
=> https://icinga.com/docs/icinga-for-windows/latest/doc/110-Installation/06-Collect-Metrics-over-Time/

No background service checks configured

List of configured repositories on this system. The list order matches the apply order:

Icinga Stable
-----------
CloneSource  =>
Enabled      => True
LocalPath    =>
Order        => 0
RemotePath   => https://packages.icinga.com/IcingaForWindows/stable/ifw.repo.json
UseSCP       => False

Installed components on this system:

Component    Version   Available
---          ---       ---
agent        2.14.2    2.14.2
apichecks    1.2.0     1.2.0
cluster      1.3.0     1.3.0
framework    1.12.3    1.12.3
hyperv       1.3.0     1.3.0
inventory    1.2.0     1.2.0
kickstart    1.4.0
mssql        1.5.0     1.5.0
plugins      1.12.0    1.12.0
restapi      1.2.0     1.2.0
service      1.2.0     1.2.0

This behavior has been observed on different systems - on virtual servers with a small number of cores and on physical servers with two sockets and a large number of cores

изображение

@Aleksey-Maksimov
Copy link
Author

Almost all the time, the plugin correctly shows % values, which are similar to what we see in Windows Task Manager.
But sometimes something strange happens to the plugin and it shows an unrealistically high % value.

изображение

@LordHepipud
Copy link
Collaborator

Hello,

thank you for your issue. This is actually not a bug, but a feature which was introduced with Windows 8 and beyond.

The performance counter we use since the latest version of Icinga for Windows, are using the same information as the task manager. How ever, while the task manager simply caps the CPU usage to 100%, the Icinga for Windows plugins print the correct usage.

There is a detailed docs entry available from Microsoft

The short version: Systems that use Intel Turbo Boost or AMD PBO (Precision Boost Overdrive), will report the current usage different, depending if they are in the boost window and use the boost clock for completing tasks, or if they are working with the base clock.

@LordHepipud LordHepipud self-assigned this Jun 19, 2024
@Aleksey-Maksimov
Copy link
Author

Aleksey-Maksimov commented Jun 19, 2024

The article you mentioned https://learn.microsoft.com/en-us/troubleshoot/windows-client/performance/cpu-usage-exceeds-100 says that Task Manager can show more than 100%. But we've never seen more than 100% in Task Manager. In general, our virtualization hosts are not very heavily loaded and Task Manager never provides data on high load. But the plugin shows us some completely unrealistic numbers of 1600%, 1900% (screenshot above). With such mathematics, the values ​​​​that we indicate in the Critical and Warning thresholds completely lose their meaning.

@Aleksey-Maksimov
Copy link
Author

I think that in the plugin it is more correct to use the \Processor Information(*)\% Processor Time counter instead of \Processor Information(*)\% Processor Utility. In this case, at peak loads we will see a clear figure of 100%.

изображение

There was a discussion of a similar issue in microsoft/Windows-Dev-Performance: microsoft/Windows-Dev-Performance#78

@LordHepipud
Copy link
Collaborator

LordHepipud commented Aug 12, 2024

Thank you for the feedback. The % Processor Time counter was replaced a while ago, as the Task Manager is using different performance metrics and users complained, that the load reported by Icinga for Windows did not match the Task Manager numbers.

We are currently investigating the reported numbers, but as of now it seems the performance counter library of Windows is reporting odd values. While a usage beyond 100% can happen, a 400% load or even higher is not possible.

Are you running these Windows machines as virtual machines on ESXi by any chance?

@Aleksey-Maksimov
Copy link
Author

This issue is observed on both virtual servers and Hyper-V virtualization hosts with Windows Server 2022.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants