Hacker News new | past | comments | ask | show | jobs | submit login
What should the CPU usage be of a fully-loaded CPU that has been throttled? (microsoft.com)
190 points by pathompong on July 3, 2021 | hide | past | favorite | 176 comments



A key problem with this proposal is that modern CPUs do not have a single definitive maximum frequency. You have the base frequency which is rarely relevant outside of synthetic workloads then you have a variety of turbo frequencies which interact in complex ways. AMD's latest CPUs don't even have clear upper bounds on their frequency scaling logic. It's a big black box and the results can vary depending on the workload, temperature, silicon quality, phase of moon, etc.


CPU usage is an incredibly complex metric that doesn't really explain what is actually going on. I noticed while running a benchmark I can get my CPU to 50c and 100% usage and it stays steady like that. But then I tried prime95 and my cpu very quickly hit 99c also at 100%. Likely the different benchmarks were both running as fast as they can but the prime95 one ran on a faster part of the cpu which could generate more heat and does not get stuck waiting on memory or other slow ops.


CPU usage calculation can be (mainly) broken down into "percentage time spent not idle due to program" vs "percentage of max performance used by program". The former is usually what an end user cares about and measures the amount a program is hogging the system. The latter is usually what a programmer wants to know to see how optimized the program is.

Both are extremely complex with a lot of nuance after that but the "percentage time spent not idle due to program" type this article refers to tends to be simpler than trying to figure out cache/mode switches/instruction level parallelism/instructions per clock per type/and so on on top of everything you need to figure out in the first case anyways.


Sounds like you're actually comparing your CPU in avx mode vs not. This is a common source of this type of behavior, avx is insanely power hungry.


I've found that CPU temperature is an extremely useful way to measure load on the system. All of my computers mine cryptocurrency in the background. I used cgroups to limit their CPU usage, tuning the parameters until the CPU temperature reached acceptable levels. Sometimes I also observe huge spikes in temperature when I do random things. Scrolling certain javascript-heavy pages causes my temperature readings to briefly spike all the way up to 90 °C!


i think you could prove that on linux by getting the number of instructions run. I think perf has a way of getting that info for you. not sure if windows has something simmilar


Hmm, different workloads will have different mixes of instructions, which will take very different amounts of time per instruction (e.g. a chain of uncached memory lookups vs a tight loop of register-register arithmetic). If you had two instances of the same workload, and one was throttled, then yes, you could compare instructions per unit time. But comparing prime95 vs the other benchmark is likely misleading.


Yeah that would be the thing to prove that prime95 is going through more instructions so the cpu runs hotter


I see. But I expect that some instructions cost more (i.e. generate more heat) than others. Someone mentions AVX instructions, for example; presumably 1 million AVX-512 fused-multiply-add instructions cost a lot more than 1 million loads (we can probably arrange for the right proportion of loads to hit a CPU cache vs. go to main memory, so that they end up taking the same time). Even without AVX, I imagine things like 64-bit integer divisions or CRC32 instructions cost more than loads or stores, though I don't know by how much.


oh that makes sense, i guess that would be another aspect to measure


> my cpu very quickly hit 99c also at 100%

That should never happen and it's ridiculous that we just let manufacturers get way with it.


Why? 100c is typically the max nominal safe operating temp for CPUs. It would be a waste of resources to add additional cooling to computers not intended to run these type of workloads.

Prime95 is basically a synthetic workload so it makes little sense to optimize for it.


Yeah, the thing is it doesn't only happen in prime95. Nowadays it's any prolonged use where the CPU is fully used, like video editing or gaming. Give an inch and they'll take a mile, as the saying goes.

Temperature junction throttling is a last resort. No laptop should rely on it in normal operation. Of course, both HP/Dell/Lenovo/etc and Intel benefit from increased sales so they don't care.


Counterpoint - during typical (consumer) usage hardware spends most of the time idle. Hardware capable of sustaining the maximum workload indefinitely is likely to have a lower maximum in practice. Unsustainable bursts are likely to provide higher overall performance for typical workloads, so it makes sense to optimize the hardware design for those.


I guess, but workstation class laptops still overheat, so again, piss poor design.


I'd be willing to bet that a "workstation class" laptop that had sufficient cooling to run continuous benchmarks without overheating would be quite unpopular in the market, because of the weight/size burden.


How did you determine that 100c is safe?


The manufacturer did and put it in the datasheet of the CPU


In extension to what the other commenter said, critical temperature that is listed is usually higher than the maximum operating temperature so it is safe to run at the maximum.


It is safe only because it throttles heavily. Shutdown temperature is just 5 degrees higher btw.

At 100 the processor is at high risk of damage, which is why there's a built-in throttling mechanism.

Just imo, if it was safe it would be running at full speed (or at least max base clock) at that temperature.

As someone else said, they're made for burst operation these days, but again, that does not excuse manufacturers using subpar cooling.

I can see the majority is fine with it, but I'm not. A 15-25% failure rate in 2 years would make any other product a rotten lemon. But somehow it's acceptable for computers. Probably because people replace them every 2 years regardless, which is another insanity on its own.


I personally run my hardware for longevity (larger PSU, good temperature ranges, minimal overclocking, etc.) and upgrade every two years but I migrate my tech. downwards. Example: Gaming Rig -> Misc. Project/Guest PC -> NAS/Media Box. Working in tech. you very often make a salary that allows you to upgrade all of these rigs every year with no real consequences, but I think if nothing else we should be trying to reduce tech. waste. Semiconductor manufacturing has a large carbon footprint just like everything else.


I was testing overclocks when this happened. This was also a very old intel chip.


By this proposal's logic, it would make total sense for processes on a boosting CPU core to report more than 100% cpu usage.


“Military power “ seems to fit perfectly…

> The etiology of military power is from War Emergency Power (WEP) which in the WWII era was a higher than normal rating power (i.e. >100% rated power) setting on an aircraft engine. Such power settings were approved for short durations (typically 5 minutes or less) such as takeoff and battle maneuvers.

> The term was quickly shortened to military power.


Please let's not add yet more military fetishizing to IT. "Military power" should refer to things like military engines, not totally unrelated electronic performance boost modes.


IT overlap is mostly because computers and the internet are military technology. Mainframes, IC's and the transistor itself were bankrolled by US military money.

The internet and GPS, the core technologies of our time, are both direct descendants of the US military.

You may not like it, but the entire IT and tech field are built on US military tech


Civilian and military engineering history are intertwined. The tech transfer doesn't go one way only.

In ancient times both mercantile and military incentives propelled the design of ships. Is fair to say the whole ship building field built on military tech?


You're just picking a point in time and calling that the beginning; every one of those things is based on previous tech with no military background.

Even if it were true that in the distant past a thing had a military application/funding, there's no reason to use their terms for new things which aren't military in nature. No commercial liner calls it's full speed "military power". It's just cosplay at this point.


If you go far enough back all of our technology arises from fire and the wheel.

There's plenty of room to debate how much of our technology came from the military, but I don't think naming is a big deal.

Take the word "screen" for example. The original meaning was a partition to protect from heat. Many of these were fabric, and led to "magic lantern" shows done with shadows. The word was repurposed again for the projection era. And yet again for tube TV's and beyond.

If you told somebody from 100 years ago to look at the screen they would have no idea what you're talking about because the original meaning is lost.

That's why I don't think it matters whether we use military terms for computer stuff. It doesn't have its original meaning anymore.


> I don't think naming is a big deal

Good, we agree, no more tacti-cool cosplaying in IT.


"Overdrive"


Turbo. I miss the old turbo button, even when it didn't actually do much of anything.


turbo actually slowed the cpu to 4.77 MHz (the speed of 8086/8088) because early programs's timing logic were based on 4.77 MHz clock frequency


This used to be the case -- XP era had it report as a percentage of the target frequency, so an Athlon XP would usually run at 110%. This is confusing to people who believe 100% is the maximum.


"these go to eleven!"

https://youtu.be/hW008FcKr3Q


Linux control groups and systemd CPU quotas use values larger than 100% for denoting more than one CPU.

https://www.freedesktop.org/software/systemd/man/systemd.res...

> The percentage specifies how much CPU time the unit shall get at maximum, relative to the total CPU time available on one CPU. Use values > 100% for allotting CPU time on more than one CPU.


Absolutely, if this really matters to you I think the only solution is to characterise the CPU over an afternoon or two and build a model based on the data rather than what the Manufacturer told you.


The OS can keep a record over time of the maximum frequency each CPU core has ever hit.

This will take into account machine to machine variance, and even environmental factors effecting maximum speed.


Oh, but it gets more fun. The same operation can take more clocks for various reasons... If I really undervolt my Zen 2 apu, power usage and benchmarks go way down, but clock frequency stays high; the CPU is clock stretching and it gets a lot less work done.

Anyway, current processors run a separate clock per core, and maximum clocks are only available when a small number of cores are active; if all cores are busy, that should really be 100%, even if each core is only doing 80% of max for a single core.

Mostly, I want to see % of time cpu is busy, and separately, stats on how throttled the cpu is, because it's hard to combine both into a coherent number. Maybe also some idea of how much of the core is being exercised, if it can be easily measured... I'd love to know when a program is keeping the cpu busy, but not making good use of it.


That strategy guarantees that a process that runs for multiple minutes while consuming all available CPU cycles will be reported as using 100% CPU at most during the first few seconds, after which it will usually be reported as using somewhere less than 90%, and realistically could be reported as low as 65%. How is this helpful?


Estimating x86 core frequency is a lot trickier than you've implied.


Ideally, in the article’s example, the program generating heat and causing the throttle would have the throttled time counted against it (as a separate metric).

That’s hard to do without hardware support, but I wonder how hard it would be to get a decent metric with currently available performance counters.


not only that, but modern CPUs aren't even deterministic in how they run an instruction. the result of any particular instruction is deterministic, but the location of the silicon and the amount of silicon used to complete the instruction is not deterministic anymore.

CPU Usage measured in percentage doesn't make sense on any CPU with modern performance features like speculative execution and branch prediction.


When the OS has asked the CPU to slow down to more closely match the performance currently required by the software, then it is somewhat misleading to report that an application is using 90+% of the CPU time, even if the CPU is actually spending 90+% of its time running that application.

However, when the CPU's speed has been reduced because it's too hot or the system is otherwise unable to allow the processor to sustain its full clock speed, you absolutely should see the Task Manager reporting 100% utilization. This scenario is what more often comes to mind when the term "throttled" is used.

If a hardware platform's power management capabilities make it impractical for the operating system to satisfy both of the above goals, then it should favor the latter goal, and err on the side of not lying to the user when your system is truly operating at its limits.


> However, when the CPU's speed has been reduced because it's too hot …

Maybe just measure overall system load as current CPU temperature as a percentage of maximum?


That's a decent method for some purposes, though it's not without its own flaws. For systems using integrated graphics or any laptop, the CPU's thermals are intertwined with the GPU's. You may also have a workload that sends a CPU core to its highest stable clock speed and safe voltage, but is still cooled well enough that the CPU doesn't get close to its temperature limits. Or you could have a situation where the CPU can tolerate a higher temperature, but it has to throttle so as not to burn the user because a fair chunk of that heat is being conducted up through the keyboard and down into the lap.


> but it has to throttle so as not to burn the user because a fair chunk of that heat is being conducted up through the keyboard and down into the lap.

I've never used a mobile device that seemed to have such considerations. They all seem perfectly happy to make their surface scalding hot. Do manufacturers actually care about this?


Perhaps in stable environments that would be reasonable, but it would also mean that if I put my laptop in the sun, the system load should become large and potentially (hot heatwave sun on a dark laptop) exceed 100% even if the laptop is completely idle.


Fun tangential anecdote regarding how interconnected and unintuitive CPU performance can be: I once made something run 20% faster by spawning a thread that did nothing but spin (i.e. while (true);).

I was trying to optimize some FEM code, toying with (hardcoded) solver parameters. On one console I had it spitting out the wall clock durations of time steps as the simulation was running, while on the other I was preparing the next run. I start compiling another version, and inexplicably the simulation in the other console gets faster. Like, 10%-20% less time taken per time step. "That must have been coincidence. There's no way the simulation got faster by compiling something in parallel." But curiosity got the better of me and I still investigated.

Watching the CPU speed with CPU-Z, it turned out that the simulation was indeed getting down-clocked, and that compiling something in parallel made the CPU run faster, speeding up the simulation too. WTF? And indeed, I could make the entire simulation run significantly faster by calling

    std::thread([](){ while (true); });
at the start of main.

Why? Well, the simulation happens to be extremely memory-bound (sparse mat-vec multiplication in inner loop). So the CPU is mostly waiting around for data to arrive. Apparently the CPU downclocks as a result. That would be fine, if not for the fact that the uncore/memory subsystem clock speed is directly tied to the current CPU speed. That's right: The program was memory-bound, hence the CPU clocked down, hence the uncore clocked down, hence memory accesses became slower.

Knowing that feedback loop, it makes perfect sense that keeping the CPU busy with a spinning thread improves performance. But it's still one big wtf.

This problem eventually went away as we parallelized more and more of the simulation, giving the CPU less reason to clock down. But for related reasons, the simulation still runs faster if you prevent hyperthreading (either by disabling it in BIOS or having num threads = num hardware cores). More threads don't improve memory bandwidth and the hyperthread pairs just step on each others toes.


I'm confused, how is COU speed tied to memory? AFAIK, memory is tied to CPU base speed which is almost always 100 MHz. The CPU then just scales it's own multiplier.


Northbridge frequency (as shown by CPU-Z) is correlated to CPU speed in my experiments. It's not one to one, but NB frequency definitely varies by a factor of two depending on CPU load.

What exact mechanism controls this is not clear to me (and I'm actually not sure if it's clear to anyone outside of Intel - the one paper [0] I found at the time was based on reverse engineering experiments). Nevertheless, CPU clock speed definitely affects Northbridge speed, as proven by the latter increasing from spinning a thread that never touches memory.

[0]: https://tu-dresden.de/zih/forschung/ressourcen/dateien/proje...

See section V.A:

> The results [...] indicate that uncore frequencies – in addition to EPB and stall cycles – depend on the core frequency of the fastest active core on the system.

(That conclusion is fully in line with my own observations.)

Also see the corresponding patent linked in the paper: https://patents.google.com/patent/WO2013137862A1/en


Power saving and "throttling" (usually it's un-turboing) are different cases that shouldn't be conflated; in one case the processor could run faster but in the other it can't. Ultimately we may want different metrics depending on what they're going to be used for. If you calculate relative to base frequency you will get utilization over 100% which is going to confuse some people.

Linux has done some work in this area with frequency-invariant utilization tracking: https://lwn.net/Articles/816388/


MacOS has a weird solution to throttling. They put a fake process in the process list which looks like it consuming x% of the resources but really it is just blocking some usage to allow the CPU to cool.


kernel_task is "real", but yeah, part of its responsibility is issuing NOOPs for thermal control: https://github.com/apple/darwin-xnu/blob/main/osfmk/kern/thr...


I wished they would have a separate, properly named process for it. My old MB Pro (late 2015) was throtteling a lot and it took me very long to recognize, as it was just "kernel task" gobbling up cpu. Getting the machine cleaned out at a service center did help a lot, I wonder, whether redoing the thermal paste on the cpu would have improved it further, but I ended up replacing it with a current model.


I love this tbh. It's a great indicator and allows me to easily use existing tracking/functions in Activity Monitor to assay thermal headroom at particular points etc


It's the right solution IMO. It doesn't artificially inflate the % of running processes but total % stays high, indicating no reserve capacity.


Linux has the same solution on my machine.


I think we can have more nuance than a choice of 50% or 100%. I fall in the 'Show 100%' camp for sure, showing 50% but not indicating why isn't particularly helpful without knowing the background here. I think a separate indicator for an overall throttling condition would be helpful. Show a 100% usage and a throttle indicator together.


Agree 100% (he!). The utilisation really needs to show busy as in % of presently available resources, and displaying what those are is a completely different concern.

For Windows, if anything I'd love to see an additional metric, where the app resource usage is presented in % of available cores.

Eg if I have a single thread app (i.e. a game) on let's say a 10 core CPU with SMP disabled, it might report 10% usage. But looking at the core graphs we see one at 100% with the other 9 idling. So the app is really maxed out at its architectural limit - ie 100%. It would be nice to have an additional column representing this - and maybe an idea of the foreground app's available % usage in other reporting (am I cpu limited, or GPU? people have bought Intel over AMD CPUs for this metric even though multi thread workload capacity has been far greater in AMDs).

Taking a step back again, especially for Ryzen CPUs unless manually locked you'll have different per core frequencies - so the article's suggestion becomes an almost impossible mission: A Mandelbrot processor can be split across all threads rather equally for an unthrottled 10 core run say at 4ghz, but a core limited config may be able to run it at 4.5ghz prior to thermal throttling - what is the real max to report in a 10 core run following that then? And on a very cold day, with fans at a higher RPM, the core limited run might hit a higher 4.8ghz - now what? You just can't; reporting on current CPU capability is a separate concern altogether.


> but not indicating why

Agree, you really want 2 indicators, vs 1 confusing hybrid.


Consider a tweak to how *nix utilities report currently:

* 100% per core's for the factory target speed (what a CPU is sold to run at non-turbo/boost/whatever)

* turbo / boost displays figures above 100% (per core)

* special fake accounting buckets for not just cpu_idle, but also cpu_powersave, cpu_thermal, etc. Any status caused by yielding or reducing the runable time of that core.

* An additional (does this exist already?) counter of executed instructions, in addition to the run time for each task / thread. On multicore systems this should be recorded for each core the thread can run on. Large NUMA systems might disable, or node restrict, this for obvious reasons.

This would give a clear picture of both the utilization of hardware as a whole, as well in the program context.


Determining how much is in use out of what is practically available is relatively easy. But determining what is 100% is still difficult to decide on, CPUs have too many frequency options to decide what the "real" 100% point is. Even picking the base frequency as the reference would be a pain to properly monitor and predict, and to take decisions based on that.


Its more about determining what throttling means, 100% for conditions. In any case, the tools we are used to using have traditionally showed a percentage scale of usage - seems like a reasonable measure even if its hard to measure.


I fall in the 50% camp because showing 100% when it's clearly not using 100% of what the CPU is capable of is misleading, particularly in troubleshooting performance issues.


I think either 50% or 100% is misleading without more information. In either case you can be left wondering where your bottleneck might be and your first guess probably is not something like more cooling.


I think showing 100% is far worse, because then I have to rely on my subjective experience of the interface I'm working with to notice that something is wrong. 100% isn't just misleading, it's wrong on a UX level. At least when it shows 50%, I immediately have a sense of, "wait, that's wrong" and I can even begin troubleshooting and look at the current clockspeed, temperature, core usage, etc.


This is precisely why all of the large volume cloud server farms I worked with turn off throttling: they need 100% predictable CPU utilization. I worked on power control strategies at Intel for quite some time, and we would often joke in server (Xeon) parts that it was pointless because all of our work was disabled.

Early throttles were 50% duty cycles, then L1 bubble injections, then V/F frequency scaling. The author only addresses the early mechanisms, but it gets even more complex with the PCUs in the later Xeons.

It is not an easy question to answer, but I think the question can be modified. A single number doesn't solve the problem, you need to know utilization in the context of throttling (and magnitude). Then decide what you are trying to solve: scheduling or app-level throttling? Personally, the Hz denominator should change if it used in utilization , since that covers the majority of cases. Any other case should read both metrics.

EDIT: Removed generalization and worded as anecdote.


I have managed several extremely large fleets of computers and I can tell you they all use frequency scaling. I seriously doubt that your statement applies to "most server farms" when properly weighted.


It depends on who pays for power. When I was running a couple thousand machines on managed hosting, we disabled frequency scaling, because it wasn't our power bill and predictability and ease of measurement is nice. But at places where we paid for power, then frequency scaling was enabled (and sometimes they'd throttle things down to manage hot racks etc)


True, I should have more accurately said: all large volume cloud vendors I worked with.


So, you've never used GCE, EC2, or Azure? Because they all offer frequency scaling.


Yes, I use them all the time. Did you see the part where I was talking about my customers?


> half-speed for whatever reason

IMHO the reason actually does matter.

Utilization should be relative to the maximum frequency the CPU could be running the same instructions at. If the CPU is throttling to save power, then it could be running the same instructions at a higher frequency, so utilization should be relative to the higher one. If it's throttling to lower the temperature, then it can't be running the same instructions at a higher frequency, so it's already maxed out at 100%.


> If it's throttling to lower the temperature…

This gives the ability to go over 100% before thermals equalize, since max power would follow a Newton cooling curve.


I don't follow. Whatever frequency the CPU is currently running at is a lower bound on the maximum frequency it could be running at, so it shouldn't exceed 100%?


But the thermals of the system means it can, for short periods of time, run much faster than long periods of time. I don't think a constantly changing 100%, that changes with temperature, is very useful. I would be more interested in whatever the sustainable 100% is, for normal room temperature.


>Utilization should be relative to the maximum frequency the CPU could be running the same instructions at

How do you define "maximum frequency"? For desktops it's relatively easy to define since they usually have good cooling and power delivery that they can always hit the top frequency, but for laptops this often doesn't apply. It might be only able to sustain top frequency for less than a minute before thermal throttling kicks in.


Whatever frequency it could execute the next cycle at, assuming there was no desire to minimize power usage. It's a transient measure, not a steady-state measure.


This will cause the system to vastly under-report % values during periods of low load, because the processor thinks it can run at a higher frequency than it can actually achieve/sustain. This is most notable on laptop processors. If I enable "high performance" mode on windows, it would cause the processor to run at its max possible frequency, around 4.1 GHz (which is near the theoretical max turbo frequency) when there's little to no load. However, when there's actual load (eg. 1 core fully loaded), the frequency would drop to around 3.5 GHz. It would get worse as more cores get loaded, all dropping all the way down to 2.3ghz.


What problem would this cause to me as a user, though? It sounds fine to me.


At 5% load, you might think there's 95% capacity to spare, but really it's actually something closer to 65%.


3.5 GHz instead of 4.1 GHz is a 15% difference, not 30%. I never expect multicore performance to be the same as single-core so if that's what you're assuming, it's an unreal assumption. I expect lower clock speeds for multicore regardless of what Task Manager shows.

But okay, so there's some discrepancy, whether 15% or 30%. Okay, so I might underestimate it, and... then what? What would actually go wrong?

I think what you're not realizing is that underestimating capacity is a very different situation from overestimating it (especially when we're talking about underestimating like 0.4 GHz vs. 4 GHz, as opposed to overestimating 2-3 GHz as 4 GHz). When I'm looking at CPU usages it's almost always to figure out who's overutilizing the CPU, not underutilizing it. If I underestimate utilization, then at worst, what happens is I launch a program that needs all cores for max throughput (say a video encoder?), and then get disappointed at its throughput being too small. This is so infrequent (if it happens at all for the average user) that the additional mental effort required to factor in the throttling is quite negligible, and the negative consequences are quite mild. Compare that to the case where I overestimate utilization: suddenly I semi-panic and try to kill the program using the most CPU, and boom, suddenly I'm at risk of losing a bunch of data. The difference is quite asymmetric.


It sounds like you don't ever engage in capacity planning. You should read about it, real usage numbers are very important.


It sounds like you don't ever use a computer as a consumer. Go ask your non-technical friends how often they're "capacity planning" when they pull up Task Manager.

And I'm just talking about what 99% of ordinary users expect most of the time. Nothing prevents you from measuring whatever metrics you want either. Obviously "I think this is what most people expect utilization to mean" obviously does not imply "I think you should be banned from anything else you might find useful"...


You've never heard someone say "my CPU is running hot all the time, I need to upgrade"?


The manufacturer doesn't know this, nor is it easy to determine. Is the max turbo frequency on air the max? On water (if so with what size rad with which fans)? At what ambient temperature? Is the max including tau, if so does that max change when it's about to expire? What about if the CPU was on ln² but the user has run out and can't add more to the cooling pot?

Every one of those numbers is different, and frequently the only way to determine if a frequency is possible to use for stable operations is to try it and see.


No, you're just making this hard. Just imagine the CPU decided it wouldn't deliberately throttle its frequency to save power. Obviously nothing stands in its way of doing that; that's what it would do if they didn't deliberately tell it otherwise. Now what frequency would it execute the next cycle at? Obviously it decides on some frequency, and it does that without you telling it whether it's air or water cooled. Whatever number it would decide on: use that number.


Sometimes reality is hard :) One source of CPU throttling which you appear to be ignoring is temperature. Another is platform power delivery. Both are not always knowable by the CPU in advance. The CPU obviously doesn't "decide on a number" by itself. Sometimes it asks for a certain voltage but because of Tau it gets a lower one. Sometime it starts to boost because of demand only to immediately thermal throttle and run slower.

You cannot deterministically know what max frequency the CPU could be running at in some future demand/power/temp state.


that doesn't solve the "relative measures make the graph useless" problem though, does it?


Hm I felt it does, but I might've missed some situation. Why do you feel it doesn't? Could you describe a scenario where it'd be misleading?


Another approach may be to take inspiration from the `cpu load` metric on *nix systems and go _above_ 100%. In this example, the CPU usage would be `200%`: The system would like to be doing twice as much as it's currently doing, but something's throttling it.

Of course, this opens up other issues with how to aggregate multiple cores, what the benchmark for 'max' should be, etc.

Perhaps the more fundamental answer is that there's no single metric that can sum up the situation for all use cases, in which case displaying '100%' would be more useful for a typical consumer while exposing multiple detailed metrics would be more useful for system admins and power users.


Not following. Linux reports each core as 100%. So an 8 core machine maxes at 800%.

So seeing 200 load doesn't indicate to me that it's throttled, but that it's using the equivalent of two full cores. Or did I misunderstand?


The context was load average, not CPU usage percentage. The load of 1 means (more or less, I'm simplifying) "all the time there one task ready to be run, so no other task is starved". In most systems/situations you'd see load <1. But there are specific cases where it would be silly high. For example I've seen >400 on a VoIP conference server.

It's more of a "pressure" / "need" measurement. And yeah, applying that to per-process CPU measurement would be interesting.


They're saying it can go over 1(00%) per core.


You might be thinking of the load average, not the cpu consumption of a particular process as reported, by, say, top or ps. The article is about the latter kind of tool.


As far as I know, "CPU usage" of a process has always been "percentage of time spent in it" so I think anything else is overthinking/overcomplicating things. It makes perfect sense for CPU usage to go up if the frequency starts falling due to throttling. That's why a separate indication of current CPU frequency is necessary.


I find this intuitive for thermal throttling, but not for power throttling, though I'm not sure changing the definition makes sense now (it might not even be practical). I've frequently fired up some process manager and seen it at > 50% CPU despite the system not running anything interesting, and it always takes me a second to realize that it's throttled down to like 400 MHz and the process manager itself is consuming what's left of the CPU.


If you say “there are two perspectives on reporting this fact”, what you really mean to say is “there are two things to report here”


Exactly, why not report both? I for one would want to know both.


"CPU usage" as in "how many time-slices did this process eat?" is pretty easy to understand and generally points at the right things ("What's eating all that CPU?", "Is this application using the correct number of threads?" etc.)

Trying to express "How much of the hypothetically available computational resources of the CPU did this application consume?" in a single number would seem like a futile exercise at best. VTune used to have something like this which IIRC was based on using all cores in parallel sections and IPC or something like that. It wasn't very meaningful, and is impacted by all sorts of factors.


Related, 50% CPU usage on a hyper-threaded CPU isn't 50%. It is usually closer to 80-90% depending on the workload. Something to watch out for when monitoring.


Yep, and Windows never goes above 100%, iirc. My linux box can hit 800% if I can keep all the cores fed. More confusion.


That seems like something entirely different. Windows is displaying cpu usage relative to all the cores (total load divided by total number of cores), whereas linux is displaying usage relative to one core (total load divided by one core).


Interesting. Does this mean that if you are not going to use all HT threads it's better to turn off HT?


SMT is usually a throughput win, and usually a latency loss. I was playing with my Threadripper a while back and for C++ builds of large projects, HT results in about a 10% improvement in compilation speed. 10% is a big deal and you should take it. The downside is that games had noticeably lower framerates even with the rest of the CPU idle (at least the games I play are bounded by single-thread performance across maybe 2-4 cores). I kind of blame Windows's scheduler there, since it should be able to say "hey, a game is running, don't schedule anything on the same physical core that has a game thread running on it", but I don't think it does. It might schedule both game thraeds on the same physical core and then they contend with each other and run 40% slower each. Also be careful about memory -- 64GB wasn't enough for 64 concurrent clang runs. You need a little bit more (but of course 128GB is the next installable increment). (I can also see more threads aggravating other resource constraints; notably disk IOPS, but I didn't notice a problem there myself. It's also possible that SMT increases power use and so decreases turbo speeds, and that might have an impact. I didn't measure that when I was testing.)

For me, I keep SMT turned off. The latency is more important than the throughput for my workstation, but if you do full builds of C++ regularly, you might want it on. Use the 10% time you're getting back to switch to a build system that can cache things, though.


> 64GB wasn't enough for 64 concurrent clang runs. You need a little bit more (but of course 128GB is the next installable increment).

That's not really true, you can mix and match memory, and you might not get ideal bandwidth, but for a lot of uses it's just fine.


Notably, triple-rank configurations work just fine. It's just not trivial to find both a dual-rank module and a single-rank module with the same components used on both (so all the sub-timings match and performance will be nice and proper).


FWIW, you can set core affinity for your game process.


Depends on the machine. The earliest implementations of HT worked by statically partitioning various caches and other resources in the processor core in half, which meant that a single-threaded process really could slow down by having HT enabled but not actively used. Newer desktop-class processors tend to have no significant downsides to leaving HT enabled, but there might still be some SMT implementations on niche products that don't handle this well.


There still are (and probably will always be) some workloads where using HT makes the whole task take longer, but unless you only run that kind of loads, optimizing is simply a matter of e.g. loading up to core numbers instead of threads when you run those loads on modern CPUs.


> The earliest implementations of HT worked by statically partitioning

Do you mean SMT in general? I don't think hyperthreading specifically has ever done that, but if I'm wrong I'd love to know more. (And AMD's version falls under "newer desktop-class processors")


I meant Intel HT specifically, but I'm going off memory here, and having trouble finding details on those old parts. Agner Fog's current microarchitecture manual doesn't mention HT in its discussion of the P4, but it does include at least one mention of static partitioning of the decoded op queue in the Atom core.

It also describes several instances where Intel's desktop cores used to devote specific resources to each thread on alternating clock cycles, but newer cores have progressively removed those limitations. However, these probably don't quite fit my original assertion because if the OS has literally HALTed one of the virtual CPUs, these alternating clock cycle limitations may have been temporarily removed.


My first experience with HT was on my dual P4 xeons. Performance with HT on was noticeably terrible. It felt like a dog and pony show. It was best to keep HT disabled then. I'm not sure when that changed, but I don't remember what I did on my subsequent Core 2 duo system, but I do have HT enabled on my current (8 or 9 year old) i7 3700 and don't notice any slowdowns. Last I looked, I had to look at very specific benchmarks to find measurable differences. Qualitatively, I don't feel a slow down, either, so I keep it enabled.


No, don't turn it off. One thread on a core will go full speed, and two threads on a core will do more work than one thread. It's just that your utilization graph will be misleading. A naive graph will assume that two threads do twice as much work as one, but the real improvement is much smaller.

If you turn off hyperthreading and keep the same exact workload, then instead of "the graph says 50% but it's really more like 80-90%", you'll have "the graph says 100% and it's correct". The numbers now accurately represent your lower capacity.


Linux VMs have the concept of "steal". It represents CPU cycles that were supposed to be available, but were taken away by the hypervisor for various reasons. Steal appears in CPU usage stats alongside other types of wasted cycles such as interrupt handling and I/O wait.

Perhaps that's something Microsoft can borrow and improve upon.


I think Apple's approach of attributing the capacity not available due to throttling (or various other reasons) to kernel_task might be the best. You can tell something is eating cpu, stuff isn't just stuck at 50% while the rest of the system looks idle.

Although then you have users trying to figure out how to kill kernel_task, which isn't great either...


That’s actually an awesome idea, just create a virtual "cpu_throttling" task and attribute the throttling to this task.


Users trying to kill kernel_task sounds like an uncaught design problem. Replace the kill options in context drop-downs with "Why can't I kill kernel_task?" and have it pop open a short explanation, do the same as a feedback message if anyone tries to kill in the terminal.

If the system doesn't do that already, either nobody considered that users would try that particular nonsensical thing, or nobody cared enough to let users find out immediately instead of wasting their time thinking the computer was at fault.


Yeah until I learned how to see the throttling with “pmset -g thermlog” I didn’t realize I was throttling so much.

I feel that should be made clear on the various CPU usage utilities.


That's actually a great approach, just rename the process to "thermal_throttle" and it all becomes clear.


I don't like to speak about "throttling" because all modern CPUs are in a closed-loop control system where the capacity of one core-second varies. There is no question about whether your CPU is throttled. It is, always. That leads to all the uncertainty about the denominator. We know how many cycles passed while a certain thread had the CPU, but we don't have very good ways to estimate the number of cycles that were available. If you take the analysis one layer deeper, does a program that waits on main memory while chasing pointers randomly use 100% of the CPU, or does it waste 99% of it, since it's not using most of the execution resources? Such a program could be said to be using 100% CPU time, but it won't respond to higher CPU clock speeds. When waiting for loads it makes no difference if time passes at 4GHz or 400MHz.

So anyway, it is complicated.


Reporting the clockspeed along with the used percentage makes the most sense.

“I’m using 100% of my cpu but I’m only at 1.2GHz, something is wrong”


It's the testing problem: Trying to report squish too many numbers into one.

The right reporting is to report both the numerator and the denominator. I'm using 800 MIPS out of 1200 MIPS available.


If my processor is marketed as 1200 MIPS but can run at 2000 MIPS for up to 0.5 seconds out of every 15 second window (due to thermal constraints), but only up to 1700 MIPS when executing SIMD instructions for 0.5 seconds and 1000 MIPS thereafter, and only up to 900 MIPS when it gets too hot because my laptop is sitting on a blanket instead of a hard surface, what is the correct denominator?


That's the point: it changes.

One representation is a line graph with two lines -- one is showing the maximum MIPS (right now), and one is showing the actively-used MIPS.

Another is a pie chart, where the size is the chart changes based on current capacity, and it is filled to how much of that capacity you're using.

And so on.

You have two dimensions. You don't want to squash that into a 1D representation.


Perhaps, the system should report two numbers: the % usage of the CPU overall resources and the % usage of the throttled resources.

There is no constraint that one has to use only one number. Perhaps, in 2021, CPUs are complex enough to deserve more than one number to give a useful representation of utilization, as it works with all sorts of factories/plants. Also, who would love to see in the metrics reported CPU usage per core?


thank you. I'd hazard a guess that an additional step toward information demarcation would prove beneficial: Bifurcate the task manager into an "everyday" mode by default that displays /currentState where curentState = throttling, user-lowered or heightened TDP via TDP UP + ample cooling for sustained loads: Indicate currenState next to a colored icon set & standard percentages.

Then, implement advanced mode that contains a deeply similar but mildly extended UI. Imagine Office with 1 ribbon tab, cleaned up, vs 3, filled to the brim with choice, tools. In principe, thinking of keeping the diff small but noticeable only slightly, like that.


What if my program has a large working set and the CPU spends 50% of its cycles waiting for memory fetches to complete? What is its CPU usage?


It only makes sense to report e.g. 50% in case of throttling if you also report e.g. 130% in case of boosting (on a single core).

Which could be useful.

Now throw HT in the mix and loose your mind.

I'm not sure there is a really better solution. Just document the one you choose, please!


I don't think this makes much sense. The maximum throughput of a CPU in terms of instructions depends on so many factors (thermal, memory, instruction mix) that trying to summarize all of them into a linearly-scaling "utilization" metric is a bit tricky. You can measure the fraction of time CPU cores are busy, the number of instructions executed, frequencies, etc. to get an idea, but only experimentation will tell you what "100%" of a system's capacity is.


Maybe there should be a separate indication of throttling so we don’t conflate with CPU usage.

Throttling is essentially determined by three factors: Energy usage, thermal saturation, and cooling. A bath tub analogy comes to mind: Energy usage represented by flow from a faucet, the heat sink is the water level in the tub, and the drain represents the cooling rate. Only energy usage could be plotted instantaneously while the others may have to be modeled and could change based on environmental factors.


Throttling should show up as a pseudo-process, "consuming" peak performance.


This is how MacOS presents thermal throttling in the Activity Monitor - there's a visible "kernel_task" process taking up CPU.

The downside is of course, that users see this process name chewing up 500%+ of CPU, google "kernel_task cpu", and find the commands to disable the thermal throttling to "fix" the issue!

https://eclecticlight.co/2019/02/25/playing-with-fire-dealin...


Here's another problem: what if a program is i/o-bottlenecked, and taking up 50% of the CPU's cycles. Because the CPU utilization is not 100%, it clocks down to 50% of its maximum clock rate, so now the program is taking up 100% of the CPU's cycles. This isn't throttling, it's just regular power management.

Clearly that's a different kind of situation; how do you distinguish the two?


I think there are two different use cases:

- measurement at thread level (top and the like) should report instructions retired per unit of time, not %CPU. We want to know how much work is being done. The explanation on why so much/so little requires more information (scheduler? cache misses? cpu throttling?)

- measurement at the CPU level (mpstat, etc) should report both %CPU (as "time active" vs "wall clock time") and active clock cycles per unit of time (dividing by clock stretching factor if used, and perhaps scaled to absolute max frequency and/or %CPU if one wants a percentage).

%CPU tells us whether we are making full use of the available time, and perhaps suggests to schedule threads differently if appropriate

Clock cycles tell us how much one CPU is affected by throttling (manual, automatic, due to C-state exits, etc), and this is an orthogonal indication on whether there is something at the system level that is making the CPU underutilized


> Another theory is that this should report 50% CPU usage, because even though that CPU-intensive program is causing the CPU to consume all of its available cycles, it is not consuming all off the cycles that are potentially available.

if they would be available wouldnt the CPU scaling kickin and ramp up the frequency as required? i think in this case it should really show up as 50%...

otherwise its somewhat similar to battery charge status... apparently nobody wants to know what percentage of the initial full capacity is left but just how much is left from what is potentially available... so in a low power state were throttling is done permanently at some frequency i would just like to know how much capacity is actually left...


The CPU may not be able to scale up the frequency to "100%" at that time due to factors that cannot be overridden, like a high-temp situation.

The problem is what is 100% when CPUs have gotten so complex around the frequency. What's the absolute reference, the base frequency, the single core turbo, the multi core turbo, overclock, AVX?

For batteries you usually get 2 estimates, a capacity percentage of the total practically possible, and one for how much usage you can get out of that based on recent or estimated usage.


I remember having debates in 1990, when SMP UNIX was still a new thing, about whether "load average" should be scaled by the number of processors. Things have only gotten messier since then. As usual, Brendan Gregg has a good take.

https://twitter.com/brendangregg/status/1411654427304333313

Personally, what I'd want to see is the proportion of max achievable IPC, and if that means getting used to numbers well below 100% (even when I'm doing everything right) then so be it. I can adjust my expectations and targets.


Maybe ditch relative (50%) for absolute (Hz)?

That's what we do for other things that have no fixed upper bound, e.g. disk/network i/o. We even do this for memory now given dynamic virtual memory, etc.


> “dynamic frequency scaling”, a feature that allows software to instruct the CPU to run at a lower speed, commonly known as “CPU throttling”

AFAIK that's not how it works in modern CPUs. Dynamic frequency scaling is a black box implemented in hardware. Software, and even OS kernels, have very little input over that. They can't subscribe for status updates. Even just getting the current frequency seems impossible, only indirectly by comparing RDTSC output (absolute time unscaled) with performance counters.


I talked about TDP Computing a lot. We are basically limited by Cooling of a product designed.

How about another measure, % of CPU TDP? Or some form of TDP measuring. If it is 100% CPU TDP, I know it is pushing as hard as it can.

( But then when your cooling aged you will be running at 100% TDP but at lower clock speed without realising it. )

Thinking about it this simple subject is really complicated.


I’d add “Energy saving / throttling” process and accounted it in another color on the graph (green vs blue), like in “world energy use over” google search. This way you’d have 50% green-busy CPU, and the other 50% available. Unthrottling would reduce green to lesser values, and 100% would always remain constant in absolute value.


I’d love to see how many watts my process is using. After all, energy is the resource each thread is really consuming.


On Linux there is powertop for that:

https://01.org/powertop/


Mac and Windows have something similar, though not as specific. They all appear to be rough estimates based on the CPU time and the frequency at the time the process ran.


I don’t think the article’s solution is a good one. The CPU metric is on its own really hard to understand and complex. I’d rather have the OS to understand and report when the CPU is throttled and do the reporting accordingly. So the metrics are easier to interpret.


Shouldn’t there be another metric that indicates what the capacity of the CPU is running at? It would be a important metric to monitor to detect issues like thermal based throttling or how often the CPU is going is utilizing boost features.


What? An African or European CPU?


If you show 50% usage then there are going to be plenty of customer service calls where customers complain about poor performance of the PC and then laptops will be compared based on this number by those that are not tech savvy …


Why not show the absolute % CPU usage and the % relative to the throttled capacity? No need to choose. For that matter, if throttled, also indicate the reason: heat, performance (any other reasons in might be?)


The other consideration is that you want to encourage application developers to use those lower power states, and not make it look like their program is artificially abusing available resources.


I think this article asks the wrong question. The correct question IMO is why is the frequency throttled instead of maxed out when a program is saturating the CPU?


If it is thermal throttling then it means cpu is maxed out for the given conditions.


I agree with that, but it seems like the use case specified is ordinary power conservation mode, not critical, thermally-induced power reduction mode.

And also, why are systems shipped with inadequate thermal control?


On OSX, I recall there's a fake "throttled" daemon that reports the CPU usage lost due to thermal throttling.

Name is probably wrong but it definitely exists.


ya kernel_task


All that matters to me is what % CPU is used compared to what is the maximum available on demand to a user at a given point in time.


If you want that approach, just add a line that says "thermal throttling: 50%" at the top. Then it still adds to 100% usage.


I disagree. Seeing 100% CPU but a low speed in windows task manager is a strong easy signal that you need to look at thermal issues. Taking away that information makes troubleshooting harder. It is also ‘truth’ in that moment and adjusting it to some other value based on what the cpu could do (but isnt doing right now) only serves to obfuscate imo.


It makes me sad that people at Microsoft don't seem to learn from history, because we have precedent that makes perfect sense: report percent (or fractions of 1) between work and what'd be 100% at that moment.


reporting 50% utilization while CPU is heavily throttling sure makes Microsoft strategic partner Intel happy, a total coincident I imagine


African or European?


It should bea bar chart with the entire bar being the potential, and a red area showing the throttling as a percent. Throttling + actual execution total to 100%


% is just not the right unit.


Whichever one is easier to implement. Worse is better.


I agree with the author. His thesis:

> While I sympathize with this point of view, I feel that reporting the CPU usage at 50% is a more accurate representation of the situation.


It’s a matter of what question you’re asking - how much CPU is X using and how much headroom does it have vs why is my computer slow.


Maldebrot set? Just look the word up my dude.


I assume it was just a typo.


As long as we’re being pedantic, Mandelbrot is a name.


Names are words.


No, names can consist of more than one word. "New York City" for example is three words making up one name.


New York City is a proper noun and can for grammatical purposes be considered one word. It's not as clear cut as programmers would like to believe.

https://www.quora.com/Are-names-considered-words?share=1


"New York City" is a proper name but not a proper noun. You may want to look up the difference if you are convinced otherwise.


"Current linguistics makes a distinction between proper nouns and proper names but this distinction is not universally observed and sometimes it is observed but not rigorously."

Unbelievable how dense people play sometimes. It was clear what I meant when I said word, and in fact, I did refer to a single word if you want to play that game: Mandelbrot.

"Maldebrot set" or whatever it said before wouldn't even give you the correct search results. It serves a purpose to correct this, whereas this is just grammatical bickering for no apparent reason other than "he started it". I pointed out an important mistake, you're just trying to find substance in an argument that holds as much water as a sieve.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: