Hacker News new | past | comments | ask | show | jobs | submit login

If you have one pod that has Burstable QoS, perhaps because it has a request and not a limit, its CPU mask will be populated by every CPU on the box, less one for the Kubelet and other node services, less all the CPUs requested by pods with Guaranteed QoS. Pods with Guaranteed QoS will have exactly the number of CPUs they asked for, no more or less, and consequently their GOMAXPROCS is consistent. Everyone else will see fewer or more CPUs as Guaranteed pods arrive and depart from the node.



If by "CPU mask" you refer to the `sched_getaffinity` syscall, I can't reproduce this behavior.

What I tried: I created a "Burstable" Pod and run `nproc` [0] on it. It returned N CPUs (N > 1).

Then I created a "Guaranteed QoS" Pod with both requests and limit set to 1 CPU. `nproc` returned N CPUs on it.

I went back to the "Burstable" Pod. It returned N.

I created a fresh "Burstable" Pod and run `nproc` on it, got N again. Please note that the "Guaranteed QoS" Pod is still running.

> Pods with Guaranteed QoS will have exactly the number of CPUs they asked for, no more or less

Well, in my case I asked for 1 CPU and got more, i.e. N CPUs.

Also, please note that Pods might ask for fractional CPUs.

[0]: coreutils `nproc` program uses `sched_getaffinity` syscall under the hood, at least on my system. I've just checked it with `strace` to be sure.


I don't know what nproc does. Consider `taskset`


I re-did the experiment again with `taskset` and got the same results, i.e. the mask is independent of creation of the "Guaranteed QoS" Pod.

FWIW, `taskset` uses the same syscall as `nproc` (according to `strace`).


Perhaps it is an artifact of your and my various container runtimes. For me, in a guaranteed qos pod, taskset shows just 1 visible CPU for a Guaranteed QoS pod with limit=request=1.

  # taskset -c -p 1
  pid 1's current affinity list: 1

  # nproc
  1
I honestly do not see how it can work otherwise.


After reading https://kubernetes.io/docs/tasks/administer-cluster/cpu-mana..., I think we have different policies set for the CPU Manager.

In my case it's `"cpuManagerPolicy": "none"` and I suppose you're using `"static"` policy.

Well, TIL. Thanks!


TIL also. The difference between guaranteed and burstable seems meaningless without this setting.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: