Not correct os.cpus().length inside the docker container with cpus limited. #28762

NicolasSchwarzer · 2019-07-19T11:29:24Z

Version: 10.16.0
Host machine: MacOS 10.14.5
Docker version: 18.09.2
Docker image: node:10.16.0
Subsystem: os

Bug

I found that inside a docker container with cpus limited, os.cpus().length still outputs the host machine's cpu amount instead of limited cpu amount.

How to reproduce

$ sysctl -n hw.physicalcpu
2
$ docker container run --rm -it --memory="3072m" --cpus="1" node:10.16.0 bash
root@38997a92e59f:/# node
> require('os').cpus().length
2
> .exit
root@38997a92e59f:/# exit
exit

As you see my host machine has 2 cpus. I start a container based on image node:10.16.0, and I set cpus of this container limit to 1. But inside the container, I can still get 2 cpus by using os.cpus().length, which supposed to be 1 as expected.

I've googled a lot and found that there's no hard distinction between inside and outside the docker container, so the os module just reads cpus' information from host machine. But java has solved this problem, as you can see here.

What the bug will cause

If we run multiple docker containers on our host machine and limit the resource of each container, we cannot limit the node application inside the container which are not max workers specified, even not to mention those node packages nesting deeply in the dependency tree. It will fork itself as many as the amount of host machine's cpus, which will exceed the container's cpu limitation, and cause the application running very slowly.

Most node packages use os.cpus().length to decide how many child process it should fork, so I think this bug should be taken a look.

The text was updated successfully, but these errors were encountered:

bnoordhuis · 2019-07-19T12:53:39Z

That OpenJDK bug report is about Docker on Linux. Node.js / libuv reads /proc/cpuinfo on that platform and Docker intercepts that by bind-mounting /proc, I believe.

On macOS, libuv calls host_processor_info(). If there is a container-aware approach I'm all ears but I don't know of any.

NicolasSchwarzer · 2019-07-19T17:04:21Z

Yes, and despite of what operation system the host machine is, not mater Linux or MacOS, once we start a docker container from image node:10.16.0 on it, inside the container, the node application 'actually' runs on Linux.

$ docker container run --rm -it --memory="3072m" --cpus="1" node:10.16.0 bash
root@aa04277a4ced:/# uname -a
Linux aa04277a4ced 4.9.125-linuxkit #1 SMP Fri Sep 7 08:20:28 UTC 2018 x86_64 GNU/Linux
root@aa04277a4ced:/# ls -l /proc/cpuinfo
-r--r--r-- 1 root root 0 Jul 19 16:40 /proc/cpuinfo
root@aa04277a4ced:/# exit
exit

And as you can see, there's the /proc/cpuinfo file.

I'm not sure whether this is a bug of docker, or should be fixed in node. But it seems java has already fixed this, it will calculate the actual available processors in docker container environment. So I hope this can be fixed in node too, or should I raise an issue to docker team?

bnoordhuis · 2019-07-21T19:43:47Z

You're right, I forgot that Docker-on-macOS starts up a full Linux VM. Can you paste the contents of that /proc/cpuinfo file?

NicolasSchwarzer · 2019-07-22T04:27:11Z

Of course, please see below:

$ docker container run --rm -it --memory="3072m" --cpus="1" node:10.16.0 bash
root@30bed28705ed:/# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 142
model name	: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
stepping	: 9
cpu MHz		: 2300.000
cache size	: 4096 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch kaiser fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 4608.00
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 142
model name	: Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz
stepping	: 9
cpu MHz		: 2300.000
cache size	: 4096 KB
physical id	: 1
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 22
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht pbe syscall nx pdpe1gb lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq dtes64 ds_cpl ssse3 sdbg fma cx16 xtpr pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch kaiser fsgsbase bmi1 hle avx2 bmi2 erms rtm xsaveopt arat
bugs		: cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf
bogomips	: 4597.46
clflush size	: 64
cache_alignment	: 64
address sizes	: 39 bits physical, 48 bits virtual
power management:

root@30bed28705ed:/# exit
exit

bnoordhuis · 2019-07-22T10:12:54Z

Right, so there are indeed two online CPUs.

Docker implements --cpu=<x> by manipulating the cpu/cpuacct cgroups, i.e., by tweaking what share of the available CPUs or CPU time the process gets.

os.cpus().length is the wrong thing to use in that kind of environment because it reports the number of online CPUs. That's not what you're interested in.

What OpenJDK does is read the relevant settings from /sys/fs/cgroup. Exposing that as an API in Node.js is problematic because:

Too complex and platform-specific. The CPU controller has a lot of knobs: there's not just CPU time but also CPU sets and probably NUMA topology to consider. Other platforms have similar mechanisms that are just different enough to be awkward (e.g. FreeBSD jail resource limits.)
There's no need to make it a built-in because programs can read the relevant information from /sys/fs/cgroup themselves. It could - and probably should - start out as a npm module because that lets you iterate on the API without worrying about backwards compatibility.

I'm going to close this out as a wontfix because the output of os.cpus() is correct. If you want to pursue a new API, please open a feature request with a sketch of what you think it should look like.

Keep in mind though that even if your feature request is accepted, it'll still be a good idea to have it live as an npm module for a while, to shake out any issues.

NicolasSchwarzer · 2019-07-22T11:40:03Z

Okay, I get it, thanks for your answering. But a new API or a npm module may not solve my problem, here's my scenario:

I start a new docker container each time to run each project's building task, for the sake of isolation and security, and the build script is written by each project's developer which is uncontrollable and may be also risky. So if there was a project using webpack to bundle it's assets, and it used plugin terser-webpack-plugin and set parallel to true instead of a number, the minifying job would start os.cpus().length - 1 processes, which are unable to be reduced by restricting container's cpus.

I will raise an issue of feature later, because I think a new API is indeed needed, especially for nowadays most applications and services are deployed on kubernetes. The capability of retrieving actual available processor amount is very helpful. But I still hope this can be fixed in os.cpus().length.

harish2704 · 2020-07-31T20:07:32Z

@NicolasSchwarzer : I think this problem need to be fixed from "Docker" project.

But, even if docker doesn't provide a fix, we can solve it by using lxcfs . See https:/lxc/lxcfs#using-with-docker

jdmarshall · 2020-09-30T20:48:14Z

It looks like if you use cpuset-cpus then nproc is correct, but that leaves you in a bind when you have hosts with varying numbers of CPUs.

In a docker container, the result of os.cpus() can be wrong (nodejs/node#28762).

xiaoxiaojx · 2022-01-29T13:25:30Z

I solved it using this solution https:/xiaoxiaojx/get_cpus_length

enghitalo · 2022-09-11T18:24:22Z

I solved it using this solution https:/xiaoxiaojx/get_cpus_length

Still not working for me

╰─ sudo docker run --rm --name node_container_name --interactive --tty --cpuset-cpus="0" cores_test_node npm start                                                                                                                                                                                                                                                                                                      ─╯

> [email protected] start
> tsc && node dist/app.js

os.cpus().length:  8
CpusLength:  8

bnoordhuis added libuv Issues and PRs related to the libuv dependency or the uv binding. os Issues and PRs related to the os subsystem. labels Jul 19, 2019

bnoordhuis closed this as completed Jul 22, 2019

bnoordhuis added the wontfix Issues that will not be fixed. label Jul 22, 2019

NicolasSchwarzer mentioned this issue Jul 25, 2019

Feature request: os: os.availableProcessors() #28855

Closed

Unitech mentioned this issue Aug 22, 2020

PM2 failed at detect cpu core count Unitech/pm2#4347

Open

dotansimha mentioned this issue Oct 13, 2020

Allow to limit the amount of workers VincentBailly/yarn#36

Closed

danielpayetdev mentioned this issue Jan 24, 2022

fix(dep-graph): set the number of workers with environment variables nrwl/nx#8671

Closed

vsavkin pushed a commit to nrwl/nx that referenced this issue Jan 28, 2022

fix(dep-graph): set the number of workers with environment variables

b86feb1

In a docker container, the result of os.cpus() can be wrong (nodejs/node#28762).

Diokuz mentioned this issue Jul 6, 2022

[Feature]: use cpu limit for maxWorker, not cpu cores length jestjs/jest#12993

Closed

arcanis mentioned this issue Aug 6, 2023

Add workerPoolConcurrency as a configuration option yarnpkg/berry#5636

Closed

3 tasks

ukstv mentioned this issue Jun 24, 2024

feat: Use env variable to configure number of workers ceramicnetwork/ceramic-anchor-service#1233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not correct os.cpus().length inside the docker container with cpus limited. #28762

Not correct os.cpus().length inside the docker container with cpus limited. #28762

NicolasSchwarzer commented Jul 19, 2019

bnoordhuis commented Jul 19, 2019

NicolasSchwarzer commented Jul 19, 2019

bnoordhuis commented Jul 21, 2019

NicolasSchwarzer commented Jul 22, 2019

bnoordhuis commented Jul 22, 2019

NicolasSchwarzer commented Jul 22, 2019

harish2704 commented Jul 31, 2020

jdmarshall commented Sep 30, 2020

xiaoxiaojx commented Jan 29, 2022

enghitalo commented Sep 11, 2022

Not correct os.cpus().length inside the docker container with cpus limited. #28762

Not correct os.cpus().length inside the docker container with cpus limited. #28762

Comments

NicolasSchwarzer commented Jul 19, 2019

Bug

How to reproduce

What the bug will cause

bnoordhuis commented Jul 19, 2019

NicolasSchwarzer commented Jul 19, 2019

bnoordhuis commented Jul 21, 2019

NicolasSchwarzer commented Jul 22, 2019

bnoordhuis commented Jul 22, 2019

NicolasSchwarzer commented Jul 22, 2019

harish2704 commented Jul 31, 2020

jdmarshall commented Sep 30, 2020

xiaoxiaojx commented Jan 29, 2022

enghitalo commented Sep 11, 2022