Aleksa Sarai | 917d8e2 | 2015-06-13 03:21:58 +1000 | [diff] [blame] | 1 | Process Number Controller |
| 2 | ========================= |
| 3 | |
| 4 | Abstract |
| 5 | -------- |
| 6 | |
| 7 | The process number controller is used to allow a cgroup hierarchy to stop any |
| 8 | new tasks from being fork()'d or clone()'d after a certain limit is reached. |
| 9 | |
| 10 | Since it is trivial to hit the task limit without hitting any kmemcg limits in |
| 11 | place, PIDs are a fundamental resource. As such, PID exhaustion must be |
| 12 | preventable in the scope of a cgroup hierarchy by allowing resource limiting of |
| 13 | the number of tasks in a cgroup. |
| 14 | |
| 15 | Usage |
| 16 | ----- |
| 17 | |
| 18 | In order to use the `pids` controller, set the maximum number of tasks in |
| 19 | pids.max (this is not available in the root cgroup for obvious reasons). The |
| 20 | number of processes currently in the cgroup is given by pids.current. |
| 21 | |
| 22 | Organisational operations are not blocked by cgroup policies, so it is possible |
| 23 | to have pids.current > pids.max. This can be done by either setting the limit to |
| 24 | be smaller than pids.current, or attaching enough processes to the cgroup such |
| 25 | that pids.current > pids.max. However, it is not possible to violate a cgroup |
| 26 | policy through fork() or clone(). fork() and clone() will return -EAGAIN if the |
| 27 | creation of a new process would cause a cgroup policy to be violated. |
| 28 | |
| 29 | To set a cgroup to have no limit, set pids.max to "max". This is the default for |
| 30 | all new cgroups (N.B. that PID limits are hierarchical, so the most stringent |
| 31 | limit in the hierarchy is followed). |
| 32 | |
| 33 | pids.current tracks all child cgroup hierarchies, so parent/pids.current is a |
| 34 | superset of parent/child/pids.current. |
| 35 | |
| 36 | Example |
| 37 | ------- |
| 38 | |
| 39 | First, we mount the pids controller: |
| 40 | # mkdir -p /sys/fs/cgroup/pids |
| 41 | # mount -t cgroup -o pids none /sys/fs/cgroup/pids |
| 42 | |
| 43 | Then we create a hierarchy, set limits and attach processes to it: |
| 44 | # mkdir -p /sys/fs/cgroup/pids/parent/child |
| 45 | # echo 2 > /sys/fs/cgroup/pids/parent/pids.max |
| 46 | # echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs |
| 47 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 48 | 2 |
| 49 | # |
| 50 | |
| 51 | It should be noted that attempts to overcome the set limit (2 in this case) will |
| 52 | fail: |
| 53 | |
| 54 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 55 | 2 |
| 56 | # ( /bin/echo "Here's some processes for you." | cat ) |
| 57 | sh: fork: Resource temporary unavailable |
| 58 | # |
| 59 | |
| 60 | Even if we migrate to a child cgroup (which doesn't have a set limit), we will |
| 61 | not be able to overcome the most stringent limit in the hierarchy (in this case, |
| 62 | parent's): |
| 63 | |
| 64 | # echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs |
| 65 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 66 | 2 |
| 67 | # cat /sys/fs/cgroup/pids/parent/child/pids.current |
| 68 | 2 |
| 69 | # cat /sys/fs/cgroup/pids/parent/child/pids.max |
| 70 | max |
| 71 | # ( /bin/echo "Here's some processes for you." | cat ) |
| 72 | sh: fork: Resource temporary unavailable |
| 73 | # |
| 74 | |
| 75 | We can set a limit that is smaller than pids.current, which will stop any new |
| 76 | processes from being forked at all (note that the shell itself counts towards |
| 77 | pids.current): |
| 78 | |
| 79 | # echo 1 > /sys/fs/cgroup/pids/parent/pids.max |
| 80 | # /bin/echo "We can't even spawn a single process now." |
| 81 | sh: fork: Resource temporary unavailable |
| 82 | # echo 0 > /sys/fs/cgroup/pids/parent/pids.max |
| 83 | # /bin/echo "We can't even spawn a single process now." |
| 84 | sh: fork: Resource temporary unavailable |
| 85 | # |