blob: e831cb2b83943b0b8a9daa5dcea73bf042d66976 [file] [log] [blame]
Li Zefan3b1b3f62008-11-12 13:26:50 -08001The cgroup freezer is useful to batch job management system which start
Matt Helsleybde5ab62008-10-18 20:27:24 -07002and stop sets of tasks in order to schedule the resources of a machine
3according to the desires of a system administrator. This sort of program
4is often used on HPC clusters to schedule access to the cluster as a
5whole. The cgroup freezer uses cgroups to describe the set of tasks to
6be started/stopped by the batch job management system. It also provides
7a means to start and stop the tasks composing the job.
8
Li Zefan3b1b3f62008-11-12 13:26:50 -08009The cgroup freezer will also be useful for checkpointing running groups
Matt Helsleybde5ab62008-10-18 20:27:24 -070010of tasks. The freezer allows the checkpoint code to obtain a consistent
11image of the tasks by attempting to force the tasks in a cgroup into a
12quiescent state. Once the tasks are quiescent another task can
13walk /proc or invoke a kernel interface to gather information about the
14quiesced tasks. Checkpointed tasks can be restarted later should a
15recoverable error occur. This also allows the checkpointed tasks to be
16migrated between nodes in a cluster by copying the gathered information
17to another node and restarting the tasks there.
18
Li Zefan3b1b3f62008-11-12 13:26:50 -080019Sequences of SIGSTOP and SIGCONT are not always sufficient for stopping
Matt Helsleybde5ab62008-10-18 20:27:24 -070020and resuming tasks in userspace. Both of these signals are observable
21from within the tasks we wish to freeze. While SIGSTOP cannot be caught,
22blocked, or ignored it can be seen by waiting or ptracing parent tasks.
23SIGCONT is especially unsuitable since it can be caught by the task. Any
24programs designed to watch for SIGSTOP and SIGCONT could be broken by
25attempting to use SIGSTOP and SIGCONT to stop and resume tasks. We can
26demonstrate this problem using nested bash shells:
27
28 $ echo $$
29 16644
30 $ bash
31 $ echo $$
32 16690
33
34 From a second, unrelated bash shell:
35 $ kill -SIGSTOP 16690
Rafael J. Wysocki5f111612011-11-06 22:00:20 +010036 $ kill -SIGCONT 16690
Matt Helsleybde5ab62008-10-18 20:27:24 -070037
Rafael J. Wysocki5f111612011-11-06 22:00:20 +010038 <at this point 16690 exits and causes 16644 to exit too>
Matt Helsleybde5ab62008-10-18 20:27:24 -070039
Li Zefan3b1b3f62008-11-12 13:26:50 -080040This happens because bash can observe both signals and choose how it
Matt Helsleybde5ab62008-10-18 20:27:24 -070041responds to them.
42
Li Zefan3b1b3f62008-11-12 13:26:50 -080043Another example of a program which catches and responds to these
Matt Helsleybde5ab62008-10-18 20:27:24 -070044signals is gdb. In fact any program designed to use ptrace is likely to
45have a problem with this method of stopping and resuming tasks.
46
Li Zefan3b1b3f62008-11-12 13:26:50 -080047In contrast, the cgroup freezer uses the kernel freezer code to
Matt Helsleybde5ab62008-10-18 20:27:24 -070048prevent the freeze/unfreeze cycle from becoming visible to the tasks
49being frozen. This allows the bash example above and gdb to run as
50expected.
51
Tejun Heoef9fe982012-11-09 09:12:30 -080052The cgroup freezer is hierarchical. Freezing a cgroup freezes all
Yuan Sun55d01592015-09-22 17:00:06 +080053tasks belonging to the cgroup and all its descendant cgroups. Each
Tejun Heoef9fe982012-11-09 09:12:30 -080054cgroup has its own state (self-state) and the state inherited from the
55parent (parent-state). Iff both states are THAWED, the cgroup is
56THAWED.
Matt Helsleybde5ab62008-10-18 20:27:24 -070057
Tejun Heoef9fe982012-11-09 09:12:30 -080058The following cgroupfs files are created by cgroup freezer.
59
60* freezer.state: Read-write.
61
62 When read, returns the effective state of the cgroup - "THAWED",
63 "FREEZING" or "FROZEN". This is the combined self and parent-states.
64 If any is freezing, the cgroup is freezing (FREEZING or FROZEN).
65
66 FREEZING cgroup transitions into FROZEN state when all tasks
67 belonging to the cgroup and its descendants become frozen. Note that
68 a cgroup reverts to FREEZING from FROZEN after a new task is added
69 to the cgroup or one of its descendant cgroups until the new task is
70 frozen.
71
72 When written, sets the self-state of the cgroup. Two values are
73 allowed - "FROZEN" and "THAWED". If FROZEN is written, the cgroup,
74 if not already freezing, enters FREEZING state along with all its
75 descendant cgroups.
76
77 If THAWED is written, the self-state of the cgroup is changed to
78 THAWED. Note that the effective state may not change to THAWED if
79 the parent-state is still freezing. If a cgroup's effective state
80 becomes THAWED, all its descendants which are freezing because of
81 the cgroup also leave the freezing state.
82
83* freezer.self_freezing: Read only.
84
85 Shows the self-state. 0 if the self-state is THAWED; otherwise, 1.
86 This value is 1 iff the last write to freezer.state was "FROZEN".
87
88* freezer.parent_freezing: Read only.
89
90 Shows the parent-state. 0 if none of the cgroup's ancestors is
91 frozen; otherwise, 1.
92
93The root cgroup is non-freezable and the above interface files don't
94exist.
Li Zefan3b1b3f62008-11-12 13:26:50 -080095
Matt Helsleybde5ab62008-10-18 20:27:24 -070096* Examples of usage :
97
Jörg Sommerf6e07d32011-06-15 12:59:45 -070098 # mkdir /sys/fs/cgroup/freezer
99 # mount -t cgroup -ofreezer freezer /sys/fs/cgroup/freezer
100 # mkdir /sys/fs/cgroup/freezer/0
101 # echo $some_pid > /sys/fs/cgroup/freezer/0/tasks
Matt Helsleybde5ab62008-10-18 20:27:24 -0700102
103to get status of the freezer subsystem :
104
Jörg Sommerf6e07d32011-06-15 12:59:45 -0700105 # cat /sys/fs/cgroup/freezer/0/freezer.state
Matt Helsleybde5ab62008-10-18 20:27:24 -0700106 THAWED
107
108to freeze all tasks in the container :
109
Jörg Sommerf6e07d32011-06-15 12:59:45 -0700110 # echo FROZEN > /sys/fs/cgroup/freezer/0/freezer.state
111 # cat /sys/fs/cgroup/freezer/0/freezer.state
Matt Helsleybde5ab62008-10-18 20:27:24 -0700112 FREEZING
Jörg Sommerf6e07d32011-06-15 12:59:45 -0700113 # cat /sys/fs/cgroup/freezer/0/freezer.state
Matt Helsleybde5ab62008-10-18 20:27:24 -0700114 FROZEN
115
116to unfreeze all tasks in the container :
117
Jörg Sommerf6e07d32011-06-15 12:59:45 -0700118 # echo THAWED > /sys/fs/cgroup/freezer/0/freezer.state
119 # cat /sys/fs/cgroup/freezer/0/freezer.state
Matt Helsleybde5ab62008-10-18 20:27:24 -0700120 THAWED
121
122This is the basic mechanism which should do the right thing for user space task
123in a simple scenario.