blob: 5108afb3645c49d948cd97e7c827b2d5599e499e [file] [log] [blame]
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -07001
2 The Resource Counter
3
4The resource counter, declared at include/linux/res_counter.h,
5is supposed to facilitate the resource management by controllers
6by providing common stuff for accounting.
7
8This "stuff" includes the res_counter structure and routines
9to work with it.
10
11
12
131. Crucial parts of the res_counter structure
14
15 a. unsigned long long usage
16
17 The usage value shows the amount of a resource that is consumed
18 by a group at a given time. The units of measurement should be
19 determined by the controller that uses this counter. E.g. it can
20 be bytes, items or any other unit the controller operates on.
21
22 b. unsigned long long max_usage
23
24 The maximal value of the usage over time.
25
26 This value is useful when gathering statistical information about
27 the particular group, as it shows the actual resource requirements
28 for a particular group, not just some usage snapshot.
29
30 c. unsigned long long limit
31
32 The maximal allowed amount of resource to consume by the group. In
33 case the group requests for more resources, so that the usage value
34 would exceed the limit, the resource allocation is rejected (see
35 the next section).
36
37 d. unsigned long long failcnt
38
39 The failcnt stands for "failures counter". This is the number of
40 resource allocation attempts that failed.
41
42 c. spinlock_t lock
43
44 Protects changes of the above values.
45
46
47
482. Basic accounting routines
49
Andrea Righi5341cfa2009-04-13 14:39:58 -070050 a. void res_counter_init(struct res_counter *rc,
51 struct res_counter *rc_parent)
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -070052
53 Initializes the resource counter. As usual, should be the first
54 routine called for a new counter.
55
Andrea Righi5341cfa2009-04-13 14:39:58 -070056 The struct res_counter *parent can be used to define a hierarchical
57 child -> parent relationship directly in the res_counter structure,
58 NULL can be used to define no relationship.
59
60 c. int res_counter_charge(struct res_counter *rc, unsigned long val,
61 struct res_counter **limit_fail_at)
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -070062
63 When a resource is about to be allocated it has to be accounted
64 with the appropriate resource counter (controller should determine
65 which one to use on its own). This operation is called "charging".
66
67 This is not very important which operation - resource allocation
68 or charging - is performed first, but
69 * if the allocation is performed first, this may create a
70 temporary resource over-usage by the time resource counter is
71 charged;
72 * if the charging is performed first, then it should be uncharged
73 on error path (if the one is called).
74
Andrea Righi5341cfa2009-04-13 14:39:58 -070075 If the charging fails and a hierarchical dependency exists, the
76 limit_fail_at parameter is set to the particular res_counter element
77 where the charging failed.
78
79 d. int res_counter_charge_locked
Frederic Weisbecker4d8438f2012-04-25 01:11:35 +020080 (struct res_counter *rc, unsigned long val, bool force)
Andrea Righi5341cfa2009-04-13 14:39:58 -070081
82 The same as res_counter_charge(), but it must not acquire/release the
83 res_counter->lock internally (it must be called with res_counter->lock
Frederic Weisbecker4d8438f2012-04-25 01:11:35 +020084 held). The force parameter indicates whether we can bypass the limit.
Andrea Righi5341cfa2009-04-13 14:39:58 -070085
Glauber Costa50bdd432012-12-18 14:22:04 -080086 e. u64 res_counter_uncharge[_locked]
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -070087 (struct res_counter *rc, unsigned long val)
88
89 When a resource is released (freed) it should be de-accounted
90 from the resource counter it was accounted to. This is called
Glauber Costa50bdd432012-12-18 14:22:04 -080091 "uncharging". The return value of this function indicate the amount
92 of charges still present in the counter.
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -070093
Andrea Righi5341cfa2009-04-13 14:39:58 -070094 The _locked routines imply that the res_counter->lock is taken.
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -070095
Glauber Costa50bdd432012-12-18 14:22:04 -080096 f. u64 res_counter_uncharge_until
Frederic Weisbecker2bb2ba92012-05-29 15:07:03 -070097 (struct res_counter *rc, struct res_counter *top,
Kees Cook8a38db12013-12-05 15:38:19 -030098 unsigned long val)
Frederic Weisbecker2bb2ba92012-05-29 15:07:03 -070099
Masanari Iida8173d5a2013-12-22 00:57:33 +0900100 Almost same as res_counter_uncharge() but propagation of uncharge
101 stops when rc == top. This is useful when kill a res_counter in
Frederic Weisbecker2bb2ba92012-05-29 15:07:03 -0700102 child cgroup.
103
Pavel Emelyanovfaebe9f2008-04-29 01:00:18 -0700104 2.1 Other accounting routines
105
106 There are more routines that may help you with common needs, like
107 checking whether the limit is reached or resetting the max_usage
108 value. They are all declared in include/linux/res_counter.h.
109
110
111
1123. Analyzing the resource counter registrations
113
114 a. If the failcnt value constantly grows, this means that the counter's
115 limit is too tight. Either the group is misbehaving and consumes too
116 many resources, or the configuration is not suitable for the group
117 and the limit should be increased.
118
119 b. The max_usage value can be used to quickly tune the group. One may
120 set the limits to maximal values and either load the container with
121 a common pattern or leave one for a while. After this the max_usage
122 value shows the amount of memory the container would require during
123 its common activity.
124
125 Setting the limit a bit above this value gives a pretty good
126 configuration that works in most of the cases.
127
128 c. If the max_usage is much less than the limit, but the failcnt value
129 is growing, then the group tries to allocate a big chunk of resource
130 at once.
131
132 d. If the max_usage is much less than the limit, but the failcnt value
133 is 0, then this group is given too high limit, that it does not
134 require. It is better to lower the limit a bit leaving more resource
135 for other groups.
136
137
138
1394. Communication with the control groups subsystem (cgroups)
140
141All the resource controllers that are using cgroups and resource counters
142should provide files (in the cgroup filesystem) to work with the resource
143counter fields. They are recommended to adhere to the following rules:
144
145 a. File names
146
147 Field name File name
148 ---------------------------------------------------
149 usage usage_in_<unit_of_measurement>
150 max_usage max_usage_in_<unit_of_measurement>
151 limit limit_in_<unit_of_measurement>
152 failcnt failcnt
153 lock no file :)
154
155 b. Reading from file should show the corresponding field value in the
156 appropriate format.
157
158 c. Writing to file
159
160 Field Expected behavior
161 ----------------------------------
162 usage prohibited
163 max_usage reset to usage
164 limit set the limit
165 failcnt reset to zero
166
167
168
1695. Usage example
170
171 a. Declare a task group (take a look at cgroups subsystem for this) and
172 fold a res_counter into it
173
174 struct my_group {
175 struct res_counter res;
176
177 <other fields>
178 }
179
180 b. Put hooks in resource allocation/release paths
181
182 int alloc_something(...)
183 {
184 if (res_counter_charge(res_counter_ptr, amount) < 0)
185 return -ENOMEM;
186
187 <allocate the resource and return to the caller>
188 }
189
190 void release_something(...)
191 {
192 res_counter_uncharge(res_counter_ptr, amount);
193
194 <release the resource>
195 }
196
197 In order to keep the usage value self-consistent, both the
198 "res_counter_ptr" and the "amount" in release_something() should be
199 the same as they were in the alloc_something() when the releasing
200 resource was allocated.
201
202 c. Provide the way to read res_counter values and set them (the cgroups
203 still can help with it).
204
205 c. Compile and run :)