Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 1 | Device Whitelist Controller |
| 2 | |
| 3 | 1. Description: |
| 4 | |
| 5 | Implement a cgroup to track and enforce open and mknod restrictions |
| 6 | on device files. A device cgroup associates a device access |
| 7 | whitelist with each cgroup. A whitelist entry has 4 fields. |
| 8 | 'type' is a (all), c (char), or b (block). 'all' means it applies |
| 9 | to all types and all major and minor numbers. Major and minor are |
| 10 | either an integer or * for all. Access is a composition of r |
| 11 | (read), w (write), and m (mknod). |
| 12 | |
| 13 | The root device cgroup starts with rwm to 'all'. A child device |
| 14 | cgroup gets a copy of the parent. Administrators can then remove |
| 15 | devices from the whitelist or add new entries. A child cgroup can |
Aristeu Rozanski | bd2953e | 2013-02-15 11:55:47 -0500 | [diff] [blame] | 16 | never receive a device access which is denied by its parent. |
Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 17 | |
| 18 | 2. User Interface |
| 19 | |
| 20 | An entry is added using devices.allow, and removed using |
| 21 | devices.deny. For instance |
| 22 | |
Jörg Sommer | f6e07d3 | 2011-06-15 12:59:45 -0700 | [diff] [blame] | 23 | echo 'c 1:3 mr' > /sys/fs/cgroup/1/devices.allow |
Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 24 | |
| 25 | allows cgroup 1 to read and mknod the device usually known as |
| 26 | /dev/null. Doing |
| 27 | |
Jörg Sommer | f6e07d3 | 2011-06-15 12:59:45 -0700 | [diff] [blame] | 28 | echo a > /sys/fs/cgroup/1/devices.deny |
Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 29 | |
Li Zefan | d823f6b | 2008-07-04 10:00:07 -0700 | [diff] [blame] | 30 | will remove the default 'a *:* rwm' entry. Doing |
| 31 | |
Jörg Sommer | f6e07d3 | 2011-06-15 12:59:45 -0700 | [diff] [blame] | 32 | echo a > /sys/fs/cgroup/1/devices.allow |
Li Zefan | d823f6b | 2008-07-04 10:00:07 -0700 | [diff] [blame] | 33 | |
| 34 | will add the 'a *:* rwm' entry to the whitelist. |
Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 35 | |
| 36 | 3. Security |
| 37 | |
| 38 | Any task can move itself between cgroups. This clearly won't |
| 39 | suffice, but we can decide the best way to adequately restrict |
| 40 | movement as people get some experience with this. We may just want |
| 41 | to require CAP_SYS_ADMIN, which at least is a separate bit from |
| 42 | CAP_MKNOD. We may want to just refuse moving to a cgroup which |
Chris Samuel | caa790b | 2009-01-17 00:01:18 +1100 | [diff] [blame] | 43 | isn't a descendant of the current one. Or we may want to use |
Serge E. Hallyn | 08ce5f1 | 2008-04-29 01:00:10 -0700 | [diff] [blame] | 44 | CAP_MAC_ADMIN, since we really are trying to lock down root. |
| 45 | |
| 46 | CAP_SYS_ADMIN is needed to modify the whitelist or move another |
| 47 | task to a new cgroup. (Again we'll probably want to change that). |
| 48 | |
| 49 | A cgroup may not be granted more permissions than the cgroup's |
| 50 | parent has. |
Aristeu Rozanski | bd2953e | 2013-02-15 11:55:47 -0500 | [diff] [blame] | 51 | |
| 52 | 4. Hierarchy |
| 53 | |
| 54 | device cgroups maintain hierarchy by making sure a cgroup never has more |
| 55 | access permissions than its parent. Every time an entry is written to |
| 56 | a cgroup's devices.deny file, all its children will have that entry removed |
| 57 | from their whitelist and all the locally set whitelist entries will be |
| 58 | re-evaluated. In case one of the locally set whitelist entries would provide |
| 59 | more access than the cgroup's parent, it'll be removed from the whitelist. |
| 60 | |
| 61 | Example: |
| 62 | A |
| 63 | / \ |
| 64 | B |
| 65 | |
| 66 | group behavior exceptions |
| 67 | A allow "b 8:* rwm", "c 116:1 rw" |
| 68 | B deny "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm" |
| 69 | |
| 70 | If a device is denied in group A: |
| 71 | # echo "c 116:* r" > A/devices.deny |
| 72 | it'll propagate down and after revalidating B's entries, the whitelist entry |
| 73 | "c 116:2 rwm" will be removed: |
| 74 | |
| 75 | group whitelist entries denied devices |
| 76 | A all "b 8:* rwm", "c 116:* rw" |
| 77 | B "c 1:3 rwm", "b 3:* rwm" all the rest |
| 78 | |
| 79 | In case parent's exceptions change and local exceptions are not allowed |
| 80 | anymore, they'll be deleted. |
| 81 | |
| 82 | Notice that new whitelist entries will not be propagated: |
| 83 | A |
| 84 | / \ |
| 85 | B |
| 86 | |
| 87 | group whitelist entries denied devices |
| 88 | A "c 1:3 rwm", "c 1:5 r" all the rest |
| 89 | B "c 1:3 rwm", "c 1:5 r" all the rest |
| 90 | |
| 91 | when adding "c *:3 rwm": |
| 92 | # echo "c *:3 rwm" >A/devices.allow |
| 93 | |
| 94 | the result: |
| 95 | group whitelist entries denied devices |
| 96 | A "c *:3 rwm", "c 1:5 r" all the rest |
| 97 | B "c 1:3 rwm", "c 1:5 r" all the rest |
| 98 | |
| 99 | but now it'll be possible to add new entries to B: |
| 100 | # echo "c 2:3 rwm" >B/devices.allow |
| 101 | # echo "c 50:3 r" >B/devices.allow |
| 102 | or even |
| 103 | # echo "c *:3 rwm" >B/devices.allow |
| 104 | |
| 105 | Allowing or denying all by writing 'a' to devices.allow or devices.deny will |
| 106 | not be possible once the device cgroups has children. |
| 107 | |
| 108 | 4.1 Hierarchy (internal implementation) |
| 109 | |
| 110 | device cgroups is implemented internally using a behavior (ALLOW, DENY) and a |
| 111 | list of exceptions. The internal state is controlled using the same user |
| 112 | interface to preserve compatibility with the previous whitelist-only |
| 113 | implementation. Removal or addition of exceptions that will reduce the access |
| 114 | to devices will be propagated down the hierarchy. |
| 115 | For every propagated exception, the effective rules will be re-evaluated based |
| 116 | on current parent's access rules. |