Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 1 | |
Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 2 | configfs - Userspace-driven kernel object configuration. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 3 | |
| 4 | Joel Becker <joel.becker@oracle.com> |
| 5 | |
| 6 | Updated: 31 March 2005 |
| 7 | |
| 8 | Copyright (c) 2005 Oracle Corporation, |
| 9 | Joel Becker <joel.becker@oracle.com> |
| 10 | |
| 11 | |
| 12 | [What is configfs?] |
| 13 | |
| 14 | configfs is a ram-based filesystem that provides the converse of |
| 15 | sysfs's functionality. Where sysfs is a filesystem-based view of |
| 16 | kernel objects, configfs is a filesystem-based manager of kernel |
| 17 | objects, or config_items. |
| 18 | |
| 19 | With sysfs, an object is created in kernel (for example, when a device |
| 20 | is discovered) and it is registered with sysfs. Its attributes then |
| 21 | appear in sysfs, allowing userspace to read the attributes via |
| 22 | readdir(3)/read(2). It may allow some attributes to be modified via |
| 23 | write(2). The important point is that the object is created and |
| 24 | destroyed in kernel, the kernel controls the lifecycle of the sysfs |
| 25 | representation, and sysfs is merely a window on all this. |
| 26 | |
| 27 | A configfs config_item is created via an explicit userspace operation: |
| 28 | mkdir(2). It is destroyed via rmdir(2). The attributes appear at |
| 29 | mkdir(2) time, and can be read or modified via read(2) and write(2). |
| 30 | As with sysfs, readdir(3) queries the list of items and/or attributes. |
| 31 | symlink(2) can be used to group items together. Unlike sysfs, the |
| 32 | lifetime of the representation is completely driven by userspace. The |
| 33 | kernel modules backing the items must respond to this. |
| 34 | |
| 35 | Both sysfs and configfs can and should exist together on the same |
| 36 | system. One is not a replacement for the other. |
| 37 | |
| 38 | [Using configfs] |
| 39 | |
| 40 | configfs can be compiled as a module or into the kernel. You can access |
| 41 | it by doing |
| 42 | |
| 43 | mount -t configfs none /config |
| 44 | |
| 45 | The configfs tree will be empty unless client modules are also loaded. |
| 46 | These are modules that register their item types with configfs as |
| 47 | subsystems. Once a client subsystem is loaded, it will appear as a |
| 48 | subdirectory (or more than one) under /config. Like sysfs, the |
| 49 | configfs tree is always there, whether mounted on /config or not. |
| 50 | |
| 51 | An item is created via mkdir(2). The item's attributes will also |
| 52 | appear at this time. readdir(3) can determine what the attributes are, |
| 53 | read(2) can query their default values, and write(2) can store new |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 54 | values. Don't mix more than one attribute in one attribute file. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 55 | |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 56 | There are two types of configfs attributes: |
| 57 | |
| 58 | * Normal attributes, which similar to sysfs attributes, are small ASCII text |
| 59 | files, with a maximum size of one page (PAGE_SIZE, 4096 on i386). Preferably |
| 60 | only one value per file should be used, and the same caveats from sysfs apply. |
| 61 | Configfs expects write(2) to store the entire buffer at once. When writing to |
| 62 | normal configfs attributes, userspace processes should first read the entire |
| 63 | file, modify the portions they wish to change, and then write the entire |
| 64 | buffer back. |
| 65 | |
| 66 | * Binary attributes, which are somewhat similar to sysfs binary attributes, |
| 67 | but with a few slight changes to semantics. The PAGE_SIZE limitation does not |
| 68 | apply, but the whole binary item must fit in single kernel vmalloc'ed buffer. |
| 69 | The write(2) calls from user space are buffered, and the attributes' |
| 70 | write_bin_attribute method will be invoked on the final close, therefore it is |
| 71 | imperative for user-space to check the return code of close(2) in order to |
| 72 | verify that the operation finished successfully. |
| 73 | To avoid a malicious user OOMing the kernel, there's a per-binary attribute |
| 74 | maximum buffer value. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 75 | |
| 76 | When an item needs to be destroyed, remove it with rmdir(2). An |
| 77 | item cannot be destroyed if any other item has a link to it (via |
| 78 | symlink(2)). Links can be removed via unlink(2). |
| 79 | |
| 80 | [Configuring FakeNBD: an Example] |
| 81 | |
| 82 | Imagine there's a Network Block Device (NBD) driver that allows you to |
| 83 | access remote block devices. Call it FakeNBD. FakeNBD uses configfs |
| 84 | for its configuration. Obviously, there will be a nice program that |
| 85 | sysadmins use to configure FakeNBD, but somehow that program has to tell |
| 86 | the driver about it. Here's where configfs comes in. |
| 87 | |
| 88 | When the FakeNBD driver is loaded, it registers itself with configfs. |
| 89 | readdir(3) sees this just fine: |
| 90 | |
| 91 | # ls /config |
| 92 | fakenbd |
| 93 | |
| 94 | A fakenbd connection can be created with mkdir(2). The name is |
| 95 | arbitrary, but likely the tool will make some use of the name. Perhaps |
| 96 | it is a uuid or a disk name: |
| 97 | |
| 98 | # mkdir /config/fakenbd/disk1 |
| 99 | # ls /config/fakenbd/disk1 |
| 100 | target device rw |
| 101 | |
| 102 | The target attribute contains the IP address of the server FakeNBD will |
| 103 | connect to. The device attribute is the device on the server. |
| 104 | Predictably, the rw attribute determines whether the connection is |
| 105 | read-only or read-write. |
| 106 | |
| 107 | # echo 10.0.0.1 > /config/fakenbd/disk1/target |
| 108 | # echo /dev/sda1 > /config/fakenbd/disk1/device |
| 109 | # echo 1 > /config/fakenbd/disk1/rw |
| 110 | |
| 111 | That's it. That's all there is. Now the device is configured, via the |
| 112 | shell no less. |
| 113 | |
| 114 | [Coding With configfs] |
| 115 | |
| 116 | Every object in configfs is a config_item. A config_item reflects an |
| 117 | object in the subsystem. It has attributes that match values on that |
| 118 | object. configfs handles the filesystem representation of that object |
| 119 | and its attributes, allowing the subsystem to ignore all but the |
| 120 | basic show/store interaction. |
| 121 | |
| 122 | Items are created and destroyed inside a config_group. A group is a |
| 123 | collection of items that share the same attributes and operations. |
| 124 | Items are created by mkdir(2) and removed by rmdir(2), but configfs |
| 125 | handles that. The group has a set of operations to perform these tasks |
| 126 | |
| 127 | A subsystem is the top level of a client module. During initialization, |
| 128 | the client module registers the subsystem with configfs, the subsystem |
| 129 | appears as a directory at the top of the configfs filesystem. A |
| 130 | subsystem is also a config_group, and can do everything a config_group |
| 131 | can. |
| 132 | |
| 133 | [struct config_item] |
| 134 | |
| 135 | struct config_item { |
| 136 | char *ci_name; |
| 137 | char ci_namebuf[UOBJ_NAME_LEN]; |
| 138 | struct kref ci_kref; |
| 139 | struct list_head ci_entry; |
| 140 | struct config_item *ci_parent; |
| 141 | struct config_group *ci_group; |
| 142 | struct config_item_type *ci_type; |
| 143 | struct dentry *ci_dentry; |
| 144 | }; |
| 145 | |
| 146 | void config_item_init(struct config_item *); |
| 147 | void config_item_init_type_name(struct config_item *, |
| 148 | const char *name, |
| 149 | struct config_item_type *type); |
| 150 | struct config_item *config_item_get(struct config_item *); |
| 151 | void config_item_put(struct config_item *); |
| 152 | |
| 153 | Generally, struct config_item is embedded in a container structure, a |
| 154 | structure that actually represents what the subsystem is doing. The |
| 155 | config_item portion of that structure is how the object interacts with |
| 156 | configfs. |
| 157 | |
| 158 | Whether statically defined in a source file or created by a parent |
| 159 | config_group, a config_item must have one of the _init() functions |
| 160 | called on it. This initializes the reference count and sets up the |
| 161 | appropriate fields. |
| 162 | |
| 163 | All users of a config_item should have a reference on it via |
| 164 | config_item_get(), and drop the reference when they are done via |
| 165 | config_item_put(). |
| 166 | |
| 167 | By itself, a config_item cannot do much more than appear in configfs. |
| 168 | Usually a subsystem wants the item to display and/or store attributes, |
| 169 | among other things. For that, it needs a type. |
| 170 | |
| 171 | [struct config_item_type] |
| 172 | |
| 173 | struct configfs_item_operations { |
| 174 | void (*release)(struct config_item *); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 175 | int (*allow_link)(struct config_item *src, |
| 176 | struct config_item *target); |
Andrzej Pietrasiewicz | e16769d | 2016-11-28 13:22:42 +0100 | [diff] [blame] | 177 | void (*drop_link)(struct config_item *src, |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 178 | struct config_item *target); |
| 179 | }; |
| 180 | |
| 181 | struct config_item_type { |
| 182 | struct module *ct_owner; |
| 183 | struct configfs_item_operations *ct_item_ops; |
| 184 | struct configfs_group_operations *ct_group_ops; |
| 185 | struct configfs_attribute **ct_attrs; |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 186 | struct configfs_bin_attribute **ct_bin_attrs; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 187 | }; |
| 188 | |
| 189 | The most basic function of a config_item_type is to define what |
| 190 | operations can be performed on a config_item. All items that have been |
| 191 | allocated dynamically will need to provide the ct_item_ops->release() |
| 192 | method. This method is called when the config_item's reference count |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 193 | reaches zero. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 194 | |
| 195 | [struct configfs_attribute] |
| 196 | |
| 197 | struct configfs_attribute { |
| 198 | char *ca_name; |
| 199 | struct module *ca_owner; |
Al Viro | 4394751 | 2011-07-25 00:05:26 -0400 | [diff] [blame] | 200 | umode_t ca_mode; |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 201 | ssize_t (*show)(struct config_item *, char *); |
| 202 | ssize_t (*store)(struct config_item *, const char *, size_t); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 203 | }; |
| 204 | |
| 205 | When a config_item wants an attribute to appear as a file in the item's |
| 206 | configfs directory, it must define a configfs_attribute describing it. |
| 207 | It then adds the attribute to the NULL-terminated array |
| 208 | config_item_type->ct_attrs. When the item appears in configfs, the |
| 209 | attribute file will appear with the configfs_attribute->ca_name |
| 210 | filename. configfs_attribute->ca_mode specifies the file permissions. |
| 211 | |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 212 | If an attribute is readable and provides a ->show method, that method will |
| 213 | be called whenever userspace asks for a read(2) on the attribute. If an |
| 214 | attribute is writable and provides a ->store method, that method will be |
| 215 | be called whenever userspace asks for a write(2) on the attribute. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 216 | |
Pantelis Antoniou | 03607ac | 2015-10-22 23:30:04 +0300 | [diff] [blame] | 217 | [struct configfs_bin_attribute] |
| 218 | |
| 219 | struct configfs_attribute { |
| 220 | struct configfs_attribute cb_attr; |
| 221 | void *cb_private; |
| 222 | size_t cb_max_size; |
| 223 | }; |
| 224 | |
| 225 | The binary attribute is used when the one needs to use binary blob to |
| 226 | appear as the contents of a file in the item's configfs directory. |
| 227 | To do so add the binary attribute to the NULL-terminated array |
| 228 | config_item_type->ct_bin_attrs, and the item appears in configfs, the |
| 229 | attribute file will appear with the configfs_bin_attribute->cb_attr.ca_name |
| 230 | filename. configfs_bin_attribute->cb_attr.ca_mode specifies the file |
| 231 | permissions. |
| 232 | The cb_private member is provided for use by the driver, while the |
| 233 | cb_max_size member specifies the maximum amount of vmalloc buffer |
| 234 | to be used. |
| 235 | |
| 236 | If binary attribute is readable and the config_item provides a |
| 237 | ct_item_ops->read_bin_attribute() method, that method will be called |
| 238 | whenever userspace asks for a read(2) on the attribute. The converse |
| 239 | will happen for write(2). The reads/writes are bufferred so only a |
| 240 | single read/write will occur; the attributes' need not concern itself |
| 241 | with it. |
| 242 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 243 | [struct config_group] |
| 244 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 245 | A config_item cannot live in a vacuum. The only way one can be created |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 246 | is via mkdir(2) on a config_group. This will trigger creation of a |
| 247 | child item. |
| 248 | |
| 249 | struct config_group { |
| 250 | struct config_item cg_item; |
| 251 | struct list_head cg_children; |
| 252 | struct configfs_subsystem *cg_subsys; |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 253 | struct list_head default_groups; |
| 254 | struct list_head group_entry; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 255 | }; |
| 256 | |
| 257 | void config_group_init(struct config_group *group); |
| 258 | void config_group_init_type_name(struct config_group *group, |
| 259 | const char *name, |
| 260 | struct config_item_type *type); |
| 261 | |
| 262 | |
| 263 | The config_group structure contains a config_item. Properly configuring |
| 264 | that item means that a group can behave as an item in its own right. |
| 265 | However, it can do more: it can create child items or groups. This is |
| 266 | accomplished via the group operations specified on the group's |
| 267 | config_item_type. |
| 268 | |
| 269 | struct configfs_group_operations { |
Joel Becker | f89ab86 | 2008-07-17 14:53:48 -0700 | [diff] [blame] | 270 | struct config_item *(*make_item)(struct config_group *group, |
| 271 | const char *name); |
| 272 | struct config_group *(*make_group)(struct config_group *group, |
| 273 | const char *name); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 274 | int (*commit_item)(struct config_item *item); |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 275 | void (*disconnect_notify)(struct config_group *group, |
| 276 | struct config_item *item); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 277 | void (*drop_item)(struct config_group *group, |
| 278 | struct config_item *item); |
| 279 | }; |
| 280 | |
| 281 | A group creates child items by providing the |
| 282 | ct_group_ops->make_item() method. If provided, this method is called from mkdir(2) in the group's directory. The subsystem allocates a new |
| 283 | config_item (or more likely, its container structure), initializes it, |
| 284 | and returns it to configfs. Configfs will then populate the filesystem |
| 285 | tree to reflect the new item. |
| 286 | |
| 287 | If the subsystem wants the child to be a group itself, the subsystem |
| 288 | provides ct_group_ops->make_group(). Everything else behaves the same, |
| 289 | using the group _init() functions on the group. |
| 290 | |
| 291 | Finally, when userspace calls rmdir(2) on the item or group, |
| 292 | ct_group_ops->drop_item() is called. As a config_group is also a |
Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 293 | config_item, it is not necessary for a separate drop_group() method. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 294 | The subsystem must config_item_put() the reference that was initialized |
| 295 | upon item allocation. If a subsystem has no work to do, it may omit |
| 296 | the ct_group_ops->drop_item() method, and configfs will call |
| 297 | config_item_put() on the item on behalf of the subsystem. |
| 298 | |
| 299 | IMPORTANT: drop_item() is void, and as such cannot fail. When rmdir(2) |
| 300 | is called, configfs WILL remove the item from the filesystem tree |
| 301 | (assuming that it has no children to keep it busy). The subsystem is |
| 302 | responsible for responding to this. If the subsystem has references to |
| 303 | the item in other threads, the memory is safe. It may take some time |
| 304 | for the item to actually disappear from the subsystem's usage. But it |
| 305 | is gone from configfs. |
| 306 | |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 307 | When drop_item() is called, the item's linkage has already been torn |
| 308 | down. It no longer has a reference on its parent and has no place in |
| 309 | the item hierarchy. If a client needs to do some cleanup before this |
| 310 | teardown happens, the subsystem can implement the |
| 311 | ct_group_ops->disconnect_notify() method. The method is called after |
| 312 | configfs has removed the item from the filesystem view but before the |
| 313 | item is removed from its parent group. Like drop_item(), |
| 314 | disconnect_notify() is void and cannot fail. Client subsystems should |
| 315 | not drop any references here, as they still must do it in drop_item(). |
| 316 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 317 | A config_group cannot be removed while it still has child items. This |
| 318 | is implemented in the configfs rmdir(2) code. ->drop_item() will not be |
| 319 | called, as the item has not been dropped. rmdir(2) will fail, as the |
| 320 | directory is not empty. |
| 321 | |
| 322 | [struct configfs_subsystem] |
| 323 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 324 | A subsystem must register itself, usually at module_init time. This |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 325 | tells configfs to make the subsystem appear in the file tree. |
| 326 | |
| 327 | struct configfs_subsystem { |
| 328 | struct config_group su_group; |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 329 | struct mutex su_mutex; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 330 | }; |
| 331 | |
| 332 | int configfs_register_subsystem(struct configfs_subsystem *subsys); |
| 333 | void configfs_unregister_subsystem(struct configfs_subsystem *subsys); |
| 334 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 335 | A subsystem consists of a toplevel config_group and a mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 336 | The group is where child config_items are created. For a subsystem, |
| 337 | this group is usually defined statically. Before calling |
| 338 | configfs_register_subsystem(), the subsystem must have initialized the |
| 339 | group via the usual group _init() functions, and it must also have |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 340 | initialized the mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 341 | When the register call returns, the subsystem is live, and it |
| 342 | will be visible via configfs. At that point, mkdir(2) can be called and |
| 343 | the subsystem must be ready for it. |
| 344 | |
| 345 | [An Example] |
| 346 | |
| 347 | The best example of these basic concepts is the simple_children |
Christoph Hellwig | 5179822 | 2015-10-03 15:32:59 +0200 | [diff] [blame] | 348 | subsystem/group and the simple_child item in |
| 349 | samples/configfs/configfs_sample.c. It shows a trivial object displaying |
| 350 | and storing an attribute, and a simple group creating and destroying |
| 351 | these children. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 352 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 353 | [Hierarchy Navigation and the Subsystem Mutex] |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 354 | |
| 355 | There is an extra bonus that configfs provides. The config_groups and |
| 356 | config_items are arranged in a hierarchy due to the fact that they |
| 357 | appear in a filesystem. A subsystem is NEVER to touch the filesystem |
| 358 | parts, but the subsystem might be interested in this hierarchy. For |
| 359 | this reason, the hierarchy is mirrored via the config_group->cg_children |
| 360 | and config_item->ci_parent structure members. |
| 361 | |
| 362 | A subsystem can navigate the cg_children list and the ci_parent pointer |
| 363 | to see the tree created by the subsystem. This can race with configfs' |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 364 | management of the hierarchy, so configfs uses the subsystem mutex to |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 365 | protect modifications. Whenever a subsystem wants to navigate the |
| 366 | hierarchy, it must do so under the protection of the subsystem |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 367 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 368 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 369 | A subsystem will be prevented from acquiring the mutex while a newly |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 370 | allocated item has not been linked into this hierarchy. Similarly, it |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 371 | will not be able to acquire the mutex while a dropping item has not |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 372 | yet been unlinked. This means that an item's ci_parent pointer will |
| 373 | never be NULL while the item is in configfs, and that an item will only |
| 374 | be in its parent's cg_children list for the same duration. This allows |
| 375 | a subsystem to trust ci_parent and cg_children while they hold the |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 376 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 377 | |
| 378 | [Item Aggregation Via symlink(2)] |
| 379 | |
| 380 | configfs provides a simple group via the group->item parent/child |
| 381 | relationship. Often, however, a larger environment requires aggregation |
| 382 | outside of the parent/child connection. This is implemented via |
| 383 | symlink(2). |
| 384 | |
| 385 | A config_item may provide the ct_item_ops->allow_link() and |
| 386 | ct_item_ops->drop_link() methods. If the ->allow_link() method exists, |
| 387 | symlink(2) may be called with the config_item as the source of the link. |
| 388 | These links are only allowed between configfs config_items. Any |
| 389 | symlink(2) attempt outside the configfs filesystem will be denied. |
| 390 | |
| 391 | When symlink(2) is called, the source config_item's ->allow_link() |
| 392 | method is called with itself and a target item. If the source item |
| 393 | allows linking to target item, it returns 0. A source item may wish to |
| 394 | reject a link if it only wants links to a certain type of object (say, |
| 395 | in its own subsystem). |
| 396 | |
| 397 | When unlink(2) is called on the symbolic link, the source item is |
| 398 | notified via the ->drop_link() method. Like the ->drop_item() method, |
| 399 | this is a void function and cannot return failure. The subsystem is |
| 400 | responsible for responding to the change. |
| 401 | |
| 402 | A config_item cannot be removed while it links to any other item, nor |
| 403 | can it be removed while an item links to it. Dangling symlinks are not |
| 404 | allowed in configfs. |
| 405 | |
| 406 | [Automatically Created Subgroups] |
| 407 | |
| 408 | A new config_group may want to have two types of child config_items. |
| 409 | While this could be codified by magic names in ->make_item(), it is much |
| 410 | more explicit to have a method whereby userspace sees this divergence. |
| 411 | |
| 412 | Rather than have a group where some items behave differently than |
| 413 | others, configfs provides a method whereby one or many subgroups are |
| 414 | automatically created inside the parent at its creation. Thus, |
Masatake YAMATO | 48cc7ec | 2008-02-03 16:10:08 +0200 | [diff] [blame] | 415 | mkdir("parent") results in "parent", "parent/subgroup1", up through |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 416 | "parent/subgroupN". Items of type 1 can now be created in |
| 417 | "parent/subgroup1", and items of type N can be created in |
| 418 | "parent/subgroupN". |
| 419 | |
| 420 | These automatic subgroups, or default groups, do not preclude other |
| 421 | children of the parent group. If ct_group_ops->make_group() exists, |
| 422 | other child groups can be created on the parent group directly. |
| 423 | |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 424 | A configfs subsystem specifies default groups by adding them using the |
| 425 | configfs_add_default_group() function to the parent config_group |
| 426 | structure. Each added group is populated in the configfs tree at the same |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 427 | time as the parent group. Similarly, they are removed at the same time |
| 428 | as the parent. No extra notification is provided. When a ->drop_item() |
| 429 | method call notifies the subsystem the parent group is going away, it |
| 430 | also means every default group child associated with that parent group. |
| 431 | |
Christoph Hellwig | 1ae1602 | 2016-02-26 11:02:14 +0100 | [diff] [blame] | 432 | As a consequence of this, default groups cannot be removed directly via |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 433 | rmdir(2). They also are not considered when rmdir(2) on the parent |
| 434 | group is checking for children. |
| 435 | |
Lucas De Marchi | 25985ed | 2011-03-30 22:57:33 -0300 | [diff] [blame] | 436 | [Dependent Subsystems] |
Joel Becker | 631d1fe | 2007-06-18 18:06:09 -0700 | [diff] [blame] | 437 | |
| 438 | Sometimes other drivers depend on particular configfs items. For |
| 439 | example, ocfs2 mounts depend on a heartbeat region item. If that |
| 440 | region item is removed with rmdir(2), the ocfs2 mount must BUG or go |
| 441 | readonly. Not happy. |
| 442 | |
| 443 | configfs provides two additional API calls: configfs_depend_item() and |
| 444 | configfs_undepend_item(). A client driver can call |
| 445 | configfs_depend_item() on an existing item to tell configfs that it is |
| 446 | depended on. configfs will then return -EBUSY from rmdir(2) for that |
| 447 | item. When the item is no longer depended on, the client driver calls |
| 448 | configfs_undepend_item() on it. |
| 449 | |
| 450 | These API cannot be called underneath any configfs callbacks, as |
| 451 | they will conflict. They can block and allocate. A client driver |
| 452 | probably shouldn't calling them of its own gumption. Rather it should |
| 453 | be providing an API that external subsystems call. |
| 454 | |
| 455 | How does this work? Imagine the ocfs2 mount process. When it mounts, |
| 456 | it asks for a heartbeat region item. This is done via a call into the |
| 457 | heartbeat code. Inside the heartbeat code, the region item is looked |
| 458 | up. Here, the heartbeat code calls configfs_depend_item(). If it |
| 459 | succeeds, then heartbeat knows the region is safe to give to ocfs2. |
| 460 | If it fails, it was being torn down anyway, and heartbeat can gracefully |
| 461 | pass up an error. |
| 462 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 463 | [Committable Items] |
| 464 | |
| 465 | NOTE: Committable items are currently unimplemented. |
| 466 | |
| 467 | Some config_items cannot have a valid initial state. That is, no |
| 468 | default values can be specified for the item's attributes such that the |
| 469 | item can do its work. Userspace must configure one or more attributes, |
| 470 | after which the subsystem can start whatever entity this item |
| 471 | represents. |
| 472 | |
| 473 | Consider the FakeNBD device from above. Without a target address *and* |
| 474 | a target device, the subsystem has no idea what block device to import. |
| 475 | The simple example assumes that the subsystem merely waits until all the |
| 476 | appropriate attributes are configured, and then connects. This will, |
| 477 | indeed, work, but now every attribute store must check if the attributes |
| 478 | are initialized. Every attribute store must fire off the connection if |
| 479 | that condition is met. |
| 480 | |
| 481 | Far better would be an explicit action notifying the subsystem that the |
| 482 | config_item is ready to go. More importantly, an explicit action allows |
Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 483 | the subsystem to provide feedback as to whether the attributes are |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 484 | initialized in a way that makes sense. configfs provides this as |
| 485 | committable items. |
| 486 | |
| 487 | configfs still uses only normal filesystem operations. An item is |
| 488 | committed via rename(2). The item is moved from a directory where it |
| 489 | can be modified to a directory where it cannot. |
| 490 | |
| 491 | Any group that provides the ct_group_ops->commit_item() method has |
| 492 | committable items. When this group appears in configfs, mkdir(2) will |
| 493 | not work directly in the group. Instead, the group will have two |
| 494 | subdirectories: "live" and "pending". The "live" directory does not |
| 495 | support mkdir(2) or rmdir(2) either. It only allows rename(2). The |
| 496 | "pending" directory does allow mkdir(2) and rmdir(2). An item is |
| 497 | created in the "pending" directory. Its attributes can be modified at |
| 498 | will. Userspace commits the item by renaming it into the "live" |
Matt LaPlante | d6bc8ac | 2006-10-03 22:54:15 +0200 | [diff] [blame] | 499 | directory. At this point, the subsystem receives the ->commit_item() |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 500 | callback. If all required attributes are filled to satisfaction, the |
| 501 | method returns zero and the item is moved to the "live" directory. |
| 502 | |
| 503 | As rmdir(2) does not work in the "live" directory, an item must be |
| 504 | shutdown, or "uncommitted". Again, this is done via rename(2), this |
| 505 | time from the "live" directory back to the "pending" one. The subsystem |
| 506 | is notified by the ct_group_ops->uncommit_object() method. |
| 507 | |
| 508 | |