Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 1 | |
Matt LaPlante | 6c28f2c | 2006-10-03 22:46:31 +0200 | [diff] [blame] | 2 | configfs - Userspace-driven kernel object configuration. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 3 | |
| 4 | Joel Becker <joel.becker@oracle.com> |
| 5 | |
| 6 | Updated: 31 March 2005 |
| 7 | |
| 8 | Copyright (c) 2005 Oracle Corporation, |
| 9 | Joel Becker <joel.becker@oracle.com> |
| 10 | |
| 11 | |
| 12 | [What is configfs?] |
| 13 | |
| 14 | configfs is a ram-based filesystem that provides the converse of |
| 15 | sysfs's functionality. Where sysfs is a filesystem-based view of |
| 16 | kernel objects, configfs is a filesystem-based manager of kernel |
| 17 | objects, or config_items. |
| 18 | |
| 19 | With sysfs, an object is created in kernel (for example, when a device |
| 20 | is discovered) and it is registered with sysfs. Its attributes then |
| 21 | appear in sysfs, allowing userspace to read the attributes via |
| 22 | readdir(3)/read(2). It may allow some attributes to be modified via |
| 23 | write(2). The important point is that the object is created and |
| 24 | destroyed in kernel, the kernel controls the lifecycle of the sysfs |
| 25 | representation, and sysfs is merely a window on all this. |
| 26 | |
| 27 | A configfs config_item is created via an explicit userspace operation: |
| 28 | mkdir(2). It is destroyed via rmdir(2). The attributes appear at |
| 29 | mkdir(2) time, and can be read or modified via read(2) and write(2). |
| 30 | As with sysfs, readdir(3) queries the list of items and/or attributes. |
| 31 | symlink(2) can be used to group items together. Unlike sysfs, the |
| 32 | lifetime of the representation is completely driven by userspace. The |
| 33 | kernel modules backing the items must respond to this. |
| 34 | |
| 35 | Both sysfs and configfs can and should exist together on the same |
| 36 | system. One is not a replacement for the other. |
| 37 | |
| 38 | [Using configfs] |
| 39 | |
| 40 | configfs can be compiled as a module or into the kernel. You can access |
| 41 | it by doing |
| 42 | |
| 43 | mount -t configfs none /config |
| 44 | |
| 45 | The configfs tree will be empty unless client modules are also loaded. |
| 46 | These are modules that register their item types with configfs as |
| 47 | subsystems. Once a client subsystem is loaded, it will appear as a |
| 48 | subdirectory (or more than one) under /config. Like sysfs, the |
| 49 | configfs tree is always there, whether mounted on /config or not. |
| 50 | |
| 51 | An item is created via mkdir(2). The item's attributes will also |
| 52 | appear at this time. readdir(3) can determine what the attributes are, |
| 53 | read(2) can query their default values, and write(2) can store new |
| 54 | values. Like sysfs, attributes should be ASCII text files, preferably |
| 55 | with only one value per file. The same efficiency caveats from sysfs |
| 56 | apply. Don't mix more than one attribute in one attribute file. |
| 57 | |
| 58 | Like sysfs, configfs expects write(2) to store the entire buffer at |
| 59 | once. When writing to configfs attributes, userspace processes should |
| 60 | first read the entire file, modify the portions they wish to change, and |
| 61 | then write the entire buffer back. Attribute files have a maximum size |
| 62 | of one page (PAGE_SIZE, 4096 on i386). |
| 63 | |
| 64 | When an item needs to be destroyed, remove it with rmdir(2). An |
| 65 | item cannot be destroyed if any other item has a link to it (via |
| 66 | symlink(2)). Links can be removed via unlink(2). |
| 67 | |
| 68 | [Configuring FakeNBD: an Example] |
| 69 | |
| 70 | Imagine there's a Network Block Device (NBD) driver that allows you to |
| 71 | access remote block devices. Call it FakeNBD. FakeNBD uses configfs |
| 72 | for its configuration. Obviously, there will be a nice program that |
| 73 | sysadmins use to configure FakeNBD, but somehow that program has to tell |
| 74 | the driver about it. Here's where configfs comes in. |
| 75 | |
| 76 | When the FakeNBD driver is loaded, it registers itself with configfs. |
| 77 | readdir(3) sees this just fine: |
| 78 | |
| 79 | # ls /config |
| 80 | fakenbd |
| 81 | |
| 82 | A fakenbd connection can be created with mkdir(2). The name is |
| 83 | arbitrary, but likely the tool will make some use of the name. Perhaps |
| 84 | it is a uuid or a disk name: |
| 85 | |
| 86 | # mkdir /config/fakenbd/disk1 |
| 87 | # ls /config/fakenbd/disk1 |
| 88 | target device rw |
| 89 | |
| 90 | The target attribute contains the IP address of the server FakeNBD will |
| 91 | connect to. The device attribute is the device on the server. |
| 92 | Predictably, the rw attribute determines whether the connection is |
| 93 | read-only or read-write. |
| 94 | |
| 95 | # echo 10.0.0.1 > /config/fakenbd/disk1/target |
| 96 | # echo /dev/sda1 > /config/fakenbd/disk1/device |
| 97 | # echo 1 > /config/fakenbd/disk1/rw |
| 98 | |
| 99 | That's it. That's all there is. Now the device is configured, via the |
| 100 | shell no less. |
| 101 | |
| 102 | [Coding With configfs] |
| 103 | |
| 104 | Every object in configfs is a config_item. A config_item reflects an |
| 105 | object in the subsystem. It has attributes that match values on that |
| 106 | object. configfs handles the filesystem representation of that object |
| 107 | and its attributes, allowing the subsystem to ignore all but the |
| 108 | basic show/store interaction. |
| 109 | |
| 110 | Items are created and destroyed inside a config_group. A group is a |
| 111 | collection of items that share the same attributes and operations. |
| 112 | Items are created by mkdir(2) and removed by rmdir(2), but configfs |
| 113 | handles that. The group has a set of operations to perform these tasks |
| 114 | |
| 115 | A subsystem is the top level of a client module. During initialization, |
| 116 | the client module registers the subsystem with configfs, the subsystem |
| 117 | appears as a directory at the top of the configfs filesystem. A |
| 118 | subsystem is also a config_group, and can do everything a config_group |
| 119 | can. |
| 120 | |
| 121 | [struct config_item] |
| 122 | |
| 123 | struct config_item { |
| 124 | char *ci_name; |
| 125 | char ci_namebuf[UOBJ_NAME_LEN]; |
| 126 | struct kref ci_kref; |
| 127 | struct list_head ci_entry; |
| 128 | struct config_item *ci_parent; |
| 129 | struct config_group *ci_group; |
| 130 | struct config_item_type *ci_type; |
| 131 | struct dentry *ci_dentry; |
| 132 | }; |
| 133 | |
| 134 | void config_item_init(struct config_item *); |
| 135 | void config_item_init_type_name(struct config_item *, |
| 136 | const char *name, |
| 137 | struct config_item_type *type); |
| 138 | struct config_item *config_item_get(struct config_item *); |
| 139 | void config_item_put(struct config_item *); |
| 140 | |
| 141 | Generally, struct config_item is embedded in a container structure, a |
| 142 | structure that actually represents what the subsystem is doing. The |
| 143 | config_item portion of that structure is how the object interacts with |
| 144 | configfs. |
| 145 | |
| 146 | Whether statically defined in a source file or created by a parent |
| 147 | config_group, a config_item must have one of the _init() functions |
| 148 | called on it. This initializes the reference count and sets up the |
| 149 | appropriate fields. |
| 150 | |
| 151 | All users of a config_item should have a reference on it via |
| 152 | config_item_get(), and drop the reference when they are done via |
| 153 | config_item_put(). |
| 154 | |
| 155 | By itself, a config_item cannot do much more than appear in configfs. |
| 156 | Usually a subsystem wants the item to display and/or store attributes, |
| 157 | among other things. For that, it needs a type. |
| 158 | |
| 159 | [struct config_item_type] |
| 160 | |
| 161 | struct configfs_item_operations { |
| 162 | void (*release)(struct config_item *); |
| 163 | ssize_t (*show_attribute)(struct config_item *, |
| 164 | struct configfs_attribute *, |
| 165 | char *); |
| 166 | ssize_t (*store_attribute)(struct config_item *, |
| 167 | struct configfs_attribute *, |
| 168 | const char *, size_t); |
| 169 | int (*allow_link)(struct config_item *src, |
| 170 | struct config_item *target); |
| 171 | int (*drop_link)(struct config_item *src, |
| 172 | struct config_item *target); |
| 173 | }; |
| 174 | |
| 175 | struct config_item_type { |
| 176 | struct module *ct_owner; |
| 177 | struct configfs_item_operations *ct_item_ops; |
| 178 | struct configfs_group_operations *ct_group_ops; |
| 179 | struct configfs_attribute **ct_attrs; |
| 180 | }; |
| 181 | |
| 182 | The most basic function of a config_item_type is to define what |
| 183 | operations can be performed on a config_item. All items that have been |
| 184 | allocated dynamically will need to provide the ct_item_ops->release() |
| 185 | method. This method is called when the config_item's reference count |
| 186 | reaches zero. Items that wish to display an attribute need to provide |
| 187 | the ct_item_ops->show_attribute() method. Similarly, storing a new |
| 188 | attribute value uses the store_attribute() method. |
| 189 | |
| 190 | [struct configfs_attribute] |
| 191 | |
| 192 | struct configfs_attribute { |
| 193 | char *ca_name; |
| 194 | struct module *ca_owner; |
Al Viro | 4394751 | 2011-07-25 00:05:26 -0400 | [diff] [blame] | 195 | umode_t ca_mode; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 196 | }; |
| 197 | |
| 198 | When a config_item wants an attribute to appear as a file in the item's |
| 199 | configfs directory, it must define a configfs_attribute describing it. |
| 200 | It then adds the attribute to the NULL-terminated array |
| 201 | config_item_type->ct_attrs. When the item appears in configfs, the |
| 202 | attribute file will appear with the configfs_attribute->ca_name |
| 203 | filename. configfs_attribute->ca_mode specifies the file permissions. |
| 204 | |
| 205 | If an attribute is readable and the config_item provides a |
| 206 | ct_item_ops->show_attribute() method, that method will be called |
| 207 | whenever userspace asks for a read(2) on the attribute. The converse |
| 208 | will happen for write(2). |
| 209 | |
| 210 | [struct config_group] |
| 211 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 212 | A config_item cannot live in a vacuum. The only way one can be created |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 213 | is via mkdir(2) on a config_group. This will trigger creation of a |
| 214 | child item. |
| 215 | |
| 216 | struct config_group { |
| 217 | struct config_item cg_item; |
| 218 | struct list_head cg_children; |
| 219 | struct configfs_subsystem *cg_subsys; |
| 220 | struct config_group **default_groups; |
| 221 | }; |
| 222 | |
| 223 | void config_group_init(struct config_group *group); |
| 224 | void config_group_init_type_name(struct config_group *group, |
| 225 | const char *name, |
| 226 | struct config_item_type *type); |
| 227 | |
| 228 | |
| 229 | The config_group structure contains a config_item. Properly configuring |
| 230 | that item means that a group can behave as an item in its own right. |
| 231 | However, it can do more: it can create child items or groups. This is |
| 232 | accomplished via the group operations specified on the group's |
| 233 | config_item_type. |
| 234 | |
| 235 | struct configfs_group_operations { |
Joel Becker | f89ab86 | 2008-07-17 14:53:48 -0700 | [diff] [blame] | 236 | struct config_item *(*make_item)(struct config_group *group, |
| 237 | const char *name); |
| 238 | struct config_group *(*make_group)(struct config_group *group, |
| 239 | const char *name); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 240 | int (*commit_item)(struct config_item *item); |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 241 | void (*disconnect_notify)(struct config_group *group, |
| 242 | struct config_item *item); |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 243 | void (*drop_item)(struct config_group *group, |
| 244 | struct config_item *item); |
| 245 | }; |
| 246 | |
| 247 | A group creates child items by providing the |
| 248 | ct_group_ops->make_item() method. If provided, this method is called from mkdir(2) in the group's directory. The subsystem allocates a new |
| 249 | config_item (or more likely, its container structure), initializes it, |
| 250 | and returns it to configfs. Configfs will then populate the filesystem |
| 251 | tree to reflect the new item. |
| 252 | |
| 253 | If the subsystem wants the child to be a group itself, the subsystem |
| 254 | provides ct_group_ops->make_group(). Everything else behaves the same, |
| 255 | using the group _init() functions on the group. |
| 256 | |
| 257 | Finally, when userspace calls rmdir(2) on the item or group, |
| 258 | ct_group_ops->drop_item() is called. As a config_group is also a |
Matt LaPlante | 53cb472 | 2006-10-03 22:55:17 +0200 | [diff] [blame] | 259 | config_item, it is not necessary for a separate drop_group() method. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 260 | The subsystem must config_item_put() the reference that was initialized |
| 261 | upon item allocation. If a subsystem has no work to do, it may omit |
| 262 | the ct_group_ops->drop_item() method, and configfs will call |
| 263 | config_item_put() on the item on behalf of the subsystem. |
| 264 | |
| 265 | IMPORTANT: drop_item() is void, and as such cannot fail. When rmdir(2) |
| 266 | is called, configfs WILL remove the item from the filesystem tree |
| 267 | (assuming that it has no children to keep it busy). The subsystem is |
| 268 | responsible for responding to this. If the subsystem has references to |
| 269 | the item in other threads, the memory is safe. It may take some time |
| 270 | for the item to actually disappear from the subsystem's usage. But it |
| 271 | is gone from configfs. |
| 272 | |
Joel Becker | 299894c | 2006-10-06 17:33:23 -0700 | [diff] [blame] | 273 | When drop_item() is called, the item's linkage has already been torn |
| 274 | down. It no longer has a reference on its parent and has no place in |
| 275 | the item hierarchy. If a client needs to do some cleanup before this |
| 276 | teardown happens, the subsystem can implement the |
| 277 | ct_group_ops->disconnect_notify() method. The method is called after |
| 278 | configfs has removed the item from the filesystem view but before the |
| 279 | item is removed from its parent group. Like drop_item(), |
| 280 | disconnect_notify() is void and cannot fail. Client subsystems should |
| 281 | not drop any references here, as they still must do it in drop_item(). |
| 282 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 283 | A config_group cannot be removed while it still has child items. This |
| 284 | is implemented in the configfs rmdir(2) code. ->drop_item() will not be |
| 285 | called, as the item has not been dropped. rmdir(2) will fail, as the |
| 286 | directory is not empty. |
| 287 | |
| 288 | [struct configfs_subsystem] |
| 289 | |
Matt LaPlante | 4ae0edc | 2006-11-30 04:58:40 +0100 | [diff] [blame] | 290 | A subsystem must register itself, usually at module_init time. This |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 291 | tells configfs to make the subsystem appear in the file tree. |
| 292 | |
| 293 | struct configfs_subsystem { |
| 294 | struct config_group su_group; |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 295 | struct mutex su_mutex; |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 296 | }; |
| 297 | |
| 298 | int configfs_register_subsystem(struct configfs_subsystem *subsys); |
| 299 | void configfs_unregister_subsystem(struct configfs_subsystem *subsys); |
| 300 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 301 | A subsystem consists of a toplevel config_group and a mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 302 | The group is where child config_items are created. For a subsystem, |
| 303 | this group is usually defined statically. Before calling |
| 304 | configfs_register_subsystem(), the subsystem must have initialized the |
| 305 | group via the usual group _init() functions, and it must also have |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 306 | initialized the mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 307 | When the register call returns, the subsystem is live, and it |
| 308 | will be visible via configfs. At that point, mkdir(2) can be called and |
| 309 | the subsystem must be ready for it. |
| 310 | |
| 311 | [An Example] |
| 312 | |
| 313 | The best example of these basic concepts is the simple_children |
Joel Becker | ecb3d28 | 2008-06-18 19:29:05 -0700 | [diff] [blame] | 314 | subsystem/group and the simple_child item in configfs_example_explicit.c |
| 315 | and configfs_example_macros.c. It shows a trivial object displaying and |
| 316 | storing an attribute, and a simple group creating and destroying these |
| 317 | children. |
| 318 | |
| 319 | The only difference between configfs_example_explicit.c and |
| 320 | configfs_example_macros.c is how the attributes of the childless item |
| 321 | are defined. The childless item has extended attributes, each with |
| 322 | their own show()/store() operation. This follows a convention commonly |
| 323 | used in sysfs. configfs_example_explicit.c creates these attributes |
| 324 | by explicitly defining the structures involved. Conversely |
| 325 | configfs_example_macros.c uses some convenience macros from configfs.h |
| 326 | to define the attributes. These macros are similar to their sysfs |
| 327 | counterparts. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 328 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 329 | [Hierarchy Navigation and the Subsystem Mutex] |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 330 | |
| 331 | There is an extra bonus that configfs provides. The config_groups and |
| 332 | config_items are arranged in a hierarchy due to the fact that they |
| 333 | appear in a filesystem. A subsystem is NEVER to touch the filesystem |
| 334 | parts, but the subsystem might be interested in this hierarchy. For |
| 335 | this reason, the hierarchy is mirrored via the config_group->cg_children |
| 336 | and config_item->ci_parent structure members. |
| 337 | |
| 338 | A subsystem can navigate the cg_children list and the ci_parent pointer |
| 339 | to see the tree created by the subsystem. This can race with configfs' |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 340 | management of the hierarchy, so configfs uses the subsystem mutex to |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 341 | protect modifications. Whenever a subsystem wants to navigate the |
| 342 | hierarchy, it must do so under the protection of the subsystem |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 343 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 344 | |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 345 | A subsystem will be prevented from acquiring the mutex while a newly |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 346 | allocated item has not been linked into this hierarchy. Similarly, it |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 347 | will not be able to acquire the mutex while a dropping item has not |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 348 | yet been unlinked. This means that an item's ci_parent pointer will |
| 349 | never be NULL while the item is in configfs, and that an item will only |
| 350 | be in its parent's cg_children list for the same duration. This allows |
| 351 | a subsystem to trust ci_parent and cg_children while they hold the |
Joel Becker | e6bd07a | 2007-07-06 23:33:17 -0700 | [diff] [blame] | 352 | mutex. |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 353 | |
| 354 | [Item Aggregation Via symlink(2)] |
| 355 | |
| 356 | configfs provides a simple group via the group->item parent/child |
| 357 | relationship. Often, however, a larger environment requires aggregation |
| 358 | outside of the parent/child connection. This is implemented via |
| 359 | symlink(2). |
| 360 | |
| 361 | A config_item may provide the ct_item_ops->allow_link() and |
| 362 | ct_item_ops->drop_link() methods. If the ->allow_link() method exists, |
| 363 | symlink(2) may be called with the config_item as the source of the link. |
| 364 | These links are only allowed between configfs config_items. Any |
| 365 | symlink(2) attempt outside the configfs filesystem will be denied. |
| 366 | |
| 367 | When symlink(2) is called, the source config_item's ->allow_link() |
| 368 | method is called with itself and a target item. If the source item |
| 369 | allows linking to target item, it returns 0. A source item may wish to |
| 370 | reject a link if it only wants links to a certain type of object (say, |
| 371 | in its own subsystem). |
| 372 | |
| 373 | When unlink(2) is called on the symbolic link, the source item is |
| 374 | notified via the ->drop_link() method. Like the ->drop_item() method, |
| 375 | this is a void function and cannot return failure. The subsystem is |
| 376 | responsible for responding to the change. |
| 377 | |
| 378 | A config_item cannot be removed while it links to any other item, nor |
| 379 | can it be removed while an item links to it. Dangling symlinks are not |
| 380 | allowed in configfs. |
| 381 | |
| 382 | [Automatically Created Subgroups] |
| 383 | |
| 384 | A new config_group may want to have two types of child config_items. |
| 385 | While this could be codified by magic names in ->make_item(), it is much |
| 386 | more explicit to have a method whereby userspace sees this divergence. |
| 387 | |
| 388 | Rather than have a group where some items behave differently than |
| 389 | others, configfs provides a method whereby one or many subgroups are |
| 390 | automatically created inside the parent at its creation. Thus, |
Masatake YAMATO | 48cc7ec | 2008-02-03 16:10:08 +0200 | [diff] [blame] | 391 | mkdir("parent") results in "parent", "parent/subgroup1", up through |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 392 | "parent/subgroupN". Items of type 1 can now be created in |
| 393 | "parent/subgroup1", and items of type N can be created in |
| 394 | "parent/subgroupN". |
| 395 | |
| 396 | These automatic subgroups, or default groups, do not preclude other |
| 397 | children of the parent group. If ct_group_ops->make_group() exists, |
| 398 | other child groups can be created on the parent group directly. |
| 399 | |
| 400 | A configfs subsystem specifies default groups by filling in the |
| 401 | NULL-terminated array default_groups on the config_group structure. |
| 402 | Each group in that array is populated in the configfs tree at the same |
| 403 | time as the parent group. Similarly, they are removed at the same time |
| 404 | as the parent. No extra notification is provided. When a ->drop_item() |
| 405 | method call notifies the subsystem the parent group is going away, it |
| 406 | also means every default group child associated with that parent group. |
| 407 | |
| 408 | As a consequence of this, default_groups cannot be removed directly via |
| 409 | rmdir(2). They also are not considered when rmdir(2) on the parent |
| 410 | group is checking for children. |
| 411 | |
Lucas De Marchi | 25985ed | 2011-03-30 22:57:33 -0300 | [diff] [blame] | 412 | [Dependent Subsystems] |
Joel Becker | 631d1fe | 2007-06-18 18:06:09 -0700 | [diff] [blame] | 413 | |
| 414 | Sometimes other drivers depend on particular configfs items. For |
| 415 | example, ocfs2 mounts depend on a heartbeat region item. If that |
| 416 | region item is removed with rmdir(2), the ocfs2 mount must BUG or go |
| 417 | readonly. Not happy. |
| 418 | |
| 419 | configfs provides two additional API calls: configfs_depend_item() and |
| 420 | configfs_undepend_item(). A client driver can call |
| 421 | configfs_depend_item() on an existing item to tell configfs that it is |
| 422 | depended on. configfs will then return -EBUSY from rmdir(2) for that |
| 423 | item. When the item is no longer depended on, the client driver calls |
| 424 | configfs_undepend_item() on it. |
| 425 | |
| 426 | These API cannot be called underneath any configfs callbacks, as |
| 427 | they will conflict. They can block and allocate. A client driver |
| 428 | probably shouldn't calling them of its own gumption. Rather it should |
| 429 | be providing an API that external subsystems call. |
| 430 | |
| 431 | How does this work? Imagine the ocfs2 mount process. When it mounts, |
| 432 | it asks for a heartbeat region item. This is done via a call into the |
| 433 | heartbeat code. Inside the heartbeat code, the region item is looked |
| 434 | up. Here, the heartbeat code calls configfs_depend_item(). If it |
| 435 | succeeds, then heartbeat knows the region is safe to give to ocfs2. |
| 436 | If it fails, it was being torn down anyway, and heartbeat can gracefully |
| 437 | pass up an error. |
| 438 | |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 439 | [Committable Items] |
| 440 | |
| 441 | NOTE: Committable items are currently unimplemented. |
| 442 | |
| 443 | Some config_items cannot have a valid initial state. That is, no |
| 444 | default values can be specified for the item's attributes such that the |
| 445 | item can do its work. Userspace must configure one or more attributes, |
| 446 | after which the subsystem can start whatever entity this item |
| 447 | represents. |
| 448 | |
| 449 | Consider the FakeNBD device from above. Without a target address *and* |
| 450 | a target device, the subsystem has no idea what block device to import. |
| 451 | The simple example assumes that the subsystem merely waits until all the |
| 452 | appropriate attributes are configured, and then connects. This will, |
| 453 | indeed, work, but now every attribute store must check if the attributes |
| 454 | are initialized. Every attribute store must fire off the connection if |
| 455 | that condition is met. |
| 456 | |
| 457 | Far better would be an explicit action notifying the subsystem that the |
| 458 | config_item is ready to go. More importantly, an explicit action allows |
Matt LaPlante | 3f6dee9 | 2006-10-03 22:45:33 +0200 | [diff] [blame] | 459 | the subsystem to provide feedback as to whether the attributes are |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 460 | initialized in a way that makes sense. configfs provides this as |
| 461 | committable items. |
| 462 | |
| 463 | configfs still uses only normal filesystem operations. An item is |
| 464 | committed via rename(2). The item is moved from a directory where it |
| 465 | can be modified to a directory where it cannot. |
| 466 | |
| 467 | Any group that provides the ct_group_ops->commit_item() method has |
| 468 | committable items. When this group appears in configfs, mkdir(2) will |
| 469 | not work directly in the group. Instead, the group will have two |
| 470 | subdirectories: "live" and "pending". The "live" directory does not |
| 471 | support mkdir(2) or rmdir(2) either. It only allows rename(2). The |
| 472 | "pending" directory does allow mkdir(2) and rmdir(2). An item is |
| 473 | created in the "pending" directory. Its attributes can be modified at |
| 474 | will. Userspace commits the item by renaming it into the "live" |
Matt LaPlante | d6bc8ac | 2006-10-03 22:54:15 +0200 | [diff] [blame] | 475 | directory. At this point, the subsystem receives the ->commit_item() |
Joel Becker | 7063fbf | 2005-12-15 14:29:43 -0800 | [diff] [blame] | 476 | callback. If all required attributes are filled to satisfaction, the |
| 477 | method returns zero and the item is moved to the "live" directory. |
| 478 | |
| 479 | As rmdir(2) does not work in the "live" directory, an item must be |
| 480 | shutdown, or "uncommitted". Again, this is done via rename(2), this |
| 481 | time from the "live" directory back to the "pending" one. The subsystem |
| 482 | is notified by the ct_group_ops->uncommit_object() method. |
| 483 | |
| 484 | |