Greg Kroah-Hartman | 36d78d6 | 2007-11-27 11:28:26 -0800 | [diff] [blame] | 1 | Everything you never wanted to know about kobjects, ksets, and ktypes |
| 2 | |
| 3 | Greg Kroah-Hartman <gregkh@suse.de> |
| 4 | |
| 5 | Based on an original article by Jon Corbet for lwn.net written October 1, |
| 6 | 2003 and located at http://lwn.net/Articles/51437/ |
| 7 | |
| 8 | Last updated December 19, 2007 |
| 9 | |
| 10 | |
| 11 | Part of the difficulty in understanding the driver model - and the kobject |
| 12 | abstraction upon which it is built - is that there is no obvious starting |
| 13 | place. Dealing with kobjects requires understanding a few different types, |
| 14 | all of which make reference to each other. In an attempt to make things |
| 15 | easier, we'll take a multi-pass approach, starting with vague terms and |
| 16 | adding detail as we go. To that end, here are some quick definitions of |
| 17 | some terms we will be working with. |
| 18 | |
| 19 | - A kobject is an object of type struct kobject. Kobjects have a name |
| 20 | and a reference count. A kobject also has a parent pointer (allowing |
| 21 | objects to be arranged into hierarchies), a specific type, and, |
| 22 | usually, a representation in the sysfs virtual filesystem. |
| 23 | |
| 24 | Kobjects are generally not interesting on their own; instead, they are |
| 25 | usually embedded within some other structure which contains the stuff |
| 26 | the code is really interested in. |
| 27 | |
| 28 | No structure should EVER have more than one kobject embedded within it. |
| 29 | If it does, the reference counting for the object is sure to be messed |
| 30 | up and incorrect, and your code will be buggy. So do not do this. |
| 31 | |
| 32 | - A ktype is the type of object that embeds a kobject. Every structure |
| 33 | that embeds a kobject needs a corresponding ktype. The ktype controls |
| 34 | what happens to the kobject when it is created and destroyed. |
| 35 | |
| 36 | - A kset is a group of kobjects. These kobjects can be of the same ktype |
| 37 | or belong to different ktypes. The kset is the basic container type for |
| 38 | collections of kobjects. Ksets contain their own kobjects, but you can |
| 39 | safely ignore that implementation detail as the kset core code handles |
| 40 | this kobject automatically. |
| 41 | |
| 42 | When you see a sysfs directory full of other directories, generally each |
| 43 | of those directories corresponds to a kobject in the same kset. |
| 44 | |
| 45 | We'll look at how to create and manipulate all of these types. A bottom-up |
| 46 | approach will be taken, so we'll go back to kobjects. |
| 47 | |
| 48 | |
| 49 | Embedding kobjects |
| 50 | |
| 51 | It is rare for kernel code to create a standalone kobject, with one major |
| 52 | exception explained below. Instead, kobjects are used to control access to |
| 53 | a larger, domain-specific object. To this end, kobjects will be found |
| 54 | embedded in other structures. If you are used to thinking of things in |
| 55 | object-oriented terms, kobjects can be seen as a top-level, abstract class |
| 56 | from which other classes are derived. A kobject implements a set of |
| 57 | capabilities which are not particularly useful by themselves, but which are |
| 58 | nice to have in other objects. The C language does not allow for the |
| 59 | direct expression of inheritance, so other techniques - such as structure |
| 60 | embedding - must be used. |
| 61 | |
| 62 | So, for example, the UIO code has a structure that defines the memory |
| 63 | region associated with a uio device: |
| 64 | |
| 65 | struct uio_mem { |
| 66 | struct kobject kobj; |
| 67 | unsigned long addr; |
| 68 | unsigned long size; |
| 69 | int memtype; |
| 70 | void __iomem *internal_addr; |
| 71 | }; |
| 72 | |
| 73 | If you have a struct uio_mem structure, finding its embedded kobject is |
| 74 | just a matter of using the kobj member. Code that works with kobjects will |
| 75 | often have the opposite problem, however: given a struct kobject pointer, |
| 76 | what is the pointer to the containing structure? You must avoid tricks |
| 77 | (such as assuming that the kobject is at the beginning of the structure) |
| 78 | and, instead, use the container_of() macro, found in <linux/kernel.h>: |
| 79 | |
| 80 | container_of(pointer, type, member) |
| 81 | |
| 82 | where pointer is the pointer to the embedded kobject, type is the type of |
| 83 | the containing structure, and member is the name of the structure field to |
| 84 | which pointer points. The return value from container_of() is a pointer to |
| 85 | the given type. So, for example, a pointer "kp" to a struct kobject |
| 86 | embedded within a struct uio_mem could be converted to a pointer to the |
| 87 | containing uio_mem structure with: |
| 88 | |
| 89 | struct uio_mem *u_mem = container_of(kp, struct uio_mem, kobj); |
| 90 | |
| 91 | Programmers often define a simple macro for "back-casting" kobject pointers |
| 92 | to the containing type. |
| 93 | |
| 94 | |
| 95 | Initialization of kobjects |
| 96 | |
| 97 | Code which creates a kobject must, of course, initialize that object. Some |
| 98 | of the internal fields are setup with a (mandatory) call to kobject_init(): |
| 99 | |
| 100 | void kobject_init(struct kobject *kobj, struct kobj_type *ktype); |
| 101 | |
| 102 | The ktype is required for a kobject to be created properly, as every kobject |
| 103 | must have an associated kobj_type. After calling kobject_init(), to |
| 104 | register the kobject with sysfs, the function kobject_add() must be called: |
| 105 | |
| 106 | int kobject_add(struct kobject *kobj, struct kobject *parent, const char *fmt, ...); |
| 107 | |
| 108 | This sets up the parent of the kobject and the name for the kobject |
| 109 | properly. If the kobject is to be associated with a specific kset, |
| 110 | kobj->kset must be assigned before calling kobject_add(). If a kset is |
| 111 | associated with a kobject, then the parent for the kobject can be set to |
| 112 | NULL in the call to kobject_add() and then the kobject's parent will be the |
| 113 | kset itself. |
| 114 | |
| 115 | As the name of the kobject is set when it is added to the kernel, the name |
| 116 | of the kobject should never be manipulated directly. If you must change |
| 117 | the name of the kobject, call kobject_rename(): |
| 118 | |
| 119 | int kobject_rename(struct kobject *kobj, const char *new_name); |
| 120 | |
| 121 | There is a function called kobject_set_name() but that is legacy cruft and |
| 122 | is being removed. If your code needs to call this function, it is |
| 123 | incorrect and needs to be fixed. |
| 124 | |
| 125 | To properly access the name of the kobject, use the function |
| 126 | kobject_name(): |
| 127 | |
| 128 | const char *kobject_name(const struct kobject * kobj); |
| 129 | |
| 130 | There is a helper function to both initialize and add the kobject to the |
| 131 | kernel at the same time, called supprisingly enough kobject_init_and_add(): |
| 132 | |
| 133 | int kobject_init_and_add(struct kobject *kobj, struct kobj_type *ktype, |
| 134 | struct kobject *parent, const char *fmt, ...); |
| 135 | |
| 136 | The arguments are the same as the individual kobject_init() and |
| 137 | kobject_add() functions described above. |
| 138 | |
| 139 | |
| 140 | Uevents |
| 141 | |
| 142 | After a kobject has been registered with the kobject core, you need to |
| 143 | announce to the world that it has been created. This can be done with a |
| 144 | call to kobject_uevent(): |
| 145 | |
| 146 | int kobject_uevent(struct kobject *kobj, enum kobject_action action); |
| 147 | |
| 148 | Use the KOBJ_ADD action for when the kobject is first added to the kernel. |
| 149 | This should be done only after any attributes or children of the kobject |
| 150 | have been initialized properly, as userspace will instantly start to look |
| 151 | for them when this call happens. |
| 152 | |
| 153 | When the kobject is removed from the kernel (details on how to do that is |
| 154 | below), the uevent for KOBJ_REMOVE will be automatically created by the |
| 155 | kobject core, so the caller does not have to worry about doing that by |
| 156 | hand. |
| 157 | |
| 158 | |
| 159 | Reference counts |
| 160 | |
| 161 | One of the key functions of a kobject is to serve as a reference counter |
| 162 | for the object in which it is embedded. As long as references to the object |
| 163 | exist, the object (and the code which supports it) must continue to exist. |
| 164 | The low-level functions for manipulating a kobject's reference counts are: |
| 165 | |
| 166 | struct kobject *kobject_get(struct kobject *kobj); |
| 167 | void kobject_put(struct kobject *kobj); |
| 168 | |
| 169 | A successful call to kobject_get() will increment the kobject's reference |
| 170 | counter and return the pointer to the kobject. |
| 171 | |
| 172 | When a reference is released, the call to kobject_put() will decrement the |
| 173 | reference count and, possibly, free the object. Note that kobject_init() |
| 174 | sets the reference count to one, so the code which sets up the kobject will |
| 175 | need to do a kobject_put() eventually to release that reference. |
| 176 | |
| 177 | Because kobjects are dynamic, they must not be declared statically or on |
| 178 | the stack, but instead, always allocated dynamically. Future versions of |
| 179 | the kernel will contain a run-time check for kobjects that are created |
| 180 | statically and will warn the developer of this improper usage. |
| 181 | |
| 182 | If all that you want to use a kobject for is to provide a reference counter |
| 183 | for your structure, please use the struct kref instead; a kobject would be |
| 184 | overkill. For more information on how to use struct kref, please see the |
| 185 | file Documentation/kref.txt in the Linux kernel source tree. |
| 186 | |
| 187 | |
| 188 | Creating "simple" kobjects |
| 189 | |
| 190 | Sometimes all that a developer wants is a way to create a simple directory |
| 191 | in the sysfs hierarchy, and not have to mess with the whole complication of |
| 192 | ksets, show and store functions, and other details. This is the one |
| 193 | exception where a single kobject should be created. To create such an |
| 194 | entry, use the function: |
| 195 | |
| 196 | struct kobject *kobject_create_and_add(char *name, struct kobject *parent); |
| 197 | |
| 198 | This function will create a kobject and place it in sysfs in the location |
| 199 | underneath the specified parent kobject. To create simple attributes |
| 200 | associated with this kobject, use: |
| 201 | |
| 202 | int sysfs_create_file(struct kobject *kobj, struct attribute *attr); |
| 203 | or |
| 204 | int sysfs_create_group(struct kobject *kobj, struct attribute_group *grp); |
| 205 | |
| 206 | Both types of attributes used here, with a kobject that has been created |
| 207 | with the kobject_create_and_add(), can be of type kobj_attribute, so no |
| 208 | special custom attribute is needed to be created. |
| 209 | |
| 210 | See the example module, samples/kobject/kobject-example.c for an |
| 211 | implementation of a simple kobject and attributes. |
| 212 | |
| 213 | |
| 214 | |
| 215 | ktypes and release methods |
| 216 | |
| 217 | One important thing still missing from the discussion is what happens to a |
| 218 | kobject when its reference count reaches zero. The code which created the |
| 219 | kobject generally does not know when that will happen; if it did, there |
| 220 | would be little point in using a kobject in the first place. Even |
| 221 | predictable object lifecycles become more complicated when sysfs is brought |
| 222 | in as other portions of the kernel can get a reference on any kobject that |
| 223 | is registered in the system. |
| 224 | |
| 225 | The end result is that a structure protected by a kobject cannot be freed |
| 226 | before its reference count goes to zero. The reference count is not under |
| 227 | the direct control of the code which created the kobject. So that code must |
| 228 | be notified asynchronously whenever the last reference to one of its |
| 229 | kobjects goes away. |
| 230 | |
| 231 | Once you registered your kobject via kobject_add(), you must never use |
| 232 | kfree() to free it directly. The only safe way is to use kobject_put(). It |
| 233 | is good practice to always use kobject_put() after kobject_init() to avoid |
| 234 | errors creeping in. |
| 235 | |
| 236 | This notification is done through a kobject's release() method. Usually |
| 237 | such a method has a form like: |
| 238 | |
| 239 | void my_object_release(struct kobject *kobj) |
| 240 | { |
| 241 | struct my_object *mine = container_of(kobj, struct my_object, kobj); |
| 242 | |
| 243 | /* Perform any additional cleanup on this object, then... */ |
| 244 | kfree(mine); |
| 245 | } |
| 246 | |
| 247 | One important point cannot be overstated: every kobject must have a |
| 248 | release() method, and the kobject must persist (in a consistent state) |
| 249 | until that method is called. If these constraints are not met, the code is |
| 250 | flawed. Note that the kernel will warn you if you forget to provide a |
| 251 | release() method. Do not try to get rid of this warning by providing an |
| 252 | "empty" release function; you will be mocked mercilessly by the kobject |
| 253 | maintainer if you attempt this. |
| 254 | |
| 255 | Note, the name of the kobject is available in the release function, but it |
| 256 | must NOT be changed within this callback. Otherwise there will be a memory |
| 257 | leak in the kobject core, which makes people unhappy. |
| 258 | |
| 259 | Interestingly, the release() method is not stored in the kobject itself; |
| 260 | instead, it is associated with the ktype. So let us introduce struct |
| 261 | kobj_type: |
| 262 | |
| 263 | struct kobj_type { |
| 264 | void (*release)(struct kobject *); |
| 265 | struct sysfs_ops *sysfs_ops; |
| 266 | struct attribute **default_attrs; |
| 267 | }; |
| 268 | |
| 269 | This structure is used to describe a particular type of kobject (or, more |
| 270 | correctly, of containing object). Every kobject needs to have an associated |
| 271 | kobj_type structure; a pointer to that structure must be specified when you |
| 272 | call kobject_init() or kobject_init_and_add(). |
| 273 | |
| 274 | The release field in struct kobj_type is, of course, a pointer to the |
| 275 | release() method for this type of kobject. The other two fields (sysfs_ops |
| 276 | and default_attrs) control how objects of this type are represented in |
| 277 | sysfs; they are beyond the scope of this document. |
| 278 | |
| 279 | The default_attrs pointer is a list of default attributes that will be |
| 280 | automatically created for any kobject that is registered with this ktype. |
| 281 | |
| 282 | |
| 283 | ksets |
| 284 | |
| 285 | A kset is merely a collection of kobjects that want to be associated with |
| 286 | each other. There is no restriction that they be of the same ktype, but be |
| 287 | very careful if they are not. |
| 288 | |
| 289 | A kset serves these functions: |
| 290 | |
| 291 | - It serves as a bag containing a group of objects. A kset can be used by |
| 292 | the kernel to track "all block devices" or "all PCI device drivers." |
| 293 | |
| 294 | - A kset is also a subdirectory in sysfs, where the associated kobjects |
| 295 | with the kset can show up. Every kset contains a kobject which can be |
| 296 | set up to be the parent of other kobjects; the top-level directories of |
| 297 | the sysfs hierarchy are constructed in this way. |
| 298 | |
| 299 | - Ksets can support the "hotplugging" of kobjects and influence how |
| 300 | uevent events are reported to user space. |
| 301 | |
| 302 | In object-oriented terms, "kset" is the top-level container class; ksets |
| 303 | contain their own kobject, but that kobject is managed by the kset code and |
| 304 | should not be manipulated by any other user. |
| 305 | |
| 306 | A kset keeps its children in a standard kernel linked list. Kobjects point |
| 307 | back to their containing kset via their kset field. In almost all cases, |
| 308 | the kobjects belonging to a ket have that kset (or, strictly, its embedded |
| 309 | kobject) in their parent. |
| 310 | |
| 311 | As a kset contains a kobject within it, it should always be dynamically |
| 312 | created and never declared statically or on the stack. To create a new |
| 313 | kset use: |
| 314 | struct kset *kset_create_and_add(const char *name, |
| 315 | struct kset_uevent_ops *u, |
| 316 | struct kobject *parent); |
| 317 | |
| 318 | When you are finished with the kset, call: |
| 319 | void kset_unregister(struct kset *kset); |
| 320 | to destroy it. |
| 321 | |
| 322 | An example of using a kset can be seen in the |
| 323 | samples/kobject/kset-example.c file in the kernel tree. |
| 324 | |
| 325 | If a kset wishes to control the uevent operations of the kobjects |
| 326 | associated with it, it can use the struct kset_uevent_ops to handle it: |
| 327 | |
| 328 | struct kset_uevent_ops { |
| 329 | int (*filter)(struct kset *kset, struct kobject *kobj); |
| 330 | const char *(*name)(struct kset *kset, struct kobject *kobj); |
| 331 | int (*uevent)(struct kset *kset, struct kobject *kobj, |
| 332 | struct kobj_uevent_env *env); |
| 333 | }; |
| 334 | |
| 335 | |
| 336 | The filter function allows a kset to prevent a uevent from being emitted to |
| 337 | userspace for a specific kobject. If the function returns 0, the uevent |
| 338 | will not be emitted. |
| 339 | |
| 340 | The name function will be called to override the default name of the kset |
| 341 | that the uevent sends to userspace. By default, the name will be the same |
| 342 | as the kset itself, but this function, if present, can override that name. |
| 343 | |
| 344 | The uevent function will be called when the uevent is about to be sent to |
| 345 | userspace to allow more environment variables to be added to the uevent. |
| 346 | |
| 347 | One might ask how, exactly, a kobject is added to a kset, given that no |
| 348 | functions which perform that function have been presented. The answer is |
| 349 | that this task is handled by kobject_add(). When a kobject is passed to |
| 350 | kobject_add(), its kset member should point to the kset to which the |
| 351 | kobject will belong. kobject_add() will handle the rest. |
| 352 | |
| 353 | If the kobject belonging to a kset has no parent kobject set, it will be |
| 354 | added to the kset's directory. Not all members of a kset do necessarily |
| 355 | live in the kset directory. If an explicit parent kobject is assigned |
| 356 | before the kobject is added, the kobject is registered with the kset, but |
| 357 | added below the parent kobject. |
| 358 | |
| 359 | |
| 360 | Kobject removal |
| 361 | |
| 362 | After a kobject has been registered with the kobject core successfully, it |
| 363 | must be cleaned up when the code is finished with it. To do that, call |
| 364 | kobject_put(). By doing this, the kobject core will automatically clean up |
| 365 | all of the memory allocated by this kobject. If a KOBJ_ADD uevent has been |
| 366 | sent for the object, a corresponding KOBJ_REMOVE uevent will be sent, and |
| 367 | any other sysfs housekeeping will be handled for the caller properly. |
| 368 | |
| 369 | If you need to do a two-stage delete of the kobject (say you are not |
| 370 | allowed to sleep when you need to destroy the object), then call |
| 371 | kobject_del() which will unregister the kobject from sysfs. This makes the |
| 372 | kobject "invisible", but it is not cleaned up, and the reference count of |
| 373 | the object is still the same. At a later time call kobject_put() to finish |
| 374 | the cleanup of the memory associated with the kobject. |
| 375 | |
| 376 | kobject_del() can be used to drop the reference to the parent object, if |
| 377 | circular references are constructed. It is valid in some cases, that a |
| 378 | parent objects references a child. Circular references _must_ be broken |
| 379 | with an explicit call to kobject_del(), so that a release functions will be |
| 380 | called, and the objects in the former circle release each other. |
| 381 | |
| 382 | |
| 383 | Example code to copy from |
| 384 | |
| 385 | For a more complete example of using ksets and kobjects properly, see the |
| 386 | sample/kobject/kset-example.c code. |