| completions - wait for completion handling |
| ========================================== |
| |
| This document was originally written based on 3.18.0 (linux-next) |
| |
| Introduction: |
| ------------- |
| |
| If you have one or more threads of execution that must wait for some process |
| to have reached a point or a specific state, completions can provide a race |
| free solution to this problem. Semantically they are somewhat like a |
| pthread_barriers and have similar use-cases. |
| |
| Completions are a code synchronization mechanism that is preferable to any |
| misuse of locks. Any time you think of using yield() or some quirky |
| msleep(1); loop to allow something else to proceed, you probably want to |
| look into using one of the wait_for_completion*() calls instead. The |
| advantage of using completions is clear intent of the code but also more |
| efficient code as both threads can continue until the result is actually |
| needed. |
| |
| Completions are built on top of the generic event infrastructure in Linux, |
| with the event reduced to a simple flag appropriately called "done" in |
| struct completion, that tells the waiting threads of execution if they |
| can continue safely. |
| |
| As completions are scheduling related the code is found in |
| kernel/sched/completion.c - for details on completion design and |
| implementation see completions-design.txt |
| |
| |
| Usage: |
| ------ |
| |
| There are three parts to the using completions, the initialization of the |
| struct completion, the waiting part through a call to one of the variants of |
| wait_for_completion() and the signaling side through a call to complete(), |
| or complete_all(). Further there are some helper functions for checking the |
| state of completions. |
| |
| To use completions one needs to include <linux/completion.h> and |
| create a variable of type struct completion. The structure used for |
| handling of completions is: |
| |
| struct completion { |
| unsigned int done; |
| wait_queue_head_t wait; |
| }; |
| |
| providing the wait queue to place tasks on for waiting and the flag for |
| indicating the state of affairs. |
| |
| Completions should be named to convey the intent of the waiter. A good |
| example is: |
| |
| wait_for_completion(&early_console_added); |
| |
| complete(&early_console_added); |
| |
| Good naming (as always) helps code readability. |
| |
| |
| Initializing completions: |
| ------------------------- |
| |
| Initialization of dynamically allocated completions, often embedded in |
| other structures, is done with: |
| |
| void init_completion(&done); |
| |
| Initialization is accomplished by initializing the wait queue and setting |
| the default state to "not available", that is, "done" is set to 0. |
| |
| The re-initialization function, reinit_completion(), simply resets the |
| done element to "not available", thus again to 0, without touching the |
| wait queue. Calling init_completion() on the same completions object is |
| most likely a bug as it re-initializes the queue to an empty queue and |
| enqueued tasks could get "lost" - use reinit_completion() in that case. |
| |
| For static declaration and initialization, macros are available. These are: |
| |
| static DECLARE_COMPLETION(setup_done) |
| |
| used for static declarations in file scope. Within functions the static |
| initialization should always use: |
| |
| DECLARE_COMPLETION_ONSTACK(setup_done) |
| |
| suitable for automatic/local variables on the stack and will make lockdep |
| happy. Note also that one needs to making *sure* the completion passt to |
| work threads remains in-scope, and no references remain to on-stack data |
| when the initiating function returns. |
| |
| |
| Waiting for completions: |
| ------------------------ |
| |
| For a thread of execution to wait for some concurrent work to finish, it |
| calls wait_for_completion() on the initialized completion structure. |
| A typical usage scenario is: |
| |
| structure completion setup_done; |
| init_completion(&setup_done); |
| initialze_work(...,&setup_done,...) |
| |
| /* run non-dependent code */ /* do setup */ |
| |
| wait_for_completion(&seupt_done); complete(setup_done) |
| |
| This is not implying any temporal order of wait_for_completion() and the |
| call to complete() - if the call to complete() happened before the call |
| to wait_for_completion() then the waiting side simply will continue |
| immediately as all dependencies are satisfied. |
| |
| Note that wait_for_completion() is calling spin_lock_irq/spin_unlock_irq |
| so it can only be called safely when you know that interrupts are enabled. |
| Calling it from hard-irq context will result in hard to detect spurious |
| enabling of interrupts. |
| |
| wait_for_completion(): |
| |
| void wait_for_completion(struct completion *done): |
| |
| The default behavior is to wait without a timeout and mark the task as |
| uninterruptible. wait_for_completion() and its variants are only safe |
| in soft-interrupt or process context but not in hard-irq context. |
| As all variants of wait_for_completion() can (obviously) block for a long |
| time, you probably don't want to call this with held locks - see also |
| try_wait_for_completion() below. |
| |
| |
| Variants available: |
| ------------------- |
| |
| The below variants all return status and this status should be checked in |
| most(/all) cases - in cases where the status is deliberately not checked you |
| probably want to make a note explaining this (e.g. see |
| arch/arm/kernel/smp.c:__cpu_up()). |
| |
| A common problem that occurs is to have unclean assignment of return types, |
| so care should be taken with assigning return-values to variables of proper |
| type. Checking for the specific meaning of return values also has been found |
| to be quite inaccurate e.g. constructs like |
| if(!wait_for_completion_interruptible_timeout(...)) would execute the same |
| code path for successful completion and for the interrupted case - which is |
| probably not what you want. |
| |
| int wait_for_completion_interruptible(struct completion *done) |
| |
| marking the task TASK_INTERRUPTIBLE. If a signal was received while waiting. |
| It will return -ERESTARTSYS and 0 otherwise. |
| |
| unsigned long wait_for_completion_timeout(struct completion *done, |
| unsigned long timeout) |
| |
| The task is marked as TASK_UNINTERRUPTIBLE and will wait at most timeout |
| (in jiffies). If timeout occurs it return 0 else the remaining time in |
| jiffies (but at least 1). Timeouts are preferably passed by msecs_to_jiffies() |
| or usecs_to_jiffies(). If the returned timeout value is deliberately ignored |
| a comment should probably explain why (e.g. see drivers/mfd/wm8350-core.c |
| wm8350_read_auxadc()) |
| |
| long wait_for_completion_interruptible_timeout( |
| struct completion *done, unsigned long timeout) |
| |
| passing a timeout in jiffies and marking the task as TASK_INTERRUPTIBLE. If a |
| signal was received it will return -ERESTARTSYS, 0 if completion timed-out and |
| the remaining time in jiffies if completion occurred. |
| |
| Further variants include _killable which passes TASK_KILLABLE as the |
| designated tasks state and will return a -ERESTARTSYS if interrupted or |
| else 0 if completions was achieved as well as a _timeout variant. |
| |
| long wait_for_completion_killable(struct completion *done) |
| long wait_for_completion_killable_timeout(struct completion *done, |
| unsigned long timeout) |
| |
| The _io variants wait_for_completion_io behave the same as the non-_io |
| variants, except for accounting waiting time as waiting on IO, which has |
| an impact on how scheduling is calculated. |
| |
| void wait_for_completion_io(struct completion *done) |
| unsigned long wait_for_completion_io_timeout(struct completion *done |
| unsigned long timeout) |
| |
| |
| Signaling completions: |
| ---------------------- |
| |
| A thread of execution that wants to signal that the conditions for |
| continuation have been achieved calls complete() to signal exactly one |
| of the waiters that it can continue. |
| |
| void complete(struct completion *done) |
| |
| or calls complete_all to signal all current and future waiters. |
| |
| void complete_all(struct completion *done) |
| |
| The signaling will work as expected even if completions are signaled before |
| a thread starts waiting. This is achieved by the waiter "consuming" |
| (decrementing) the done element of struct completion. Waiting threads |
| wakeup order is the same in which they were enqueued (FIFO order). |
| |
| If complete() is called multiple times then this will allow for that number |
| of waiters to continue - each call to complete() will simply increment the |
| done element. Calling complete_all() multiple times is a bug though. Both |
| complete() and complete_all() can be called in hard-irq context safely. |
| |
| There only can be one thread calling complete() or complete_all() on a |
| particular struct completions at any time - serialized through the wait |
| queue spinlock. Any such concurrent calls to complete() or complete_all() |
| probably are a design bug. |
| |
| Signaling completion from hard-irq context is fine as it will appropriately |
| lock with spin_lock_irqsave/spin_unlock_irqrestore. |
| |
| |
| try_wait_for_completion()/completion_done(): |
| -------------------------------------------- |
| |
| The try_wait_for_completion will not put the thread on the wait queue but |
| rather returns false if it would need to enqueue (block) the thread, else it |
| consumes any posted completions and returns true. |
| |
| bool try_wait_for_completion(struct completion *done) |
| |
| Finally to check state of a completions without changing it in any way is |
| provided by completion_done() returning false if there are any posted |
| completion that was not yet consumed by waiters implying that there are |
| waiters and true otherwise; |
| |
| bool completion_done(struct completion *done) |
| |
| Both try_wait_for_completion() and completion_done() are safe to be called in |
| hard-irq context. |