Patrick Pannuto | 0fcb808 | 2010-08-02 15:01:05 -0700 | [diff] [blame] | 1 | delays - Information on the various kernel delay / sleep mechanisms |
| 2 | ------------------------------------------------------------------- |
| 3 | |
| 4 | This document seeks to answer the common question: "What is the |
| 5 | RightWay (TM) to insert a delay?" |
| 6 | |
| 7 | This question is most often faced by driver writers who have to |
| 8 | deal with hardware delays and who may not be the most intimately |
| 9 | familiar with the inner workings of the Linux Kernel. |
| 10 | |
| 11 | |
| 12 | Inserting Delays |
| 13 | ---------------- |
| 14 | |
| 15 | The first, and most important, question you need to ask is "Is my |
| 16 | code in an atomic context?" This should be followed closely by "Does |
| 17 | it really need to delay in atomic context?" If so... |
| 18 | |
| 19 | ATOMIC CONTEXT: |
| 20 | You must use the *delay family of functions. These |
| 21 | functions use the jiffie estimation of clock speed |
| 22 | and will busy wait for enough loop cycles to achieve |
| 23 | the desired delay: |
| 24 | |
| 25 | ndelay(unsigned long nsecs) |
| 26 | udelay(unsigned long usecs) |
| 27 | mdelay(unsgined long msecs) |
| 28 | |
| 29 | udelay is the generally preferred API; ndelay-level |
| 30 | precision may not actually exist on many non-PC devices. |
| 31 | |
| 32 | mdelay is macro wrapper around udelay, to account for |
| 33 | possible overflow when passing large arguments to udelay. |
| 34 | In general, use of mdelay is discouraged and code should |
| 35 | be refactored to allow for the use of msleep. |
| 36 | |
| 37 | NON-ATOMIC CONTEXT: |
| 38 | You should use the *sleep[_range] family of functions. |
| 39 | There are a few more options here, while any of them may |
| 40 | work correctly, using the "right" sleep function will |
| 41 | help the scheduler, power management, and just make your |
| 42 | driver better :) |
| 43 | |
| 44 | -- Backed by busy-wait loop: |
| 45 | udelay(unsigned long usecs) |
| 46 | -- Backed by hrtimers: |
| 47 | usleep_range(unsigned long min, unsigned long max) |
| 48 | -- Backed by jiffies / legacy_timers |
| 49 | msleep(unsigned long msecs) |
| 50 | msleep_interruptible(unsigned long msecs) |
| 51 | |
| 52 | Unlike the *delay family, the underlying mechanism |
| 53 | driving each of these calls varies, thus there are |
| 54 | quirks you should be aware of. |
| 55 | |
| 56 | |
| 57 | SLEEPING FOR "A FEW" USECS ( < ~10us? ): |
| 58 | * Use udelay |
| 59 | |
| 60 | - Why not usleep? |
| 61 | On slower systems, (embedded, OR perhaps a speed- |
| 62 | stepped PC!) the overhead of setting up the hrtimers |
| 63 | for usleep *may* not be worth it. Such an evaluation |
| 64 | will obviously depend on your specific situation, but |
| 65 | it is something to be aware of. |
| 66 | |
| 67 | SLEEPING FOR ~USECS OR SMALL MSECS ( 10us - 20ms): |
| 68 | * Use usleep_range |
| 69 | |
| 70 | - Why not msleep for (1ms - 20ms)? |
| 71 | Explained originally here: |
| 72 | http://lkml.org/lkml/2007/8/3/250 |
| 73 | msleep(1~20) may not do what the caller intends, and |
| 74 | will often sleep longer (~20 ms actual sleep for any |
| 75 | value given in the 1~20ms range). In many cases this |
| 76 | is not the desired behavior. |
| 77 | |
| 78 | - Why is there no "usleep" / What is a good range? |
| 79 | Since usleep_range is built on top of hrtimers, the |
| 80 | wakeup will be very precise (ish), thus a simple |
| 81 | usleep function would likely introduce a large number |
| 82 | of undesired interrupts. |
| 83 | |
| 84 | With the introduction of a range, the scheduler is |
| 85 | free to coalesce your wakeup with any other wakeup |
| 86 | that may have happened for other reasons, or at the |
| 87 | worst case, fire an interrupt for your upper bound. |
| 88 | |
| 89 | The larger a range you supply, the greater a chance |
| 90 | that you will not trigger an interrupt; this should |
| 91 | be balanced with what is an acceptable upper bound on |
| 92 | delay / performance for your specific code path. Exact |
| 93 | tolerances here are very situation specific, thus it |
| 94 | is left to the caller to determine a reasonable range. |
| 95 | |
| 96 | SLEEPING FOR LARGER MSECS ( 10ms+ ) |
| 97 | * Use msleep or possibly msleep_interruptible |
| 98 | |
| 99 | - What's the difference? |
| 100 | msleep sets the current task to TASK_UNINTERRUPTIBLE |
| 101 | whereas msleep_interruptible sets the current task to |
| 102 | TASK_INTERRUPTIBLE before scheduling the sleep. In |
| 103 | short, the difference is whether the sleep can be ended |
| 104 | early by a signal. In general, just use msleep unless |
| 105 | you know you have a need for the interruptible variant. |