Rafael J. Wysocki | 5b79520 | 2007-05-08 00:24:07 -0700 | [diff] [blame] | 1 | Debugging suspend and resume |
| 2 | (C) 2007 Rafael J. Wysocki <rjw@sisk.pl>, GPL |
| 3 | |
| 4 | 1. Testing suspend to disk (STD) |
| 5 | |
| 6 | To verify that the STD works, you can try to suspend in the "reboot" mode: |
| 7 | |
| 8 | # echo reboot > /sys/power/disk |
| 9 | # echo disk > /sys/power/state |
| 10 | |
| 11 | and the system should suspend, reboot, resume and get back to the command prompt |
| 12 | where you have started the transition. If that happens, the STD is most likely |
| 13 | to work correctly, but you need to repeat the test at least a couple of times in |
| 14 | a row for confidence. This is necessary, because some problems only show up on |
| 15 | a second attempt at suspending and resuming the system. You should also test |
| 16 | the "platform" and "shutdown" modes of suspend: |
| 17 | |
| 18 | # echo platform > /sys/power/disk |
| 19 | # echo disk > /sys/power/state |
| 20 | |
| 21 | or |
| 22 | |
| 23 | # echo shutdown > /sys/power/disk |
| 24 | # echo disk > /sys/power/state |
| 25 | |
| 26 | in which cases you will have to press the power button to make the system |
| 27 | resume. If that does not work, you will need to identify what goes wrong. |
| 28 | |
| 29 | a) Test mode of STD |
| 30 | |
| 31 | To verify if there are any drivers that cause problems you can run the STD |
| 32 | in the test mode: |
| 33 | |
| 34 | # echo test > /sys/power/disk |
| 35 | # echo disk > /sys/power/state |
| 36 | |
| 37 | in which case the system should freeze tasks, suspend devices, disable nonboot |
| 38 | CPUs (if any), wait for 5 seconds, enable nonboot CPUs, resume devices, thaw |
| 39 | tasks and return to your command prompt. If that fails, most likely there is |
| 40 | a driver that fails to either suspend or resume (in the latter case the system |
| 41 | may hang or be unstable after the test, so please take that into consideration). |
| 42 | To find this driver, you can carry out a binary search according to the rules: |
| 43 | - if the test fails, unload a half of the drivers currently loaded and repeat |
| 44 | (that would probably involve rebooting the system, so always note what drivers |
| 45 | have been loaded before the test), |
| 46 | - if the test succeeds, load a half of the drivers you have unloaded most |
| 47 | recently and repeat. |
| 48 | |
| 49 | Once you have found the failing driver (there can be more than just one of |
| 50 | them), you have to unload it every time before the STD transition. In that case |
| 51 | please make sure to report the problem with the driver. |
| 52 | |
| 53 | It is also possible that a cycle can still fail after you have unloaded |
| 54 | all modules. In that case, you would want to look in your kernel configuration |
| 55 | for the drivers that can be compiled as modules (testing again with them as |
| 56 | modules), and possibly also try boot time options such as "noapic" or "noacpi". |
| 57 | |
| 58 | b) Testing minimal configuration |
| 59 | |
| 60 | If the test mode of STD works, you can boot the system with "init=/bin/bash" |
| 61 | and attempt to suspend in the "reboot", "shutdown" and "platform" modes. If |
| 62 | that does not work, there probably is a problem with a driver statically |
| 63 | compiled into the kernel and you can try to compile more drivers as modules, |
| 64 | so that they can be tested individually. Otherwise, there is a problem with a |
| 65 | modular driver and you can find it by loading a half of the modules you normally |
| 66 | use and binary searching in accordance with the algorithm: |
| 67 | - if there are n modules loaded and the attempt to suspend and resume fails, |
| 68 | unload n/2 of the modules and try again (that would probably involve rebooting |
| 69 | the system), |
| 70 | - if there are n modules loaded and the attempt to suspend and resume succeeds, |
| 71 | load n/2 modules more and try again. |
| 72 | |
| 73 | Again, if you find the offending module(s), it(they) must be unloaded every time |
| 74 | before the STD transition, and please report the problem with it(them). |
| 75 | |
| 76 | c) Advanced debugging |
| 77 | |
| 78 | In case the STD does not work on your system even in the minimal configuration |
| 79 | and compiling more drivers as modules is not practical or some modules cannot |
| 80 | be unloaded, you can use one of the more advanced debugging techniques to find |
Andres Salomon | 8f4ce8c | 2007-10-18 03:04:50 -0700 | [diff] [blame] | 81 | the problem. First, if there is a serial port in your box, you can boot the |
| 82 | kernel with the 'no_console_suspend' parameter and try to log kernel |
Rafael J. Wysocki | 5b79520 | 2007-05-08 00:24:07 -0700 | [diff] [blame] | 83 | messages using the serial console. This may provide you with some information |
| 84 | about the reasons of the suspend (resume) failure. Alternatively, it may be |
| 85 | possible to use a FireWire port for debugging with firescope |
| 86 | (ftp://ftp.firstfloor.org/pub/ak/firescope/). On i386 it is also possible to |
| 87 | use the PM_TRACE mechanism documented in Documentation/s2ram.txt . |
| 88 | |
| 89 | 2. Testing suspend to RAM (STR) |
| 90 | |
| 91 | To verify that the STR works, it is generally more convenient to use the s2ram |
| 92 | tool available from http://suspend.sf.net and documented at |
| 93 | http://en.opensuse.org/s2ram . However, before doing that it is recommended to |
| 94 | carry out the procedure described in section 1. |
| 95 | |
| 96 | Assume you have resolved the problems with the STD and you have found some |
| 97 | failing drivers. These drivers are also likely to fail during the STR or |
| 98 | during the resume, so it is better to unload them every time before the STR |
| 99 | transition. Now, you can follow the instructions at |
| 100 | http://en.opensuse.org/s2ram to test the system, but if it does not work |
| 101 | "out of the box", you may need to boot it with "init=/bin/bash" and test |
| 102 | s2ram in the minimal configuration. In that case, you may be able to search |
| 103 | for failing drivers by following the procedure analogous to the one described in |
| 104 | 1b). If you find some failing drivers, you will have to unload them every time |
| 105 | before the STR transition (ie. before you run s2ram), and please report the |
| 106 | problems with them. |