Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 1 | RCU Torture Test Operation |
| 2 | |
| 3 | |
| 4 | CONFIG_RCU_TORTURE_TEST |
| 5 | |
| 6 | The CONFIG_RCU_TORTURE_TEST config option is available for all RCU |
| 7 | implementations. It creates an rcutorture kernel module that can |
| 8 | be loaded to run a torture test. The test periodically outputs |
| 9 | status messages via printk(), which can be examined via the dmesg |
Paul E. McKenney | 72e9bb5 | 2006-06-27 02:54:03 -0700 | [diff] [blame] | 10 | command (perhaps grepping for "torture"). The test is started |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 11 | when the module is loaded, and stops when the module is unloaded. |
| 12 | |
Paul E. McKenney | 31a72bc | 2008-06-18 09:26:49 -0700 | [diff] [blame] | 13 | CONFIG_RCU_TORTURE_TEST_RUNNABLE |
| 14 | |
| 15 | It is also possible to specify CONFIG_RCU_TORTURE_TEST=y, which will |
| 16 | result in the tests being loaded into the base kernel. In this case, |
| 17 | the CONFIG_RCU_TORTURE_TEST_RUNNABLE config option is used to specify |
| 18 | whether the RCU torture tests are to be started immediately during |
| 19 | boot or whether the /proc/sys/kernel/rcutorture_runnable file is used |
| 20 | to enable them. This /proc file can be used to repeatedly pause and |
| 21 | restart the tests, regardless of the initial state specified by the |
| 22 | CONFIG_RCU_TORTURE_TEST_RUNNABLE config option. |
| 23 | |
| 24 | You will normally -not- want to start the RCU torture tests during boot |
| 25 | (and thus the default is CONFIG_RCU_TORTURE_TEST_RUNNABLE=n), but doing |
| 26 | this can sometimes be useful in finding boot-time bugs. |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 27 | |
| 28 | |
| 29 | MODULE PARAMETERS |
| 30 | |
| 31 | This module has the following parameters: |
| 32 | |
Paul E. McKenney | 0729fbf | 2008-06-25 12:24:52 -0700 | [diff] [blame] | 33 | irqreaders Says to invoke RCU readers from irq level. This is currently |
| 34 | done via timers. Defaults to "1" for variants of RCU that |
| 35 | permit this. (Or, more accurately, variants of RCU that do |
| 36 | -not- permit this know to ignore this variable.) |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 37 | |
Josh Triplett | b772e1d | 2006-10-04 02:17:13 -0700 | [diff] [blame] | 38 | nfakewriters This is the number of RCU fake writer threads to run. Fake |
| 39 | writer threads repeatedly use the synchronous "wait for |
| 40 | current readers" function of the interface selected by |
| 41 | torture_type, with a delay between calls to allow for various |
| 42 | different numbers of writers running in parallel. |
| 43 | nfakewriters defaults to 4, which provides enough parallelism |
| 44 | to trigger special cases caused by multiple writers, such as |
| 45 | the synchronize_srcu() early return optimization. |
| 46 | |
Paul E. McKenney | 0729fbf | 2008-06-25 12:24:52 -0700 | [diff] [blame] | 47 | nreaders This is the number of RCU reading threads supported. |
| 48 | The default is twice the number of CPUs. Why twice? |
| 49 | To properly exercise RCU implementations with preemptible |
| 50 | read-side critical sections. |
| 51 | |
| 52 | shuffle_interval |
| 53 | The number of seconds to keep the test threads affinitied |
| 54 | to a particular subset of the CPUs, defaults to 3 seconds. |
| 55 | Used in conjunction with test_no_idle_hz. |
| 56 | |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 57 | stat_interval The number of seconds between output of torture |
| 58 | statistics (via printk()). Regardless of the interval, |
| 59 | statistics are printed when the module is unloaded. |
| 60 | Setting the interval to zero causes the statistics to |
| 61 | be printed -only- when the module is unloaded, and this |
| 62 | is the default. |
| 63 | |
Paul E. McKenney | d120f65 | 2008-06-18 05:21:44 -0700 | [diff] [blame] | 64 | stutter The length of time to run the test before pausing for this |
| 65 | same period of time. Defaults to "stutter=5", so as |
| 66 | to run and pause for (roughly) five-second intervals. |
| 67 | Specifying "stutter=0" causes the test to run continuously |
| 68 | without pausing, which is the old default behavior. |
| 69 | |
Paul E. McKenney | 29766f1 | 2006-06-27 02:54:02 -0700 | [diff] [blame] | 70 | test_no_idle_hz Whether or not to test the ability of RCU to operate in |
| 71 | a kernel that disables the scheduling-clock interrupt to |
| 72 | idle CPUs. Boolean parameter, "1" to test, "0" otherwise. |
Paul E. McKenney | f85d6c7 | 2008-01-25 21:08:25 +0100 | [diff] [blame] | 73 | Defaults to omitting this test. |
Paul E. McKenney | 29766f1 | 2006-06-27 02:54:02 -0700 | [diff] [blame] | 74 | |
Josh Triplett | 20d2e42 | 2006-10-04 02:17:15 -0700 | [diff] [blame] | 75 | torture_type The type of RCU to test: "rcu" for the rcu_read_lock() API, |
| 76 | "rcu_sync" for rcu_read_lock() with synchronous reclamation, |
Josh Triplett | 11a1470 | 2006-10-04 02:17:16 -0700 | [diff] [blame] | 77 | "rcu_bh" for the rcu_read_lock_bh() API, "rcu_bh_sync" for |
Josh Triplett | 4b6c2cc | 2006-10-04 02:17:16 -0700 | [diff] [blame] | 78 | rcu_read_lock_bh() with synchronous reclamation, "srcu" for |
| 79 | the "srcu_read_lock()" API, and "sched" for the use of |
| 80 | preempt_disable() together with synchronize_sched(). |
Paul E. McKenney | 72e9bb5 | 2006-06-27 02:54:03 -0700 | [diff] [blame] | 81 | |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 82 | verbose Enable debug printk()s. Default is disabled. |
| 83 | |
| 84 | |
| 85 | OUTPUT |
| 86 | |
| 87 | The statistics output is as follows: |
| 88 | |
Paul E. McKenney | 72e9bb5 | 2006-06-27 02:54:03 -0700 | [diff] [blame] | 89 | rcu-torture: --- Start of test: nreaders=16 stat_interval=0 verbose=0 |
| 90 | rcu-torture: rtc: 0000000000000000 ver: 1916 tfle: 0 rta: 1916 rtaf: 0 rtf: 1915 |
| 91 | rcu-torture: Reader Pipe: 1466408 9747 0 0 0 0 0 0 0 0 0 |
| 92 | rcu-torture: Reader Batch: 1464477 11678 0 0 0 0 0 0 0 0 |
| 93 | rcu-torture: Free-Block Circulation: 1915 1915 1915 1915 1915 1915 1915 1915 1915 1915 0 |
| 94 | rcu-torture: --- End of test |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 95 | |
Paul E. McKenney | 72e9bb5 | 2006-06-27 02:54:03 -0700 | [diff] [blame] | 96 | The command "dmesg | grep torture:" will extract this information on |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 97 | most systems. On more esoteric configurations, it may be necessary to |
| 98 | use other commands to access the output of the printk()s used by |
| 99 | the RCU torture test. The printk()s use KERN_ALERT, so they should |
| 100 | be evident. ;-) |
| 101 | |
| 102 | The entries are as follows: |
| 103 | |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 104 | o "rtc": The hexadecimal address of the structure currently visible |
| 105 | to readers. |
| 106 | |
| 107 | o "ver": The number of times since boot that the rcutw writer task |
| 108 | has changed the structure visible to readers. |
| 109 | |
| 110 | o "tfle": If non-zero, indicates that the "torture freelist" |
| 111 | containing structure to be placed into the "rtc" area is empty. |
| 112 | This condition is important, since it can fool you into thinking |
| 113 | that RCU is working when it is not. :-/ |
| 114 | |
| 115 | o "rta": Number of structures allocated from the torture freelist. |
| 116 | |
| 117 | o "rtaf": Number of allocations from the torture freelist that have |
| 118 | failed due to the list being empty. |
| 119 | |
| 120 | o "rtf": Number of frees into the torture freelist. |
| 121 | |
| 122 | o "Reader Pipe": Histogram of "ages" of structures seen by readers. |
| 123 | If any entries past the first two are non-zero, RCU is broken. |
| 124 | And rcutorture prints the error flag string "!!!" to make sure |
| 125 | you notice. The age of a newly allocated structure is zero, |
| 126 | it becomes one when removed from reader visibility, and is |
| 127 | incremented once per grace period subsequently -- and is freed |
| 128 | after passing through (RCU_TORTURE_PIPE_LEN-2) grace periods. |
| 129 | |
| 130 | The output displayed above was taken from a correctly working |
| 131 | RCU. If you want to see what it looks like when broken, break |
| 132 | it yourself. ;-) |
| 133 | |
| 134 | o "Reader Batch": Another histogram of "ages" of structures seen |
| 135 | by readers, but in terms of counter flips (or batches) rather |
| 136 | than in terms of grace periods. The legal number of non-zero |
Paul E. McKenney | f85d6c7 | 2008-01-25 21:08:25 +0100 | [diff] [blame] | 137 | entries is again two. The reason for this separate view is that |
| 138 | it is sometimes easier to get the third entry to show up in the |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 139 | "Reader Batch" list than in the "Reader Pipe" list. |
| 140 | |
| 141 | o "Free-Block Circulation": Shows the number of torture structures |
| 142 | that have reached a given point in the pipeline. The first element |
| 143 | should closely correspond to the number of structures allocated, |
| 144 | the second to the number that have been removed from reader view, |
| 145 | and all but the last remaining to the corresponding number of |
| 146 | passes through a grace period. The last entry should be zero, |
| 147 | as it is only incremented if a torture structure's counter |
| 148 | somehow gets incremented farther than it should. |
| 149 | |
Paul E. McKenney | b2896d2 | 2006-10-04 02:17:03 -0700 | [diff] [blame] | 150 | Different implementations of RCU can provide implementation-specific |
| 151 | additional information. For example, SRCU provides the following: |
| 152 | |
| 153 | srcu-torture: rtc: f8cf46a8 ver: 355 tfle: 0 rta: 356 rtaf: 0 rtf: 346 rtmbe: 0 |
| 154 | srcu-torture: Reader Pipe: 559738 939 0 0 0 0 0 0 0 0 0 |
| 155 | srcu-torture: Reader Batch: 560434 243 0 0 0 0 0 0 0 0 |
| 156 | srcu-torture: Free-Block Circulation: 355 354 353 352 351 350 349 348 347 346 0 |
| 157 | srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1) |
| 158 | |
| 159 | The first four lines are similar to those for RCU. The last line shows |
| 160 | the per-CPU counter state. The numbers in parentheses are the values |
| 161 | of the "old" and "current" counters for the corresponding CPU. The |
| 162 | "idx" value maps the "old" and "current" values to the underlying array, |
| 163 | and is useful for debugging. |
| 164 | |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 165 | |
| 166 | USAGE |
| 167 | |
| 168 | The following script may be used to torture RCU: |
| 169 | |
| 170 | #!/bin/sh |
| 171 | |
| 172 | modprobe rcutorture |
| 173 | sleep 100 |
| 174 | rmmod rcutorture |
Paul E. McKenney | 72e9bb5 | 2006-06-27 02:54:03 -0700 | [diff] [blame] | 175 | dmesg | grep torture: |
Paul E. McKenney | a241ec6 | 2005-10-30 15:03:12 -0800 | [diff] [blame] | 176 | |
| 177 | The output can be manually inspected for the error flag of "!!!". |
| 178 | One could of course create a more elaborate script that automatically |
Paul E. McKenney | 29766f1 | 2006-06-27 02:54:02 -0700 | [diff] [blame] | 179 | checked for such errors. The "rmmod" command forces a "SUCCESS" or |
| 180 | "FAILURE" indication to be printk()ed. |