Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 1 | Demonstrations of deadlock_detector. |
| 2 | |
| 3 | This program detects potential deadlocks on a running process. The program |
| 4 | attaches uprobes on `pthread_mutex_lock` and `pthread_mutex_unlock` to build |
| 5 | a mutex wait directed graph, and then looks for a cycle in this graph. This |
| 6 | graph has the following properties: |
| 7 | |
| 8 | - Nodes in the graph represent mutexes. |
| 9 | - Edge (A, B) exists if there exists some thread T where lock(A) was called |
| 10 | and lock(B) was called before unlock(A) was called. |
| 11 | |
| 12 | If there is a cycle in this graph, this indicates that there is a lock order |
| 13 | inversion (potential deadlock). If the program finds a lock order inversion, the |
| 14 | program will dump the cycle of mutexes, dump the stack traces where each mutex |
| 15 | was acquired, and then exit. |
| 16 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 17 | This program can only find potential deadlocks that occur while the program |
| 18 | is tracing the process. It cannot find deadlocks that may have occurred |
| 19 | before the program was attached to the process. |
| 20 | |
| 21 | Since this traces all mutex lock and unlock events and all thread creation |
| 22 | events on the traced process, the overhead of this bpf program can be very |
| 23 | high if the process has many threads and mutexes. You should only run this on |
| 24 | a process where the slowdown is acceptable. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 25 | |
| 26 | Note: This tool does not work for shared mutexes or recursive mutexes. |
| 27 | |
| 28 | For shared (read-write) mutexes, a deadlock requires a cycle in the wait |
| 29 | graph where at least one of the mutexes in the cycle is acquiring exclusive |
| 30 | (write) ownership. |
| 31 | |
| 32 | For recursive mutexes, lock() is called multiple times on the same mutex. |
| 33 | However, there is no way to determine if a mutex is a recursive mutex |
| 34 | after the mutex has been created. As a result, this tool will not find |
| 35 | potential deadlocks that involve only one mutex. |
| 36 | |
| 37 | |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 38 | # ./deadlock_detector.py 181 |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 39 | Tracing... Hit Ctrl-C to end. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 40 | ---------------- |
| 41 | Potential Deadlock Detected! |
| 42 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 43 | Cycle in lock order graph: Mutex M0 (main::static_mutex3 0x0000000000473c60) => Mutex M1 (0x00007fff6d738400) => Mutex M2 (global_mutex1 0x0000000000473be0) => Mutex M3 (global_mutex2 0x0000000000473c20) => Mutex M0 (main::static_mutex3 0x0000000000473c60) |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 44 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 45 | Mutex M1 (0x00007fff6d738400) acquired here while holding Mutex M0 (main::static_mutex3 0x0000000000473c60) in Thread 357250 (lockinversion): |
| 46 | @ 00000000004024d0 pthread_mutex_lock |
| 47 | @ 0000000000406dd0 std::mutex::lock() |
| 48 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 49 | @ 0000000000402e38 main::{lambda()#3}::operator()() const |
| 50 | @ 0000000000406ba8 void std::_Bind_simple<main::{lambda()#3} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 51 | @ 0000000000406951 std::_Bind_simple<main::{lambda()#3} ()>::operator()() |
| 52 | @ 000000000040673a std::thread::_Impl<std::_Bind_simple<main::{lambda()#3} ()> >::_M_run() |
| 53 | @ 00007fd4496564e1 execute_native_thread_routine |
| 54 | @ 00007fd449dd57f1 start_thread |
| 55 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 56 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 57 | Mutex M0 (main::static_mutex3 0x0000000000473c60) previously acquired by the same Thread 357250 (lockinversion) here: |
| 58 | @ 00000000004024d0 pthread_mutex_lock |
| 59 | @ 0000000000406dd0 std::mutex::lock() |
| 60 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 61 | @ 0000000000402e22 main::{lambda()#3}::operator()() const |
| 62 | @ 0000000000406ba8 void std::_Bind_simple<main::{lambda()#3} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 63 | @ 0000000000406951 std::_Bind_simple<main::{lambda()#3} ()>::operator()() |
| 64 | @ 000000000040673a std::thread::_Impl<std::_Bind_simple<main::{lambda()#3} ()> >::_M_run() |
| 65 | @ 00007fd4496564e1 execute_native_thread_routine |
| 66 | @ 00007fd449dd57f1 start_thread |
| 67 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 68 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 69 | Mutex M2 (global_mutex1 0x0000000000473be0) acquired here while holding Mutex M1 (0x00007fff6d738400) in Thread 357251 (lockinversion): |
| 70 | @ 00000000004024d0 pthread_mutex_lock |
| 71 | @ 0000000000406dd0 std::mutex::lock() |
| 72 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 73 | @ 0000000000402ea8 main::{lambda()#4}::operator()() const |
| 74 | @ 0000000000406b46 void std::_Bind_simple<main::{lambda()#4} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 75 | @ 000000000040692d std::_Bind_simple<main::{lambda()#4} ()>::operator()() |
| 76 | @ 000000000040671c std::thread::_Impl<std::_Bind_simple<main::{lambda()#4} ()> >::_M_run() |
| 77 | @ 00007fd4496564e1 execute_native_thread_routine |
| 78 | @ 00007fd449dd57f1 start_thread |
| 79 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 80 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 81 | Mutex M1 (0x00007fff6d738400) previously acquired by the same Thread 357251 (lockinversion) here: |
| 82 | @ 00000000004024d0 pthread_mutex_lock |
| 83 | @ 0000000000406dd0 std::mutex::lock() |
| 84 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 85 | @ 0000000000402e97 main::{lambda()#4}::operator()() const |
| 86 | @ 0000000000406b46 void std::_Bind_simple<main::{lambda()#4} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 87 | @ 000000000040692d std::_Bind_simple<main::{lambda()#4} ()>::operator()() |
| 88 | @ 000000000040671c std::thread::_Impl<std::_Bind_simple<main::{lambda()#4} ()> >::_M_run() |
| 89 | @ 00007fd4496564e1 execute_native_thread_routine |
| 90 | @ 00007fd449dd57f1 start_thread |
| 91 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 92 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 93 | Mutex M3 (global_mutex2 0x0000000000473c20) acquired here while holding Mutex M2 (global_mutex1 0x0000000000473be0) in Thread 357247 (lockinversion): |
| 94 | @ 00000000004024d0 pthread_mutex_lock |
| 95 | @ 0000000000406dd0 std::mutex::lock() |
| 96 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 97 | @ 0000000000402d5f main::{lambda()#1}::operator()() const |
| 98 | @ 0000000000406c6c void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 99 | @ 0000000000406999 std::_Bind_simple<main::{lambda()#1} ()>::operator()() |
| 100 | @ 0000000000406776 std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() |
| 101 | @ 00007fd4496564e1 execute_native_thread_routine |
| 102 | @ 00007fd449dd57f1 start_thread |
| 103 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 104 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 105 | Mutex M2 (global_mutex1 0x0000000000473be0) previously acquired by the same Thread 357247 (lockinversion) here: |
| 106 | @ 00000000004024d0 pthread_mutex_lock |
| 107 | @ 0000000000406dd0 std::mutex::lock() |
| 108 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 109 | @ 0000000000402d4e main::{lambda()#1}::operator()() const |
| 110 | @ 0000000000406c6c void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 111 | @ 0000000000406999 std::_Bind_simple<main::{lambda()#1} ()>::operator()() |
| 112 | @ 0000000000406776 std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() |
| 113 | @ 00007fd4496564e1 execute_native_thread_routine |
| 114 | @ 00007fd449dd57f1 start_thread |
| 115 | @ 00007fd44909746d __clone |
| 116 | |
| 117 | Mutex M0 (main::static_mutex3 0x0000000000473c60) acquired here while holding Mutex M3 (global_mutex2 0x0000000000473c20) in Thread 357248 (lockinversion): |
| 118 | @ 00000000004024d0 pthread_mutex_lock |
| 119 | @ 0000000000406dd0 std::mutex::lock() |
| 120 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 121 | @ 0000000000402dc9 main::{lambda()#2}::operator()() const |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 122 | @ 0000000000406c0a void std::_Bind_simple<main::{lambda()#2} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 123 | @ 0000000000406975 std::_Bind_simple<main::{lambda()#2} ()>::operator()() |
| 124 | @ 0000000000406758 std::thread::_Impl<std::_Bind_simple<main::{lambda()#2} ()> >::_M_run() |
| 125 | @ 00007fd4496564e1 execute_native_thread_routine |
| 126 | @ 00007fd449dd57f1 start_thread |
| 127 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 128 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 129 | Mutex M3 (global_mutex2 0x0000000000473c20) previously acquired by the same Thread 357248 (lockinversion) here: |
| 130 | @ 00000000004024d0 pthread_mutex_lock |
| 131 | @ 0000000000406dd0 std::mutex::lock() |
| 132 | @ 00000000004070d2 std::lock_guard<std::mutex>::lock_guard(std::mutex&) |
| 133 | @ 0000000000402db8 main::{lambda()#2}::operator()() const |
| 134 | @ 0000000000406c0a void std::_Bind_simple<main::{lambda()#2} ()>::_M_invoke<>(std::_Index_tuple<>) |
| 135 | @ 0000000000406975 std::_Bind_simple<main::{lambda()#2} ()>::operator()() |
| 136 | @ 0000000000406758 std::thread::_Impl<std::_Bind_simple<main::{lambda()#2} ()> >::_M_run() |
| 137 | @ 00007fd4496564e1 execute_native_thread_routine |
| 138 | @ 00007fd449dd57f1 start_thread |
| 139 | @ 00007fd44909746d __clone |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 140 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 141 | Thread 357248 created by Thread 350692 (lockinversion) here: |
| 142 | @ 00007fd449097431 __clone |
| 143 | @ 00007fd449dd5ef5 pthread_create |
| 144 | @ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) |
| 145 | @ 00000000004033ac std::thread::thread<main::{lambda()#2}>(main::{lambda()#2}&&) |
| 146 | @ 000000000040308f main |
| 147 | @ 00007fd448faa0f6 __libc_start_main |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 148 | @ 0000000000402ad8 [unknown] |
| 149 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 150 | Thread 357250 created by Thread 350692 (lockinversion) here: |
| 151 | @ 00007fd449097431 __clone |
| 152 | @ 00007fd449dd5ef5 pthread_create |
| 153 | @ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) |
| 154 | @ 00000000004034b2 std::thread::thread<main::{lambda()#3}>(main::{lambda()#3}&&) |
| 155 | @ 00000000004030b9 main |
| 156 | @ 00007fd448faa0f6 __libc_start_main |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 157 | @ 0000000000402ad8 [unknown] |
| 158 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 159 | Thread 357251 created by Thread 350692 (lockinversion) here: |
| 160 | @ 00007fd449097431 __clone |
| 161 | @ 00007fd449dd5ef5 pthread_create |
| 162 | @ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) |
| 163 | @ 00000000004035b8 std::thread::thread<main::{lambda()#4}>(main::{lambda()#4}&&) |
| 164 | @ 00000000004030e6 main |
| 165 | @ 00007fd448faa0f6 __libc_start_main |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 166 | @ 0000000000402ad8 [unknown] |
| 167 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 168 | Thread 357247 created by Thread 350692 (lockinversion) here: |
| 169 | @ 00007fd449097431 __clone |
| 170 | @ 00007fd449dd5ef5 pthread_create |
| 171 | @ 00007fd449658440 std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>) |
| 172 | @ 00000000004032a6 std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) |
| 173 | @ 0000000000403070 main |
| 174 | @ 00007fd448faa0f6 __libc_start_main |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 175 | @ 0000000000402ad8 [unknown] |
| 176 | |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 177 | This is output from a process that has a potential deadlock involving 4 mutexes |
| 178 | and 4 threads: |
| 179 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 180 | - Thread 357250 acquired M1 while holding M0 (edge M0 -> M1) |
| 181 | - Thread 357251 acquired M2 while holding M1 (edge M1 -> M2) |
| 182 | - Thread 357247 acquired M3 while holding M2 (edge M2 -> M3) |
| 183 | - Thread 357248 acquired M0 while holding M3 (edge M3 -> M0) |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 184 | |
| 185 | This is the C++ program that generated the output above: |
| 186 | |
| 187 | ```c++ |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 188 | #include <chrono> |
| 189 | #include <iostream> |
| 190 | #include <mutex> |
| 191 | #include <thread> |
| 192 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 193 | std::mutex global_mutex1; |
| 194 | std::mutex global_mutex2; |
| 195 | |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 196 | int main(void) { |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 197 | static std::mutex static_mutex3; |
| 198 | std::mutex local_mutex4; |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 199 | |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 200 | std::cout << "sleeping for a bit to allow trace to attach..." << std::endl; |
| 201 | std::this_thread::sleep_for(std::chrono::seconds(10)); |
| 202 | std::cout << "starting program..." << std::endl; |
| 203 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 204 | auto t1 = std::thread([] { |
| 205 | std::lock_guard<std::mutex> g1(global_mutex1); |
| 206 | std::lock_guard<std::mutex> g2(global_mutex2); |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 207 | }); |
| 208 | t1.join(); |
| 209 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 210 | auto t2 = std::thread([] { |
| 211 | std::lock_guard<std::mutex> g2(global_mutex2); |
| 212 | std::lock_guard<std::mutex> g3(static_mutex3); |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 213 | }); |
| 214 | t2.join(); |
| 215 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 216 | auto t3 = std::thread([&local_mutex4] { |
| 217 | std::lock_guard<std::mutex> g3(static_mutex3); |
| 218 | std::lock_guard<std::mutex> g4(local_mutex4); |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 219 | }); |
| 220 | t3.join(); |
| 221 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 222 | auto t4 = std::thread([&local_mutex4] { |
| 223 | std::lock_guard<std::mutex> g4(local_mutex4); |
| 224 | std::lock_guard<std::mutex> g1(global_mutex1); |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 225 | }); |
| 226 | t4.join(); |
| 227 | |
| 228 | std::cout << "sleeping to allow trace to collect data..." << std::endl; |
| 229 | std::this_thread::sleep_for(std::chrono::seconds(5)); |
| 230 | std::cout << "done!" << std::endl; |
| 231 | } |
| 232 | ``` |
| 233 | |
| 234 | Note that an actual deadlock did not occur, although this mutex lock ordering |
| 235 | creates the possibility of a deadlock, and this is a hint to the programmer to |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 236 | reconsider the lock ordering. If the mutexes are global or static and debug |
| 237 | symbols are enabled, the output will contain the mutex symbol name. The output |
| 238 | uses a similar format as ThreadSanitizer |
| 239 | (https://github.com/google/sanitizers/wiki/ThreadSanitizerDeadlockDetector). |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 240 | |
| 241 | |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 242 | # ./deadlock_detector.py 181 --binary /usr/local/bin/lockinversion |
| 243 | |
| 244 | Tracing... Hit Ctrl-C to end. |
| 245 | ^C |
| 246 | |
| 247 | If the traced process is instantiated from a statically-linked executable, |
| 248 | this argument is optional, and the program will determine the path of the |
| 249 | executable from the pid. However, on older kernels without this patch |
| 250 | ("uprobe: Find last occurrence of ':' when parsing uprobe PATH:OFFSET", |
| 251 | https://lkml.org/lkml/2017/1/13/585), binaries that contain `:` in the path |
| 252 | cannot be attached with uprobes. As a workaround, we can create a symlink |
| 253 | to the binary, and provide the symlink name instead to the `--binary` option. |
| 254 | |
| 255 | |
| 256 | # ./deadlock_detector.py 181 --binary /lib/x86_64-linux-gnu/libpthread.so.0 |
| 257 | |
| 258 | Tracing... Hit Ctrl-C to end. |
| 259 | ^C |
| 260 | |
| 261 | If the traced process is instantiated from a dynamically-linked executable, |
| 262 | this argument is required and needs to be the path to the pthread shared |
| 263 | library used by the executable. |
| 264 | |
| 265 | |
| 266 | # ./deadlock_detector.py 181 --dump-graph graph.json --verbose |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 267 | |
| 268 | Tracing... Hit Ctrl-C to end. |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 269 | Mutexes: 0, Edges: 0 |
| 270 | Mutexes: 532, Edges: 411 |
| 271 | Mutexes: 735, Edges: 675 |
| 272 | Mutexes: 1118, Edges: 1278 |
| 273 | Mutexes: 1666, Edges: 2185 |
| 274 | Mutexes: 2056, Edges: 2694 |
| 275 | Mutexes: 2245, Edges: 2906 |
| 276 | Mutexes: 2656, Edges: 3479 |
| 277 | Mutexes: 2813, Edges: 3785 |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 278 | ^C |
| 279 | |
| 280 | If the program does not find a deadlock, it will keep running until you hit |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 281 | Ctrl-C. If you pass the `--verbose` flag, the program will also dump statistics |
| 282 | about the number of mutexes and edges in the mutex wait graph. If you want to |
| 283 | serialize the graph to analyze it later, you can pass the `--dump-graph FILE` |
| 284 | flag, and the program will serialize the graph in json. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 285 | |
| 286 | |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 287 | # ./deadlock_detector.py 181 --lock-symbols custom_mutex1_lock,custom_mutex2_lock --unlock_symbols custom_mutex1_unlock,custom_mutex2_unlock --verbose |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 288 | |
| 289 | Tracing... Hit Ctrl-C to end. |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 290 | Mutexes: 0, Edges: 0 |
| 291 | Mutexes: 532, Edges: 411 |
| 292 | Mutexes: 735, Edges: 675 |
| 293 | Mutexes: 1118, Edges: 1278 |
| 294 | Mutexes: 1666, Edges: 2185 |
| 295 | Mutexes: 2056, Edges: 2694 |
| 296 | Mutexes: 2245, Edges: 2906 |
| 297 | Mutexes: 2656, Edges: 3479 |
| 298 | Mutexes: 2813, Edges: 3785 |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 299 | ^C |
| 300 | |
| 301 | If your program is using custom mutexes and not pthread mutexes, you can use |
| 302 | the `--lock-symbols` and `--unlock-symbols` flags to specify different mutex |
| 303 | symbols to trace. The flags take a comma-separated string of symbol names. |
| 304 | Note that if the symbols are inlined in the binary, then this program can result |
| 305 | in false positives. |
| 306 | |
| 307 | |
| 308 | USAGE message: |
| 309 | |
| 310 | # ./deadlock_detector.py -h |
| 311 | |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 312 | usage: deadlock_detector.py [-h] [--binary BINARY] [--dump-graph DUMP_GRAPH] |
| 313 | [--verbose] [--lock-symbols LOCK_SYMBOLS] |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 314 | [--unlock-symbols UNLOCK_SYMBOLS] |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 315 | pid |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 316 | |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 317 | Detect potential deadlocks (lock inversions) in a running binary. |
| 318 | Must be run as root. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 319 | |
| 320 | positional arguments: |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 321 | pid Pid to trace |
| 322 | |
| 323 | optional arguments: |
| 324 | -h, --help show this help message and exit |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 325 | --binary BINARY If set, trace the mutexes from the binary at this |
| 326 | path. For statically-linked binaries, this argument is |
| 327 | not required. For dynamically-linked binaries, this |
| 328 | argument is required and should be the path of the |
| 329 | pthread library the binary is using. Example: |
| 330 | /lib/x86_64-linux-gnu/libpthread.so.0 |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 331 | --dump-graph DUMP_GRAPH |
| 332 | If set, this will dump the mutex graph to the |
| 333 | specified file. |
Kenny Yu | e7dff43 | 2017-02-03 09:39:39 -0800 | [diff] [blame] | 334 | --verbose Print statistics about the mutex wait graph. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 335 | --lock-symbols LOCK_SYMBOLS |
| 336 | Comma-separated list of lock symbols to trace. Default |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 337 | is pthread_mutex_lock. These symbols cannot be inlined |
| 338 | in the binary. |
Kenny Yu | 66fb4d2 | 2017-02-02 10:05:31 -0800 | [diff] [blame] | 339 | --unlock-symbols UNLOCK_SYMBOLS |
| 340 | Comma-separated list of unlock symbols to trace. |
Kenny Yu | c44e92c | 2017-02-03 18:50:33 -0800 | [diff] [blame] | 341 | Default is pthread_mutex_unlock. These symbols cannot |
| 342 | be inlined in the binary. |
| 343 | |
| 344 | Examples: |
| 345 | deadlock_detector 181 # Analyze PID 181 |
| 346 | |
| 347 | deadlock_detector 181 --binary /lib/x86_64-linux-gnu/libpthread.so.0 |
| 348 | # Analyze PID 181 and locks from this binary. |
| 349 | # If tracing a process that is running from |
| 350 | # a dynamically-linked binary, this argument |
| 351 | # is required and should be the path to the |
| 352 | # pthread library. |
| 353 | |
| 354 | deadlock_detector 181 --verbose |
| 355 | # Analyze PID 181 and print statistics about |
| 356 | # the mutex wait graph. |
| 357 | |
| 358 | deadlock_detector 181 --lock-symbols my_mutex_lock1,my_mutex_lock2 \ |
| 359 | --unlock-symbols my_mutex_unlock1,my_mutex_unlock2 |
| 360 | # Analyze PID 181 and trace custom mutex |
| 361 | # symbols instead of pthread mutexes. |
| 362 | |
| 363 | deadlock_detector 181 --dump-graph graph.json |
| 364 | # Analyze PID 181 and dump the mutex wait |
| 365 | # graph to graph.json. |