| Demonstrations of tcpretrans, the Linux eBPF/bcc version. |
| |
| |
| This tool traces the kernel TCP retransmit function to show details of these |
| retransmits. For example: |
| |
| # ./tcpretrans |
| TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE |
| 01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED |
| 01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED |
| 01:55:17 0 4 10.153.223.157:22 R> 69.53.245.40:22957 ESTABLISHED |
| [...] |
| |
| This output shows three TCP retransmits, the first two were for an IPv4 |
| connection from 10.153.223.157 port 22 to 69.53.245.40 port 34619. The TCP |
| state was "ESTABLISHED" at the time of the retransmit. The on-CPU PID at the |
| time of the retransmit is printed, in this case 0 (the kernel, which will |
| be the case most of the time). |
| |
| Retransmits are usually a sign of poor network health, and this tool is |
| useful for their investigation. Unlike using tcpdump, this tool has very |
| low overhead, as it only traces the retransmit function. It also prints |
| additional kernel details: the state of the TCP session at the time of the |
| retransmit. |
| |
| |
| A -l option will include TCP tail loss probe attempts: |
| |
| # ./tcpretrans -l |
| TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE |
| 01:55:45 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED |
| 01:55:46 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED |
| 01:55:46 0 4 10.153.223.157:22 R> 69.53.245.40:51601 ESTABLISHED |
| 01:55:53 0 4 10.153.223.157:22 L> 69.53.245.40:46444 ESTABLISHED |
| 01:56:06 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:06 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:08 1938 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| 01:56:08 0 4 10.153.223.157:22 R> 69.53.245.40:46444 ESTABLISHED |
| [...] |
| |
| See the "L>" in the "T>" column. These are attempts: the kernel probably |
| sent a TLP, but in some cases it might not have been ultimately sent. |
| |
| To spot heavily retransmitting flows quickly one can use the -c flag. It will |
| count occurring retransmits per flow. |
| |
| # ./tcpretrans.py -c |
| Tracing retransmits ... Hit Ctrl-C to end |
| ^C |
| LADDR:LPORT RADDR:RPORT RETRANSMITS |
| 192.168.10.50:60366 <-> 172.217.21.194:443 700 |
| 192.168.10.50:666 <-> 172.213.11.195:443 345 |
| 192.168.10.50:366 <-> 172.212.22.194:443 211 |
| [...] |
| |
| This can ease to quickly isolate congested or otherwise awry network paths |
| responsible for clamping tcp performance. |
| |
| USAGE message: |
| |
| # ./tcpretrans -h |
| usage: tcpretrans [-h] [-l] |
| |
| Trace TCP retransmits |
| |
| optional arguments: |
| -h, --help show this help message and exit |
| -l, --lossprobe include tail loss probe attempts |
| -c, --count count occurred retransmits per flow |
| |
| examples: |
| ./tcpretrans # trace TCP retransmits |
| ./tcpretrans -l # include TLP attempts |