Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 1 | =========================================================================== |
| 2 | The UDP-Lite protocol (RFC 3828) |
| 3 | =========================================================================== |
| 4 | |
| 5 | |
| 6 | UDP-Lite is a Standards-Track IETF transport protocol whose characteristic |
| 7 | is a variable-length checksum. This has advantages for transport of multimedia |
| 8 | (video, VoIP) over wireless networks, as partly damaged packets can still be |
| 9 | fed into the codec instead of being discarded due to a failed checksum test. |
| 10 | |
| 11 | This file briefly describes the existing kernel support and the socket API. |
| 12 | For in-depth information, you can consult: |
| 13 | |
Justin P. Mattock | 0ea6e61 | 2010-07-23 20:51:24 -0700 | [diff] [blame] | 14 | o The UDP-Lite Homepage: |
| 15 | http://web.archive.org/web/*/http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ |
Matt LaPlante | 01dd2fb | 2007-10-20 01:34:40 +0200 | [diff] [blame] | 16 | From here you can also download some example application source code. |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 17 | |
| 18 | o The UDP-Lite HOWTO on |
Justin P. Mattock | 0ea6e61 | 2010-07-23 20:51:24 -0700 | [diff] [blame] | 19 | http://web.archive.org/web/*/http://www.erg.abdn.ac.uk/users/gerrit/udp-lite/ |
| 20 | files/UDP-Lite-HOWTO.txt |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 21 | |
| 22 | o The Wireshark UDP-Lite WiKi (with capture files): |
Masanari Iida | b07d4961 | 2015-06-13 00:23:21 +0900 | [diff] [blame] | 23 | https://wiki.wireshark.org/Lightweight_User_Datagram_Protocol |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 24 | |
| 25 | o The Protocol Spec, RFC 3828, http://www.ietf.org/rfc/rfc3828.txt |
| 26 | |
| 27 | |
| 28 | I) APPLICATIONS |
| 29 | |
| 30 | Several applications have been ported successfully to UDP-Lite. Ethereal |
Justin P. Mattock | 0ea6e61 | 2010-07-23 20:51:24 -0700 | [diff] [blame] | 31 | (now called wireshark) has UDP-Litev4/v6 support by default. |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 32 | Porting applications to UDP-Lite is straightforward: only socket level and |
| 33 | IPPROTO need to be changed; senders additionally set the checksum coverage |
| 34 | length (default = header length = 8). Details are in the next section. |
| 35 | |
| 36 | |
| 37 | II) PROGRAMMING API |
| 38 | |
| 39 | UDP-Lite provides a connectionless, unreliable datagram service and hence |
| 40 | uses the same socket type as UDP. In fact, porting from UDP to UDP-Lite is |
| 41 | very easy: simply add `IPPROTO_UDPLITE' as the last argument of the socket(2) |
| 42 | call so that the statement looks like: |
| 43 | |
| 44 | s = socket(PF_INET, SOCK_DGRAM, IPPROTO_UDPLITE); |
| 45 | |
| 46 | or, respectively, |
| 47 | |
| 48 | s = socket(PF_INET6, SOCK_DGRAM, IPPROTO_UDPLITE); |
| 49 | |
| 50 | With just the above change you are able to run UDP-Lite services or connect |
| 51 | to UDP-Lite servers. The kernel will assume that you are not interested in |
| 52 | using partial checksum coverage and so emulate UDP mode (full coverage). |
| 53 | |
| 54 | To make use of the partial checksum coverage facilities requires setting a |
| 55 | single socket option, which takes an integer specifying the coverage length: |
| 56 | |
| 57 | * Sender checksum coverage: UDPLITE_SEND_CSCOV |
| 58 | |
| 59 | For example, |
| 60 | |
| 61 | int val = 20; |
| 62 | setsockopt(s, SOL_UDPLITE, UDPLITE_SEND_CSCOV, &val, sizeof(int)); |
| 63 | |
| 64 | sets the checksum coverage length to 20 bytes (12b data + 8b header). |
| 65 | Of each packet only the first 20 bytes (plus the pseudo-header) will be |
| 66 | checksummed. This is useful for RTP applications which have a 12-byte |
| 67 | base header. |
| 68 | |
| 69 | |
| 70 | * Receiver checksum coverage: UDPLITE_RECV_CSCOV |
| 71 | |
| 72 | This option is the receiver-side analogue. It is truly optional, i.e. not |
| 73 | required to enable traffic with partial checksum coverage. Its function is |
| 74 | that of a traffic filter: when enabled, it instructs the kernel to drop |
| 75 | all packets which have a coverage _less_ than this value. For example, if |
| 76 | RTP and UDP headers are to be protected, a receiver can enforce that only |
| 77 | packets with a minimum coverage of 20 are admitted: |
| 78 | |
| 79 | int min = 20; |
| 80 | setsockopt(s, SOL_UDPLITE, UDPLITE_RECV_CSCOV, &min, sizeof(int)); |
| 81 | |
| 82 | The calls to getsockopt(2) are analogous. Being an extension and not a stand- |
| 83 | alone protocol, all socket options known from UDP can be used in exactly the |
| 84 | same manner as before, e.g. UDP_CORK or UDP_ENCAP. |
| 85 | |
| 86 | A detailed discussion of UDP-Lite checksum coverage options is in section IV. |
| 87 | |
| 88 | |
| 89 | III) HEADER FILES |
| 90 | |
| 91 | The socket API requires support through header files in /usr/include: |
| 92 | |
| 93 | * /usr/include/netinet/in.h |
| 94 | to define IPPROTO_UDPLITE |
| 95 | |
| 96 | * /usr/include/netinet/udplite.h |
| 97 | for UDP-Lite header fields and protocol constants |
| 98 | |
| 99 | For testing purposes, the following can serve as a `mini' header file: |
| 100 | |
| 101 | #define IPPROTO_UDPLITE 136 |
| 102 | #define SOL_UDPLITE 136 |
| 103 | #define UDPLITE_SEND_CSCOV 10 |
| 104 | #define UDPLITE_RECV_CSCOV 11 |
| 105 | |
| 106 | Ready-made header files for various distros are in the UDP-Lite tarball. |
| 107 | |
| 108 | |
| 109 | IV) KERNEL BEHAVIOUR WITH REGARD TO THE VARIOUS SOCKET OPTIONS |
| 110 | |
| 111 | To enable debugging messages, the log level need to be set to 8, as most |
| 112 | messages use the KERN_DEBUG level (7). |
| 113 | |
| 114 | 1) Sender Socket Options |
| 115 | |
| 116 | If the sender specifies a value of 0 as coverage length, the module |
| 117 | assumes full coverage, transmits a packet with coverage length of 0 |
| 118 | and according checksum. If the sender specifies a coverage < 8 and |
| 119 | different from 0, the kernel assumes 8 as default value. Finally, |
| 120 | if the specified coverage length exceeds the packet length, the packet |
| 121 | length is used instead as coverage length. |
| 122 | |
| 123 | 2) Receiver Socket Options |
| 124 | |
| 125 | The receiver specifies the minimum value of the coverage length it |
| 126 | is willing to accept. A value of 0 here indicates that the receiver |
| 127 | always wants the whole of the packet covered. In this case, all |
| 128 | partially covered packets are dropped and an error is logged. |
| 129 | |
| 130 | It is not possible to specify illegal values (<0 and <8); in these |
| 131 | cases the default of 8 is assumed. |
| 132 | |
| 133 | All packets arriving with a coverage value less than the specified |
| 134 | threshold are discarded, these events are also logged. |
| 135 | |
| 136 | 3) Disabling the Checksum Computation |
| 137 | |
| 138 | On both sender and receiver, checksumming will always be performed |
Matt LaPlante | a982ac0 | 2007-05-09 07:35:06 +0200 | [diff] [blame] | 139 | and cannot be disabled using SO_NO_CHECK. Thus |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 140 | |
| 141 | setsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, ... ); |
| 142 | |
| 143 | will always will be ignored, while the value of |
| 144 | |
| 145 | getsockopt(sockfd, SOL_SOCKET, SO_NO_CHECK, &value, ...); |
| 146 | |
| 147 | is meaningless (as in TCP). Packets with a zero checksum field are |
Gerrit Renker | 47112e2 | 2008-07-21 13:35:08 -0700 | [diff] [blame] | 148 | illegal (cf. RFC 3828, sec. 3.1) and will be silently discarded. |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 149 | |
| 150 | 4) Fragmentation |
| 151 | |
| 152 | The checksum computation respects both buffersize and MTU. The size |
| 153 | of UDP-Lite packets is determined by the size of the send buffer. The |
| 154 | minimum size of the send buffer is 2048 (defined as SOCK_MIN_SNDBUF |
| 155 | in include/net/sock.h), the default value is configurable as |
| 156 | net.core.wmem_default or via setting the SO_SNDBUF socket(7) |
| 157 | option. The maximum upper bound for the send buffer is determined |
| 158 | by net.core.wmem_max. |
| 159 | |
| 160 | Given a payload size larger than the send buffer size, UDP-Lite will |
| 161 | split the payload into several individual packets, filling up the |
| 162 | send buffer size in each case. |
| 163 | |
| 164 | The precise value also depends on the interface MTU. The interface MTU, |
| 165 | in turn, may trigger IP fragmentation. In this case, the generated |
| 166 | UDP-Lite packet is split into several IP packets, of which only the |
| 167 | first one contains the L4 header. |
| 168 | |
| 169 | The send buffer size has implications on the checksum coverage length. |
| 170 | Consider the following example: |
| 171 | |
| 172 | Payload: 1536 bytes Send Buffer: 1024 bytes |
| 173 | MTU: 1500 bytes Coverage Length: 856 bytes |
| 174 | |
| 175 | UDP-Lite will ship the 1536 bytes in two separate packets: |
| 176 | |
| 177 | Packet 1: 1024 payload + 8 byte header + 20 byte IP header = 1052 bytes |
| 178 | Packet 2: 512 payload + 8 byte header + 20 byte IP header = 540 bytes |
| 179 | |
| 180 | The coverage packet covers the UDP-Lite header and 848 bytes of the |
| 181 | payload in the first packet, the second packet is fully covered. Note |
| 182 | that for the second packet, the coverage length exceeds the packet |
| 183 | length. The kernel always re-adjusts the coverage length to the packet |
| 184 | length in such cases. |
| 185 | |
| 186 | As an example of what happens when one UDP-Lite packet is split into |
| 187 | several tiny fragments, consider the following example. |
| 188 | |
| 189 | Payload: 1024 bytes Send buffer size: 1024 bytes |
| 190 | MTU: 300 bytes Coverage length: 575 bytes |
| 191 | |
| 192 | +-+-----------+--------------+--------------+--------------+ |
| 193 | |8| 272 | 280 | 280 | 280 | |
| 194 | +-+-----------+--------------+--------------+--------------+ |
| 195 | 280 560 840 1032 |
| 196 | ^ |
| 197 | *****checksum coverage************* |
| 198 | |
| 199 | The UDP-Lite module generates one 1032 byte packet (1024 + 8 byte |
| 200 | header). According to the interface MTU, these are split into 4 IP |
| 201 | packets (280 byte IP payload + 20 byte IP header). The kernel module |
| 202 | sums the contents of the entire first two packets, plus 15 bytes of |
| 203 | the last packet before releasing the fragments to the IP module. |
| 204 | |
| 205 | To see the analogous case for IPv6 fragmentation, consider a link |
| 206 | MTU of 1280 bytes and a write buffer of 3356 bytes. If the checksum |
| 207 | coverage is less than 1232 bytes (MTU minus IPv6/fragment header |
| 208 | lengths), only the first fragment needs to be considered. When using |
| 209 | larger checksum coverage lengths, each eligible fragment needs to be |
| 210 | checksummed. Suppose we have a checksum coverage of 3062. The buffer |
| 211 | of 3356 bytes will be split into the following fragments: |
| 212 | |
| 213 | Fragment 1: 1280 bytes carrying 1232 bytes of UDP-Lite data |
| 214 | Fragment 2: 1280 bytes carrying 1232 bytes of UDP-Lite data |
| 215 | Fragment 3: 948 bytes carrying 900 bytes of UDP-Lite data |
| 216 | |
| 217 | The first two fragments have to be checksummed in full, of the last |
| 218 | fragment only 598 (= 3062 - 2*1232) bytes are checksummed. |
| 219 | |
| 220 | While it is important that such cases are dealt with correctly, they |
| 221 | are (annoyingly) rare: UDP-Lite is designed for optimising multimedia |
| 222 | performance over wireless (or generally noisy) links and thus smaller |
Matt LaPlante | 01dd2fb | 2007-10-20 01:34:40 +0200 | [diff] [blame] | 223 | coverage lengths are likely to be expected. |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 224 | |
| 225 | |
| 226 | V) UDP-LITE RUNTIME STATISTICS AND THEIR MEANING |
| 227 | |
| 228 | Exceptional and error conditions are logged to syslog at the KERN_DEBUG |
| 229 | level. Live statistics about UDP-Lite are available in /proc/net/snmp |
| 230 | and can (with newer versions of netstat) be viewed using |
| 231 | |
| 232 | netstat -svu |
| 233 | |
| 234 | This displays UDP-Lite statistics variables, whose meaning is as follows. |
| 235 | |
Wang Chen | cb75994 | 2007-12-03 22:33:28 +1100 | [diff] [blame] | 236 | InDatagrams: The total number of datagrams delivered to users. |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 237 | |
| 238 | NoPorts: Number of packets received to an unknown port. |
| 239 | These cases are counted separately (not as InErrors). |
| 240 | |
| 241 | InErrors: Number of erroneous UDP-Lite packets. Errors include: |
| 242 | * internal socket queue receive errors |
| 243 | * packet too short (less than 8 bytes or stated |
| 244 | coverage length exceeds received length) |
| 245 | * xfrm4_policy_check() returned with error |
| 246 | * application has specified larger min. coverage |
| 247 | length than that of incoming packet |
| 248 | * checksum coverage violated |
| 249 | * bad checksum |
| 250 | |
| 251 | OutDatagrams: Total number of sent datagrams. |
| 252 | |
| 253 | These statistics derive from the UDP MIB (RFC 2013). |
| 254 | |
| 255 | |
| 256 | VI) IPTABLES |
| 257 | |
| 258 | There is packet match support for UDP-Lite as well as support for the LOG target. |
Matt LaPlante | 01dd2fb | 2007-10-20 01:34:40 +0200 | [diff] [blame] | 259 | If you copy and paste the following line into /etc/protocols, |
Gerrit Renker | ba4e58e | 2006-11-27 11:10:57 -0800 | [diff] [blame] | 260 | |
| 261 | udplite 136 UDP-Lite # UDP-Lite [RFC 3828] |
| 262 | |
| 263 | then |
| 264 | iptables -A INPUT -p udplite -j LOG |
| 265 | |
| 266 | will produce logging output to syslog. Dropping and rejecting packets also works. |
| 267 | |
| 268 | |
| 269 | VII) MAINTAINER ADDRESS |
| 270 | |
| 271 | The UDP-Lite patch was developed at |
| 272 | University of Aberdeen |
| 273 | Electronics Research Group |
| 274 | Department of Engineering |
| 275 | Fraser Noble Building |
| 276 | Aberdeen AB24 3UE; UK |
| 277 | The current maintainer is Gerrit Renker, <gerrit@erg.abdn.ac.uk>. Initial |
| 278 | code was developed by William Stanislaus, <william@erg.abdn.ac.uk>. |