Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 1 | |
| 2 | 1. Control Interfaces |
| 3 | |
| 4 | The interfaces for receiving network packages timestamps are: |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 5 | |
| 6 | * SO_TIMESTAMP |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 7 | Generates a timestamp for each incoming packet in (not necessarily |
| 8 | monotonic) system time. Reports the timestamp via recvmsg() in a |
| 9 | control message as struct timeval (usec resolution). |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 10 | |
| 11 | * SO_TIMESTAMPNS |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 12 | Same timestamping mechanism as SO_TIMESTAMP, but reports the |
| 13 | timestamp as struct timespec (nsec resolution). |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 14 | |
| 15 | * IP_MULTICAST_LOOP + SO_TIMESTAMP[NS] |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 16 | Only for multicast:approximate transmit timestamp obtained by |
| 17 | reading the looped packet receive timestamp. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 18 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 19 | * SO_TIMESTAMPING |
| 20 | Generates timestamps on reception, transmission or both. Supports |
| 21 | multiple timestamp sources, including hardware. Supports generating |
| 22 | timestamps for stream sockets. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 23 | |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 24 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 25 | 1.1 SO_TIMESTAMP: |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 26 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 27 | This socket option enables timestamping of datagrams on the reception |
| 28 | path. Because the destination socket, if any, is not known early in |
| 29 | the network stack, the feature has to be enabled for all packets. The |
| 30 | same is true for all early receive timestamp options. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 31 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 32 | For interface details, see `man 7 socket`. |
| 33 | |
| 34 | |
| 35 | 1.2 SO_TIMESTAMPNS: |
| 36 | |
| 37 | This option is identical to SO_TIMESTAMP except for the returned data type. |
| 38 | Its struct timespec allows for higher resolution (ns) timestamps than the |
| 39 | timeval of SO_TIMESTAMP (ms). |
| 40 | |
| 41 | |
| 42 | 1.3 SO_TIMESTAMPING: |
| 43 | |
| 44 | Supports multiple types of timestamp requests. As a result, this |
| 45 | socket option takes a bitmap of flags, not a boolean. In |
| 46 | |
| 47 | err = setsockopt(fd, SOL_SOCKET, SO_TIMESTAMPING, (void *) val, &val); |
| 48 | |
| 49 | val is an integer with any of the following bits set. Setting other |
| 50 | bit returns EINVAL and does not change the current state. |
| 51 | |
| 52 | |
| 53 | 1.3.1 Timestamp Generation |
| 54 | |
| 55 | Some bits are requests to the stack to try to generate timestamps. Any |
| 56 | combination of them is valid. Changes to these bits apply to newly |
| 57 | created packets, not to packets already in the stack. As a result, it |
| 58 | is possible to selectively request timestamps for a subset of packets |
| 59 | (e.g., for sampling) by embedding an send() call within two setsockopt |
| 60 | calls, one to enable timestamp generation and one to disable it. |
| 61 | Timestamps may also be generated for reasons other than being |
| 62 | requested by a particular socket, such as when receive timestamping is |
| 63 | enabled system wide, as explained earlier. |
| 64 | |
| 65 | SOF_TIMESTAMPING_RX_HARDWARE: |
| 66 | Request rx timestamps generated by the network adapter. |
| 67 | |
| 68 | SOF_TIMESTAMPING_RX_SOFTWARE: |
| 69 | Request rx timestamps when data enters the kernel. These timestamps |
| 70 | are generated just after a device driver hands a packet to the |
| 71 | kernel receive stack. |
| 72 | |
| 73 | SOF_TIMESTAMPING_TX_HARDWARE: |
| 74 | Request tx timestamps generated by the network adapter. |
| 75 | |
| 76 | SOF_TIMESTAMPING_TX_SOFTWARE: |
| 77 | Request tx timestamps when data leaves the kernel. These timestamps |
| 78 | are generated in the device driver as close as possible, but always |
| 79 | prior to, passing the packet to the network interface. Hence, they |
| 80 | require driver support and may not be available for all devices. |
| 81 | |
| 82 | SOF_TIMESTAMPING_TX_SCHED: |
| 83 | Request tx timestamps prior to entering the packet scheduler. Kernel |
| 84 | transmit latency is, if long, often dominated by queuing delay. The |
| 85 | difference between this timestamp and one taken at |
| 86 | SOF_TIMESTAMPING_TX_SOFTWARE will expose this latency independent |
| 87 | of protocol processing. The latency incurred in protocol |
| 88 | processing, if any, can be computed by subtracting a userspace |
| 89 | timestamp taken immediately before send() from this timestamp. On |
| 90 | machines with virtual devices where a transmitted packet travels |
| 91 | through multiple devices and, hence, multiple packet schedulers, |
| 92 | a timestamp is generated at each layer. This allows for fine |
| 93 | grained measurement of queuing delay. |
| 94 | |
| 95 | SOF_TIMESTAMPING_TX_ACK: |
| 96 | Request tx timestamps when all data in the send buffer has been |
| 97 | acknowledged. This only makes sense for reliable protocols. It is |
| 98 | currently only implemented for TCP. For that protocol, it may |
| 99 | over-report measurement, because the timestamp is generated when all |
| 100 | data up to and including the buffer at send() was acknowledged: the |
| 101 | cumulative acknowledgment. The mechanism ignores SACK and FACK. |
| 102 | |
| 103 | |
| 104 | 1.3.2 Timestamp Reporting |
Andrew Lutomirski | adca476 | 2014-03-04 17:24:10 -0800 | [diff] [blame] | 105 | |
| 106 | The other three bits control which timestamps will be reported in a |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 107 | generated control message. Changes to the bits take immediate |
| 108 | effect at the timestamp reporting locations in the stack. Timestamps |
| 109 | are only reported for packets that also have the relevant timestamp |
| 110 | generation request set. |
Andrew Lutomirski | adca476 | 2014-03-04 17:24:10 -0800 | [diff] [blame] | 111 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 112 | SOF_TIMESTAMPING_SOFTWARE: |
| 113 | Report any software timestamps when available. |
Andrew Lutomirski | adca476 | 2014-03-04 17:24:10 -0800 | [diff] [blame] | 114 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 115 | SOF_TIMESTAMPING_SYS_HARDWARE: |
| 116 | This option is deprecated and ignored. |
Andrew Lutomirski | adca476 | 2014-03-04 17:24:10 -0800 | [diff] [blame] | 117 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 118 | SOF_TIMESTAMPING_RAW_HARDWARE: |
| 119 | Report hardware timestamps as generated by |
| 120 | SOF_TIMESTAMPING_TX_HARDWARE when available. |
| 121 | |
| 122 | |
| 123 | 1.3.3 Timestamp Options |
| 124 | |
| 125 | The interface supports one option |
| 126 | |
| 127 | SOF_TIMESTAMPING_OPT_ID: |
| 128 | |
| 129 | Generate a unique identifier along with each packet. A process can |
| 130 | have multiple concurrent timestamping requests outstanding. Packets |
| 131 | can be reordered in the transmit path, for instance in the packet |
| 132 | scheduler. In that case timestamps will be queued onto the error |
| 133 | queue out of order from the original send() calls. This option |
| 134 | embeds a counter that is incremented at send() time, to order |
| 135 | timestamps within a flow. |
| 136 | |
| 137 | This option is implemented only for transmit timestamps. There, the |
| 138 | timestamp is always looped along with a struct sock_extended_err. |
| 139 | The option modifies field ee_info to pass an id that is unique |
| 140 | among all possibly concurrently outstanding timestamp requests for |
| 141 | that socket. In practice, it is a monotonically increasing u32 |
| 142 | (that wraps). |
| 143 | |
| 144 | In datagram sockets, the counter increments on each send call. In |
| 145 | stream sockets, it increments with every byte. |
| 146 | |
| 147 | |
| 148 | 1.4 Bytestream Timestamps |
| 149 | |
| 150 | The SO_TIMESTAMPING interface supports timestamping of bytes in a |
| 151 | bytestream. Each request is interpreted as a request for when the |
| 152 | entire contents of the buffer has passed a timestamping point. That |
| 153 | is, for streams option SOF_TIMESTAMPING_TX_SOFTWARE will record |
| 154 | when all bytes have reached the device driver, regardless of how |
| 155 | many packets the data has been converted into. |
| 156 | |
| 157 | In general, bytestreams have no natural delimiters and therefore |
| 158 | correlating a timestamp with data is non-trivial. A range of bytes |
| 159 | may be split across segments, any segments may be merged (possibly |
| 160 | coalescing sections of previously segmented buffers associated with |
| 161 | independent send() calls). Segments can be reordered and the same |
| 162 | byte range can coexist in multiple segments for protocols that |
| 163 | implement retransmissions. |
| 164 | |
| 165 | It is essential that all timestamps implement the same semantics, |
| 166 | regardless of these possible transformations, as otherwise they are |
| 167 | incomparable. Handling "rare" corner cases differently from the |
| 168 | simple case (a 1:1 mapping from buffer to skb) is insufficient |
| 169 | because performance debugging often needs to focus on such outliers. |
| 170 | |
| 171 | In practice, timestamps can be correlated with segments of a |
| 172 | bytestream consistently, if both semantics of the timestamp and the |
| 173 | timing of measurement are chosen correctly. This challenge is no |
| 174 | different from deciding on a strategy for IP fragmentation. There, the |
| 175 | definition is that only the first fragment is timestamped. For |
| 176 | bytestreams, we chose that a timestamp is generated only when all |
| 177 | bytes have passed a point. SOF_TIMESTAMPING_TX_ACK as defined is easy to |
| 178 | implement and reason about. An implementation that has to take into |
| 179 | account SACK would be more complex due to possible transmission holes |
| 180 | and out of order arrival. |
| 181 | |
| 182 | On the host, TCP can also break the simple 1:1 mapping from buffer to |
| 183 | skbuff as a result of Nagle, cork, autocork, segmentation and GSO. The |
| 184 | implementation ensures correctness in all cases by tracking the |
| 185 | individual last byte passed to send(), even if it is no longer the |
| 186 | last byte after an skbuff extend or merge operation. It stores the |
| 187 | relevant sequence number in skb_shinfo(skb)->tskey. Because an skbuff |
| 188 | has only one such field, only one timestamp can be generated. |
| 189 | |
| 190 | In rare cases, a timestamp request can be missed if two requests are |
| 191 | collapsed onto the same skb. A process can detect this situation by |
| 192 | enabling SOF_TIMESTAMPING_OPT_ID and comparing the byte offset at |
| 193 | send time with the value returned for each timestamp. It can prevent |
| 194 | the situation by always flushing the TCP stack in between requests, |
| 195 | for instance by enabling TCP_NODELAY and disabling TCP_CORK and |
| 196 | autocork. |
| 197 | |
| 198 | These precautions ensure that the timestamp is generated only when all |
| 199 | bytes have passed a timestamp point, assuming that the network stack |
| 200 | itself does not reorder the segments. The stack indeed tries to avoid |
| 201 | reordering. The one exception is under administrator control: it is |
| 202 | possible to construct a packet scheduler configuration that delays |
| 203 | segments from the same stream differently. Such a setup would be |
| 204 | unusual. |
| 205 | |
| 206 | |
| 207 | 2 Data Interfaces |
| 208 | |
| 209 | Timestamps are read using the ancillary data feature of recvmsg(). |
| 210 | See `man 3 cmsg` for details of this interface. The socket manual |
| 211 | page (`man 7 socket`) describes how timestamps generated with |
| 212 | SO_TIMESTAMP and SO_TIMESTAMPNS records can be retrieved. |
| 213 | |
| 214 | |
| 215 | 2.1 SCM_TIMESTAMPING records |
| 216 | |
| 217 | These timestamps are returned in a control message with cmsg_level |
| 218 | SOL_SOCKET, cmsg_type SCM_TIMESTAMPING, and payload of type |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 219 | |
| 220 | struct scm_timestamping { |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 221 | struct timespec ts[3]; |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 222 | }; |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 223 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 224 | The structure can return up to three timestamps. This is a legacy |
| 225 | feature. Only one field is non-zero at any time. Most timestamps |
| 226 | are passed in ts[0]. Hardware timestamps are passed in ts[2]. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 227 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 228 | ts[1] used to hold hardware timestamps converted to system time. |
| 229 | Instead, expose the hardware clock device on the NIC directly as |
| 230 | a HW PTP clock source, to allow time conversion in userspace and |
| 231 | optionally synchronize system time with a userspace PTP stack such |
| 232 | as linuxptp. For the PTP clock API, see Documentation/ptp/ptp.txt. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 233 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 234 | 2.1.1 Transmit timestamps with MSG_ERRQUEUE |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 235 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 236 | For transmit timestamps the outgoing packet is looped back to the |
| 237 | socket's error queue with the send timestamp(s) attached. A process |
| 238 | receives the timestamps by calling recvmsg() with flag MSG_ERRQUEUE |
| 239 | set and with a msg_control buffer sufficiently large to receive the |
| 240 | relevant metadata structures. The recvmsg call returns the original |
| 241 | outgoing data packet with two ancillary messages attached. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 242 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 243 | A message of cm_level SOL_IP(V6) and cm_type IP(V6)_RECVERR |
| 244 | embeds a struct sock_extended_err. This defines the error type. For |
| 245 | timestamps, the ee_errno field is ENOMSG. The other ancillary message |
| 246 | will have cm_level SOL_SOCKET and cm_type SCM_TIMESTAMPING. This |
| 247 | embeds the struct scm_timestamping. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 248 | |
| 249 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 250 | 2.1.1.2 Timestamp types |
| 251 | |
| 252 | The semantics of the three struct timespec are defined by field |
| 253 | ee_info in the extended error structure. It contains a value of |
| 254 | type SCM_TSTAMP_* to define the actual timestamp passed in |
| 255 | scm_timestamping. |
| 256 | |
| 257 | The SCM_TSTAMP_* types are 1:1 matches to the SOF_TIMESTAMPING_* |
| 258 | control fields discussed previously, with one exception. For legacy |
| 259 | reasons, SCM_TSTAMP_SND is equal to zero and can be set for both |
| 260 | SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE. It |
| 261 | is the first if ts[2] is non-zero, the second otherwise, in which |
| 262 | case the timestamp is stored in ts[0]. |
| 263 | |
| 264 | |
| 265 | 2.1.1.3 Fragmentation |
| 266 | |
| 267 | Fragmentation of outgoing datagrams is rare, but is possible, e.g., by |
| 268 | explicitly disabling PMTU discovery. If an outgoing packet is fragmented, |
| 269 | then only the first fragment is timestamped and returned to the sending |
| 270 | socket. |
| 271 | |
| 272 | |
| 273 | 2.1.1.4 Packet Payload |
| 274 | |
| 275 | The calling application is often not interested in receiving the whole |
| 276 | packet payload that it passed to the stack originally: the socket |
| 277 | error queue mechanism is just a method to piggyback the timestamp on. |
| 278 | In this case, the application can choose to read datagrams with a |
| 279 | smaller buffer, possibly even of length 0. The payload is truncated |
| 280 | accordingly. Until the process calls recvmsg() on the error queue, |
| 281 | however, the full packet is queued, taking up budget from SO_RCVBUF. |
| 282 | |
| 283 | |
| 284 | 2.1.1.5 Blocking Read |
| 285 | |
| 286 | Reading from the error queue is always a non-blocking operation. To |
| 287 | block waiting on a timestamp, use poll or select. poll() will return |
| 288 | POLLERR in pollfd.revents if any data is ready on the error queue. |
| 289 | There is no need to pass this flag in pollfd.events. This flag is |
| 290 | ignored on request. See also `man 2 poll`. |
| 291 | |
| 292 | |
| 293 | 2.1.2 Receive timestamps |
| 294 | |
| 295 | On reception, there is no reason to read from the socket error queue. |
| 296 | The SCM_TIMESTAMPING ancillary data is sent along with the packet data |
| 297 | on a normal recvmsg(). Since this is not a socket error, it is not |
| 298 | accompanied by a message SOL_IP(V6)/IP(V6)_RECVERROR. In this case, |
| 299 | the meaning of the three fields in struct scm_timestamping is |
| 300 | implicitly defined. ts[0] holds a software timestamp if set, ts[1] |
| 301 | is again deprecated and ts[2] holds a hardware timestamp if set. |
| 302 | |
| 303 | |
| 304 | 3. Hardware Timestamping configuration: SIOCSHWTSTAMP and SIOCGHWTSTAMP |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 305 | |
| 306 | Hardware time stamping must also be initialized for each device driver |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 307 | that is expected to do hardware time stamping. The parameter is defined in |
| 308 | /include/linux/net_tstamp.h as: |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 309 | |
| 310 | struct hwtstamp_config { |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 311 | int flags; /* no flags defined right now, must be zero */ |
| 312 | int tx_type; /* HWTSTAMP_TX_* */ |
| 313 | int rx_filter; /* HWTSTAMP_FILTER_* */ |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 314 | }; |
| 315 | |
| 316 | Desired behavior is passed into the kernel and to a specific device by |
| 317 | calling ioctl(SIOCSHWTSTAMP) with a pointer to a struct ifreq whose |
| 318 | ifr_data points to a struct hwtstamp_config. The tx_type and |
| 319 | rx_filter are hints to the driver what it is expected to do. If |
| 320 | the requested fine-grained filtering for incoming packets is not |
| 321 | supported, the driver may time stamp more than just the requested types |
| 322 | of packets. |
| 323 | |
| 324 | A driver which supports hardware time stamping shall update the struct |
| 325 | with the actual, possibly more permissive configuration. If the |
| 326 | requested packets cannot be time stamped, then nothing should be |
| 327 | changed and ERANGE shall be returned (in contrast to EINVAL, which |
| 328 | indicates that SIOCSHWTSTAMP is not supported at all). |
| 329 | |
| 330 | Only a processes with admin rights may change the configuration. User |
| 331 | space is responsible to ensure that multiple processes don't interfere |
| 332 | with each other and that the settings are reset. |
| 333 | |
Ben Hutchings | fd468c7 | 2013-11-14 01:19:29 +0000 | [diff] [blame] | 334 | Any process can read the actual configuration by passing this |
| 335 | structure to ioctl(SIOCGHWTSTAMP) in the same way. However, this has |
| 336 | not been implemented in all drivers. |
| 337 | |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 338 | /* possible values for hwtstamp_config->tx_type */ |
| 339 | enum { |
| 340 | /* |
| 341 | * no outgoing packet will need hardware time stamping; |
| 342 | * should a packet arrive which asks for it, no hardware |
| 343 | * time stamping will be done |
| 344 | */ |
| 345 | HWTSTAMP_TX_OFF, |
| 346 | |
| 347 | /* |
| 348 | * enables hardware time stamping for outgoing packets; |
| 349 | * the sender of the packet decides which are to be |
| 350 | * time stamped by setting SOF_TIMESTAMPING_TX_SOFTWARE |
| 351 | * before sending the packet |
| 352 | */ |
| 353 | HWTSTAMP_TX_ON, |
| 354 | }; |
| 355 | |
| 356 | /* possible values for hwtstamp_config->rx_filter */ |
| 357 | enum { |
| 358 | /* time stamp no incoming packet at all */ |
| 359 | HWTSTAMP_FILTER_NONE, |
| 360 | |
| 361 | /* time stamp any incoming packet */ |
| 362 | HWTSTAMP_FILTER_ALL, |
| 363 | |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 364 | /* return value: time stamp all packets requested plus some others */ |
| 365 | HWTSTAMP_FILTER_SOME, |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 366 | |
| 367 | /* PTP v1, UDP, any kind of event packet */ |
| 368 | HWTSTAMP_FILTER_PTP_V1_L4_EVENT, |
| 369 | |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 370 | /* for the complete list of values, please check |
| 371 | * the include file /include/linux/net_tstamp.h |
| 372 | */ |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 373 | }; |
| 374 | |
Willem de Bruijn | 8fe2f76 | 2014-08-31 21:27:47 -0400 | [diff] [blame] | 375 | 3.1 Hardware Timestamping Implementation: Device Drivers |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 376 | |
| 377 | A driver which supports hardware time stamping must support the |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 378 | SIOCSHWTSTAMP ioctl and update the supplied struct hwtstamp_config with |
Ben Hutchings | fd468c7 | 2013-11-14 01:19:29 +0000 | [diff] [blame] | 379 | the actual values as described in the section on SIOCSHWTSTAMP. It |
| 380 | should also support SIOCGHWTSTAMP. |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 381 | |
| 382 | Time stamps for received packets must be stored in the skb. To get a pointer |
| 383 | to the shared time stamp structure of the skb call skb_hwtstamps(). Then |
| 384 | set the time stamps in the structure: |
| 385 | |
| 386 | struct skb_shared_hwtstamps { |
| 387 | /* hardware time stamp transformed into duration |
| 388 | * since arbitrary point in time |
| 389 | */ |
| 390 | ktime_t hwtstamp; |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 391 | }; |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 392 | |
| 393 | Time stamps for outgoing packets are to be generated as follows: |
Oliver Hartkopp | 2244d07 | 2010-08-17 08:59:14 +0000 | [diff] [blame] | 394 | - In hard_start_xmit(), check if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) |
| 395 | is set no-zero. If yes, then the driver is expected to do hardware time |
| 396 | stamping. |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 397 | - If this is possible for the skb and requested, then declare |
Oliver Hartkopp | 2244d07 | 2010-08-17 08:59:14 +0000 | [diff] [blame] | 398 | that the driver is doing the time stamping by setting the flag |
| 399 | SKBTX_IN_PROGRESS in skb_shinfo(skb)->tx_flags , e.g. with |
| 400 | |
| 401 | skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS; |
| 402 | |
| 403 | You might want to keep a pointer to the associated skb for the next step |
| 404 | and not free the skb. A driver not supporting hardware time stamping doesn't |
| 405 | do that. A driver must never touch sk_buff::tstamp! It is used to store |
| 406 | software generated time stamps by the network subsystem. |
Jakub Kicinski | 59cb89e | 2014-03-16 20:32:48 +0100 | [diff] [blame] | 407 | - Driver should call skb_tx_timestamp() as close to passing sk_buff to hardware |
| 408 | as possible. skb_tx_timestamp() provides a software time stamp if requested |
| 409 | and hardware timestamping is not possible (SKBTX_IN_PROGRESS not set). |
Patrick Ohly | cb9eff0 | 2009-02-12 05:03:36 +0000 | [diff] [blame] | 410 | - As soon as the driver has sent the packet and/or obtained a |
| 411 | hardware time stamp for it, it passes the time stamp back by |
| 412 | calling skb_hwtstamp_tx() with the original skb, the raw |
Patrick Loschmidt | 6929869 | 2010-04-07 21:52:07 -0700 | [diff] [blame] | 413 | hardware time stamp. skb_hwtstamp_tx() clones the original skb and |
| 414 | adds the timestamps, therefore the original skb has to be freed now. |
| 415 | If obtaining the hardware time stamp somehow fails, then the driver |
| 416 | should not fall back to software time stamping. The rationale is that |
| 417 | this would occur at a later time in the processing pipeline than other |
| 418 | software time stamping and therefore could lead to unexpected deltas |
| 419 | between time stamps. |