Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | PPP Generic Driver and Channel Interface |
| 2 | ---------------------------------------- |
| 3 | |
| 4 | Paul Mackerras |
| 5 | paulus@samba.org |
| 6 | 7 Feb 2002 |
| 7 | |
| 8 | The generic PPP driver in linux-2.4 provides an implementation of the |
| 9 | functionality which is of use in any PPP implementation, including: |
| 10 | |
| 11 | * the network interface unit (ppp0 etc.) |
| 12 | * the interface to the networking code |
| 13 | * PPP multilink: splitting datagrams between multiple links, and |
| 14 | ordering and combining received fragments |
| 15 | * the interface to pppd, via a /dev/ppp character device |
| 16 | * packet compression and decompression |
| 17 | * TCP/IP header compression and decompression |
| 18 | * detecting network traffic for demand dialling and for idle timeouts |
| 19 | * simple packet filtering |
| 20 | |
| 21 | For sending and receiving PPP frames, the generic PPP driver calls on |
| 22 | the services of PPP `channels'. A PPP channel encapsulates a |
| 23 | mechanism for transporting PPP frames from one machine to another. A |
| 24 | PPP channel implementation can be arbitrarily complex internally but |
| 25 | has a very simple interface with the generic PPP code: it merely has |
| 26 | to be able to send PPP frames, receive PPP frames, and optionally |
| 27 | handle ioctl requests. Currently there are PPP channel |
| 28 | implementations for asynchronous serial ports, synchronous serial |
| 29 | ports, and for PPP over ethernet. |
| 30 | |
| 31 | This architecture makes it possible to implement PPP multilink in a |
| 32 | natural and straightforward way, by allowing more than one channel to |
| 33 | be linked to each ppp network interface unit. The generic layer is |
| 34 | responsible for splitting datagrams on transmit and recombining them |
| 35 | on receive. |
| 36 | |
| 37 | |
| 38 | PPP channel API |
| 39 | --------------- |
| 40 | |
| 41 | See include/linux/ppp_channel.h for the declaration of the types and |
| 42 | functions used to communicate between the generic PPP layer and PPP |
| 43 | channels. |
| 44 | |
| 45 | Each channel has to provide two functions to the generic PPP layer, |
| 46 | via the ppp_channel.ops pointer: |
| 47 | |
| 48 | * start_xmit() is called by the generic layer when it has a frame to |
| 49 | send. The channel has the option of rejecting the frame for |
| 50 | flow-control reasons. In this case, start_xmit() should return 0 |
| 51 | and the channel should call the ppp_output_wakeup() function at a |
| 52 | later time when it can accept frames again, and the generic layer |
| 53 | will then attempt to retransmit the rejected frame(s). If the frame |
| 54 | is accepted, the start_xmit() function should return 1. |
| 55 | |
| 56 | * ioctl() provides an interface which can be used by a user-space |
| 57 | program to control aspects of the channel's behaviour. This |
| 58 | procedure will be called when a user-space program does an ioctl |
| 59 | system call on an instance of /dev/ppp which is bound to the |
| 60 | channel. (Usually it would only be pppd which would do this.) |
| 61 | |
| 62 | The generic PPP layer provides seven functions to channels: |
| 63 | |
| 64 | * ppp_register_channel() is called when a channel has been created, to |
| 65 | notify the PPP generic layer of its presence. For example, setting |
| 66 | a serial port to the PPPDISC line discipline causes the ppp_async |
| 67 | channel code to call this function. |
| 68 | |
| 69 | * ppp_unregister_channel() is called when a channel is to be |
| 70 | destroyed. For example, the ppp_async channel code calls this when |
| 71 | a hangup is detected on the serial port. |
| 72 | |
| 73 | * ppp_output_wakeup() is called by a channel when it has previously |
| 74 | rejected a call to its start_xmit function, and can now accept more |
| 75 | packets. |
| 76 | |
| 77 | * ppp_input() is called by a channel when it has received a complete |
| 78 | PPP frame. |
| 79 | |
| 80 | * ppp_input_error() is called by a channel when it has detected that a |
| 81 | frame has been lost or dropped (for example, because of a FCS (frame |
| 82 | check sequence) error). |
| 83 | |
| 84 | * ppp_channel_index() returns the channel index assigned by the PPP |
| 85 | generic layer to this channel. The channel should provide some way |
| 86 | (e.g. an ioctl) to transmit this back to user-space, as user-space |
| 87 | will need it to attach an instance of /dev/ppp to this channel. |
| 88 | |
| 89 | * ppp_unit_number() returns the unit number of the ppp network |
| 90 | interface to which this channel is connected, or -1 if the channel |
| 91 | is not connected. |
| 92 | |
| 93 | Connecting a channel to the ppp generic layer is initiated from the |
| 94 | channel code, rather than from the generic layer. The channel is |
| 95 | expected to have some way for a user-level process to control it |
| 96 | independently of the ppp generic layer. For example, with the |
| 97 | ppp_async channel, this is provided by the file descriptor to the |
| 98 | serial port. |
| 99 | |
| 100 | Generally a user-level process will initialize the underlying |
| 101 | communications medium and prepare it to do PPP. For example, with an |
| 102 | async tty, this can involve setting the tty speed and modes, issuing |
| 103 | modem commands, and then going through some sort of dialog with the |
| 104 | remote system to invoke PPP service there. We refer to this process |
| 105 | as `discovery'. Then the user-level process tells the medium to |
| 106 | become a PPP channel and register itself with the generic PPP layer. |
| 107 | The channel then has to report the channel number assigned to it back |
| 108 | to the user-level process. From that point, the PPP negotiation code |
| 109 | in the PPP daemon (pppd) can take over and perform the PPP |
| 110 | negotiation, accessing the channel through the /dev/ppp interface. |
| 111 | |
| 112 | At the interface to the PPP generic layer, PPP frames are stored in |
| 113 | skbuff structures and start with the two-byte PPP protocol number. |
| 114 | The frame does *not* include the 0xff `address' byte or the 0x03 |
| 115 | `control' byte that are optionally used in async PPP. Nor is there |
| 116 | any escaping of control characters, nor are there any FCS or framing |
| 117 | characters included. That is all the responsibility of the channel |
| 118 | code, if it is needed for the particular medium. That is, the skbuffs |
| 119 | presented to the start_xmit() function contain only the 2-byte |
| 120 | protocol number and the data, and the skbuffs presented to ppp_input() |
| 121 | must be in the same format. |
| 122 | |
| 123 | The channel must provide an instance of a ppp_channel struct to |
| 124 | represent the channel. The channel is free to use the `private' field |
| 125 | however it wishes. The channel should initialize the `mtu' and |
| 126 | `hdrlen' fields before calling ppp_register_channel() and not change |
| 127 | them until after ppp_unregister_channel() returns. The `mtu' field |
| 128 | represents the maximum size of the data part of the PPP frames, that |
| 129 | is, it does not include the 2-byte protocol number. |
| 130 | |
| 131 | If the channel needs some headroom in the skbuffs presented to it for |
| 132 | transmission (i.e., some space free in the skbuff data area before the |
| 133 | start of the PPP frame), it should set the `hdrlen' field of the |
| 134 | ppp_channel struct to the amount of headroom required. The generic |
| 135 | PPP layer will attempt to provide that much headroom but the channel |
| 136 | should still check if there is sufficient headroom and copy the skbuff |
| 137 | if there isn't. |
| 138 | |
| 139 | On the input side, channels should ideally provide at least 2 bytes of |
| 140 | headroom in the skbuffs presented to ppp_input(). The generic PPP |
| 141 | code does not require this but will be more efficient if this is done. |
| 142 | |
| 143 | |
| 144 | Buffering and flow control |
| 145 | -------------------------- |
| 146 | |
| 147 | The generic PPP layer has been designed to minimize the amount of data |
| 148 | that it buffers in the transmit direction. It maintains a queue of |
| 149 | transmit packets for the PPP unit (network interface device) plus a |
| 150 | queue of transmit packets for each attached channel. Normally the |
| 151 | transmit queue for the unit will contain at most one packet; the |
| 152 | exceptions are when pppd sends packets by writing to /dev/ppp, and |
| 153 | when the core networking code calls the generic layer's start_xmit() |
| 154 | function with the queue stopped, i.e. when the generic layer has |
| 155 | called netif_stop_queue(), which only happens on a transmit timeout. |
| 156 | The start_xmit function always accepts and queues the packet which it |
| 157 | is asked to transmit. |
| 158 | |
| 159 | Transmit packets are dequeued from the PPP unit transmit queue and |
| 160 | then subjected to TCP/IP header compression and packet compression |
| 161 | (Deflate or BSD-Compress compression), as appropriate. After this |
| 162 | point the packets can no longer be reordered, as the decompression |
| 163 | algorithms rely on receiving compressed packets in the same order that |
| 164 | they were generated. |
| 165 | |
| 166 | If multilink is not in use, this packet is then passed to the attached |
| 167 | channel's start_xmit() function. If the channel refuses to take |
| 168 | the packet, the generic layer saves it for later transmission. The |
| 169 | generic layer will call the channel's start_xmit() function again |
| 170 | when the channel calls ppp_output_wakeup() or when the core |
| 171 | networking code calls the generic layer's start_xmit() function |
| 172 | again. The generic layer contains no timeout and retransmission |
| 173 | logic; it relies on the core networking code for that. |
| 174 | |
| 175 | If multilink is in use, the generic layer divides the packet into one |
| 176 | or more fragments and puts a multilink header on each fragment. It |
| 177 | decides how many fragments to use based on the length of the packet |
| 178 | and the number of channels which are potentially able to accept a |
| 179 | fragment at the moment. A channel is potentially able to accept a |
| 180 | fragment if it doesn't have any fragments currently queued up for it |
| 181 | to transmit. The channel may still refuse a fragment; in this case |
| 182 | the fragment is queued up for the channel to transmit later. This |
| 183 | scheme has the effect that more fragments are given to higher- |
| 184 | bandwidth channels. It also means that under light load, the generic |
| 185 | layer will tend to fragment large packets across all the channels, |
| 186 | thus reducing latency, while under heavy load, packets will tend to be |
| 187 | transmitted as single fragments, thus reducing the overhead of |
| 188 | fragmentation. |
| 189 | |
| 190 | |
| 191 | SMP safety |
| 192 | ---------- |
| 193 | |
| 194 | The PPP generic layer has been designed to be SMP-safe. Locks are |
| 195 | used around accesses to the internal data structures where necessary |
| 196 | to ensure their integrity. As part of this, the generic layer |
| 197 | requires that the channels adhere to certain requirements and in turn |
| 198 | provides certain guarantees to the channels. Essentially the channels |
| 199 | are required to provide the appropriate locking on the ppp_channel |
| 200 | structures that form the basis of the communication between the |
| 201 | channel and the generic layer. This is because the channel provides |
| 202 | the storage for the ppp_channel structure, and so the channel is |
| 203 | required to provide the guarantee that this storage exists and is |
| 204 | valid at the appropriate times. |
| 205 | |
| 206 | The generic layer requires these guarantees from the channel: |
| 207 | |
| 208 | * The ppp_channel object must exist from the time that |
| 209 | ppp_register_channel() is called until after the call to |
| 210 | ppp_unregister_channel() returns. |
| 211 | |
| 212 | * No thread may be in a call to any of ppp_input(), ppp_input_error(), |
| 213 | ppp_output_wakeup(), ppp_channel_index() or ppp_unit_number() for a |
| 214 | channel at the time that ppp_unregister_channel() is called for that |
| 215 | channel. |
| 216 | |
| 217 | * ppp_register_channel() and ppp_unregister_channel() must be called |
| 218 | from process context, not interrupt or softirq/BH context. |
| 219 | |
| 220 | * The remaining generic layer functions may be called at softirq/BH |
| 221 | level but must not be called from a hardware interrupt handler. |
| 222 | |
| 223 | * The generic layer may call the channel start_xmit() function at |
| 224 | softirq/BH level but will not call it at interrupt level. Thus the |
| 225 | start_xmit() function may not block. |
| 226 | |
| 227 | * The generic layer will only call the channel ioctl() function in |
| 228 | process context. |
| 229 | |
| 230 | The generic layer provides these guarantees to the channels: |
| 231 | |
| 232 | * The generic layer will not call the start_xmit() function for a |
| 233 | channel while any thread is already executing in that function for |
| 234 | that channel. |
| 235 | |
| 236 | * The generic layer will not call the ioctl() function for a channel |
| 237 | while any thread is already executing in that function for that |
| 238 | channel. |
| 239 | |
| 240 | * By the time a call to ppp_unregister_channel() returns, no thread |
| 241 | will be executing in a call from the generic layer to that channel's |
| 242 | start_xmit() or ioctl() function, and the generic layer will not |
| 243 | call either of those functions subsequently. |
| 244 | |
| 245 | |
| 246 | Interface to pppd |
| 247 | ----------------- |
| 248 | |
| 249 | The PPP generic layer exports a character device interface called |
| 250 | /dev/ppp. This is used by pppd to control PPP interface units and |
| 251 | channels. Although there is only one /dev/ppp, each open instance of |
| 252 | /dev/ppp acts independently and can be attached either to a PPP unit |
| 253 | or a PPP channel. This is achieved using the file->private_data field |
| 254 | to point to a separate object for each open instance of /dev/ppp. In |
| 255 | this way an effect similar to Solaris' clone open is obtained, |
| 256 | allowing us to control an arbitrary number of PPP interfaces and |
| 257 | channels without having to fill up /dev with hundreds of device names. |
| 258 | |
| 259 | When /dev/ppp is opened, a new instance is created which is initially |
| 260 | unattached. Using an ioctl call, it can then be attached to an |
| 261 | existing unit, attached to a newly-created unit, or attached to an |
| 262 | existing channel. An instance attached to a unit can be used to send |
| 263 | and receive PPP control frames, using the read() and write() system |
| 264 | calls, along with poll() if necessary. Similarly, an instance |
| 265 | attached to a channel can be used to send and receive PPP frames on |
| 266 | that channel. |
| 267 | |
| 268 | In multilink terms, the unit represents the bundle, while the channels |
| 269 | represent the individual physical links. Thus, a PPP frame sent by a |
| 270 | write to the unit (i.e., to an instance of /dev/ppp attached to the |
| 271 | unit) will be subject to bundle-level compression and to fragmentation |
| 272 | across the individual links (if multilink is in use). In contrast, a |
| 273 | PPP frame sent by a write to the channel will be sent as-is on that |
| 274 | channel, without any multilink header. |
| 275 | |
| 276 | A channel is not initially attached to any unit. In this state it can |
| 277 | be used for PPP negotiation but not for the transfer of data packets. |
| 278 | It can then be connected to a PPP unit with an ioctl call, which |
| 279 | makes it available to send and receive data packets for that unit. |
| 280 | |
| 281 | The ioctl calls which are available on an instance of /dev/ppp depend |
| 282 | on whether it is unattached, attached to a PPP interface, or attached |
| 283 | to a PPP channel. The ioctl calls which are available on an |
| 284 | unattached instance are: |
| 285 | |
| 286 | * PPPIOCNEWUNIT creates a new PPP interface and makes this /dev/ppp |
| 287 | instance the "owner" of the interface. The argument should point to |
| 288 | an int which is the desired unit number if >= 0, or -1 to assign the |
| 289 | lowest unused unit number. Being the owner of the interface means |
| 290 | that the interface will be shut down if this instance of /dev/ppp is |
| 291 | closed. |
| 292 | |
| 293 | * PPPIOCATTACH attaches this instance to an existing PPP interface. |
| 294 | The argument should point to an int containing the unit number. |
| 295 | This does not make this instance the owner of the PPP interface. |
| 296 | |
| 297 | * PPPIOCATTCHAN attaches this instance to an existing PPP channel. |
| 298 | The argument should point to an int containing the channel number. |
| 299 | |
| 300 | The ioctl calls available on an instance of /dev/ppp attached to a |
| 301 | channel are: |
| 302 | |
| 303 | * PPPIOCDETACH detaches the instance from the channel. This ioctl is |
| 304 | deprecated since the same effect can be achieved by closing the |
| 305 | instance. In order to prevent possible races this ioctl will fail |
| 306 | with an EINVAL error if more than one file descriptor refers to this |
| 307 | instance (i.e. as a result of dup(), dup2() or fork()). |
| 308 | |
| 309 | * PPPIOCCONNECT connects this channel to a PPP interface. The |
| 310 | argument should point to an int containing the interface unit |
| 311 | number. It will return an EINVAL error if the channel is already |
| 312 | connected to an interface, or ENXIO if the requested interface does |
| 313 | not exist. |
| 314 | |
| 315 | * PPPIOCDISCONN disconnects this channel from the PPP interface that |
| 316 | it is connected to. It will return an EINVAL error if the channel |
| 317 | is not connected to an interface. |
| 318 | |
| 319 | * All other ioctl commands are passed to the channel ioctl() function. |
| 320 | |
| 321 | The ioctl calls that are available on an instance that is attached to |
| 322 | an interface unit are: |
| 323 | |
| 324 | * PPPIOCSMRU sets the MRU (maximum receive unit) for the interface. |
| 325 | The argument should point to an int containing the new MRU value. |
| 326 | |
| 327 | * PPPIOCSFLAGS sets flags which control the operation of the |
| 328 | interface. The argument should be a pointer to an int containing |
| 329 | the new flags value. The bits in the flags value that can be set |
| 330 | are: |
| 331 | SC_COMP_TCP enable transmit TCP header compression |
| 332 | SC_NO_TCP_CCID disable connection-id compression for |
| 333 | TCP header compression |
| 334 | SC_REJ_COMP_TCP disable receive TCP header decompression |
| 335 | SC_CCP_OPEN Compression Control Protocol (CCP) is |
| 336 | open, so inspect CCP packets |
| 337 | SC_CCP_UP CCP is up, may (de)compress packets |
| 338 | SC_LOOP_TRAFFIC send IP traffic to pppd |
| 339 | SC_MULTILINK enable PPP multilink fragmentation on |
| 340 | transmitted packets |
| 341 | SC_MP_SHORTSEQ expect short multilink sequence |
| 342 | numbers on received multilink fragments |
| 343 | SC_MP_XSHORTSEQ transmit short multilink sequence nos. |
| 344 | |
| 345 | The values of these flags are defined in <linux/if_ppp.h>. Note |
| 346 | that the values of the SC_MULTILINK, SC_MP_SHORTSEQ and |
| 347 | SC_MP_XSHORTSEQ bits are ignored if the CONFIG_PPP_MULTILINK option |
| 348 | is not selected. |
| 349 | |
| 350 | * PPPIOCGFLAGS returns the value of the status/control flags for the |
| 351 | interface unit. The argument should point to an int where the ioctl |
| 352 | will store the flags value. As well as the values listed above for |
| 353 | PPPIOCSFLAGS, the following bits may be set in the returned value: |
| 354 | SC_COMP_RUN CCP compressor is running |
| 355 | SC_DECOMP_RUN CCP decompressor is running |
| 356 | SC_DC_ERROR CCP decompressor detected non-fatal error |
| 357 | SC_DC_FERROR CCP decompressor detected fatal error |
| 358 | |
| 359 | * PPPIOCSCOMPRESS sets the parameters for packet compression or |
| 360 | decompression. The argument should point to a ppp_option_data |
| 361 | structure (defined in <linux/if_ppp.h>), which contains a |
| 362 | pointer/length pair which should describe a block of memory |
| 363 | containing a CCP option specifying a compression method and its |
| 364 | parameters. The ppp_option_data struct also contains a `transmit' |
| 365 | field. If this is 0, the ioctl will affect the receive path, |
| 366 | otherwise the transmit path. |
| 367 | |
| 368 | * PPPIOCGUNIT returns, in the int pointed to by the argument, the unit |
| 369 | number of this interface unit. |
| 370 | |
| 371 | * PPPIOCSDEBUG sets the debug flags for the interface to the value in |
| 372 | the int pointed to by the argument. Only the least significant bit |
| 373 | is used; if this is 1 the generic layer will print some debug |
| 374 | messages during its operation. This is only intended for debugging |
| 375 | the generic PPP layer code; it is generally not helpful for working |
| 376 | out why a PPP connection is failing. |
| 377 | |
| 378 | * PPPIOCGDEBUG returns the debug flags for the interface in the int |
| 379 | pointed to by the argument. |
| 380 | |
| 381 | * PPPIOCGIDLE returns the time, in seconds, since the last data |
| 382 | packets were sent and received. The argument should point to a |
| 383 | ppp_idle structure (defined in <linux/ppp_defs.h>). If the |
| 384 | CONFIG_PPP_FILTER option is enabled, the set of packets which reset |
| 385 | the transmit and receive idle timers is restricted to those which |
| 386 | pass the `active' packet filter. |
| 387 | |
| 388 | * PPPIOCSMAXCID sets the maximum connection-ID parameter (and thus the |
| 389 | number of connection slots) for the TCP header compressor and |
| 390 | decompressor. The lower 16 bits of the int pointed to by the |
| 391 | argument specify the maximum connection-ID for the compressor. If |
| 392 | the upper 16 bits of that int are non-zero, they specify the maximum |
| 393 | connection-ID for the decompressor, otherwise the decompressor's |
| 394 | maximum connection-ID is set to 15. |
| 395 | |
| 396 | * PPPIOCSNPMODE sets the network-protocol mode for a given network |
| 397 | protocol. The argument should point to an npioctl struct (defined |
| 398 | in <linux/if_ppp.h>). The `protocol' field gives the PPP protocol |
| 399 | number for the protocol to be affected, and the `mode' field |
| 400 | specifies what to do with packets for that protocol: |
| 401 | |
| 402 | NPMODE_PASS normal operation, transmit and receive packets |
| 403 | NPMODE_DROP silently drop packets for this protocol |
| 404 | NPMODE_ERROR drop packets and return an error on transmit |
| 405 | NPMODE_QUEUE queue up packets for transmit, drop received |
| 406 | packets |
| 407 | |
| 408 | At present NPMODE_ERROR and NPMODE_QUEUE have the same effect as |
| 409 | NPMODE_DROP. |
| 410 | |
| 411 | * PPPIOCGNPMODE returns the network-protocol mode for a given |
| 412 | protocol. The argument should point to an npioctl struct with the |
| 413 | `protocol' field set to the PPP protocol number for the protocol of |
| 414 | interest. On return the `mode' field will be set to the network- |
| 415 | protocol mode for that protocol. |
| 416 | |
| 417 | * PPPIOCSPASS and PPPIOCSACTIVE set the `pass' and `active' packet |
| 418 | filters. These ioctls are only available if the CONFIG_PPP_FILTER |
| 419 | option is selected. The argument should point to a sock_fprog |
| 420 | structure (defined in <linux/filter.h>) containing the compiled BPF |
| 421 | instructions for the filter. Packets are dropped if they fail the |
| 422 | `pass' filter; otherwise, if they fail the `active' filter they are |
| 423 | passed but they do not reset the transmit or receive idle timer. |
| 424 | |
| 425 | * PPPIOCSMRRU enables or disables multilink processing for received |
| 426 | packets and sets the multilink MRRU (maximum reconstructed receive |
| 427 | unit). The argument should point to an int containing the new MRRU |
| 428 | value. If the MRRU value is 0, processing of received multilink |
| 429 | fragments is disabled. This ioctl is only available if the |
| 430 | CONFIG_PPP_MULTILINK option is selected. |
| 431 | |
| 432 | Last modified: 7-feb-2002 |