Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 1 | |
| 2 | HOWTO for multiqueue network device support |
| 3 | =========================================== |
| 4 | |
| 5 | Section 1: Base driver requirements for implementing multiqueue support |
Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 6 | |
| 7 | Intro: Kernel support for multiqueue devices |
| 8 | --------------------------------------------------------- |
| 9 | |
David S. Miller | b19fa1f | 2008-07-08 23:14:24 -0700 | [diff] [blame] | 10 | Kernel support for multiqueue devices is always present. |
Peter P Waskiewicz Jr | a093bf0 | 2007-06-28 20:45:47 -0700 | [diff] [blame] | 11 | |
| 12 | Section 1: Base driver requirements for implementing multiqueue support |
| 13 | ----------------------------------------------------------------------- |
| 14 | |
| 15 | Base drivers are required to use the new alloc_etherdev_mq() or |
| 16 | alloc_netdev_mq() functions to allocate the subqueues for the device. The |
| 17 | underlying kernel API will take care of the allocation and deallocation of |
| 18 | the subqueue memory, as well as netdev configuration of where the queues |
| 19 | exist in memory. |
| 20 | |
| 21 | The base driver will also need to manage the queues as it does the global |
| 22 | netdev->queue_lock today. Therefore base drivers should use the |
| 23 | netif_{start|stop|wake}_subqueue() functions to manage each queue while the |
| 24 | device is still operational. netdev->queue_lock is still used when the device |
| 25 | comes online or when it's completely shut down (unregister_netdev(), etc.). |
| 26 | |
Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 27 | |
| 28 | Section 2: Qdisc support for multiqueue devices |
| 29 | |
| 30 | ----------------------------------------------- |
| 31 | |
Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 32 | Currently two qdiscs are optimized for multiqueue devices. The first is the |
| 33 | default pfifo_fast qdisc. This qdisc supports one qdisc per hardware queue. |
| 34 | A new round-robin qdisc, sch_multiq also supports multiple hardware queues. The |
Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 35 | qdisc is responsible for classifying the skb's and then directing the skb's to |
| 36 | bands and queues based on the value in skb->queue_mapping. Use this field in |
| 37 | the base driver to determine which queue to send the skb to. |
| 38 | |
Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 39 | sch_multiq has been added for hardware that wishes to avoid head-of-line |
| 40 | blocking. It will cycle though the bands and verify that the hardware queue |
Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 41 | associated with the band is not stopped prior to dequeuing a packet. |
| 42 | |
| 43 | On qdisc load, the number of bands is based on the number of queues on the |
| 44 | hardware. Once the association is made, any skb with skb->queue_mapping set, |
| 45 | will be queued to the band associated with the hardware queue. |
| 46 | |
| 47 | |
| 48 | Section 3: Brief howto using MULTIQ for multiqueue devices |
| 49 | --------------------------------------------------------------- |
| 50 | |
| 51 | The userspace command 'tc,' part of the iproute2 package, is used to configure |
| 52 | qdiscs. To add the MULTIQ qdisc to your network device, assuming the device |
| 53 | is called eth0, run the following command: |
| 54 | |
| 55 | # tc qdisc add dev eth0 root handle 1: multiq |
| 56 | |
| 57 | The qdisc will allocate the number of bands to equal the number of queues that |
| 58 | the device reports, and bring the qdisc online. Assuming eth0 has 4 Tx |
| 59 | queues, the band mapping would look like: |
| 60 | |
| 61 | band 0 => queue 0 |
| 62 | band 1 => queue 1 |
| 63 | band 2 => queue 2 |
| 64 | band 3 => queue 3 |
| 65 | |
Alexander Duyck | f07d150 | 2008-09-12 17:57:23 -0700 | [diff] [blame] | 66 | Traffic will begin flowing through each queue based on either the simple_tx_hash |
| 67 | function or based on netdev->select_queue() if you have it defined. |
Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 68 | |
Alexander Duyck | ca9b0e2 | 2008-09-12 16:30:20 -0700 | [diff] [blame] | 69 | The behavior of tc filters remains the same. However a new tc action, |
| 70 | skbedit, has been added. Assuming you wanted to route all traffic to a |
Alexander Duyck | 67333bb | 2008-09-12 17:56:50 -0700 | [diff] [blame] | 71 | specific host, for example 192.168.0.3, through a specific queue you could use |
Alexander Duyck | ca9b0e2 | 2008-09-12 16:30:20 -0700 | [diff] [blame] | 72 | this action and establish a filter such as: |
| 73 | |
| 74 | tc filter add dev eth0 parent 1: protocol ip prio 1 u32 \ |
| 75 | match ip dst 192.168.0.3 \ |
| 76 | action skbedit queue_mapping 3 |
Alexander Duyck | 9265194 | 2008-09-12 16:29:34 -0700 | [diff] [blame] | 77 | |
| 78 | Author: Alexander Duyck <alexander.h.duyck@intel.com> |
| 79 | Original Author: Peter P. Waskiewicz Jr. <peter.p.waskiewicz.jr@intel.com> |