Linus Torvalds | 1da177e | 2005-04-16 15:20:36 -0700 | [diff] [blame] | 1 | Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters |
| 2 | ================================================================ |
| 3 | |
| 4 | November 17, 2004 |
| 5 | |
| 6 | |
| 7 | Contents |
| 8 | ======== |
| 9 | |
| 10 | - In This Release |
| 11 | - Identifying Your Adapter |
| 12 | - Command Line Parameters |
| 13 | - Improving Performance |
| 14 | - Support |
| 15 | |
| 16 | |
| 17 | In This Release |
| 18 | =============== |
| 19 | |
| 20 | This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family |
| 21 | of Adapters, version 1.0.x. |
| 22 | |
| 23 | For questions related to hardware requirements, refer to the documentation |
| 24 | supplied with your Intel PRO/10GbE adapter. All hardware requirements listed |
| 25 | apply to use with Linux. |
| 26 | |
| 27 | Identifying Your Adapter |
| 28 | ======================== |
| 29 | |
| 30 | To verify your Intel adapter is supported, find the board ID number on the |
| 31 | adapter. Look for a label that has a barcode and a number in the format |
| 32 | A12345-001. |
| 33 | |
| 34 | Use the above information and the Adapter & Driver ID Guide at: |
| 35 | |
| 36 | http://support.intel.com/support/network/adapter/pro100/21397.htm |
| 37 | |
| 38 | For the latest Intel network drivers for Linux, go to: |
| 39 | |
| 40 | http://downloadfinder.intel.com/scripts-df/support_intel.asp |
| 41 | |
| 42 | Command Line Parameters |
| 43 | ======================= |
| 44 | |
| 45 | If the driver is built as a module, the following optional parameters are |
| 46 | used by entering them on the command line with the modprobe or insmod command |
| 47 | using this syntax: |
| 48 | |
| 49 | modprobe ixgb [<option>=<VAL1>,<VAL2>,...] |
| 50 | |
| 51 | insmod ixgb [<option>=<VAL1>,<VAL2>,...] |
| 52 | |
| 53 | For example, with two PRO/10GbE PCI adapters, entering: |
| 54 | |
| 55 | insmod ixgb TxDescriptors=80,128 |
| 56 | |
| 57 | loads the ixgb driver with 80 TX resources for the first adapter and 128 TX |
| 58 | resources for the second adapter. |
| 59 | |
| 60 | The default value for each parameter is generally the recommended setting, |
| 61 | unless otherwise noted. Also, if the driver is statically built into the |
| 62 | kernel, the driver is loaded with the default values for all the parameters. |
| 63 | Ethtool can be used to change some of the parameters at runtime. |
| 64 | |
| 65 | FlowControl |
| 66 | Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx) |
| 67 | Default: Read from the EEPROM |
| 68 | If EEPROM is not detected, default is 3 |
| 69 | This parameter controls the automatic generation(Tx) and response(Rx) to |
| 70 | Ethernet PAUSE frames. |
| 71 | |
| 72 | RxDescriptors |
| 73 | Valid Range: 64-512 |
| 74 | Default Value: 512 |
| 75 | This value is the number of receive descriptors allocated by the driver. |
| 76 | Increasing this value allows the driver to buffer more incoming packets. |
| 77 | Each descriptor is 16 bytes. A receive buffer is also allocated for |
| 78 | each descriptor and can be either 2048, 4056, 8192, or 16384 bytes, |
| 79 | depending on the MTU setting. When the MTU size is 1500 or less, the |
| 80 | receive buffer size is 2048 bytes. When the MTU is greater than 1500 the |
| 81 | receive buffer size will be either 4056, 8192, or 16384 bytes. The |
| 82 | maximum MTU size is 16114. |
| 83 | |
| 84 | RxIntDelay |
| 85 | Valid Range: 0-65535 (0=off) |
| 86 | Default Value: 6 |
| 87 | This value delays the generation of receive interrupts in units of |
| 88 | 0.8192 microseconds. Receive interrupt reduction can improve CPU |
| 89 | efficiency if properly tuned for specific network traffic. Increasing |
| 90 | this value adds extra latency to frame reception and can end up |
| 91 | decreasing the throughput of TCP traffic. If the system is reporting |
| 92 | dropped receives, this value may be set too high, causing the driver to |
| 93 | run out of available receive descriptors. |
| 94 | |
| 95 | TxDescriptors |
| 96 | Valid Range: 64-4096 |
| 97 | Default Value: 256 |
| 98 | This value is the number of transmit descriptors allocated by the driver. |
| 99 | Increasing this value allows the driver to queue more transmits. Each |
| 100 | descriptor is 16 bytes. |
| 101 | |
| 102 | XsumRX |
| 103 | Valid Range: 0-1 |
| 104 | Default Value: 1 |
| 105 | A value of '1' indicates that the driver should enable IP checksum |
| 106 | offload for received packets (both UDP and TCP) to the adapter hardware. |
| 107 | |
| 108 | XsumTX |
| 109 | Valid Range: 0-1 |
| 110 | Default Value: 1 |
| 111 | A value of '1' indicates that the driver should enable IP checksum |
| 112 | offload for transmitted packets (both UDP and TCP) to the adapter |
| 113 | hardware. |
| 114 | |
| 115 | Improving Performance |
| 116 | ===================== |
| 117 | |
| 118 | With the Intel PRO/10 GbE adapter, the default Linux configuration will very |
| 119 | likely limit the total available throughput artificially. There is a set of |
| 120 | things that when applied together increase the ability of Linux to transmit |
| 121 | and receive data. The following enhancements were originally acquired from |
| 122 | settings published at http://www.spec.org/web99 for various submitted results |
| 123 | using Linux. |
| 124 | |
| 125 | NOTE: These changes are only suggestions, and serve as a starting point for |
| 126 | tuning your network performance. |
| 127 | |
| 128 | The changes are made in three major ways, listed in order of greatest effect: |
| 129 | - Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen |
| 130 | parameter. |
| 131 | - Use sysctl to modify /proc parameters (essentially kernel tuning) |
| 132 | - Use setpci to modify the MMRBC field in PCI-X configuration space to increase |
| 133 | transmit burst lengths on the bus. |
| 134 | |
| 135 | NOTE: setpci modifies the adapter's configuration registers to allow it to read |
| 136 | up to 4k bytes at a time (for transmits). However, for some systems the |
| 137 | behavior after modifying this register may be undefined (possibly errors of some |
| 138 | kind). A power-cycle, hard reset or explicitly setting the e6 register back to |
| 139 | 22 (setpci -d 8086:1048 e6.b=22) may be required to get back to a stable |
| 140 | configuration. |
| 141 | |
| 142 | - COPY these lines and paste them into ixgb_perf.sh: |
| 143 | #!/bin/bash |
| 144 | echo "configuring network performance , edit this file to change the interface" |
| 145 | # set mmrbc to 4k reads, modify only Intel 10GbE device IDs |
| 146 | setpci -d 8086:1048 e6.b=2e |
| 147 | # set the MTU (max transmission unit) - it requires your switch and clients to change too! |
| 148 | # set the txqueuelen |
| 149 | # your ixgb adapter should be loaded as eth1 for this to work, change if needed |
| 150 | ifconfig eth1 mtu 9000 txqueuelen 1000 up |
| 151 | # call the sysctl utility to modify /proc/sys entries |
| 152 | sysctl -p ./sysctl_ixgb.conf |
| 153 | - END ixgb_perf.sh |
| 154 | |
| 155 | - COPY these lines and paste them into sysctl_ixgb.conf: |
| 156 | # some of the defaults may be different for your kernel |
| 157 | # call this file with sysctl -p <this file> |
| 158 | # these are just suggested values that worked well to increase throughput in |
| 159 | # several network benchmark tests, your mileage may vary |
| 160 | |
| 161 | ### IPV4 specific settings |
| 162 | net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use |
| 163 | net.ipv4.tcp_sack = 0 # turn SACK support off, default on |
| 164 | # on systems with a VERY fast bus -> memory interface this is the big gainer |
| 165 | net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760 |
| 166 | net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072 |
| 167 | net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768 |
| 168 | |
| 169 | ### CORE settings (mostly for socket and UDP effect) |
| 170 | net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071 |
| 171 | net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071 |
| 172 | net.core.rmem_default = 524287 # default receive socket buffer size, default 65535 |
| 173 | net.core.wmem_default = 524287 # default send socket buffer size, default 65535 |
| 174 | net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240 |
| 175 | net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300 |
| 176 | - END sysctl_ixgb.conf |
| 177 | |
| 178 | Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface |
| 179 | your ixgb driver is using. |
| 180 | |
| 181 | NOTE: Unless these scripts are added to the boot process, these changes will |
| 182 | only last only until the next system reboot. |
| 183 | |
| 184 | |
| 185 | Resolving Slow UDP Traffic |
| 186 | -------------------------- |
| 187 | |
| 188 | If your server does not seem to be able to receive UDP traffic as fast as it |
| 189 | can receive TCP traffic, it could be because Linux, by default, does not set |
| 190 | the network stack buffers as large as they need to be to support high UDP |
| 191 | transfer rates. One way to alleviate this problem is to allow more memory to |
| 192 | be used by the IP stack to store incoming data. |
| 193 | |
| 194 | For instance, use the commands: |
| 195 | sysctl -w net.core.rmem_max=262143 |
| 196 | and |
| 197 | sysctl -w net.core.rmem_default=262143 |
| 198 | to increase the read buffer memory max and default to 262143 (256k - 1) from |
| 199 | defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables |
| 200 | will increase the amount of memory used by the network stack for receives, and |
| 201 | can be increased significantly more if necessary for your application. |
| 202 | |
| 203 | Support |
| 204 | ======= |
| 205 | |
| 206 | For general information and support, go to the Intel support website at: |
| 207 | |
| 208 | http://support.intel.com |
| 209 | |
| 210 | If an issue is identified with the released source code on the supported |
| 211 | kernel with a supported adapter, email the specific information related to |
| 212 | the issue to linux.nics@intel.com. |