shemminger | 143969f | 2006-01-10 18:50:18 +0000 | [diff] [blame] | 1 | |
| 2 | Advantage over current IMQ; cleaner in particular in in SMP; |
| 3 | with a _lot_ less code. |
| 4 | Old Dummy device functionality is preserved while new one only |
| 5 | kicks in if you use actions. |
| 6 | |
| 7 | IMQ USES |
| 8 | -------- |
| 9 | As far as i know the reasons listed below is why people use IMQ. |
| 10 | It would be nice to know of anything else that i missed. |
| 11 | |
| 12 | 1) qdiscs/policies that are per device as opposed to system wide. |
| 13 | IMQ allows for sharing. |
| 14 | |
| 15 | 2) Allows for queueing incoming traffic for shaping instead of |
| 16 | dropping. I am not aware of any study that shows policing is |
| 17 | worse than shaping in achieving the end goal of rate control. |
| 18 | I would be interested if anyone is experimenting. |
| 19 | |
| 20 | 3) Very interesting use: if you are serving p2p you may wanna give |
| 21 | preference to your own localy originated traffic (when responses come back) |
| 22 | vs someone using your system to do bittorent. So QoSing based on state |
| 23 | comes in as the solution. What people did to achive this was stick |
| 24 | the IMQ somewhere prelocal hook. |
| 25 | I think this is a pretty neat feature to have in Linux in general. |
| 26 | (i.e not just for IMQ). |
| 27 | But i wont go back to putting netfilter hooks in the device to satisfy |
| 28 | this. I also dont think its worth it hacking dummy some more to be |
| 29 | aware of say L3 info and play ip rule tricks to achieve this. |
| 30 | --> Instead the plan is to have a contrack related action. This action will |
| 31 | selectively either query/create contrack state on incoming packets. |
| 32 | Packets could then be redirected to dummy based on what happens -> eg |
| 33 | on incoming packets; if we find they are of known state we could send to |
| 34 | a different queue than one which didnt have existing state. This |
| 35 | all however is dependent on whatever rules the admin enters. |
| 36 | |
| 37 | At the moment this function does not exist yet. I have decided instead |
| 38 | of sitting on the patch to release it and then if theres pressure i will |
| 39 | add this feature. |
| 40 | |
| 41 | What you can do with dummy currently with actions |
| 42 | -------------------------------------------------- |
| 43 | |
| 44 | Lets say you are policing packets from alias 192.168.200.200/32 |
| 45 | you dont want those to exceed 100kbps going out. |
| 46 | |
| 47 | tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \ |
| 48 | match ip src 192.168.200.200/32 flowid 1:2 \ |
| 49 | action police rate 100kbit burst 90k drop |
| 50 | |
| 51 | If you run tcpdump on eth0 you will see all packets going out |
| 52 | with src 192.168.200.200/32 dropped or not |
| 53 | Extend the rule a little to see only the ones that made it out: |
| 54 | |
| 55 | tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \ |
| 56 | match ip src 192.168.200.200/32 flowid 1:2 \ |
| 57 | action police rate 10kbit burst 90k drop \ |
| 58 | action mirred egress mirror dev dummy0 |
| 59 | |
| 60 | Now fire tcpdump on dummy0 to see only those packets .. |
| 61 | tcpdump -n -i dummy0 -x -e -t |
| 62 | |
| 63 | Essentially a good debugging/logging interface. |
| 64 | |
| 65 | If you replace mirror with redirect, those packets will be |
| 66 | blackholed and will never make it out. This redirect behavior |
| 67 | changes with new patch (but not the mirror). |
| 68 | |
| 69 | What you can do with the patch to provide functionality |
| 70 | that most people use IMQ for below: |
| 71 | |
| 72 | -------- |
| 73 | export TC="/sbin/tc" |
| 74 | |
| 75 | $TC qdisc add dev dummy0 root handle 1: prio |
| 76 | $TC qdisc add dev dummy0 parent 1:1 handle 10: sfq |
| 77 | $TC qdisc add dev dummy0 parent 1:2 handle 20: tbf rate 20kbit buffer 1600 limit 3000 |
| 78 | $TC qdisc add dev dummy0 parent 1:3 handle 30: sfq |
| 79 | $TC filter add dev dummy0 protocol ip pref 1 parent 1: handle 1 fw classid 1:1 |
| 80 | $TC filter add dev dummy0 protocol ip pref 2 parent 1: handle 2 fw classid 1:2 |
| 81 | |
| 82 | ifconfig dummy0 up |
| 83 | |
| 84 | $TC qdisc add dev eth0 ingress |
| 85 | |
| 86 | # redirect all IP packets arriving in eth0 to dummy0 |
| 87 | # use mark 1 --> puts them onto class 1:1 |
| 88 | $TC filter add dev eth0 parent ffff: protocol ip prio 10 u32 \ |
| 89 | match u32 0 0 flowid 1:1 \ |
| 90 | action ipt -j MARK --set-mark 1 \ |
| 91 | action mirred egress redirect dev dummy0 |
| 92 | |
| 93 | -------- |
| 94 | |
| 95 | |
| 96 | Run A Little test: |
| 97 | |
| 98 | from another machine ping so that you have packets going into the box: |
| 99 | ----- |
| 100 | [root@jzny action-tests]# ping 10.22 |
| 101 | PING 10.22 (10.0.0.22): 56 data bytes |
| 102 | 64 bytes from 10.0.0.22: icmp_seq=0 ttl=64 time=2.8 ms |
| 103 | 64 bytes from 10.0.0.22: icmp_seq=1 ttl=64 time=0.6 ms |
| 104 | 64 bytes from 10.0.0.22: icmp_seq=2 ttl=64 time=0.6 ms |
| 105 | |
| 106 | --- 10.22 ping statistics --- |
| 107 | 3 packets transmitted, 3 packets received, 0% packet loss |
| 108 | round-trip min/avg/max = 0.6/1.3/2.8 ms |
| 109 | [root@jzny action-tests]# |
| 110 | ----- |
| 111 | Now look at some stats: |
| 112 | |
| 113 | --- |
| 114 | [root@jmandrake]:~# $TC -s filter show parent ffff: dev eth0 |
| 115 | filter protocol ip pref 10 u32 |
| 116 | filter protocol ip pref 10 u32 fh 800: ht divisor 1 |
| 117 | filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 |
| 118 | match 00000000/00000000 at 0 |
| 119 | action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING |
| 120 | target MARK set 0x1 |
| 121 | index 1 ref 1 bind 1 installed 4195sec used 27sec |
| 122 | Sent 252 bytes 3 pkts (dropped 0, overlimits 0) |
| 123 | |
| 124 | action order 2: mirred (Egress Redirect to device dummy0) stolen |
| 125 | index 1 ref 1 bind 1 installed 165 sec used 27 sec |
| 126 | Sent 252 bytes 3 pkts (dropped 0, overlimits 0) |
| 127 | |
| 128 | [root@jmandrake]:~# $TC -s qdisc |
| 129 | qdisc sfq 30: dev dummy0 limit 128p quantum 1514b |
| 130 | Sent 0 bytes 0 pkts (dropped 0, overlimits 0) |
| 131 | qdisc tbf 20: dev dummy0 rate 20Kbit burst 1575b lat 2147.5s |
| 132 | Sent 210 bytes 3 pkts (dropped 0, overlimits 0) |
| 133 | qdisc sfq 10: dev dummy0 limit 128p quantum 1514b |
| 134 | Sent 294 bytes 3 pkts (dropped 0, overlimits 0) |
| 135 | qdisc prio 1: dev dummy0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 |
| 136 | Sent 504 bytes 6 pkts (dropped 0, overlimits 0) |
| 137 | qdisc ingress ffff: dev eth0 ---------------- |
| 138 | Sent 308 bytes 5 pkts (dropped 0, overlimits 0) |
| 139 | |
| 140 | [root@jmandrake]:~# ifconfig dummy0 |
| 141 | dummy0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 |
| 142 | inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link |
| 143 | UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 |
| 144 | RX packets:6 errors:0 dropped:3 overruns:0 frame:0 |
| 145 | TX packets:3 errors:0 dropped:0 overruns:0 carrier:0 |
| 146 | collisions:0 txqueuelen:32 |
| 147 | RX bytes:504 (504.0 b) TX bytes:252 (252.0 b) |
| 148 | ----- |
| 149 | |
| 150 | Dummy continues to behave like it always did. |
| 151 | You send it any packet not originating from the actions it will drop them. |
| 152 | [In this case the three dropped packets were ipv6 ndisc]. |
| 153 | |
| 154 | cheers, |
| 155 | jamal |