blob: 3ef9f21b1f90da7bf83ac45bdf689b4873035e88 [file] [log] [blame]
shemminger143969f2006-01-10 18:50:18 +00001
2Advantage over current IMQ; cleaner in particular in in SMP;
3with a _lot_ less code.
4Old Dummy device functionality is preserved while new one only
5kicks in if you use actions.
6
7IMQ USES
8--------
9As far as i know the reasons listed below is why people use IMQ.
10It would be nice to know of anything else that i missed.
11
121) qdiscs/policies that are per device as opposed to system wide.
13IMQ allows for sharing.
14
152) Allows for queueing incoming traffic for shaping instead of
16dropping. I am not aware of any study that shows policing is
17worse than shaping in achieving the end goal of rate control.
18I would be interested if anyone is experimenting.
19
203) Very interesting use: if you are serving p2p you may wanna give
21preference to your own localy originated traffic (when responses come back)
22vs someone using your system to do bittorent. So QoSing based on state
23comes in as the solution. What people did to achive this was stick
24the IMQ somewhere prelocal hook.
25I think this is a pretty neat feature to have in Linux in general.
26(i.e not just for IMQ).
27But i wont go back to putting netfilter hooks in the device to satisfy
28this. I also dont think its worth it hacking dummy some more to be
29aware of say L3 info and play ip rule tricks to achieve this.
30--> Instead the plan is to have a contrack related action. This action will
31selectively either query/create contrack state on incoming packets.
32Packets could then be redirected to dummy based on what happens -> eg
33on incoming packets; if we find they are of known state we could send to
34a different queue than one which didnt have existing state. This
35all however is dependent on whatever rules the admin enters.
36
37At the moment this function does not exist yet. I have decided instead
38of sitting on the patch to release it and then if theres pressure i will
39add this feature.
40
41What you can do with dummy currently with actions
42--------------------------------------------------
43
44Lets say you are policing packets from alias 192.168.200.200/32
45you dont want those to exceed 100kbps going out.
46
47tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
48match ip src 192.168.200.200/32 flowid 1:2 \
49action police rate 100kbit burst 90k drop
50
51If you run tcpdump on eth0 you will see all packets going out
52with src 192.168.200.200/32 dropped or not
53Extend the rule a little to see only the ones that made it out:
54
55tc filter add dev eth0 parent 1: protocol ip prio 10 u32 \
56match ip src 192.168.200.200/32 flowid 1:2 \
57action police rate 10kbit burst 90k drop \
58action mirred egress mirror dev dummy0
59
60Now fire tcpdump on dummy0 to see only those packets ..
61tcpdump -n -i dummy0 -x -e -t
62
63Essentially a good debugging/logging interface.
64
65If you replace mirror with redirect, those packets will be
66blackholed and will never make it out. This redirect behavior
67changes with new patch (but not the mirror).
68
69What you can do with the patch to provide functionality
70that most people use IMQ for below:
71
72--------
73export TC="/sbin/tc"
74
75$TC qdisc add dev dummy0 root handle 1: prio
76$TC qdisc add dev dummy0 parent 1:1 handle 10: sfq
77$TC qdisc add dev dummy0 parent 1:2 handle 20: tbf rate 20kbit buffer 1600 limit 3000
78$TC qdisc add dev dummy0 parent 1:3 handle 30: sfq
79$TC filter add dev dummy0 protocol ip pref 1 parent 1: handle 1 fw classid 1:1
80$TC filter add dev dummy0 protocol ip pref 2 parent 1: handle 2 fw classid 1:2
81
82ifconfig dummy0 up
83
84$TC qdisc add dev eth0 ingress
85
86# redirect all IP packets arriving in eth0 to dummy0
87# use mark 1 --> puts them onto class 1:1
88$TC filter add dev eth0 parent ffff: protocol ip prio 10 u32 \
89match u32 0 0 flowid 1:1 \
90action ipt -j MARK --set-mark 1 \
91action mirred egress redirect dev dummy0
92
93--------
94
95
96Run A Little test:
97
98from another machine ping so that you have packets going into the box:
99-----
100[root@jzny action-tests]# ping 10.22
101PING 10.22 (10.0.0.22): 56 data bytes
10264 bytes from 10.0.0.22: icmp_seq=0 ttl=64 time=2.8 ms
10364 bytes from 10.0.0.22: icmp_seq=1 ttl=64 time=0.6 ms
10464 bytes from 10.0.0.22: icmp_seq=2 ttl=64 time=0.6 ms
105
106--- 10.22 ping statistics ---
1073 packets transmitted, 3 packets received, 0% packet loss
108round-trip min/avg/max = 0.6/1.3/2.8 ms
109[root@jzny action-tests]#
110-----
111Now look at some stats:
112
113---
114[root@jmandrake]:~# $TC -s filter show parent ffff: dev eth0
115filter protocol ip pref 10 u32
116filter protocol ip pref 10 u32 fh 800: ht divisor 1
117filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1
118 match 00000000/00000000 at 0
119 action order 1: tablename: mangle hook: NF_IP_PRE_ROUTING
120 target MARK set 0x1
121 index 1 ref 1 bind 1 installed 4195sec used 27sec
122 Sent 252 bytes 3 pkts (dropped 0, overlimits 0)
123
124 action order 2: mirred (Egress Redirect to device dummy0) stolen
125 index 1 ref 1 bind 1 installed 165 sec used 27 sec
126 Sent 252 bytes 3 pkts (dropped 0, overlimits 0)
127
128[root@jmandrake]:~# $TC -s qdisc
129qdisc sfq 30: dev dummy0 limit 128p quantum 1514b
130 Sent 0 bytes 0 pkts (dropped 0, overlimits 0)
131qdisc tbf 20: dev dummy0 rate 20Kbit burst 1575b lat 2147.5s
132 Sent 210 bytes 3 pkts (dropped 0, overlimits 0)
133qdisc sfq 10: dev dummy0 limit 128p quantum 1514b
134 Sent 294 bytes 3 pkts (dropped 0, overlimits 0)
135qdisc prio 1: dev dummy0 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1
136 Sent 504 bytes 6 pkts (dropped 0, overlimits 0)
137qdisc ingress ffff: dev eth0 ----------------
138 Sent 308 bytes 5 pkts (dropped 0, overlimits 0)
139
140[root@jmandrake]:~# ifconfig dummy0
141dummy0 Link encap:Ethernet HWaddr 00:00:00:00:00:00
142 inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
143 UP BROADCAST RUNNING NOARP MTU:1500 Metric:1
144 RX packets:6 errors:0 dropped:3 overruns:0 frame:0
145 TX packets:3 errors:0 dropped:0 overruns:0 carrier:0
146 collisions:0 txqueuelen:32
147 RX bytes:504 (504.0 b) TX bytes:252 (252.0 b)
148-----
149
150Dummy continues to behave like it always did.
151You send it any packet not originating from the actions it will drop them.
152[In this case the three dropped packets were ipv6 ndisc].
153
154cheers,
155jamal