blob: dcf85c0ed3cc569a52fa15299d3cadbc06ad5159 [file] [log] [blame]
Bart De Schuymer08934e32002-06-02 14:02:18 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3c.org/TR/1999/REC-html401-19991224/loose.dtd">
2<HTML><HEAD><TITLE>How bridge/ebtables/iptables interaction works</TITLE>
3<META http-equiv=Content-Type content="text/html; charset=iso-8859-15">
4<STYLE type=text/css>H1 {
5 FONT: bold 25pt Times, serif; TEXT-ALIGN: center; TEXT-DECORATION: underline
6}
7P {
8 FONT: 20pt Times, serif
9}
10LI {
11 MARGIN-BOTTOM: 2em; FONT: 22pt 'Times New Roman', serif
12}
13PRE {
14 FONT: 18pt Courier, monospace
15}
16.statement {
17 TEXT-DECORATION: underline
18}
19.section {
20 FONT: bold 22pt Times
21}
22.case {
23 FONT-STYLE: italic
24}
25</STYLE>
26
27<META content="MSHTML 6.00.2505.0" name=GENERATOR></HEAD>
28<BODY>
29<H1>How bridge/ebtables/iptables interaction works</H1>
30
31<P class=section>1. How frames traverse the <EM>ebtables</EM> chains:</P>
32<P>This section only considers <EM>ebtables</EM>, _not_ <EM>iptables</EM>.</P>
33<PRE>
34 Route
35 ^
36 |
37I +--------+ Bridge +----------+ +-------+ +-----------+ O
38N->|BROUTING|-------->|PREROUTING|----->[BRIDGING]---->|FORWARD| ---->|POSTROUTING|-->U
39 +--------+ +----------+ [DECISION] +-------+ +-----------+ T
40 | ^
41 v |
42 +-----+ +----------+
43 |INPUT| |OUTPUT (2)|
44 +-----+ +----------+
45 | ^
46 | |
47 | +----------+
48 | +OUTPUT (1)+
49 | +----------+
50 | ^
51 +------->Local Process---------+
52</PRE>
53<P>
54First thing to keep in mind is that we are talking about the ethernet layer here,
55so the OSI layer 2. A packet destined for the local computer according to the bridge
56(which works on the ethernet layer) isn't necessarily destined for the local computer
57according to the ip layer. That's how routing works (MAC destination is the router, ip
58destination is the actual box you want to communicate with).</P>
59<P>
60<EM>Ebtables</EM> currently has three tables: filter, nat and broute. The filter table has a
61FORWARD, INPUT and OUTPUT chain. The nat table has a PREROUTING, OUTPUT and POSTROUTING chain.
62The broute table has the BROUTING chain. In the figure the filter OUTPUT chain has (2)
63appended and the nat OUTPUT chain has (1) appended. So these two OUTPUT chains are not
64the same (and have a different intended use).</P>
65<P>
66When a nic enslaved to a bridge receives a frame, the frame will first go through the BROUTING
67chain. In this special chain one can choose whether to route or bridge frames. The default
68is bridging and we will assume the decision in this chain is 'bridge'. So, next the frame
69passes through the PREROUTING chain. This chain is intended for you to be able to alter the
70destination MAC address of
71frames (DNAT). If the frame passes this chain, the bridging code will decide where the
72frame should be sent. The bridge does this by looking at the destination MAC address, it
73doesn't care about the OSI layer 3 addresses (e.g. ip address). Note that frames coming in
74on non-forwarding ports of a bridge will not be seen by <EM>ebtables</EM>, not even by the BROUTING
75chain.</P>
76<P>
77If the bridge decides the frame is for the bridging computer, the frame will go through the
78INPUT chain. In this chain you can filter frames destined for the bridge box. After passing
79the INPUT chain, the frame will be given to the code on layer 3 (i.e. it will be passed up),
80e.g. to the ip code. So, a routed ip packet will go through the <EM>ebtables</EM> INPUT chain, not
81through the <EM>ebtables</EM> FORWARD chain. This is logical.</P>
82<P>
83Else the frame should possibly be sent onto another side of the bridge. If it should, the
84frame will go through the FORWARD chain and the POSTROUTING chain. In the FORWARD chain one
85can filter frames that will be bridged, the POSTROUTING chain is intended to be able to
86change the MAC source address (SNAT).</P>
87<P>
88Frames that originate from the bridge box itself will go, after the bridging decision, through the
89nat OUTPUT chain, through the filter OUTPUT chain and the POSTROUTING chain. The
90nat OUTPUT chain allows you to alter the destination MAC address and the filter OUTPUT chain
91allows you to filter frames originating from the bridge box. Note that the nat OUTPUT chain is
92traversed after the bridging decision, so actually too late. We should change this. The POSTROUTING
93chain is the same one as described above. Note that it is also possible for routed frames to go
94through these chains, this is when the destination device is a logical bridge device.</P>
95<P class=section>
962. A machine used as a bridge and a router (not a brouter):</P>
97<P>
98It's possible to see a single ip packet pass the PREROUTING, INPUT, nat OUTPUT, filter OUTPUT
99and POSTROUTING <EM>ebtables</EM> chains.</P>
100<P>
101This can happen when the bridge is also used as a router. The ethernet frame(s) containing that
102ip packet will have the bridge's destination MAC address, while the destination ip address is not
103that of the bridge. Including the <EM>iptables</EM> chains, this is how the ip packet runs through the
104bridge/router (eb=ebtables , ip=iptables ):</P>
105<PRE>ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet</PRE>
106<P>
107This assumes that the routing decision sends the packet to a bridge interface. If the routing
108decision sends the packet to a physical network card, this is what happens:</P>
109<PRE>ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ipPOSTROUTING->send packet</PRE>
110<P>
111What is obviously "asymmetric" here is that the <EM>iptables</EM> PREROUTING chain is traversed before
112the <EM>ebtables</EM> INPUT chain, however this can not be helped. See the next section.</P>
113<P class=section>
1143. DNATing bridged packets:</P>
115<P>
116Take an ip packet received by the bridge, it enters the bridge code. Lets assume we want to do
117some ip DNAT on it. Changing the destination address of the packet (ip address and MAC address)
118has to happen before the bridge code decides what to do with the packet. The bridge code can decide
119to bridge it (if the destination MAC address is on another side of the bridge), flood it over all
120the forwarding bridge ports (the position of the box with the destination MAC is unknown to the bridge),
121give it to the higher protocol code (here, the ip code) if the destination MAC address is that of the
122bridge, or ignore it (the destination MAC address is located on the same side of the bridge).</P>
123<P>
124So, this ip DNAT has to happen very early in the bridge code. Namely before the bridge code
125actually does anything. This is at the same place as where the <EM>ebtables</EM> PREROUTING chain will
126be traversed (for the same reason).</P>
127<P class=section>
1284. Chain traversal for bridged ip packets:</P>
129<P>
130A bridged packet never enters any network code above layer 2. So a bridged ip packet will never
131enter the ip code. Therefore all <EM>iptables</EM> chains will be traversed while the ip packet is in the
132bridge code. The chain traversal will look like this:</P>
133<PRE>
134ebPREROUTING->ipPREROUTING->ebFORWARD->ipFORWARD->ebPOSTROUTING->ipPOSTROUTING</PRE>
135<P>
136Once again note that there is a certain form of asymmetry here that cannot be helped.</P>
137<P class=section>
1385. Using a bridge port in <EM>iptables</EM> rules:</P>
139<P>
140The wish to be able to use physical devices belonging to a bridge (bridge ports) in <EM>iptables</EM> rules
141is valid. It's necessary to prevent spoofing attacks. Say br0 has ports eth0 and eth1. If <EM>iptables</EM>
142rules can only use br0 there's no way of knowing when a box on the eth0 side changes it's source ip
143address to that of a box on the eth1 side, except by looking at the MAC source address (and then
144still...). With the current bridge/iptables patch (0.0.6 or later) you can use eth0 and eth1 in your
145<EM>iptables</EM> rules and therefore catch these attempts.</P>
146<P class=case>
1471. <EM>iptables</EM> wants to use bridge ports:<P>
148<P>
149To make this possible the <EM>iptables</EM> chains have to be traversed after the bridge code decided where
150the frame needs to be sent (eth0, eth1, both or none). This has some impact on the scheme presented
151in section 2 (so, we are looking at routed traffic here). It actually looks like this:</P>
152<PRE>
153ebPREROUTING->ipPREROUTING->ebINPUT->ipFORWARD->ebOUTPUT(1)->ebOUTPUT(2)->ipPOSTROUTING->ebPOSTROUTING->send packet</PRE>
154<P>
155Note that this is the work of the br-nf patch. If one does not compile the br-nf code into the kernel,
156the chains will be traversed as shown below. However, then one can only use br0, not eth0/eth1 to
157filter.</P>
158<PRE>ebPREROUTING->ebINPUT->ipPREROUTING->ipFORWARD->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->send packet</PRE>
159<P>
160Notice that ipPREROUTING is now in the natural position in the chain list and too far to be able to change
161the bridging decision. More precise: ipPREROUTING is now traversed while the packet is in the ip code.</P>
162<P class=case>
1632. IP DNAT for locally generated packets (so in the <EM>iptables</EM> nat OUTPUT chain):</P>
164<P>
165The 'normal' way locally generated packets would go through the chains looks like this:</P>
166<PRE>
167ipOUTPUT(1)->ipOUTPUT(2)->ipPOSTROUTING->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING</PRE>
168<P>
169From the section 5.1 we know that this actually looks like this:</P>
170<PRE>
171ipOUTPUT(1)->ipOUTPUT(2)->ebOUTPUT(1)->ebOUTPUT(2)->ebPOSTROUTING->ipPOSTROUTING</PRE>
172<P>
173Here we denote by ipOUTPUT(1) (resp. ipOUTPUT(2)) the <EM>iptables</EM> nat (resp. filter) OUTPUT chain. Note that
174the ipOUTPUT(1) chain is traversed while the packet is in the ip code, while the ipOUTPUT(2) chain is traversed when
175the packet has entered the bridge code. This makes it possible to do DNAT to another device in ipOUTPUT(1) and lets
176one use the bridge ports in the ipOUTPUT(2) chain.</P>
177<P class=section>
Bart De Schuymer65370682002-06-02 14:53:27 +00001786. Two possible ways for frames/packets to pass through the <EM>iptables</EM> PREROUTING, FORWARD and POSTROUTING
Bart De Schuymer08934e32002-06-02 14:02:18 +0000179chains:</P>
180<P>
181With the br-nf patch there are 2 ways a frame/packet can pass through the 3 given <EM>iptables</EM>
182chains. The first way is when the frame is bridged, so the <EM>iptables</EM> chains are called by the bridge code.
183The second way is when the packet is routed. So special care has to be taken to distinguish between those
184two, especially in the <EM>iptables</EM> FORWARD chain. Here's an example of strange things to look out for:</P>
185<P>
186Consider the following situation (my personal setup)</P>
187<PRE>
188 +-----------------+
189 | cable modem |
190 +-------+---------+
191 |
192 |
193 eth0|IP via DHCP from ISP
194 +-------+---------+
195 |bridge/router/fw |
196 +--+-----------+--+
197 eth1| 172.16.1.1|eth2
198 | (br0) |
199 | |
200 172.16.1.4| |172.16.1.2
201 +----------+---+ +--+------------+
202 |test computer/| | desktop |
203 |backup server | +---------------+
204 +--------------+</PRE>
205<P>
206With this setup I can test the bridge+ebtables+iptables code while having access to the internet from all
207three computers. The default gateway for 172.16.1.2 and 172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge
208interface br0 with ports eth1 and eth2.</P>
209<P class=case>More details:</P>
210<P>
211The idea is that traffic between 172.16.1.4 and 172.16.2 is bridged, while the rest is routed, using
212masquerading. Here's the "script" I use at bootup for the bridge/router:</P>
213<PRE>
214iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -d 172.16.1.0/24 -j ACCEPT
215iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -j MASQUERADE
216insmod ebtables
217insmod ebtable_filter
218insmod ebtable_nat
219insmod ebt_nat
220insmod ebt_log
221insmod ebt_arp
222insmod ebt_ip
223insmod br_db
224brctl addbr br0
225brctl stp br0 off
226brctl addif br0 eth1
227brctl addif br0 eth2
228ifconfig eth1 0 0.0.0.0
229ifconfig eth2 0 0.0.0.0
230ifconfig br0 172.16.1.1 netmask 255.255.255.0 up
231echo '1' > /proc/sys/net/ipv4/ip_forward</PRE>
232<P>
233The catch is in the first line. Because the <EM>iptables</EM> code gets executed for both bridged packets and routed
234packets we need to make a distinction between the two. We don't really want the bridged packets to be
235masqueraded. If we omit the first line then everything will work too, but things will happen differently.
236Let's say 172.16.1.2 pings 172.16.1.4. The bridge receives the ping request and will transmit it through its eth1
237port after first masquerading the ip address. So the packet's source ip address will now be 172.16.1.1 and
238172.16.1.4 will respond to the bridge. Masquerading will change the ip destination of this response from
239172.16.1.1 to 172.16.1.4. Everything works fine. But it's better not to have this behaviour. Thus, we use the
240first line of the script to avoid this. Note that if I wanted to filter the connections to and from the
241internet, I would certainly need the first line so I don't filter the local connections as well.</P>
242<P class=section>
Bart De Schuymer65370682002-06-02 14:53:27 +00002437. ip DNAT in the <EM>iptables</EM> PREROUTING chain on frames/packets entering on a bridge port:</P>
Bart De Schuymer08934e32002-06-02 14:02:18 +0000244<P>Through some groovy play it is assured that (see /net/bridge/br_netfilter.c) DNAT'ed packets that after DNAT'ing
245have the same output device as the input device they came on (the logical bridge device which we like to call br0)
246will be bridged, not routed. So they will go through the <EM>ebtables</EM> FORWARD chain. All other DNAT'ed packets will be
247routed, so won't go through the <EM>ebtables</EM> FORWARD chain, will go through the <EM>ebtables</EM> INPUT chain and might go
248through the <EM>ebtables</EM> OUTPUT chain.</P>
Bart De Schuymer65370682002-06-02 14:53:27 +0000249<P class=section>
2508. using the mac module extension for <EM>iptables</EM>:</P>
251<P>The side effect explained here occurs when the br-nf code is compiled in the kernel, the ip packet is routed and the out device
252for that packet is a logical bridge. The side effect is encountered when filtering on the mac source in the
253<EM>iptables</EM> FORWARD chains. As should be clear from earlier sections, the traversal of the <EM>iptables</EM> FORWARD chains
254is postponed until the packet is in the bridge code. This is done so one can filter on the bridge port out device. This has a
255side effect on the MAC source address, because the ip code will have changed the MAC source address to the MAC address of the bridge.
256It is therefore impossible, in the <EM>iptables</EM> FORWARD chains, to filter on the MAC source address of the computer sending
257the packet in question to the bridge/router. If you really need to filter on this MAC source address, you should do it in the nat
258PREROUTING chain. Agreed, very ugly, but making it possible to filter on the real MAC source address in the FORWARD chains would
259involve a very dirty hack and is probably not worth it.</P>
Bart De Schuymer08934e32002-06-02 14:02:18 +0000260<P>
261Released under the GPL.</P>
262<P>
263Bart De Schuymer.</P>
264<P>
Bart De Schuymer65370682002-06-02 14:53:27 +0000265Last updated June 2nd, 2002.</P>
266</BODY></HTML>