blob: 77219ca09f4f5766bc1f5ac8240613bdb241ee7b [file] [log] [blame]
<!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01//EN'
'http://www.w3.org/TR/html4/strict.dtd'>
<HTML>
<HEAD>
<TITLE>
ebtables/iptables interaction on a Linux-based bridge
</TITLE>
<META HTTP-EQUIV="Content-Type" CONTENT=
"text/html; charset=iso-8859-1">
<LINK REL="STYLESHEET" TYPE="text/css" HREF="br_fw_ia.css">
</HEAD>
<BODY>
<DIV CLASS="bar">
<DIV CLASS="shadow">
<H1 ALIGN="center">
ebtables/iptables interaction on a Linux-based bridge
</H1>
</DIV>
</DIV>
<H3>
<A NAME="top">Table of Contents</A>
</H3>
<OL>
<LI>
<A HREF="#section1">Introduction</A>
</LI>
<LI>
<A HREF="#section2">How frames traverse the
<EM>ebtables</EM> chains</A>
</LI>
<LI>
<A HREF="#section3">A machine used as a bridge and a
router (not a brouter)</A>
</LI>
<LI>
<A HREF="#section4">DNAT'ing bridged packets</A>
</LI>
<LI>
<A HREF="#section5">Chain traversal for bridged IP packets</A>
</LI>
<LI>
<A HREF="#section6">Using a bridge port in <EM>iptables</EM> rules</A>
</LI>
<LI>
<A HREF="#section7">Two possible ways for frames/packets
to pass through the <EM>iptables</EM> PREROUTING, FORWARD and
POSTROUTING chains</A>
</LI>
<LI>
<A HREF="#section8">IP DNAT in the <EM>iptables</EM> PREROUTING
chain on frames/packets entering on a bridge port</A>
</LI>
<LI>
<A HREF="#section9">Using the MAC module extension for
<EM>iptables</EM></A>
</LI>
<LI>
<A HREF="#section10">Using the <EM>iptables</EM> physdev match module for kernel 2.5</A>
</LI>
</OL>
<A NAME="section1"></A>
<P CLASS="section">
1. Introduction
</P>
<P>
This document describes how <EM>iptables</EM> and
<EM>ebtables</EM> filtering tables interact on a Linux-based bridge.<BR>
Getting a bridging firewall consists of patching the kernel source
code with one or two patches. Since kernel 2.5.39 (resp. 2.5.44), <EM>ebtables</EM> (resp. <EM>br-nf</EM>)
is in the standard 2.5 kernel releases.
Because the demand was high, patches for the 2.4 kernel are still available at the <EM>ebtables</EM> homepage.
The <EM>br-nf</EM> code makes bridged IP frames/packets go through the <EM>iptables</EM> chains.
<EM>Ebtables</EM> filters on the Ethernet layer, while <EM>iptables</EM>
only filters IP packets.<BR>
The explanations below will use the TCP/IP Network Model.
It should be noted that the <EM>br-nf</EM> code sometimes violates the
TCP/IP Network
Model. As will be seen later, it is possible, f.e., to do IP DNAT inside the Link Layer.<BR>
We want to note that we are perfectly well aware that the word frame is used for the Link Layer,
while the word packet is used for the Network Layer. However, when we are talking about IP packets
inside the Link Layer, we will refer to these as frames/packets or packets/frames.
</P>
<A NAME="section2"></A>
<P CLASS="section">
2. How frames traverse the <EM>ebtables</EM> chains
</P>
<DIV CLASS="note">
This section only considers <EM>ebtables</EM>, not
<EM>iptables</EM>.
</DIV>
<P>
First thing to keep in mind is that we are talking about
the Ethernet layer here, so the OSI layer 2 (Data link
layer), or layer 1 (Link layer, Network Access layer) by the TCP/IP Network
Model.
</P>
<P>
A packet destined for the local computer according to the
bridge (which works on the Ethernet layer) isn't
necessarily destined for the local computer according to
the IP layer. That's how routing works (MAC destination is
the router, IP destination is the actual box you want to
communicate with).
</P>
<P>
<IMG SRC="bridge2a.png">
</P>
<P>
<I><B>Figure 2a.</B> General frame traversal scheme</I><BR>
</P>
<P>
</P>
<P>
There are six hooks defined in the Linux bridging code, of which the
BROUTING hook was added for <EM>ebtables</EM>.
</P>
<BR>
<BR>
<IMG SRC="bridge2b.png">
<P>
<I><B>Figure 2b.</B> Ethernet bridging hooks</I><BR>
</P>
</P>
<P>
<P>
The hooks are specific places in the network
code on which software can attach itself to process the
packets/frames passing that place. For example, the kernel module responsible for the <EM>ebtables</EM> FORWARD chain is attached onto the bridge FORWARD hook.
This is done when the module is loaded into the kernel or at bootup.
</P>
<P>
Note that the <EM>ebtables</EM> BROUTING and PREROUTING chains are traversed before the bridging decision, therefore these chains will even see frames that will be
ignored by the bridge. You should take that into account when using this chain. Also note that the chains won't see frames entering on a non-forwarding bridge port.<br>
The bridge's decision for a frame (as seen on Figure 2b) can be one of these:
<ul>
<li>bridge it, if the destination MAC address is on
another side of the bridge;</li>
<li>flood it over all the forwarding bridge ports, if the
position of the box with the destination MAC is unknown
to the bridge;</li>
<li>pass it to the higher protocol code (the IP code),
if the destination MAC address is that of the bridge or of
one of its ports;</li>
<li>ignore it, if the destination MAC address is located
on the same side of the bridge.</li>
</ul>
</P>
<IMG SRC="bridge2c.png">
<P>
<I><B>Figure 2c.</B> Bridging tables (ebtables) traversal
process</I><BR>
</P>
<P>
<EM>Ebtables</EM> has three tables:
<B>filter</B>, <B>nat</B> and <B>broute</B>, as shown in Figure 2c.
</P>
<UL>
<LI>
The <FONT COLOR="#00ffff"><B>broute</B></FONT> table has
the BROUTING chain.
</LI>
<LI>
The <FONT COLOR="#00ffff"><B>filter</B></FONT> table has
the FORWARD, INPUT and OUTPUT chains.
</LI>
<LI>
The <FONT COLOR="#00ffff"><B>nat</B></FONT> table has the
PREROUTING, OUTPUT and POSTROUTING chains.
</LI>
</UL>
<BR>
<DIV CLASS="note">
The filter OUTPUT and nat OUTPUT chains are separated and have
a different usage.
</DIV>
<P>
Figures 2b and 2c give a clear view where the
<EM>ebtables</EM> chains are attached onto the bridge hooks.
</P>
<P>
When an NIC enslaved to a bridge receives a frame, the frame
will first go through the BROUTING chain. In this special
chain you can choose whether to route or bridge frames,
enabling you to make a brouter. The definitions found on
the Internet for what a brouter actually is differ a bit.
The next definition describes the brouting ability using the
BROUTING chain quite well:
</P>
<DIV CLASS="note">
A brouter is a device which
bridges some frames/packets (i.e. forwards based on Link layer
information) and routes other frames/packets (i.e. forwards based
on Network layer information). The bridge/route decision is
based on configuration information.
</DIV>
<P>
A brouter can be used, for example,
to act as a normal router for IP traffic between 2
networks, while bridging specific traffic (NetBEUI, ARP,
whatever) between those networks. The IP routing
table does not use the bridge logical device and the box has
IP addresses assigned to the physical network devices that
also happen to be bridge ports (bridge enslaved NICs).<BR>
The default decision in the BROUTING chain is bridging.
</P>
<P>
Next the frame passes through the PREROUTING chain.
In this chain you can alter the destination MAC address
of frames (DNAT).
If the frame passes this chain, the bridging code will decide where the
frame should be sent. The bridge does this by looking at
the destination MAC address, it doesn't care about the
Network Layer addresses (e.g. IP address).
</P>
<P>
If the bridge decides the frame is destined for the local
computer, the frame will go through the INPUT chain.
In this chain you can filter frames destined for the bridge box.
After traversal of the INPUT chain, the frame will be passed up
to the Network Layer code (e.g. to the IP code).
So, a routed IP packet will go through
the <EM>ebtables</EM> INPUT chain, not through the
<EM>ebtables</EM> FORWARD chain. This is logical.
</P>
<BR>
<BR>
<IMG SRC="bridge2d.png">
<P>
<I><B>Figure 2d.</B> Incoming frames' chain traversal</I><BR>
</P>
<P>
Otherwise the frame should possibly be sent onto another side
of the bridge. If it should, the frame will go through the
FORWARD chain and the POSTROUTING chain. The bridged frames can be
filtered in the FORWARD chain. In the POSTROUTING chain you can alter the MAC
source address (SNAT).
</P>
<BR>
<BR>
<IMG SRC="bridge2e.png">
<P>
<I><B>Figure 2e.</B> Forwarded frames' chain traversal</I><BR>
</P>
<P>
Locally originated frames will, after the bridging decision, traverse
the nat OUTPUT, the filter OUTPUT and the nat POSTROUTING chains.
The nat OUTPUT chain allows to alter the destination
MAC address and the filter OUTPUT chain allows to
filter frames originating from the bridge box. Note that
the nat OUTPUT chain is traversed after the bridging
decision, so this is actually too late. We should change this. The nat
POSTROUTING chain is the same one as described above.
</P>
<BR>
<BR>
<IMG SRC="bridge2f.png">
<P>
<I><B>Figure 2f.</B> Outgoing frames' chain traversal</I><BR>
</P>
<DIV CLASS="note">
It's also possible for routed frames to go
through these three chains when the destination
device is a logical bridge device.
</DIV>
<BR>
<BR>
<A NAME="section3"></A>
<P CLASS="section">
3. A machine used as a bridge and a router (not a brouter)
</P>
<P>
Here is the IP code hooks scheme:
</P>
<IMG SRC="bridge3a.png">
<P>
<I><B>Figure 3a.</B> IP code hooks</I><BR>
</P>
<P>
Here is the iptables packet traversal scheme.
</P>
<BR>
<BR>
<IMG SRC="bridge3b.png">
<P>
<I><B>Figure 3b.</B> Routing tables (iptables) traversal
process</I><BR>
</P>
<P>
Note that the iptables nat OUTPUT chain is situated after the
routing decision. As commented in the previous section,
this is too late for DNAT. This is solved by rerouting the
IP packet if it has been DNAT'ed, before continuing. For clarity:
this is standard behaviour of the Linux kernel, not something
caused by our code.
</P>
<P>
Figures 3a and 3b give a clear view where the
<EM>iptables</EM> chains are attached onto the IP hooks. When the
bridge code and netfilter is enabled in the kernel, the iptables chains are
also attached onto the hooks of the bridging code. However,
this does not mean that they are no longer attached onto their
standard IP code hooks. For IP packets that get into
contact with the bridging code, the <EM>br-nf</EM> code will
decide in which place in the network code the <EM>iptables</EM>
chains will be traversed. Obviously, it is guaranteed that no chain is
traversed twice by the same packet. All packets that do not come into
contact with the bridge code traverse the <EM>iptables</EM> chains
in the standard way as seen in Figure 3b.<BR>
The following sections try, among other things,
to explain what the <EM>br-nf</EM> code does and why it does it.
</P>
<P>
It's possible to see a single IP packet/frame traverse the
nat PREROUTING, filter INPUT, nat OUTPUT, filter OUTPUT and
nat POSTROUTING <EM>ebtables</EM> chains.<BR>
This can happen when the bridge is also used as a router.
The Ethernet frame(s) containing that IP packet will have
the bridge's destination MAC address, while the destination
IP address is not of the bridge. Including the
<EM>iptables</EM> chains, this is how the IP packet runs
through the bridge/router (actually there is more going on,
see <A HREF="#section6">section 6</A>):
</P>
<P>
<IMG SRC="bridge3c.png">
</P>
<P>
<I><B>Figure 3c.</B> Bridge/router routes packet to a
bridge interface (simplistic view)</I><BR>
</P>
<P>
This assumes that the routing decision sends the packet to
a bridge interface. If the routing decision sends the
packet to a physical network card, this is what happens:
</P>
<P>
<IMG SRC="bridge3d.png">
</P>
<P>
<I><B>Figure 3d.</B> Bridge/router routes packet to a
physical interface (simplistic view)</I><BR>
</P>
<P>
Figures 3c and 3d assume the IP packet arrived on a bridge port.
What is obviously "asymmetric" here is that the
<EM>iptables</EM> PREROUTING chain is traversed before the
<EM>ebtables</EM> INPUT chain, however this cannot be
helped without sacrificing functionality. See the
next section.
</P>
<A NAME="section4"></A>
<P CLASS="section">
4. DNAT'ing bridged packets
</P>
<P>
Take an IP packet received by the bridge. Let's assume we
want to do some IP DNAT on it.
Changing the destination address of the packet (IP address
and MAC address) has to happen before the bridge code
decides what to do with the frame/packet.
</P>
<P>
So, this IP DNAT has to happen very early in the bridge
code. Namely before the bridge code actually does anything.
This is at the same place as where the <EM>ebtables</EM> nat
PREROUTING chain will be traversed (for the same reason).
This should explain the asymmetry encountered in Figures 3c
and 3d.<BR>
One should also be aware of the fact that frames for which the
bridging decision would be the fourth from the above list (i.e.
ignore the frame) will be seen in the PREROUTING chains of
<EM>ebtables</EM> and <EM>iptables</EM>.
</P>
<A NAME="section5"></A>
<P CLASS="section">
5. Chain traversal for bridged IP packets
</P>
<P>
A bridged packet never enters any network code above layer
1 (Link Layer). So, a bridged IP packet/frame will never enter the
IP code.
Therefore all <EM>iptables</EM> chains will be traversed
while the IP packet is in the bridge code. The chain
traversal will look like this:
</P>
<P>
<IMG SRC="bridge5.png">
</P>
<P>
<I><B>Figure 5.</B> Chain traversal for bridged IP
packets</I><BR>
</P>
<A NAME="section6"></A>
<P CLASS="section">
6. Using a bridge port in <EM>iptables</EM> rules
</P>
<P>
The wish to be able to use physical devices belonging to a
bridge (bridge ports) in <EM>iptables</EM> rules is valid.
Knowing the input bridge ports is necessary to prevent
spoofing attacks. Say br0 has ports eth0 and eth1. If
<EM>iptables</EM> rules can only use br0 there's no way of
knowing when a box on the eth0 side changes its source IP
address to that of a box on the eth1 side, except by
looking at the MAC source address (and then still...). With
the <EM>br-nf</EM> code you can use eth0 and eth1 in your
<EM>iptables</EM> rules and therefore catch these attempts.
</P>
<P CLASS="case">
6.1. <EM>iptables</EM> wants to use the bridge destination
ports:
</P>
<P>
To make this possible the <EM>iptables</EM> chains have to
be traversed after the bridge code decided where the frame
needs to be sent (eth0, eth1 or both). This has some
impact on the scheme presented in <A HREF=
"#section3">section 3</A> (so, we are looking at routed
traffic here, entering the box on a bridge port). It actually
looks like this (in the case of Figure 3c):
</P>
<P>
<IMG SRC="bridge6a.png">
</P>
<P>
<I><B>Figure 6a.</B> Chain traversal for routing, when the bridge
and netfilter code are compiled in the kernel.</I><BR>
</P>
<DIV CLASS="note">
All chains are now traversed while in the bridge code.<BR>
This is the work of the <EM>br-nf</EM> code. Obviously this does not
mean that the routed IP packets never enter the IP code. They
just don't pass any <EM>iptables</EM> chains while in the IP code.
</DIV>
<P CLASS="case">
6.2. IP DNAT for locally generated packets (so in the
<EM>iptables</EM> nat OUTPUT chain):
</P>
<P>
The normal way locally generated packets would go through
the chains looks like this:
</P>
<P>
<IMG SRC="bridge6c.png">
</P>
<P>
<I><B>Figure 6c.</B> The normal way for locally generated
packets</I><BR>
</P>
<P>
From <A HREF="#section6">section 6.1</A> we know that this
actually looks like this (due to the <EM>br-nf</EM> code):
</P>
<P>
<IMG SRC="bridge6d.png">
</P>
<P>
<I><B>Figure 6d.</B> The actual way for locally generated
packets</I><BR>
</P>
<P>
Note that the <EM>iptables</EM> nat OUTPUT chain is traversed while the
packet is in the IP code and the <EM>iptables</EM> filter OUTPUT chain
is traversed when the packet has passed the bridging decision.
This makes it possible to do DNAT to another device in the
nat OUTPUT chain and lets us use the bridge ports in the
filter OUTPUT chain.
</P>
<A NAME="section7"></A>
<P CLASS="section">
7. Two possible ways for frames/packets to pass through the
<EM>iptables</EM> PREROUTING, FORWARD and POSTROUTING
chains
</P>
<P>
Because of the <EM>br-nf</EM> code, there are 2 ways a frame/packet can
pass through the 3 given <EM>iptables</EM> chains. The
first way is when the frame is bridged, so the
<EM>iptables</EM> chains are called by the bridge code. The
second way is when the packet is routed. So special care
has to be taken to distinguish between those two,
especially in the <EM>iptables</EM> FORWARD chain. Here's
an example of strange things to look out for:
</P>
<P>
Consider the following situation
</P>
<P>
<IMG SRC="bridge7a.png">
</P>
<P>
<I><B>Figure 7a.</B> Very basic setup.</I><BR>
</P>
<P>
The default gateway for 172.16.1.2 and
172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge
interface br0 with ports eth1 and eth2.
</P>
<P CLASS="case">
More details:
</P>
<P>
The idea is that traffic between 172.16.1.4 and 172.16.1.2 is
bridged, while the rest is routed, using masquerading.
</P>
<P>
<IMG SRC="bridge7b.png">
</P>
<P>
<I><B>Figure 7b.</B> Traffic flow for the example setup.</I><BR>
</P>
<P>
Here's a possible scheme to use at bootup for the bridge/router:
</P>
<PRE>
iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -d 172.16.1.0/24 -j ACCEPT
iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -j MASQUERADE
brctl addbr br0
brctl stp br0 off
brctl addif br0 eth1
brctl addif br0 eth2
ifconfig eth1 0 0.0.0.0
ifconfig eth2 0 0.0.0.0
ifconfig br0 172.16.1.1 netmask 255.255.255.0 up
echo '1' &gt; /proc/sys/net/ipv4/ip_forward
</PRE>
<P>
The catch is in the first line. Because the
<EM>iptables</EM> code gets executed for both bridged
packets and routed packets, we need to make a distinction
between the two. We don't really want the bridged frames/packets
to be masqueraded. If we omit the first line then
everything will work too, but things will happen
differently. Let's say 172.16.1.2 pings 172.16.1.4. The
bridge receives the ping request and will transmit it
through its eth1 port after first masquerading the IP
address. So the packet's source IP address will now be
172.16.1.1 and 172.16.1.4 will respond to the bridge.
Masquerading will change the IP destination of this
response from 172.16.1.1 to 172.16.1.4. Everything works
fine. But it's better not to have this behaviour. Thus, we
use the first line to avoid this. Note that
if we would want to filter the connections to and from the
Internet, we would certainly need the first line so we don't
filter the local connections as well.
</P>
<A NAME="section8"></A>
<P CLASS="section">
8. IP DNAT in the <EM>iptables</EM> PREROUTING chain on
frames/packets entering on a bridge port
</P>
<P>
Through some groovy play it is assured that (see
/net/bridge/br_netfilter.c) DNAT'ed packets that after
DNAT'ing have the same output device as the input device
they came on (the logical bridge device which we like to
call br0) will go through the <EM>ebtables</EM> FORWARD
chain, not through the <EM>ebtables</EM> INPUT/OUTPUT chains. All
other DNAT'ed packets will be purely routed, so won't go
through the <EM>ebtables</EM> FORWARD chain, will go through
the <EM>ebtables</EM> INPUT chain and might go through the
<EM>ebtables</EM> OUTPUT chain.<BR>
</P>
<A NAME="section9"></A>
<P CLASS="section">
9. Using the MAC module extension for <EM>iptables</EM>
</P>
<P>
The side effect explained here occurs when the netfilter code
is enabled in the kernel, the IP packet is routed and the
out device for that packet is a logical bridge device. The
side effect is encountered when filtering on the MAC source
in the <EM>iptables</EM> FORWARD chains. As should be clear
from earlier sections, the traversal of the
<EM>iptables</EM> FORWARD chains is postponed until the
packet is in the bridge code. This is done so we can
filter on the bridge port out device. This has a side
effect on the MAC source address, because the IP code will
have changed the MAC source address to the MAC address of
the bridge device. It is therefore impossible, in the
<EM>iptables</EM> FORWARD chains, to filter on the MAC
source address of the computer sending the packet in
question to the bridge/router. If you really need to filter
on this MAC source address, you should do it in the nat
PREROUTING chain. Agreed, very ugly, but making it possible
to filter on the real MAC source address in the FORWARD
chains would involve a very dirty hack and is probably not
worth it. This of course makes the anti-spoofing remark of
<A HREF="#section6">section 6</A> funny.
</P>
<A NAME="section10"></A>
<P CLASS="section">
10. Using the <EM>iptables</EM> physdev match module for kernel 2.5
</P>
<P>
The 2.5 standard kernel contains an <EM>iptables</EM> match module
called <EM>physdev</EM> which has to be used to match the bridge's
physical in and out ports. Its usage is simple:</P>
<PRE>iptables -m physdev --physdev-in &lt;bridge-port&gt;</PRE><P>
and</P>
<PRE>iptables -m physdev --physdev-out &lt;bridge-port&gt;</PRE>
</P>
<HR>
<PRE>
Released under the GNU Free Documentation License.
Copyright (c) 2002 Bart De Schuymer &lt;bdschuym@pandora.be&gt;,
Nick Fedchik &lt;nick@fedchik.org.ua&gt;.
</PRE>
<BR>
<BR>
<BR>
<SMALL>Permission is granted to copy, distribute and/or
modify this document under the terms of the GNU Free
Documentation License, Version 1.1 or any later version
published by the Free Software Foundation, with no Invariant Sections,
with no Front-Cover Texts, and with no Back-Cover Texts. For a copy of the
license, see <A HREF=
"http://www.gnu.org/licenses/fdl.txt">"GNU Free Documentation License"</A>.</SMALL> <BR>
<BR>
<P>
Last updated December 23, 2002.
</P>
</BODY>
</HTML>