docs/br_fw_ia/br_fw_ia.html - external/ebtables - Gitiles

 <!DOCTYPE html PUBLIC '-//W3C//DTD HTML 4.01//EN'
 'http://www.w3.org/TR/html4/strict.dtd'>
 <HTML>
   <HEAD>
     <TITLE>
       ebtables/iptables interaction on a Linux-based bridge
     </TITLE>
     <META HTTP-EQUIV="Content-Type" CONTENT=
     "text/html; charset=iso-8859-1">
     <LINK REL="STYLESHEET" TYPE="text/css" HREF="br_fw_ia.css">
   </HEAD>
   <BODY>
     <DIV CLASS="bar">
       <DIV CLASS="shadow">
         <H1 ALIGN="center">
           ebtables/iptables interaction on a Linux-based bridge
         </H1>
       </DIV>
     </DIV>
     <H3>
       <A NAME="top">Table of Contents</A>
     </H3>
     <OL>
       <LI>
         <A HREF="#section1">Introduction</A>
       </LI>
       <LI>
         <A HREF="#section2">How frames traverse the
         <EM>ebtables</EM> chains</A>
       </LI>
       <LI>
         <A HREF="#section3">A machine used as a bridge and a
         router (not a brouter)</A>
       </LI>
       <LI>
         <A HREF="#section4">DNAT'ing bridged packets</A>
       </LI>
       <LI>
         <A HREF="#section5">Chain traversal for bridged IP packets</A>
       </LI>
       <LI>
         <A HREF="#section6">Using a bridge port in <EM>iptables</EM> rules</A>
       </LI>
       <LI>
         <A HREF="#section7">Two possible ways for frames/packets
         to pass through the <EM>iptables</EM> PREROUTING, FORWARD and
         POSTROUTING chains</A>
       </LI>
       <LI>
         <A HREF="#section8">IP DNAT in the <EM>iptables</EM> PREROUTING
         chain on frames/packets entering on a bridge port</A>
       </LI>
       <LI>
         <A HREF="#section9">Using the MAC module extension for
         <EM>iptables</EM></A>
       </LI>
       <LI>
         <A HREF="#section10">Using the <EM>iptables</EM> physdev match module for kernel 2.5</A>
       </LI>
     </OL>
     <A NAME="section1"></A>
     <P CLASS="section">
       1. Introduction
     </P>
     <P>
       This document describes how <EM>iptables</EM> and
       <EM>ebtables</EM> filtering tables interact on a Linux-based bridge.<BR>
       Getting a bridging firewall consists of patching the kernel source
       code with one or two patches. Since kernel 2.5.39 (resp. 2.5.44), <EM>ebtables</EM> (resp. <EM>br-nf</EM>)
       is in the standard 2.5 kernel releases.
       Because the demand was high, patches for the 2.4 kernel are still available at the <EM>ebtables</EM> homepage.
       The <EM>br-nf</EM> code makes bridged IP frames/packets go through the <EM>iptables</EM> chains.
       <EM>Ebtables</EM> filters on the Ethernet layer, while <EM>iptables</EM>
       only filters IP packets.<BR>
       The explanations below will use the TCP/IP Network Model.
       It should be noted that the <EM>br-nf</EM> code sometimes violates the
       TCP/IP Network
       Model. As will be seen later, it is possible, f.e., to do IP DNAT inside the Link Layer.<BR>
       We want to note that we are perfectly well aware that the word frame is used for the Link Layer,
       while the word packet is used for the Network Layer. However, when we are talking about IP packets
       inside the Link Layer, we will refer to these as frames/packets or packets/frames.
     </P>
     <A NAME="section2"></A>
     <P CLASS="section">
       2. How frames traverse the <EM>ebtables</EM> chains
     </P>
     <DIV CLASS="note">
       This section only considers <EM>ebtables</EM>, not
       <EM>iptables</EM>.
     </DIV>
     <P>
       First thing to keep in mind is that we are talking about
       the Ethernet layer here, so the OSI layer 2 (Data link
       layer), or layer 1 (Link layer, Network Access layer) by the TCP/IP Network
       Model.
     </P>
     <P>
       A packet destined for the local computer according to the
       bridge (which works on the Ethernet layer) isn't
       necessarily destined for the local computer according to
       the IP layer. That's how routing works (MAC destination is
       the router, IP destination is the actual box you want to
       communicate with).
     </P>
     <P>
       <IMG SRC="bridge2a.png">
     </P>
     <P>
       <I><B>Figure 2a.</B> General frame traversal scheme</I><BR>
     </P>
     <P>
     </P>
     <P>
       There are six hooks defined in the Linux bridging code, of which the
       BROUTING hook was added for <EM>ebtables</EM>.
     </P>
     <BR>
      <BR>
      <IMG SRC="bridge2b.png">
     <P>
       <I><B>Figure 2b.</B> Ethernet bridging hooks</I><BR>
     </P>
     </P>
     <P>
     <P>
       The hooks are specific places in the network
       code on which software can attach itself to process the
       packets/frames passing that place. For example, the kernel module responsible for the <EM>ebtables</EM> FORWARD chain is attached onto the bridge FORWARD hook.
       This is done when the module is loaded into the kernel or at bootup.
     </P>
     <P>
      Note that the <EM>ebtables</EM> BROUTING and PREROUTING chains are traversed before the bridging decision, therefore these chains will even see frames that will be
      ignored by the bridge. You should take that into account when using this chain. Also note that the chains won't see frames entering on a non-forwarding bridge port.<br>
      The bridge's decision for a frame (as seen on Figure 2b) can be one of these:
       <ul>
       <li>bridge it, if the destination MAC address is on
       another side of the bridge;</li>
       <li>flood it over all the forwarding bridge ports, if the
       position of the box with the destination MAC is unknown
       to the bridge;</li>
       <li>pass it to the higher protocol code (the IP code),
       if the destination MAC address is that of the bridge or of
       one of its ports;</li>
       <li>ignore it, if the destination MAC address is located
       on the same side of the bridge.</li>
       </ul>
     </P>
      <IMG SRC="bridge2c.png">
     <P>
       <I><B>Figure 2c.</B> Bridging tables (ebtables) traversal
       process</I><BR>
     </P>
     <P>
       <EM>Ebtables</EM> has three tables:
       <B>filter</B>, <B>nat</B> and <B>broute</B>, as shown in Figure 2c.
     </P>
     <UL>
       <LI>
         The <FONT COLOR="#00ffff"><B>broute</B></FONT> table has
         the BROUTING chain.
       </LI>
       <LI>
         The <FONT COLOR="#00ffff"><B>filter</B></FONT> table has
         the FORWARD, INPUT and OUTPUT chains.
       </LI>
       <LI>
         The <FONT COLOR="#00ffff"><B>nat</B></FONT> table has the
         PREROUTING, OUTPUT and POSTROUTING chains.
       </LI>
     </UL>
     <BR>
     <DIV CLASS="note">
       The filter OUTPUT and nat OUTPUT chains are separated and have
       a different usage.
     </DIV>
     <P>
       Figures 2b and 2c give a clear view where the
       <EM>ebtables</EM> chains are attached onto the bridge hooks.
     </P>
     <P>
       When an NIC enslaved to a bridge receives a frame, the frame
       will first go through the BROUTING chain. In this special
       chain you can choose whether to route or bridge frames,
       enabling you to make a brouter. The definitions found on
       the Internet for what a brouter actually is differ a bit.
       The next definition describes the brouting ability using the
       BROUTING chain quite well:
     </P>
     <DIV CLASS="note">
       A brouter is a device which
       bridges some frames/packets (i.e. forwards based on Link layer
       information) and routes other frames/packets (i.e. forwards based
       on Network layer information). The bridge/route decision is
       based on configuration information.
     </DIV>
     <P>
       A brouter can be used, for example,
       to act as a normal router for IP traffic between 2
       networks, while bridging specific traffic (NetBEUI, ARP,
       whatever) between those networks. The IP routing
       table does not use the bridge logical device and the box has
       IP addresses assigned to the physical network devices that
       also happen to be bridge ports (bridge enslaved NICs).<BR>
       The default decision in the BROUTING chain is bridging.
     </P>
     <P>
       Next the frame passes through the PREROUTING chain.
       In this chain you can alter the destination MAC address
       of frames (DNAT).
       If the frame passes this chain, the bridging code will decide where the
       frame should be sent. The bridge does this by looking at
       the destination MAC address, it doesn't care about the
       Network Layer addresses (e.g. IP address).
     </P>
     <P>
       If the bridge decides the frame is destined for the local
       computer, the frame will go through the INPUT chain.
       In this chain you can filter frames destined for the bridge box.
       After traversal of the INPUT chain, the frame will be passed up
       to the Network Layer code (e.g. to the IP code).
       So, a routed IP packet will go through
       the <EM>ebtables</EM> INPUT chain, not through the
       <EM>ebtables</EM> FORWARD chain. This is logical.
     </P>
      <BR>
      <BR>
      <IMG SRC="bridge2d.png">
     <P>
       <I><B>Figure 2d.</B> Incoming frames' chain traversal</I><BR>
     </P>
     <P>
       Otherwise the frame should possibly be sent onto another side
       of the bridge. If it should, the frame will go through the
       FORWARD chain and the POSTROUTING chain. The bridged frames can be
       filtered in the FORWARD chain. In the POSTROUTING chain you can alter the MAC
       source address (SNAT).
     </P>
      <BR>
      <BR>
      <IMG SRC="bridge2e.png">
     <P>
       <I><B>Figure 2e.</B> Forwarded frames' chain traversal</I><BR>
     </P>
     <P>
       Locally originated frames will, after the bridging decision, traverse
       the nat OUTPUT, the filter OUTPUT and the nat POSTROUTING chains.
       The nat OUTPUT chain allows to alter the destination
       MAC address and the filter OUTPUT chain allows to
       filter frames originating from the bridge box. Note that
       the nat OUTPUT chain is traversed after the bridging
       decision, so this is actually too late. We should change this. The nat
       POSTROUTING chain is the same one as described above.
     </P>
      <BR>
      <BR>
      <IMG SRC="bridge2f.png">
     <P>
       <I><B>Figure 2f.</B> Outgoing frames' chain traversal</I><BR>
     </P>
     <DIV CLASS="note">
       It's also possible for routed frames to go
       through these three chains when the destination
       device is a logical bridge device.
     </DIV>
     <BR>
      <BR>
      <A NAME="section3"></A>
     <P CLASS="section">
       3. A machine used as a bridge and a router (not a brouter)
     </P>
     <P>
       Here is the IP code hooks scheme:
     </P>
     <IMG SRC="bridge3a.png">
     <P>
       <I><B>Figure 3a.</B> IP code hooks</I><BR>
     </P>
     <P>
       Here is the iptables packet traversal scheme.
     </P>
     <BR>
      <BR>
      <IMG SRC="bridge3b.png">
     <P>
       <I><B>Figure 3b.</B> Routing tables (iptables) traversal
       process</I><BR>
     </P>
     <P>
       Note that the iptables nat OUTPUT chain is situated after the
       routing decision. As commented in the previous section,
       this is too late for DNAT. This is solved by rerouting the
       IP packet if it has been DNAT'ed, before continuing. For clarity:
       this is standard behaviour of the Linux kernel, not something
       caused by our code.
     </P>
     <P>
       Figures 3a and 3b give a clear view where the
       <EM>iptables</EM> chains are attached onto the IP hooks. When the
       bridge code and netfilter is enabled in the kernel, the iptables chains are
       also attached onto the hooks of the bridging code. However,
       this does not mean that they are no longer attached onto their
       standard IP code hooks. For IP packets that get into
       contact with the bridging code, the <EM>br-nf</EM> code will
       decide in which place in the network code the <EM>iptables</EM>
       chains will be traversed. Obviously, it is guaranteed that no chain is
       traversed twice by the same packet. All packets that do not come into
       contact with the bridge code traverse the <EM>iptables</EM> chains
       in the standard way as seen in Figure 3b.<BR>
       The following sections try, among other things,
       to explain what the <EM>br-nf</EM> code does and why it does it.
     </P>
     <P>
       It's possible to see a single IP packet/frame traverse the
       nat PREROUTING, filter INPUT, nat OUTPUT, filter OUTPUT and
       nat POSTROUTING <EM>ebtables</EM> chains.<BR>
       This can happen when the bridge is also used as a router.
       The Ethernet frame(s) containing that IP packet will have
       the bridge's destination MAC address, while the destination
       IP address is not of the bridge. Including the
       <EM>iptables</EM> chains, this is how the IP packet runs
       through the bridge/router (actually there is more going on,
       see <A HREF="#section6">section 6</A>):
     </P>
     <P>
       <IMG SRC="bridge3c.png">
     </P>
     <P>
       <I><B>Figure 3c.</B> Bridge/router routes packet to a
       bridge interface (simplistic view)</I><BR>
     </P>
     <P>
       This assumes that the routing decision sends the packet to
       a bridge interface. If the routing decision sends the
       packet to a physical network card, this is what happens:
     </P>
     <P>
       <IMG SRC="bridge3d.png">
     </P>
     <P>
       <I><B>Figure 3d.</B> Bridge/router routes packet to a
       physical interface (simplistic view)</I><BR>
     </P>
     <P>
       Figures 3c and 3d assume the IP packet arrived on a bridge port.
       What is obviously "asymmetric" here is that the
       <EM>iptables</EM> PREROUTING chain is traversed before the
       <EM>ebtables</EM> INPUT chain, however this cannot be
       helped without sacrificing functionality. See the
       next section.
     </P>
     <A NAME="section4"></A>
     <P CLASS="section">
       4. DNAT'ing bridged packets
     </P>
     <P>
       Take an IP packet received by the bridge. Let's assume we
       want to do some IP DNAT on it.
       Changing the destination address of the packet (IP address
       and MAC address) has to happen before the bridge code
       decides what to do with the frame/packet.
     </P>
     <P>
       So, this IP DNAT has to happen very early in the bridge
       code. Namely before the bridge code actually does anything.
       This is at the same place as where the <EM>ebtables</EM> nat
       PREROUTING chain will be traversed (for the same reason).
       This should explain the asymmetry encountered in Figures 3c
       and 3d.<BR>
       One should also be aware of the fact that frames for which the
       bridging decision would be the fourth from the above list (i.e.
       ignore the frame) will be seen in the PREROUTING chains of
       <EM>ebtables</EM> and <EM>iptables</EM>.
     </P>
     <A NAME="section5"></A>
     <P CLASS="section">
       5. Chain traversal for bridged IP packets
     </P>
     <P>
       A bridged packet never enters any network code above layer
       1 (Link Layer). So, a bridged IP packet/frame will never enter the
       IP code.
       Therefore all <EM>iptables</EM> chains will be traversed
       while the IP packet is in the bridge code. The chain
       traversal will look like this:
     </P>
     <P>
       <IMG SRC="bridge5.png">
     </P>
     <P>
       <I><B>Figure 5.</B> Chain traversal for bridged IP
       packets</I><BR>
     </P>
     <A NAME="section6"></A>
     <P CLASS="section">
       6. Using a bridge port in <EM>iptables</EM> rules
     </P>
     <P>
       The wish to be able to use physical devices belonging to a
       bridge (bridge ports) in <EM>iptables</EM> rules is valid.
       Knowing the input bridge ports is necessary to prevent
       spoofing attacks. Say br0 has ports eth0 and eth1. If
       <EM>iptables</EM> rules can only use br0 there's no way of
       knowing when a box on the eth0 side changes its source IP
       address to that of a box on the eth1 side, except by
       looking at the MAC source address (and then still...). With
       the <EM>br-nf</EM> code you can use eth0 and eth1 in your
       <EM>iptables</EM> rules and therefore catch these attempts.
     </P>
     <P CLASS="case">
       6.1. <EM>iptables</EM> wants to use the bridge destination
       ports:
     </P>
     <P>
       To make this possible the <EM>iptables</EM> chains have to
       be traversed after the bridge code decided where the frame
       needs to be sent (eth0, eth1 or both). This has some
       impact on the scheme presented in <A HREF=
       "#section3">section 3</A> (so, we are looking at routed
       traffic here, entering the box on a bridge port). It actually
       looks like this (in the case of Figure 3c):
     </P>
     <P>
       <IMG SRC="bridge6a.png">
     </P>
     <P>
       <I><B>Figure 6a.</B> Chain traversal for routing, when the bridge
       and netfilter code are compiled in the kernel.</I><BR>
     </P>
     <DIV CLASS="note">
       All chains are now traversed while in the bridge code.<BR>
       This is the work of the <EM>br-nf</EM> code. Obviously this does not
       mean that the routed IP packets never enter the IP code. They
       just don't pass any <EM>iptables</EM> chains while in the IP code.
     </DIV>
     <P CLASS="case">
       6.2. IP DNAT for locally generated packets (so in the
       <EM>iptables</EM> nat OUTPUT chain):
     </P>
     <P>
       The normal way locally generated packets would go through
       the chains looks like this:
     </P>
     <P>
       <IMG SRC="bridge6c.png">
     </P>
     <P>
       <I><B>Figure 6c.</B> The normal way for locally generated
       packets</I><BR>
     </P>
     <P>
       From <A HREF="#section6">section 6.1</A> we know that this
       actually looks like this (due to the <EM>br-nf</EM> code):
     </P>
     <P>
       <IMG SRC="bridge6d.png">
     </P>
     <P>
       <I><B>Figure 6d.</B> The actual way for locally generated
       packets</I><BR>
     </P>
     <P>
       Note that the <EM>iptables</EM> nat OUTPUT chain is traversed while the
       packet is in the IP code and the <EM>iptables</EM> filter OUTPUT chain
       is traversed when the packet has passed the bridging decision.
       This makes it possible to do DNAT to another device in the
       nat OUTPUT chain and lets us use the bridge ports in the
       filter OUTPUT chain.
     </P>
     <A NAME="section7"></A>
     <P CLASS="section">
       7. Two possible ways for frames/packets to pass through the
       <EM>iptables</EM> PREROUTING, FORWARD and POSTROUTING
       chains
     </P>
     <P>
       Because of the <EM>br-nf</EM> code, there are 2 ways a frame/packet can
       pass through the 3 given <EM>iptables</EM> chains. The
       first way is when the frame is bridged, so the
       <EM>iptables</EM> chains are called by the bridge code. The
       second way is when the packet is routed. So special care
       has to be taken to distinguish between those two,
       especially in the <EM>iptables</EM> FORWARD chain. Here's
       an example of strange things to look out for:
     </P>
     <P>
       Consider the following situation
     </P>
     <P>
       <IMG SRC="bridge7a.png">
     </P>
     <P>
       <I><B>Figure 7a.</B> Very basic setup.</I><BR>
     </P>
     <P>
       The default gateway for 172.16.1.2 and
       172.16.1.4 is 172.16.1.1. 172.16.1.1 is the bridge
       interface br0 with ports eth1 and eth2.
     </P>
     <P CLASS="case">
       More details:
     </P>
     <P>
       The idea is that traffic between 172.16.1.4 and 172.16.1.2 is
       bridged, while the rest is routed, using masquerading.
     </P>
     <P>
       <IMG SRC="bridge7b.png">
     </P>
     <P>
       <I><B>Figure 7b.</B> Traffic flow for the example setup.</I><BR>
     </P>
     <P>
       Here's a possible scheme to use at bootup for the bridge/router:
     </P>
 <PRE>
 iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -d 172.16.1.0/24 -j ACCEPT
 iptables -t nat -A POSTROUTING -s 172.16.1.0/24 -j MASQUERADE

 brctl addbr br0
 brctl stp br0 off
 brctl addif br0 eth1
 brctl addif br0 eth2

 ifconfig eth1 0 0.0.0.0
 ifconfig eth2 0 0.0.0.0
 ifconfig br0 172.16.1.1 netmask 255.255.255.0 up

 echo '1' &gt; /proc/sys/net/ipv4/ip_forward
 </PRE>
     <P>
       The catch is in the first line. Because the
       <EM>iptables</EM> code gets executed for both bridged
       packets and routed packets, we need to make a distinction
       between the two. We don't really want the bridged frames/packets
       to be masqueraded. If we omit the first line then
       everything will work too, but things will happen
       differently. Let's say 172.16.1.2 pings 172.16.1.4. The
       bridge receives the ping request and will transmit it
       through its eth1 port after first masquerading the IP
       address. So the packet's source IP address will now be
       172.16.1.1 and 172.16.1.4 will respond to the bridge.
       Masquerading will change the IP destination of this
       response from 172.16.1.1 to 172.16.1.4. Everything works
       fine. But it's better not to have this behaviour. Thus, we
       use the first line to avoid this. Note that
       if we would want to filter the connections to and from the
       Internet, we would certainly need the first line so we don't
       filter the local connections as well.
     </P>
     <A NAME="section8"></A>
     <P CLASS="section">
       8. IP DNAT in the <EM>iptables</EM> PREROUTING chain on
       frames/packets entering on a bridge port
     </P>
     <P>
       Through some groovy play it is assured that (see
       /net/bridge/br_netfilter.c) DNAT'ed packets that after
       DNAT'ing have the same output device as the input device
       they came on (the logical bridge device which we like to
       call br0) will go through the <EM>ebtables</EM> FORWARD
       chain, not through the <EM>ebtables</EM> INPUT/OUTPUT chains. All
       other DNAT'ed packets will be purely routed, so won't go
       through the <EM>ebtables</EM> FORWARD chain, will go through
       the <EM>ebtables</EM> INPUT chain and might go through the
       <EM>ebtables</EM> OUTPUT chain.<BR>
     </P>
     <A NAME="section9"></A>
     <P CLASS="section">
       9. Using the MAC module extension for <EM>iptables</EM>
     </P>
     <P>
       The side effect explained here occurs when the netfilter code
       is enabled in the kernel, the IP packet is routed and the
       out device for that packet is a logical bridge device. The
       side effect is encountered when filtering on the MAC source
       in the <EM>iptables</EM> FORWARD chains. As should be clear
       from earlier sections, the traversal of the
       <EM>iptables</EM> FORWARD chains is postponed until the
       packet is in the bridge code. This is done so we can
       filter on the bridge port out device. This has a side
       effect on the MAC source address, because the IP code will
       have changed the MAC source address to the MAC address of
       the bridge device. It is therefore impossible, in the
       <EM>iptables</EM> FORWARD chains, to filter on the MAC
       source address of the computer sending the packet in
       question to the bridge/router. If you really need to filter
       on this MAC source address, you should do it in the nat
       PREROUTING chain. Agreed, very ugly, but making it possible
       to filter on the real MAC source address in the FORWARD
       chains would involve a very dirty hack and is probably not
       worth it. This of course makes the anti-spoofing remark of
       <A HREF="#section6">section 6</A> funny.
     </P>
     <A NAME="section10"></A>
     <P CLASS="section">
       10. Using the <EM>iptables</EM> physdev match module for kernel 2.5
     </P>
     <P>
       The 2.5 standard kernel contains an <EM>iptables</EM> match module
       called <EM>physdev</EM> which has to be used to match the bridge's
       physical in and out ports. Its usage is simple:</P>
       <PRE>iptables -m physdev --physdev-in &lt;bridge-port&gt;</PRE><P>
       and</P>
       <PRE>iptables -m physdev --physdev-out &lt;bridge-port&gt;</PRE>
     </P>
     <HR>
 <PRE>
 Released under the GNU Free Documentation License.
 Copyright (c) 2002 Bart De Schuymer &lt;bdschuym@pandora.be&gt;,
                    Nick Fedchik &lt;nick@fedchik.org.ua&gt;.
 </PRE>
     <BR>
      <BR>
      <BR>
      <SMALL>Permission is granted to copy, distribute and/or
     modify this document under the terms of the GNU Free
     Documentation License, Version 1.1 or any later version
     published by the Free Software Foundation, with no Invariant Sections,
     with no Front-Cover Texts, and with no Back-Cover Texts. For a copy of the
     license, see <A HREF=
     "http://www.gnu.org/licenses/fdl.txt">"GNU Free Documentation License"</A>.</SMALL> <BR>
      <BR>
     <P>
       Last updated December 23, 2002.
     </P>
   </BODY>
 </HTML>