blob: 031ef4a634850b8cf4582e5d151fa8080dfa429c [file] [log] [blame]
David Ahern562d8972015-09-15 10:50:14 -06001Virtual Routing and Forwarding (VRF)
2====================================
3The VRF device combined with ip rules provides the ability to create virtual
4routing and forwarding domains (aka VRFs, VRF-lite to be specific) in the
5Linux network stack. One use case is the multi-tenancy problem where each
6tenant has their own unique routing tables and in the very least need
7different default gateways.
8
9Processes can be "VRF aware" by binding a socket to the VRF device. Packets
10through the socket then use the routing table associated with the VRF
11device. An important feature of the VRF device implementation is that it
12impacts only Layer 3 and above so L2 tools (e.g., LLDP) are not affected
13(ie., they do not need to be run in each VRF). The design also allows
14the use of higher priority ip rules (Policy Based Routing, PBR) to take
15precedence over the VRF device rules directing specific traffic as desired.
16
17In addition, VRF devices allow VRFs to be nested within namespaces. For
18example network namespaces provide separation of network interfaces at L1
19(Layer 1 separation), VLANs on the interfaces within a namespace provide
20L2 separation and then VRF devices provide L3 separation.
21
22Design
23------
24A VRF device is created with an associated route table. Network interfaces
25are then enslaved to a VRF device:
26
27 +-----------------------------+
28 | vrf-blue | ===> route table 10
29 +-----------------------------+
30 | | |
31 +------+ +------+ +-------------+
32 | eth1 | | eth2 | ... | bond1 |
33 +------+ +------+ +-------------+
34 | |
35 +------+ +------+
36 | eth8 | | eth9 |
37 +------+ +------+
38
39Packets received on an enslaved device and are switched to the VRF device
40using an rx_handler which gives the impression that packets flow through
41the VRF device. Similarly on egress routing rules are used to send packets
42to the VRF device driver before getting sent out the actual interface. This
43allows tcpdump on a VRF device to capture all packets into and out of the
44VRF as a whole.[1] Similiarly, netfilter [2] and tc rules can be applied
45using the VRF device to specify rules that apply to the VRF domain as a whole.
46
47[1] Packets in the forwarded state do not flow through the device, so those
48 packets are not seen by tcpdump. Will revisit this limitation in a
49 future release.
50
51[2] Iptables on ingress is limited to NF_INET_PRE_ROUTING only with skb->dev
52 set to real ingress device and egress is limited to NF_INET_POST_ROUTING.
53 Will revisit this limitation in a future release.
54
55
56Setup
57-----
581. VRF device is created with an association to a FIB table.
59 e.g, ip link add vrf-blue type vrf table 10
60 ip link set dev vrf-blue up
61
622. Rules are added that send lookups to the associated FIB table when the
63 iif or oif is the VRF device. e.g.,
64 ip ru add oif vrf-blue table 10
65 ip ru add iif vrf-blue table 10
66
67 Set the default route for the table (and hence default route for the VRF).
68 e.g, ip route add table 10 prohibit default
69
703. Enslave L3 interfaces to a VRF device.
71 e.g, ip link set dev eth1 master vrf-blue
72
73 Local and connected routes for enslaved devices are automatically moved to
74 the table associated with VRF device. Any additional routes depending on
75 the enslaved device will need to be reinserted following the enslavement.
76
774. Additional VRF routes are added to associated table.
78 e.g., ip route add table 10 ...
79
80
81Applications
82------------
83Applications that are to work within a VRF need to bind their socket to the
84VRF device:
85
86 setsockopt(sd, SOL_SOCKET, SO_BINDTODEVICE, dev, strlen(dev)+1);
87
88or to specify the output device using cmsg and IP_PKTINFO.
89
90
91Limitations
92-----------
93VRF device currently only works for IPv4. Support for IPv6 is under development.
94
95Index of original ingress interface is not available via cmsg. Will address
96soon.