Linux routing and policy-routing with iproute2

    This article will focus on routing network packets in Linux. Specifically, about the type of routing called  policy-routing ( policy-based routing ). This type of routing allows packets to be routed based on a number of fairly flexible rules, unlike the classic destination-routing routing mechanism. (routing based on destination address). Policy-routing is applied if there are several network interfaces and it is necessary to send certain packets to a specific interface, and packets are not determined by the destination address or not only by the destination address. For example, policy-routing can be used to: balance traffic between several external channels (uplinks), provide access to the server in case of several uplinks, send packets from different internal addresses through different external interfaces, if necessary, even to send packets to different TCP ports through different interfaces, etc.
    To manage network interfaces, routing and shaping in Linux, use the iproute2 utility package .

    This set of utilities only sets the settings; in reality, all work is performed by the Linux kernel. In order to support policy-routing by the kernel, it must be compiled with the IP: advanced router ( CONFIG_IP_ADVANCED_ROUTER ) and IP: policy routing ( CONFIG_IP_MULTIPLE_TABLES ) options enabled , located under  Networking support -> Networking options -> TCP / IP networking .

    ip route


    To configure routing, use the ip route command. Performed without parameters, it will show a list of current routing rules (not all rules, more on that later):
    # ip route
    192.168.12.0/24 dev eth0 proto kernel scope link src 192.168.12.101
    default via 192.168.12.1 dev eth0

    This will look like routing when using the IP address 192.168.12.101 with the subnet mask 255.255.255.0 and the default gateway 192.168.12.1 on the eth0 interface.
    We see that the traffic on the 192.168.12.0/24 subnet goes through the eth0 interface. proto kernelmeans that routing was set by the kernel automatically when setting the IP interface. scope linkmeans that this entry is valid only for this interface (eth0). src 192.168.12.101sets the IP address of the sender for packets that fall under this routing rule.
    Traffic to any other hosts that do not fall into the 192.168.12.0/24 subnet will go to the 192.168.12.1 gateway through the eth0 interface (default via 192.168.12.1 dev eth0) By the way, when sending packets to the gateway, the destination IP address does not change, just in the Ethernet frame the MAC address of the gateway will be indicated as the recipient's MAC address (often even specialists with experience are confused at this point). The gateway, in turn, changes the IP address of the sender if NAT is used, or simply sends the packet further. In this case, a private address (192.168.12.101) is used, so the gateway most likely does NAT.
    And now we’ll go deeper into the routing. In fact, there are several routing tables, and you can also create your own routing tables. The local , main, and default tables are initially predefined . To the local tablethe kernel writes entries for local IP addresses (so that traffic to these IP addresses remains local and does not try to go to the external network), as well as for Broadcasts. The main table is the main table and it is used if the command does not indicate which table to use (i.e., we saw the main table above ). The default table is initially empty. Let's take a quick look at the contents of the local table:
    # ip route show table local
    broadcast 127.255.255.255 dev lo proto kernel scope link src 127.0.0.1
    broadcast 192.168.12.255 dev eth0 proto kernel scope link src 192.168.12.101
    broadcast 192.168.12.0 dev eth0 proto kernel scope link src 192.168.12.101
    local 192.168.12.101 dev eth0 proto kernel scope host src 192.168.12.101
    broadcast 127.0.0.0 dev lo proto kernel scope link src 127.0.0.1
    local 127.0.0.1 dev lo proto kernel scope host src 127.0.0.1
    local 127.0.0.0/8 dev lo proto kernel scope host src 127.0.0.1

    broadcastand localdetermine the types of records (we examined the type above default). Type broadcastmeans that packets corresponding to this record will be sent as broadcast packets, in accordance with the interface settings. local- packets will be sent locally. scope hostindicates that this entry is valid only for this host.
    To view the contents of a specific table, use the command  . To view the contents of all tables , specify TABLE_NAME with , or . All tables actually have digital identifiers, their symbolic names are set in the file  / etc / iproute2 / rt_tables and are used only for convenience.ip route show table TABLE_NAMEallunspec0

    ip rule


    How does the kernel choose which table to send packets to? Everything is logical - there are rules for this. In our case:
    # ip rule
    0: from all lookup local
    32766: from all lookup main
    32767: from all lookup default

    The number at the beginning of the line is the rule identifier, from all- the condition means packets from any addresses, lookupindicates which table to send the packet to. If a packet falls under several rules, then it passes them all in ascending order of identifier. Of course, if a packet matches any routing record, then it will not pass subsequent routing records and subsequent rules.
    Possible conditions:
    • from - we have already considered above, this is a check of the sender of the packet.
    • to - packet receiver.
    • iif - the name of the interface on which the packet came.
    • oif- the name of the interface from which the packet leaves. This condition only applies to packets originating from local sockets bound to a specific interface.
    • tos- value of the TOS field of the IP packet.
    • fwmark- checking the value of the FWMARK packet. This condition gives tremendous flexibility of rules. Using rules, iptablesyou can filter packets by a huge number of signs and set certain FWMARK values. And then take these values ​​into account when routing.

    Conditions can be combined, for example from 192.168.1.0/24 to 10.0.0.0/8, or you can use the prefix not, which indicates that the package must not meet the condition in order to fall under this rule.
    So, we figured out what routing tables and routing rules are. And creating your own tables and routing rules is policy-routing , it's also PBR (policy based routing). By the way, SBR (source based routing) or source-routing in Linux is a special case of policy-routing, this is the use of a condition fromin a routing rule.

    Simple example


    Now consider a simple example. We have a certain gateway, packets with IP 192.168.1.20 come to it. Packets from this IP must be sent to the 10.1.0.1 gateway. To implement this, we do the following:
    Create a table with a single rule:
    # ip route add default via 10.1.0.1 table 120

    We create a rule that sends the necessary packets to the desired table:
    # ip rule add from 192.168.1.20 table 120

    As you can see, everything is simple.

    Server availability through several uplinks


    Now a more realistic example. There are two uplinks for up to two providers, it is necessary to ensure server availability from both channels:

    One of the providers is used as the default route, no matter which one. In this case, the web server will be available only through the network of this provider. Requests through the network of another provider will come, but response packets will go to the default gateway and nothing will come of it.
    This is solved very simply:
    We define the tables:
    # ip route add default via 11.22.33.1 table 101
    # ip route add default via 55.66.77.1 table 102

    Define the rules:
    # ip rule add from 11.22.33.44 table 101
    # ip rule add from 55.66.77.88 table 102

    I think now it’s not necessary to explain the meaning of these lines. Similarly, you can make server availability for more than two uplinks.

    Balancing traffic between uplinks


    It is done by one elegant team:
    # ip route replace default scope global \
      nexthop via 11.22.33.1 dev eth0 weight 1 \
      nexthop via 55.66.77.1 dev eth1 weight 1

    This entry will replace the existing default routing in the main table. In this case, the route will be selected depending on the weight of the gateway ( weight). For example, if you specify weights 7 and 3, 70% of the connections will go through the first gateway, and 30% through the second. There is one thing to consider when doing this: the kernel caches the routes, and the route for any host through a certain gateway will hang in the table for some time after the last access to this record. And the route to frequently used hosts may not be in time to be reset and will be constantly updated in the cache, remaining on the same gateway. If this is a problem, you can sometimes clear the cache manually with a command  ip route flush cache.

    Using packet labeling with iptables


    Suppose we need packets on port 80 to leave only through 11.22.33.1. To do this, do the following:
    # iptables -t mangle -A OUTPUT -p tcp -m tcp --dport 80 -j MARK --set-mark 0x2
    # ip route add default via 11.22.33.1 dev eth0 table 102
    # ip rule add fwmark 0x2 / 0x2 lookup 102

    The first command marks all packets going to port 80. The second command creates a routing table. The third team wraps all packages with the indicated markings in the desired table.
    Again, everything is simple. Consider also using the iptables CONNMARK module  . It allows you to track and mark all packets related to a particular connection. For example, you can mark packets according to a certain characteristic in the INPUT chain , and then automatically mark packets related to these connections and in the OUTPUT chain . It is used like this:
    # iptables -t mangle -A INPUT -i eth0 -j CONNMARK --set-mark 0x2
    # iptables -t mangle -A INPUT -i eth1 -j CONNMARK --set-mark 0x4
    # iptables -t mangle -A OUTPUT -j CONNMARK --restore-mark

    Packets arriving with eth0 are labeled 2, and with eth1 - 4 (lines 1 and 2). The rule on the third line checks whether the packet belongs to a particular connection and restores the markings (which were set for incoming packets) for outgoing packets.
    I hope the material presented will help you evaluate the full flexibility of routing in Linux. Thanks for attention :)

    Also popular now: