Creating a fault-tolerant gateway based on Mikrotik RouterOS
- From the sandbox
- Tutorial
The task was to ensure the fault tolerance of the network router, which was supposed to support several local networks, three Internet channels from different providers, DMZ and a dozen VPN connections for remote users.
By fault tolerance was meant the instant replacement of equipment in the event of failure. I opted for Mikrotik RouterOS, as I had positive experience operating this OS. Also, the convenience of configuration and administration influenced the choice thanks to the Winbox utility.
This example of fault tolerance has been operating in combat conditions for several years and has proven itself. During operation, the configuration was redone several times, but the option that is lower in my conditions was optimal. Now turning off one of the routers does not affect the work, they are completely interchangeable.
As iron, it was decided to use ordinary PCs with Core 2 Duo processors, 1 GB of memory and Transend’s HDD Flash as a drive. Both routers are housed in cute MiniTower cores on a shelf in the server cabinet. The Mikrotik version at that time was installed 4.16, but now it works on 5.22.
I will not describe the configuration of the entire router, but I will focus only on fault tolerance. For a better perception of information, we restrict ourselves to one provider and 3 internal local area networks.
The fault tolerance protocol was selected by VRRP. Its principle is that routers have priority: Master and Slave, and after a certain time interval check each other's availability. If Master fails, Slave will replace it.
Since there are only 3 network interfaces on the router in terms of PCI (1 integrated was not used), and there are many subnets, VLAN was also used. Moreover, VRRP was hung on the physical interface, and VLANs on it already. All settings were carried out on one router. The second is automatically configured.
Physical ethernet interfaces
Virtual VRRP Interface
Of the VRRP settings, 3 parameters are interesting:
1) interface = lan the interface on which VRRP is hung
2) priority = 101 prioritizes the router. Master or Slave. The main one is who has more.
3) preemption-mode = yes if this mode is turned off: after the slave becomes a master, it will remain with it even when the master returns to operation.
LAN VLAN
VRRP is used only on the local interface. Its task is to monitor the health of the router as a whole and the availability of communication with the local network. In case of problems, everything else was switched by the script. This decision was due to the fact that IpSec did not work well with VRRP + VLAN on the WAN interface.
VRRP on Mikrotik allows you to use routers in load balancing or fault tolerance modes. In balancing mode, fault tolerance is also respected, but fault tolerance is enough for our task, otherwise VRRP would have to be hung on all interfaces.
Configure Addressing
The vrrp-lan virtual interfaces on both routers will have the same address 10.1.1.1/32
But the physical lan interfaces will have different addresses (10.1.1.2 and 10.1.1.3) and will also be on the same subnet between themselves and the vrrp interface.
Now it’s enough to add a default gateway to the routing table and the initial configuration can be considered complete.
Next, you need to configure the scripts for backup, checking the status of the Master / Slave and transferring the configuration.
Checking the status of the router
If vrrp running = false, then there is a router in Slave mode - disable WAN. If Master - then turn it on. We also check the current status of the WAN interface - whether it is turned on or off, so as not to disturb it again. This script is put to run in the scheduler every 3-10 seconds.
Backup
In addition to sending to e-mail (you must first configure the sending parameters), the script will create a local file called lastconfig.backup, which is useful to us.
This script is put to run in the scheduler 1 time per day, I have it at night. If you make a lot of changes in the configuration in a day, set it as you like.
Next are 2 scripts: transferring settings and applying settings on the backup server. Since they should only be run on the backup server, routers need to be distinguished somehow. I distinguish between the MAC addresses of the integrated network interfaces. The backup has the address FF: FF: 40: 40: 40: 41
Copying and applying the latest current configuration
That is, we pick up lastconfig.backup via FTP and recover from it. FTP user must be configured, preferably with limited access by IP. Please note that we connect to FTP by the IP address of the local physical interface, which is available only between routers.
We put this script into the scheduler a few minutes later than executing the backup script.
And the last script is the application of the settings on the backup server. It also uses MAC to identify the router.
Here we change the name of the router, the IP address of the LAN interface and the priority of VRRP to a lower one, so that the router can be made a slave. Running this script needs to be put in autoload. Changes will occur on the backup server after copying and applying the latest current configuration.
Actually, that’s all. Let me remind you that we did all the manipulations on the router, which we will have as a master. Now save the configuration, transfer it to the slave and apply it. This can be done in any way described here http://wiki.mikrotik.com/wiki/Manual:Configuration_Management#System_Backup
After applying the configuration and rebooting, the backup router will go into operation with the settings we need.
It was written from memory, but different sources were used during the setup process. Primarilywiki.mikrotik.com/wiki/Main_Page
By fault tolerance was meant the instant replacement of equipment in the event of failure. I opted for Mikrotik RouterOS, as I had positive experience operating this OS. Also, the convenience of configuration and administration influenced the choice thanks to the Winbox utility.
This example of fault tolerance has been operating in combat conditions for several years and has proven itself. During operation, the configuration was redone several times, but the option that is lower in my conditions was optimal. Now turning off one of the routers does not affect the work, they are completely interchangeable.
As iron, it was decided to use ordinary PCs with Core 2 Duo processors, 1 GB of memory and Transend’s HDD Flash as a drive. Both routers are housed in cute MiniTower cores on a shelf in the server cabinet. The Mikrotik version at that time was installed 4.16, but now it works on 5.22.
I will not describe the configuration of the entire router, but I will focus only on fault tolerance. For a better perception of information, we restrict ourselves to one provider and 3 internal local area networks.
The fault tolerance protocol was selected by VRRP. Its principle is that routers have priority: Master and Slave, and after a certain time interval check each other's availability. If Master fails, Slave will replace it.
Since there are only 3 network interfaces on the router in terms of PCI (1 integrated was not used), and there are many subnets, VLAN was also used. Moreover, VRRP was hung on the physical interface, and VLANs on it already. All settings were carried out on one router. The second is automatically configured.
Interface Settings
Physical ethernet interfaces
/ interface ethernet
set 0 arp = enabled auto-negotiation = yes cable-settings = default \
disable-running-check = yes disabled = no full-duplex = yes l2mtu = 1600 \
mtu = 1500 name = lan speed = 1Gbps
set 1 arp = enabled auto-negotiation = yes cable-settings = default \
disable-running-check = yes disabled = no full-duplex = yes l2mtu = 1600 \
mtu = 1500 name = wan speed = 1Gbps
Virtual VRRP Interface
/ interface vrrp
add arp = enabled authentication = none disabled = no interface = lan \
interval = 2s mtu = 1500 name = vrrp-lan \
preemption-mode = yes priority = 101 v3-protocol = ipv4 version = 3 vrid = 2
Of the VRRP settings, 3 parameters are interesting:
1) interface = lan the interface on which VRRP is hung
2) priority = 101 prioritizes the router. Master or Slave. The main one is who has more.
3) preemption-mode = yes if this mode is turned off: after the slave becomes a master, it will remain with it even when the master returns to operation.
LAN VLAN
/ interface vlan
add arp = enabled disabled = no interface = vrrp-lan mtu = 1500 name = vlan101 \
use-service-tag = no vlan-id = 101
add arp = enabled disabled = no interface = vrrp-lan mtu = 1500 name = vlan102 \
use-service-tag = no vlan-id = 102
add arp = enabled disabled = no interface = vrrp-lan mtu = 1500 name = vlan103 \
use-service-tag = no vlan-id = 103
VRRP is used only on the local interface. Its task is to monitor the health of the router as a whole and the availability of communication with the local network. In case of problems, everything else was switched by the script. This decision was due to the fact that IpSec did not work well with VRRP + VLAN on the WAN interface.
VRRP on Mikrotik allows you to use routers in load balancing or fault tolerance modes. In balancing mode, fault tolerance is also respected, but fault tolerance is enough for our task, otherwise VRRP would have to be hung on all interfaces.
Configure Addressing
/ ip address
add address = 192.168.101.1 / 24 disabled = no interface = vlan101 network = 192.168.101.0
add address = 192.168.102.1 / 24 disabled = no interface = vlan102 network = 192.168.102.0
add address = 192.168.103.1 / 24 disabled = no interface = vlan103 network = 192.168.103.0
add address = 10.1.1.2 / 29 disabled = no interface = lan network = 10.1.1.0
add address = 10.1.1.1 / 32 disabled = no interface = vrrp-lan network = 10.1.1.1
add address = 77.77.77.70 / 30 disabled = no interface = wan network = 77.77.77.68
The vrrp-lan virtual interfaces on both routers will have the same address 10.1.1.1/32
But the physical lan interfaces will have different addresses (10.1.1.2 and 10.1.1.3) and will also be on the same subnet between themselves and the vrrp interface.
Now it’s enough to add a default gateway to the routing table and the initial configuration can be considered complete.
/ ip route
add disabled = no distance = 1 dst-address = 0.0.0.0 / 0 gateway = 77.77.77.69 scope = 30 target-scope = 10
Next, you need to configure the scripts for backup, checking the status of the Master / Slave and transferring the configuration.
Checking the status of the router
: if ([/ interface get vrrp running] = false) do = {
: if ([/ interface get wan disabled] = false) do = {
/ interface disable wan
}
}
: if ([/ interface get vrrp running] = true ) do = {
: if ([/ interface get wan disabled] = true) do = {
/ interface enable wan
}
}
If vrrp running = false, then there is a router in Slave mode - disable WAN. If Master - then turn it on. We also check the current status of the WAN interface - whether it is turned on or off, so as not to disturb it again. This script is put to run in the scheduler every 3-10 seconds.
Backup
: local mserver
: local mkomu
: local msubject
: local bjmeno
: set mkomu "root@server.ru"
: set msubject ("Backup". [/ system identity get name])
: set bjmeno ([/ system identity get name]. "-". [: pick [/ system clock get date] 7 11]. [: pick [/ system clock get date] 0 3]. [: pick [/ system clock get date] 4 6]. ".backup" )
/ system backup save name = $ bjmeno
/ system backup save name = lastconfig
: delay 5
: put ($ mserver. "\ n")
/ tool e-mail send subject = $ msubject file = $ bjmeno to = $ mkomu body = ("Backup from". [/ System clock get date]. "Mikrotik". [/ System identity get name]. ".")
: Put ("Backup". $ Bjmeno. "\ N")
In addition to sending to e-mail (you must first configure the sending parameters), the script will create a local file called lastconfig.backup, which is useful to us.
This script is put to run in the scheduler 1 time per day, I have it at night. If you make a lot of changes in the configuration in a day, set it as you like.
Next are 2 scripts: transferring settings and applying settings on the backup server. Since they should only be run on the backup server, routers need to be distinguished somehow. I distinguish between the MAC addresses of the integrated network interfaces. The backup has the address FF: FF: 40: 40: 40: 41
Copying and applying the latest current configuration
: local interA [/ interface ethernet find mac-address = "FF: FF: 40: 40: 40: 41"]
: if ($ interA! = "") do = {
/ tool fetch address = 10.1.1.2 src-path = lastconfig.backup mode = ftp user = ftp password = VeryHightPassword !!! 11
: delay 10
system backup load name = lastconfig.backup
}
That is, we pick up lastconfig.backup via FTP and recover from it. FTP user must be configured, preferably with limited access by IP. Please note that we connect to FTP by the IP address of the local physical interface, which is available only between routers.
We put this script into the scheduler a few minutes later than executing the backup script.
And the last script is the application of the settings on the backup server. It also uses MAC to identify the router.
: local intA [/ interface ethernet find mac-address = "FF: FF: 40: 40: 40: 41"]
: if ($ intA! = "") do = {
/ system identity set name = router-slave
/ ip address remove [/ ip address find address = "10.1.1.2/29"]
/ ip address add address = 10.1.1.3 / 29 interface = lan
/ interface vrrp set priority = 100 preemption-mode = yes numbers = vrrp-lan
}
Here we change the name of the router, the IP address of the LAN interface and the priority of VRRP to a lower one, so that the router can be made a slave. Running this script needs to be put in autoload. Changes will occur on the backup server after copying and applying the latest current configuration.
Actually, that’s all. Let me remind you that we did all the manipulations on the router, which we will have as a master. Now save the configuration, transfer it to the slave and apply it. This can be done in any way described here http://wiki.mikrotik.com/wiki/Manual:Configuration_Management#System_Backup
After applying the configuration and rebooting, the backup router will go into operation with the settings we need.
It was written from memory, but different sources were used during the setup process. Primarilywiki.mikrotik.com/wiki/Main_Page