Apr 212012
 

Most are using ifstated to do failover between two firewalls, this isn’t about that. I have a network connected to two different ISPs, two external interfaces, two WAN connections. What I’m doing is some simple loadbalancing of the outgoing traffic between those interfaces, using ifstated to “disable” one of the interfaces in case that ISP goes down and enable it again when it’s back up. Also, some NAT and redirecting of the incoming traffic like mail and http towards the proper server on the inside.

I’ve set up ifstated to periodically ping the gateway and a far-away-but-always-online control host on each of the interfaces. If either the gateway or the control host don’t respond, it deletes the route for that interface, causing packets to no longer travel through it unless specifically instructed to. When both the gateway and control host start responding to pings, the route is put back up.

Load balancing is handled by the OS, through equal-cost multipath routing, explained in the OpenBSD FAQ. Short version, add two default routes:

delete /etc/mygate and enable the net.inet.ip.multipath sysctl. Check the FAQ for details.

First, the config files. I’ve put all of them in a directory that I created, /etc/netconf, /etc/pf.conf and /etc/ifstated.conf are symlinked to the ones in this directory. That has no other purpose than to keep everything in one place. I have one file, ifdef.conf, that contains macros common to both pf.conf and ifstated.conf. It’s included in pf.conf, but including it in ifstated.conf doesn’t work, so I’m using shell scripts that run the necessary commands and that file is sourced in every shell script. This way, if I need to change something, a NIC for example, all I need to edit is that one file.

Here they are:

ifdef.conf:

Note the control_host line. That variable is only used in shell scripts executed by ifstated, shell commands work there. It also assumes a working DNS server is available, otherwise it will fail and the interface will be set to inactive. If it ever gets used in pf.conf, like for example in a line that makes sure the path to control_host is always available, or if relying on a DNS server that might fail doesn’t sound like such a good idea, change it to a static IP of something that will likely always be up, like a root DNS server or www.isc.org. Even if the control host goes down only one interface will be deactivated, the other one will still be up, since it won’t be checked until the deactivated interface is marked as active again.

pf.conf:

Although one of the interfaces is named $dmz_if, there is no DMZ enforcing rule. One would need to block all traffic from DMZ to internal network and then allow only what is really, really, necessary.

As an example, the mail server is also a DNS server.

Traceroute uses udp packets destined for that port range (except on windows), I opened it up just because I don’t like it when a trace ends in oblivion even though the ping works and I’m not aware of any good reason to block it. Icmp type unreach can be further restricted to a smaller set, see ‘man icmp’.

Egress is the interface or group of interfaces that have a default route assigned to them. Basically, the external interface, in our case both $ext1_if and $ext2_if.

There are two lines that do the NAT-ing, one for each interface. Used to have a table called &ltnatTo&gt and this line instead:

My idea was that when an interface goes down I would delete that route and remove that IP from so that packets don’t get translated to it. That was a mistake. First, the NAT is done post-routing and since I used the OS not pf to do the loadbalancing it would change the IP of packets destined for $ext1_if into $ext2_nat, causing connections to send a query on one interface and get the answer on the other one, so I needed to add rules to route those properly. Second, it didn’t work with ifstated, packets would end up getting a “no destination to host” error when an interface went down. This way, with two different rules, since routing is never done to the inactive interface, they will never get that IP and all is good.

The table is used for people that try to bruteforce ssh. It can obviously be used for other connections, but with different parameters, setting a limit of ten connections in five seconds for some protocols might not be such a good idea. To expire the entries in this table I use this line in root’s crontab:

Every ten minutes it will delete entries older than one hour. Adjust accordingly.

Note these lines:

These two rules aren’t necessary since we allow all traffic to go out unrestricted, but these two rules mark each connection with a label so that ifstated can clear states for them when the interface is disabled. The label for $ext1_if would be “out_re0”.

One important statement is reply-to. At first I used something like this to manage redirects to inside hosts:

The result was that connections would start on one interface, $ext1_if and end up going out on the other, with the IP of $ext1_if. When this happened performance on such connections was so bad that OpenVPN was pretty much unusable.
The pf.conf man page has this to say about relpy-to:

reply-to
The reply-to option is similar to route-to, but routes packets that
pass in the opposite direction (replies) to the specified
interface. Opposite direction is only defined in the context of a
state entry, and reply-to is useful only in rules that create
state. It can be used on systems with multiple external
connections to route all outgoing packets of a connection through
the interface the incoming connection arrived through (symmetric
routing enforcement).

which is exactly what I needed. So the above line became:

Notice that in this case the “match” rule changes packets before they get to the “pass” rule, so “pass” contains the internal IP of the server, $www_in, which is 172.16.1.2. That can also be written as

ifstated.conf:

ext1_chk.sh and ext2_chk.sh are the scripts that test if the connection is up. They are run every 60 seconds. ext1_nok.sh or ext2_nok.sh are run when the ISP on the respective interface fails in some way, then ifstated keeps checking every minute is that ISP is back, ignoring the other interface meanwhile. ext1_ok or ext2_ok are run when the ISP works again, then ifstated goes back to main_loop, running both test scripts. The firewall will never be left with no default route since only one interface is brought down at a time. If both go down it will delete one of the routes, while new connections to the outside will try to go through the other one and simply fail as expected, so it shouldn’t brake anything in strange ways.

Use ‘ifstated -dvv’ to see it in action.

ext*_chk.sh:

It pings the gateway and the control host three times. The -I parameter makes sure the proper interface is used. It sends 3 pings and waits 3 seconds for a reply from control_host, which means it will take 9 seconds for the test to end in case the host can’t be reached and ping times out. If the check is still running by the time ifstated want to start it again, it will be killed and it will never fail the test, so do not use time lower than 9 seconds in ifstated.conf. More than that, if it also needs to resolve the IP. Basically, make sure checks never overlap.

On FreeBSD, the ‘-w’ option as such doesn’t exist, instead it’s ‘-W’, adjust if needed.

ext*_nok.sh:

Runs when the interface loses connectivity. Flushes all routes on the interface and all pf states related to it. This is where the label in pf.conf is used. Then it sends a mail to root warning that the interface has been disabled. That part could be more informative.

ext*_ok.sh:

Restarts the interface and sends email telling the admin that all is good with the world again. Since ext*_nok.sh is flushing all routes on that interface, /etc/netstart will set up any additional routes in /etc/hostname.$ext1_if.

To enable ifstated on boot:

  15 Responses to “Redundant load balancing for outgoing traffic on OpenBSD with pf and ifstated”

  1. I do use a similar approach, but I don’t disable the interface of the failing ISP for 2 reasons:
    I use ping to check connectivity from and to the isp router so, if the interface is down, it won’t work and connectivity checks will always fail.
    Also, sometimes, by issuing a reboot or reset command to the router/modem device, it will solve most isp side connectivity issues, but you need the interface be up to accomplish that.

    To solve the routing problem, I enforce the proper routing of packets using pf’s route-to to route packets that come in one interface, to leave on the other. Also, when one isp connection fails, I manipulate the routing table, to add another route, with higher priority, effectively “canceling” the multipath function. Once the isp connection returns, I simply delete that route.

    I liked your aproach. We can share knowledge if you’re interested.

  2. I don’t bring down the interface either, I just remove the routes. That way packets don’t normally use it but i can still ‘ping -I’ through that interface to check if connection has been reestablished. I also clear all states related to it so ongoing connections should get cut off / restarted right away instead of lingering on.

    The problem with packets coming in on one interface and going out through the other one is that the other end of the connection is expecting a response from the IP associated with the first interface, so if packets going out get NATed to a different IP, it won’t work. Didn’t think about adding a new route, thanks for the idea.

    • My ifstated daemon runs lots of commands on each state change. There are a lot of things to run, not only route manipulation and modem/router reseting. I’ve been thinking on creating a daemon to run these actions using threads, using ifstated to only inform this daemon of the state changes. If you’re interested on this, please let me know.

      Also, on the routing problem issue, I use only pf to manipulate the packets using route-to and reply-to. One thing I noticed though is that traffic that should be routed with equal cost, gets out a little bit more often on one of the interfaces. I don’t know if you noticed the same thing.

      I don’t like the idea of clearing the states, because, even though my modem/router should get another new external ip address every time the connection hangs/dies, it often gets the same ip. If i didn’t cleared the states, the connection might resume nicely, depending on how much time it took to re-establish the internet connection. This way I let things “die” naturally. Just by creating a new route, with higher priority to “cancel” the multipath routing, works very nice to effectively let the states die. When the isp connection gets back, simple delete this higher priority route created and things get back to the way they were.

      • Regarding ifstated, the setup above is exactly what I use in production and it’s enough for me. Static IPs, I don’t have control over whatever the next uplink device is, so I don’t need it to do anything other than redirect traffic. I’m not skilled in any kind of programming that can do threads, so can’t help there. I’d probably just use shell scripts anyway :). That’s why I kill the states too, since the ISPs are pretty stable if something goes wrong it’s likely gonna be serious and it won’t get fixed right away.

        The real pf.conf however is a bit more complicated, some connections need to go out through the main interface so I always see higher traffic there. But, reading RFC2992, as far as I understand it, the load balancing isn’t done per packet, but per connection / flow as much as possible, meaning that once a connection is established on one interface it tries to keep it going on that one. Since not all connections are created equal, downloading a large file will produce more traffic than reading a webpage, the ammount of traffic on each interface is likely to be different.

        Depending on how DHCP is implemented, the modem probably tries to get the same IP it had before it hanged, if it’s still free the server likely accepts the request, that’s why it’ll often keep the address.

        One thing I noticed lately is that youtube videos sometimes return “an error has occured”, even though after refreshing the page several times it might work. I think it might be related to multipath, but I’ve yet to investigate.

  3. Just a test, for hidden tags or < and > in source code:


    test
    test
    match out log on $ext1_if from nat-to $ext1_nat

    because there is some missing parts of scripts.

  4. ok, let’s try with pf.conf:

    # pf.conf
    #
    # See pf.conf(5) for syntax and examples.
    # Remember to set net.inet.ip.forwarding=1 and/or net.inet6.ip6.forwarding=1
    # in /etc/sysctl.conf if packets are to be forwarded between interfaces.

    ### MACROS

    # INTERFACES
    include "/etc/netconf/ifdef.conf"

    # ports open to the outside on this host (the firewall)
    fw_tcp_ports = "{ 22 }"
    # the range is for traceroute
    fw_udp_ports = "{ 33433:33626 }"
    # what kind of ICMP is allowed in. everything is allowed out.
    # ecoreq - RFC 792
    # unreach - RFC 1122
    # trace - might aswell
    icmp_types = "{ echoreq, unreach, trace }"

    # SIGNIFICANT HOSTS
    # webserver
    www_in=172.16.1.2
    www_out=10.100.100.200
    www_tcp_ports="{ http, https }"
    #www_udp_ports=" { } "
    # mail
    email_in=172.16.1.3
    email_out=10.100.100.201
    email_tcp_ports=" { domain, smtp, smtps, pop3, pop3s, imap, imaps, https } "
    email_udp_ports=" { domain } "

    ### TABLES
    table <rfc1918> const { 192.168.0.0/16, 172.16.0.0/12, 10.0.0.0/8 }
    table <NoRouteIPs> const { 127.0.0.0/8, 192.168.0.0/16, 172.16.0.0/12, \
    10.0.0.0/8, 169.254.0.0/16, 192.0.2.0/24, \
    0.0.0.0/8, 240.0.0.0/4 }
    # hosts that are allowed unrestricted access to the internet
    table <nated> { \
    192.168.1.10 \ # inside_server1
    192.168.1.20 \ # inside_server2
    192.168.1.30 \ # IP_phone
    192.168.1.35 \ # some_guy
    192.168.10.0/24 \ # these guys all get NAT
    172.16.1.3 \ # mail
    172.16.1.2 \ # www
    }

    # IPs in this table have attempted bruteforce and will be blocked
    # from accessing anything
    table <brutes> persist

    ### OPTIONS
    set skip on lo
    # should it drop by default, or be polite about it?
    set block-policy return

    ### NAT
    #
    match out log on $ext1_if from <nated> nat-to $ext1_nat
    match out log on $ext2_if from <nated> nat-to $ext2_nat
    #
    # redirects
    # www
    match in log on $ext1_if proto tcp to $www_out port $www_tcp_ports rdr-to $www_in
    # email
    match in log on $ext1_if proto tcp to $email_out port $email_tcp_ports rdr-to $email_in
    match in log on $ext1_if proto udp to $email_out port $email_udp_ports rdr-to $email_in

    ### FILTER RULES
    #
    # block all IPV6, we don't have that.
    # rules will need to be written for it, if ever
    block log quick inet6
    # these should never be routed to the internet, whatever the following rules
    block drop in log quick on egress from <NoRouteIPs> to any
    block drop out log quick on egress from any to <NoRouteIPs>
    # bruteforcers. entries expire in 1 hour, see root's crontab
    # alternative: expiretables
    block drop log quick from <brutes>

    # block incoming by default
    block in log on egress from any to any

    # we allow everything out on egress, but
    # the label on these rules will be used by ifstated
    # to clear states on a certain interface when it goes down
    pass out log on $ext1_if label "out_$if"
    pass out log on $ext2_if label "out_$if"

    # allow traffic to this host
    pass in log on egress proto tcp from any to (egress) port $fw_tcp_ports
    pass in log on egress proto udp from any to (egress) port $fw_udp_ports
    # allow certain ICMP
    pass in log inet proto icmp all icmp-type $icmp_types

    # redirects. the second rule is needed if traffic on inside interfaces is blocked by default
    # web
    pass in on $ext1_if proto tcp to $www_in port $www_tcp_ports reply-to ($ext1_if $ext1_gw)
    ##pass in on $dmz_if proto tcp to $www_in port $www_tcp_ports
    # email
    pass in on $ext1_if proto tcp to $email_in port $email_tcp_ports reply-to ($ext1_if $ext1_gw)
    ##pass in on $dmz_if proto tcp to $email_in port $email_tcp_ports
    pass in on $ext1_if proto udp to $email_in port $email_udp_ports reply-to ($ext1_if $ext1_gw)
    ##pass in on $dmz_if proto udp to $email_in port $email_udp_ports

    # SSH bruteforce protection
    # maximum 15 concurrent connections
    # maximum 10 connections in 5 seconds from the same IP
    pass on egress inet proto tcp from any to (egress) port 22 \
    flags S/SA keep state \
    (max-src-conn 15, max-src-conn-rate 10/5, \
    overload <brutes> flush global)

    # By default, do not permit remote connections to X11
    block in log on ! lo0 proto tcp to port 6000:6010

  5. Now the table names are shown in rules and table definitions.
    Thanks for the detailed info about redundand load balancing.

    • Wow, you took the time to fix the whole conf? Many thanks, I’ll leave it like this for now and I’ll be looking for a way to make WordPress not interpret characters inside <pre> tags, I’m probably not the first one to have this problem. Thanks for pointing it out.

  6. You are welcome, best wishes from south of you and larevedere

  7. Hi again,
    If there is no internet access on first interface (for example control host is unreachable), the scripts that will be executed are:
    1. ext1_chk.sh with failed ping tests;
    2. ext1_nok.sh, that will flush rules for interface 1;
    3. ext1_chk.sh to check connectivity of interface 1;
    Am I correct till now?
    Here is the question – how will be reached control host (google.com) via int1, without routes for interface 1, they are flushed in step 2?
    Thanks!

    • That’s why -I is used in the ping command:
      ping -c 3 -i 3 -w 3 -I $ext1_nat $control_host
      -I makes sure that the ping goes out the specified interface, route or not

      • Yes, the I option works, but in the same subnet of ext1_if. What about if control host is out of ext_if subnet, i.e. icmp packet leaving ext1_if should go to the default gw of ext1_if network? But default gw route was flushed in ext1_nok.sh or I missed something in pf rules…

        • It doesn’t work only in the same subnet, that was tested, it also worked with an outside control host like google.com for me.

          However, right now I can’t explain with certainty why. I don’t have access to that machine anymore and I don’t even have an OpenBSD, to be certain of what I’m talking about I would have to set up a test environment that I don’t have.

          I tried it on Linux, it refuses to send the ping without a route. On Windows I’m not even going to try.

          I’m thinking of two possibilities. One, assuming the first device down the wire does some kind of routing, if the packet reaches that device and is accepted as valid, it doesn’t matter anymore, because it has a destination (google.com) and that device will know what do to with it. So just sending the packet on the wire into oblivion, could work. Problem is, I don’t know if the physical layer (ethernet) would accept a packet without MAC address. Or, what happens if the first device on the wire is a dumb L2 switch that has no idea about anything other than MAC addresses and local IPs.
          Second, maybe the gateway for that particular interface is still stored somewhere even if the route is flushed and the packet doesn’t leave the interface without a destination MAC. In that case, I think what happens is obvious. Unfortunately, I don’t have an OpenBSD to test.

          I’d like to know the answer though. If you can test, you can save a packet dump with tcpdump and then read it with wireshark. The difference should be pretty obvious if you compare before and after flush.

          • Yes, I was trying to use your scripts and went to this problem – ping directly returns No route to host. I will make some more tests and workarounds tomorow or maybe the day after tomorow and will write here.

          • But… there should be a route to host. The other interface should be up, that one should have a valid route. Does it say that only when -I is used, or even without it?
            I was thinking that if the packet doesn’t have a gateway or it’s the wrong one, you could force one in pf by using route-to, or somesuch. But for that, there has to be a packet first.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">

(required)

(required)