Tag Archives: elb

Nginx, reverse proxies and DNS resolution

Nginx is a pretty awesome high performance web server and reverse proxy. It’s often used in conjunction with other HTTP servers such as Java/Tomcat and Ruby/Unicorn, as it allows static content to be served directly from disk by Nginx and for connections from slow clients to be queued and buffered by Nginx, rather than taking up time of the expensive/scarce application server worker processes.

 

A typical Nginx reverse proxy configuration to a single backend using proxy_pass to a local HTTP server application on port 8080 would look something like this:

server {
    ...
    proxy_pass http://localhost:8080
    ...
}

Another popular approach is having a defined upstream group (which can be used for multiple servers, or a single one if desired), for example:

upstream upstream-localhost {
    server localhost:8080;
}

server {
    ...
    proxy_pass http://upstream-localhost;
    ...
}

Generally this configuration works fine for most of our use cases – we typically have a 1-to-1 mapping between a backend application server and Nginx, so the configuration is very simple and reliable – any issues are usually with the backend application, rather than Nginx itself.

 

However on occasion there are times when it’s desirable to have Nginx talking to a backend on another server.

I recently implemented an OAuth2 gateway using Nginx-Lua, with the Nginx gateway doing the OAuth2 authentication in a small Lua module before passing the request through to the backend application. This configuration ran on a pair of bastion servers, which reverse proxy the request through to an Amazon ELB which load balances a number of application servers.

This works perfectly 95% of the time, but Amazon ELBs (even internal) have a tendency to change their IP addresses. Normally this doesn’t matter, since you never reference ELBs via their IP address and use their DNS name instead, but the default behaviour of the Nginx upstream and proxy modules is to resolve DNS at startup, but not to re-resolve DNS during the operation of the application.

This leads to a situation where the Amazon ELB IP address changes, Amazon update the DNS record, but Nginx never re-resolves the DNS record and stays pointing at the old IP address. Subsequently requests to the backend start failing once Amazon drops services from the old IP address.

This lack of re-resolution of backends is a known limitation/issue with Nginx. Thankfully there is a workaround to force Nginx to re-resolve addresses, as per this mailing list post by setting proxy_pass to a variable, which then forces re-resolution of the DNS names as Nginx treats variables differently to static configuration.

server {
    ...
    resolver 127.0.0.1;
    set $backend_upstream "http://dynamic.example.com:80";
    proxy_pass $backend_upstream;
    ...
}

 

A resolver (DNS server address) also needs to be configured. When using parametrised backends, a resolver must be configured in Nginx (it is unable to use the local OS resolver) and must point directly to a name server IP address.

If your name servers aren’t predictable, you could install something like dnsmasq to provide a local resolver on 127.0.0.1 which then forwards to the dynamically assigned name server, or take the approach of pulling the name server details from the host using something like Puppet Facts and then writing it into the configuration file when it’s generated on the host.

Nginx >= 1.1.9 will re-resolve DNS records based on their TTL, but it’s possible to override this with any value desired. To verify correct behaviour, tcpdump will quickly show whether re-resolution is working.

# tcpdump -i eth0 port 53
15:26:00.338503 IP nginx.example.com.53933 > 8.8.8.8.domain: 15459+ A? dynamic.example.com. (54)
15:26:00.342765 IP 8.8.8.8.domain > nginx.example.com.53933: 15459 1/0/0 A 10.1.1.1 (70)
...
15:26:52.958614 IP nginx.example.com.48673 > 8.8.8.8.domain: 63771+ A? dynamic.example.com. (54)
15:26:52.959142 IP 8.8.8.8.domain > nginx.example.com.48673: 63771 1/0/0 A 10.1.1.2 (70)

It’s a bit of an annoyance in an otherwise fantastic application, but as long as you are aware of the limitation, it is not too difficult to resolve the issue by a bit of configuration adjustment.

ELBs & Corporate Proxies

Following on from yesterday’s ELB post, it’s worth noting that there’s another common scenario where you can trigger issues when accessing ELBs – many corporates enforce the use of an HTTP proxy for all outgoing traffic, sometimes transparently, other times less so.

Having a multi-AZ ELB and accessing from your own data center isn’t too much of an issue, if each host does it’s own DNS lookup, your hosts should roughly end up with a 50/50 split across AZs, as each one resolves it’s own DNS record.

But when a proxy is added to the mix, it breaks this, since proxies tend to do their own DNS lookups and cache the results for use by other clients. Testing with Squid showed that the DNS caching for Squid would favour a particular AZ and send all traffic to that one AZ, before then flipping to the other AZ when the DNS cache expired and was refreshed – in my case, every 5mins when the TTL expired.

I'm so sick of these motherfucking ELBs in this motherfucking cloud!

Go home Amazon, you’re drunk

If you’re using Squid, there’s little you can do to work around this – whilst you can adjust the Squid DNS caching times and approaches, short of disabling DNS caching and taking the performance hit of a DNS lookup for every new outbound request, you will always end up with load jumping between both AZs and causing havoc..

There’s a few workaround options:

  • Have multiple Squid proxies for your outbound traffic and load balance between then on your network, if you load balance outgoing traffic across 4+ different outbound Squid servers, your load should end up going to different AZs a *bit* more evenly – but still not guaranteed.
  • Create an internal ELB and access via your VPC link, allowing you to bypass your company’s outbound network proxies (as traffic routes via VPN or Direct Connect) – but then you’re paying for 2x ELBs – one external for end users and one internal for your own systems.
  • Replace the ELB with something actually useful (eg a Varnish or HA-Proxy instance in Amazon).
  • Get rid of the outbound proxies please! I could write a business case for it based on the amount of money I’ve seen proxies waste at so many different companies (hint: engineers time debugging issues is much more expensive than a couple GB extra data usage).
  • Gin.

Russian roulette with ELBs and CDNs

In my day job, I look after a number of websites, all of which generally make heavy use of CDNs (Content Distribution Networks) to offload traffic to edge nodes near to an end user’s device. In our case we use Akamai, one of the largest and experienced providers in the world.

A large number of our clusters and applications now run on Amazon’s public cloud service here in Sydney, making use of EC2 instances and ELBs. Due to the important nature of our systems, we have almost all applications in active-active multi-AZ (Availability Zone) configurations. The intention of this design is that the ELB (Elastic Load Balancer) serves all incoming traffic by dividing it across each availability zone in equal proportions. If either Amazon AZ fails, the other will continue to serve requests like nothing is wrong.

It’s a nicer solution than the traditional data center approach of having an active-passive multi-site design, as with both AZs being constantly active serving requests, we know that production and “DR” are always in a functional working state, ready to handle traffic; plus your investment into DR isn’t going to waste like traditional servers sitting idle.

Unfortunately Amazon ELBs offer only the barest of no-frills features which makes them a bit stupid at times. In particular, Amazon’s multi-AZ ELBs actually consist two separate ELBs, once in each AZ. Incoming traffic selects an ELB by means of a DNS round robin and then is directed to a server in that particular AZ .

Thus, each availability zone has it’s own ELB, which adds it’s own IP address to the DNS round robin, and looks something like this:

www.example.com is an alias for www-example-com-elb.jws.elb.amazonaws.com.
www-example-com-elb.jws.elb.amazonaws.com. has address 172.16.32.1
www-example-com-elb.jws.elb.amazonaws.com. has address 192.168.0.1

The problem is that DNS round robin has no guarantee of balancing the load evenly across the two data centers. If a particular company’s proxy server caches one address, it may direct traffic for the whole company to AZ-A and deliver no traffic to AZ-B.

In reality, due to the large number of users getting assigned different IP addresses with round robin, users tend to be spread somewhat evenly across the different AZs, making the problem a somewhat moot point when you have sizeable visitor numbers.

But if you add Akamai to the mix, you can end up with interesting results – it turns out that Akamai Edge nodes in AU use a central source of DNS information, which can lead to them favouring a particular ELB IP address. And since *all* your traffic goes via the CDN, this in turn results in all your traffic going directly to a single AZ and ignoring the other one entirely.

In a real-world scenario of a 4 webserver cluster, we saw traffic jump between each AZ whenever Akamai’s edge servers updated DNS to a different IP address, as per the below graph:

Time to really test that your application is active-active!

Akamai decides to switch which ELB it’s using from A to B :-/

This swapping brings around some really nasty issues. In theory your active-active setup should be large enough to handle all your usual traffic load on just one AZ, but if that’s not the case, bad things will happen to your site performance and/or reliability.

The other nasty issue is when doing auto-scaling with Amazon, this swapping messes with your Cloud Watch metrics for autoscale policies/triggers – one AZ is complete idle, one AZ is maxed out, average stats show a half busy cluster, no need to autoscale upwards to handle the load.

And even if you’re clever and set your autoscaling to also trigger based on ELB latency/errors/throughput, you may still end up with issues, since the new host created during the autoscale may end up in the idle AZ, instead of the active AZ where you need it.

Using a smarter system for load balancing can negate the issue – for example using a pair of Varnish servers or HA-Proxy servers configured to do cross-AZ load balancing would workaround the issue, by spreading all the traffic coming into one AZ across all the servers in both AZs, but this does have increased costs (running EC2 instances, inter-AZ traffic). It also may have performance issues depending on the amount of traffic pouring into your instance.

Additionally, if you have a global audience, rather than a mostly single-country audience like us, you may not see the issue, since the different Akamai regions around the world will balance load somewhat equally across the two AZs.

To properly fix this behaviour with Akamai, you need to open a professional services request and have the SureRoute configuration adjusted so that Akamai forces the edge notes to lookup the origin IPs at the edge:

<!-- SR fix to handle multiple origin IP's -->
<forward:cache-parent.sureroute2.force-origin-ip-from-edge>on
</forward:cache-parent.sureroute2.force-origin-ip-from-edge>
<forward:cache-parent.sureroute2.round-robin.status>on
</forward:cache-parent.sureroute2.round-robin.status>

<!-- no host in sureroute stat-key -->
<forward:cache-parent.sureroute2.stat-key.host>off
</forward:cache-parent.sureroute2.stat-key.host>

With this fixed configuration, Akamai will correctly spread load evenly across our two AZs and our load graphs settled comfortably back into normality. I’m not entirely sure why this configuration isn’t default SureRoute behaviour, but like many things with Akamai, there are often mysterious adjustments that only professional services know about or can make.

Finally it’s worth noting that this issue isn’t unique to Amazon – you could get the same issue if you run active-active conventional data centers and use Akamai for offload. It may also be an issue with other CDNs by default, so double-check the behaviour of your particular vendor – it would be interesting to see if CloudFront (Amazon’s CDN) exhibits similar issues or not.

Credit to my colleague Andrew. for spotting this issue originally and having to deal with two different vendors support cases at once to get to the bottom of the root cause.