japh(r) by Chris Strom: HAProxy

Sunday, August 16, 2009

HAProxy

‹prev | My Chain | next›

As pointed out by an astute reader, my current architecture is... limited.

In the two previous incarnations of the EEE Coooks, I have scraped by with either completely static HTML or a single mongrel instance on a shared hosting service. Now that I have an entire Linode all to myself, I have more options available.

Specifically, I can run HAProxy to balance multiple thin instances of my Sinatra application. I do not know that I will ever need that many, but YAGNI rarely applies to infrastructure—especially when infrastructure is cheap. And cheap it is, getting four thin instances requires a single change to the YAML configuration file:

# sudo vi /etc/thin/eee.yml
---
user: www-data
group: www-data
pid: log/thin.pid
log: log/thin.log
timeout: 60
port: 8000
address: 127.0.0.1
servers: 4
chdir: /var/www/eee-code
environment: production
daemonize: true
rackup: config.ru

I could configure the nginx server that I am using to round-robin through each of the thin instances. The trouble with vanilla round-robin proxying is that, should one instance get bogged down with a long running process, subsequent requests will eventually round-robin back to the "stuck" instance—and hang.

There is the fair proxy patch for nginx, but I seem to recall some instability with it. HAProxy has always been rock solid, so I prefer it for load balancing. It does suffer from dense documentation, though there is more and more documentation popping up.

Soo...

sh-3.2$ sudo apt-get install haproxy

Annoyingly, HAProxy on Debian will not start via init.d script without the following change to /etc/default/haproxy:

# Set ENABLED to 1 if you want the init script to start haproxy.
ENABLED=1
# Add extra flags here.
#EXTRAOPTS="-de -m 16"

I can accept the default value of ENABLED=0 if there is some warning when the init.d script is run, but it just silently fails. Ah well, such is life with such pure awesomeness of a tool whose only documentation is an 80 column text file.

With it enabled, I can get HAProxy to proxy my thin servers with the following config file (based on the config from the Rails Machine wiki):

global
  # maximum number of simultaneous active connections from an upstream web server
  maxconn 500 

  # Logging to syslog facility local0
  # log   127.0.0.1       local0

  # Distribute the health checks with a bit of randomness
  spread-checks 5

  # Uncomment the statement below to turn on verbose logging
  #debug

# Settings in the defaults section apply to all services (unless you change it,
# this configuration defines one service, called rails).
defaults

  # apply log settings from the global section above to services
  log global

  # Proxy incoming traffic as HTTP requests
  mode http

  # Distribute incoming requests between Mongrels by round robin algorythm.
  # Note that because of 'maxconn 1' settings in the listen section, Mongrels 
  # that are busy processing some other request will actually be skipped.
  # So, the actual load-balancing behavior is smarter than simple round robin.
  balance roundrobin

  # Maximum number of simultaneous active connections from an upstream web server 
  # per service
  maxconn 500

  # Log details about HTTP requests
  option httplog

  # Abort request if client closes its output channel while waiting for the 
  # request. HAProxy documentation has a long explanation for this option.
  option abortonclose

  # Check if a "Connection: close" header is already set in each direction,
  # and will add one if missing.
  option httpclose

  # If sending a request to one Mongrel fails, try to send it to another, 3 times
  # before aborting the request
  retries 3

  # Do not enforce session affinity (i.e., an HTTP session can be served by 
  # any Mongrel, not just the one that started the session
  option redispatch

  # Timeout a request if the client did not read any data for 120 seconds
  timeout client 120000

  # Timeout a request if Mongrel does not accept a connection for 120 seconds
  timeout connect 120000

  # Timeout a request if Mongrel does not accept the data on the connection,
  # or does not send a response back in 120 seconds
  timeout server 120000
  
  # Remove the server from the farm gracefully if the health check URI returns
  # a 404. This is useful for rolling restarts.
  http-check disable-on-404

  # Enable the statistics page 
  stats enable
  stats uri     /haproxy?stats
  stats realm   Haproxy\ Statistics
  stats auth    haproxy:stats

  # Create a monitorable URI which returns a 200 if haproxy is up
  monitor-uri /haproxy?monitor

  # Specify the HTTP method and URI to check to ensure the server is alive.
  # see http://github.com/jnewland/pulse
  #option httpchk GET /pulse

  # Amount of time after which a health check is considered to have timed out
  timeout check 2000

# Thin service section.
listen eee :80
  server eee-1 localhost:8000 maxconn 1 check inter 20000 fastinter 1000 fall 1
  server eee-2 localhost:8001 maxconn 1 check inter 20000 fastinter 1000 fall 1
  server eee-3 localhost:8002 maxconn 1 check inter 20000 fastinter 1000 fall 1
  server eee-4 localhost:8003 maxconn 1 check inter 20000 fastinter 1000 fall 1

Nice! For the astute, I have left the status page available for the time being (this is a beta site). Tomorrow, I will set up some monitoring. For real this time.

japh(r) by Chris Strom

Sunday, August 16, 2009

HAProxy

No comments:

Post a Comment