Thursday, August 27, 2009

Grotesquely Thin

‹prev | My Chain | next›

I was able to create a deploy user with limited privileges yesterday, but things quickly deteriorated after that. I was foiled in my attempts at enabling the deploy user to restart my thin cluster. As I quit in frustration, I was reasonably sure that the problem lay in the god configuration that I was using. Frustration is never a good guide, last night being yet another example of this.

The true source of the problem lay in the way that I was accessing individual servers in the thin cluster. The thin help text on the subject reads:
sh-3.2$ thin --help
Usage: thin [options] start|stop|restart|config|install
Cluster options:
-s, --servers NUM Number of servers to start
-o, --only NUM Send command to only one server of the cluster
-C, --config FILE Load options from config file
--all [DIR] Send command to each config files in DIR
The specific option that I got wrong was the the --only, which I assumed would be indexed from 1. Assumed, not verified, and I got burned (although honestly, who wants to start server number zero?). The relevant block of the god configuration reads:
DAEMON = ' '
CONFIG_PATH = '/etc/thin'
APP_ROOT = '/var/www/eee-code'

%w{8000 8001 8002 8003}.each do |port|
server_number = port.to_i - 7999 do |w| = "eee-thin-#{port}"
w.start = "#{DAEMON} start --all #{CONFIG_PATH} --only #{server_number}"
w.stop = "#{DAEMON} stop --all #{CONFIG_PATH} --only #{server_number}"
w.restart = "#{DAEMON} restart --all #{CONFIG_PATH} --only #{server_number}"
w.interval = 30.seconds # default
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = File.join(APP_ROOT, "log/thin.#{port}.pid")

The server_number is calculated to be 1 (8000 - 7999) through 4, so starting eee-thin-8000 will start server #1 which actually runs on port 8001. As one might expect all manner of bad things happen when the monitoring software and the PIDs recorded on the filesystem disagree.

The real question is how does the server on port 8000 get started (because there is a server listening on port 8000). To answer that, I start the thin server manually:
sh-3.2$ sudo /var/lib/gems/1.8/bin/thin start --all /etc/thin --only 1
Starting server on ...
[start] /etc/thin/eee.yml ...
Starting server on ...
Starting server on ...
Starting server on ...
Starting server on ...
Wow. There is all sorts of wrong there. My /etc/thin/eee.yml (the only file in /etc/thin) has no mention of port 3001:
user: www-data
group: www-data
pid: log/
log: log/thin.log
timeout: 60
port: 8000
servers: 4
chdir: /var/www/eee-code
environment: production
daemonize: true
Really, I have no explanation for that other than I am most likely abusing the --all, which is supposed to "Send command to each config files in DIR". Why this treats the only config file in the DIR differently that explicitly pointing to that config file, I cannot say. But when I use the config directly, it starts up correctly:
sh-3.2$ sudo /var/lib/gems/1.8/bin/thin start --config /etc/thin/eee.yml --only 1
Starting server on ...
It works, but I am definitely starting the second server, not the first one which would have started on port 8000. That is, the --only options is indexed from zero, not one.

Ultimately, I take my lessons learned and use this updated god configuration:
%w{8000 8001 8002 8003}.each do |port| do |w|
server_number = port.to_i - 8000
pid_file = File.join(APP_ROOT, "log/thin.#{port}.pid") = "eee-thin-#{port}" = "thin"
w.interval = 30.seconds # default
w.start = "#{DAEMON} start --config #{CONFIG_PATH}/eee.yml --only #{server_number}"
w.stop = "#{DAEMON} stop --config #{CONFIG_PATH}/eee.yml --only #{server_number}"
w.restart = "#{DAEMON} restart --config #{CONFIG_PATH}/eee.yml --only #{server_number}"
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = pid_file

After starting /etc/init.d/god, thin is now running:
sh-3.2$ sudo god status
couchdb: up
eee-thin-8000: up
eee-thin-8001: up
eee-thin-8002: up
eee-thin-8003: up
More importantly, I can now issue the stop/start commands with my Vlad / god rake tasks:
cstrom@jaynestown:~/repos/eee-code$ rake vlad:stop_app
(in /home/cstrom/repos/eee-code)'s password:
Could not chdir to home directory /home/deploy: No such file or directory
Sending 'stop' command

The following watches were affected:
Which means...

I can finally deploy with Vlad. Except for one thing. My CouchDB "migrations" (loading updated views) are not hooked into Vlad yet. I will finish that off tomorrow and then move onto other features in need of deploying.

No comments:

Post a Comment