I have things on the beta site in something of a final state. I have been meaning to keep a closer eye on things, so:
sh-3.2$ sudo gem install godGod is a ruby-based monitoring tool. I do more than enough monit stuff at the day job—there is no way that I am going to do it on the side as well. Besides, god is kinda fun.
I start with this configuration for my thin server:
DAEMON = '/var/lib/gems/1.8/bin/thin'The nice thing about god is the readability of the configuration file—most of it is self-explanatory. So far, I have a name (for reporting purposes), start/stop/restart commands, interval between checks, and grace periods after start/restart before monitoring kicks back in.
CONFIG_PATH = '/etc/thin'
APP_ROOT = '/var/www/eee-code'
%w{8000 8001 8002 8003}.each do |port|
server_number = port.to_i - 7999
God.watch do |w|
w.name = "eee-thin-#{port}"
w.start = "#{DAEMON} start --all #{CONFIG_PATH} --only #{server_number}"
w.stop = "#{DAEMON} stop --all #{CONFIG_PATH} --only #{server_number}"
w.restart = "#{DAEMON} restart --all #{CONFIG_PATH} --only #{server_number}"
w.interval = 30.seconds # default
w.start_grace = 10.seconds
w.restart_grace = 10.seconds
w.pid_file = File.join(APP_ROOT, "log/thin.#{port}.pid")
w.behavior(:clean_pid_file)
end
end
The only out-of-the-ordinary thing in there is the
server_number
, which is thin-specific. The servers are indexed from 1. The first thin server in my cluster is located at port 8000 (then 8001, 8002, and 8003). Thus the server number can be calculated by subtracting 7999 from the port number.Try that with monit.
The next few bits are from the sample configuration file provided along with the god gem. I add a check to start the server if the server has been not running for 5 seconds:
w.start_if do |start|I add a check to restart the server if the memory or cpu usage gets too high:
start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
end
end
w.restart_if do |restart|And finally, a check to handle a flailing system (to force god to stop paying attention):
restart.condition(:memory_usage) do |c|
c.above = 30.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
restart.condition(:cpu_usage) do |c|
c.above = 50.percent
c.times = 5
end
end
w.lifecycle do |on|Last, but certainly not least, I would like to be notified when the server enters one of these states. Since I am running the lightweight
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minute
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
sendmail
replacemant, ssmtp
, I need to employ god's sendmail configuration:God::Contacts::Email.message_settings = {And, to get notified when a start takes place, I add a
:from => 'god@___cooks.com'
}
God::Contacts::Email.delivery_method = :sendmail
God::Contacts::Email.sendmail_settings = {
:location => '/usr/sbin/sendmail',
:arguments => '-i -t'
}
God.contact(:email) do |c|
c.name = 'chris'
c.email = 'chris@___cooks.com'
end
notify
directive to the start condition:w.start_if do |start|To check my monitoring out, I intentionally kill the thin server running on port 8001 and watch the god logs:
start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
c.notify = 'chris'
end
end
...Nice!
I [2009-08-18 02:16:46] INFO: eee-thin-8001 sent email to chris@___cooks.com (Email)
I [2009-08-18 02:16:46] INFO: eee-thin-8001 move 'up' to 'start'
I [2009-08-18 02:16:46] INFO: eee-thin-8001 before_start: no pid file to delete (CleanPidFile)
I [2009-08-18 02:16:46] INFO: eee-thin-8001 start: /var/lib/gems/1.8/bin/thin start --all /etc/thin --only 2
I [2009-08-18 02:16:46] INFO: eee-thin-8002 [ok] process is running (ProcessRunning)
...
I [2009-08-18 02:16:57] INFO: eee-thin-8001 moved 'up' to 'up'
I [2009-08-18 02:16:57] INFO: eee-thin-8001 [ok] process is running (ProcessRunning)
I [2009-08-18 02:16:57] INFO: eee-thin-8001 [ok] memory within bounds [21512kb] (MemoryUsage)
I [2009-08-18 02:16:57] INFO: eee-thin-8001 [ok] cpu within bounds [4.52029520294135%] (CpuUsage)
I will get god working under init.d tomorrow and configure it to monitor HAProxy and CouchDB as well.
No comments:
Post a Comment