我们最近将 Rails 应用迁移到 MRI 2.2.0(可能相关也可能不相关)后,在 Unicorn 零停机重启方面遇到了问题。
当我将 USR2 信号发送到主进程(此示例中为 PID 19216)时,我在 Unicorn 日志中收到以下内容:
I, [2015-03-11T23:47:33.992274 #6848] INFO -- : executing ["/srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn", "/srv/ourapp/current/config.ru", "-Dc", "/srv/ourapp/shared/config/unicorn.rb", {10=>#<Kgio::UNIXServer:/srv/ourapp/shared/sockets/unicorn.sock>}] (in /srv/ourapp/releases/a0e8b5df474ad5129200654f92a76af00a750f47)
I, [2015-03-11T23:47:36.504235 #6848] INFO -- : inherited addr=/srv/ourapp/shared/sockets/unicorn.sock fd=10
/srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:206:in `pid=': Already running on PID:19216 (or pid=/srv/ourapp/shared/pids/unicorn.pid is stale) (ArgumentError)
from /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:134:in `start'
from /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/bin/unicorn:126:in `<top (required)>'
from /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `load'
from /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `<main>'
E, [2015-03-11T23:47:36.519549 #19216] ERROR -- : reaped #<Process::Status: pid 6848 exit 1> exec()-ed
E, [2015-03-11T23:47:36.520296 #19216] ERROR -- : master loop error: Already running on PID:19216 (or pid=/srv/ourapp/shared/pids/unicorn.pid is stale) (ArgumentError)
E, [2015-03-11T23:47:36.520496 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:206:in `pid='
E, [2015-03-11T23:47:36.520650 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:404:in `reap_all_workers'
E, [2015-03-11T23:47:36.520790 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/lib/unicorn/http_server.rb:279:in `join'
E, [2015-03-11T23:47:36.520928 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/gems/unicorn-4.8.1/bin/unicorn:126:in `<top (required)>'
E, [2015-03-11T23:47:36.521115 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `load'
E, [2015-03-11T23:47:36.521254 #19216] ERROR -- : /srv/ourapp/shared/bundle/ruby/2.2.0/bin/unicorn:23:in `<main>’
当我检查时,确实,现在有 unicorn.pid 和 unicorn.pid.oldbin,都包含 19216。
什么原因可能导致这种情况的发生?
这是我们的unicorn.rb FWIW:
# Set your full path to application.
app_path = "/srv/ourapp/current"
# Set unicorn options
worker_processes 3
preload_app true
timeout 30
listen "/srv/ourapp/shared/sockets/unicorn.sock", :backlog => 64
# Spawn unicorn master worker for user deploy (group: deploy)
user 'deploy', 'deploy'
# Fill path to your app
working_directory app_path
# Should be 'production' by default, otherwise use other env
rails_env = ENV['RAILS_ENV'] || 'production'
# Log everything to one file
stderr_path "/srv/ourapp/shared/log/unicorn.log"
stdout_path "/srv/ourapp/shared/log/unicorn.log"
# Set master PID location
pid "/srv/ourapp/shared/pids/unicorn.pid"
before_exec do |server|
ENV["BUNDLE_GEMFILE"] = "#{app_path}/Gemfile"
end
before_fork do |server, worker|
ActiveRecord::Base.connection.disconnect!
sleep 10
old_pid = "#{server.config[:pid]}.oldbin"
if File.exists?(old_pid) && server.pid != old_pid
begin
Process.kill("QUIT", File.read(old_pid).to_i)
rescue Errno::ENOENT, Errno::ESRCH
# someone else did our job for us
end
end
end
after_fork do |server, worker|
ActiveRecord::Base.establish_connection
Sidekiq.configure_client do |config|
config.redis = { namespace: 'sidekiq' }
end
end