Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delayed jobs not executing #657

Open
abeltje1 opened this issue Oct 30, 2018 · 8 comments
Open

Delayed jobs not executing #657

abeltje1 opened this issue Oct 30, 2018 · 8 comments

Comments

@abeltje1
Copy link

abeltje1 commented Oct 30, 2018

Hi all!

First of all: thanks for this great project!

I've got resque (1.27.4) and scheduler (4.3.1) up and running with Rails (5.2.1). General background jobs are working fine, and it seemed delayed jobs were too. They are being added to the delayed queue (I can verify through the resque-web UI) and are correctly removed from the UI when the time comes. The only thing is that they actually never end up being executed. When looking through the logs, there's no info whatsoever. I also must note that I've tried it with different jobs (I've created a job with only 1 log as a test) and have tried it with ActiveJob as well as standalone Resque. When I schedule a job in the past, it does get executed (same if I hit 'execute now' in the UI)

I'm not entirely sure what information I should provide, so for completeness I'll provide my complete resque rake file.

require 'resque/tasks'
require 'resque/scheduler/tasks'


# Start a worker with proper env vars and output redirection
def run_worker(queue, count = 1)
  puts "Starting #{count} worker(s) with QUEUE: #{queue}"
  ops = {:pgroup => true, :err => [(Rails.root + "log/workers_error.log").to_s, "a"], 
                          :out => [(Rails.root + "log/workers.log").to_s, "a"]}
  env_vars = {"QUEUE" => queue.to_s}
  count.times {
    ## Using Kernel.spawn and Process.detach because regular system() call would
    ## cause the processes to quit when capistrano finishes
    pid = spawn(env_vars, "rake resque:work", ops)
    Process.detach(pid)
  }
end

# Start a scheduler, requires resque_scheduler >= 2.0.0.f
def run_scheduler
  puts "Starting resque scheduler"
  env_vars = {
    "BACKGROUND" => "1",
    "PIDFILE" => (Rails.root + "tmp/pids/resque_scheduler.pid").to_s,
    "VERBOSE" => "1"
  }
  ops = {:pgroup => true, :err => [(Rails.root + "log/scheduler_error.log").to_s, "a"],
                          :out => [(Rails.root + "log/scheduler.log").to_s, "a"]}
  pid = spawn(env_vars, "rake environment resque:scheduler", ops)
  Process.detach(pid)
end

namespace :resque do
  task :setup => :environment

  desc "Restart running workers"
  task :restart_workers => :environment do
    Rake::Task['resque:stop_workers'].invoke
    Rake::Task['resque:start_workers'].invoke
  end
  
  desc "Quit running workers"
  task :stop_workers => :environment do
    pids = Array.new
    Resque.workers.each do |worker|
      pids.concat(worker.worker_pids)
    end
    if pids.empty?
      puts "No workers to kill"
    else
      syscmd = "kill -s QUIT #{pids.join(' ')}"
      puts "Running syscmd: #{syscmd}"
      system(syscmd)
    end
  end
  
  desc "Start workers"
  task :start_workers => :environment do
    run_worker("*", 2)
    run_worker("high", 1)
  end

  desc "Restart scheduler"
  task :restart_scheduler => :environment do
    Rake::Task['resque:stop_scheduler'].invoke
    Rake::Task['resque:start_scheduler'].invoke
  end

  desc "Quit scheduler"
  task :stop_scheduler => :environment do
    pidfile = Rails.root + "tmp/pids/resque_scheduler.pid"
    if !File.exists?(pidfile)
      puts "Scheduler not running"
    else
      pid = File.read(pidfile).to_i
      syscmd = "kill -s QUIT #{pid}"
      puts "Running syscmd: #{syscmd}"
      system(syscmd)
      FileUtils.rm_f(pidfile)
    end
  end

  desc "Start scheduler"
  task :start_scheduler => :environment do
    run_scheduler
  end

  desc "Reload schedule"
  task :reload_schedule => :environment do
    pidfile = Rails.root + "tmp/pids/resque_scheduler.pid"

    if !File.exists?(pidfile)
      puts "Scheduler not running"
    else
      pid = File.read(pidfile).to_i
      syscmd = "kill -s USR2 #{pid}"
      puts "Running syscmd: #{syscmd}"
      system(syscmd)
    end
  end
end

My config file actually has nothing in it, since I'm not using the recurring feature yet.

I understand this question might seem a little vague, but I'd greatly appreciate it if someone could point me in the right direction. I've been looking through various docs, issues & blogs and can't seem to find anyone struggling with the same issues, which leads me to believe I'm overlooking something rather simple. If I'm missing some info, do tell. Thanks in advance!

Cheers!

@ruckus
Copy link

ruckus commented Dec 11, 2018

I'm suffering from a similar issue. I have a fixed config/resque_schedule.yml configuration and sometimes - according to the resque-web UI - some delayed jobs do not get run. But most others do. No rhyme or reason. Generally restarting the resque-scheduler task a couple of times solves it.

@rafaelfranca
Copy link
Member

Can you please provide a sample application that reproduces the error?

@ruckus
Copy link

ruckus commented Dec 12, 2018

Hi @rafaelfranca thanks for the response. Unfortunately this is part of a much larger application (I have 70 delayed/scheduled jobs) so its impossible to extract it out into a distinct sample app. I totally understand that doesnt give you much info to help debug/diagnose and this is worst kind of issue report.

Anyways, perhaps some more information. I'm running resque-scheduler on 3 instances and thus leveraging the fail-over aspects. One theory is that maybe none of the 3 instances think they're the master so they dont scheduled jobs. I looked at the code to see how the instances elect the master and it appears I can just look at the redis key resque_scheduler_master_lock; when I do this its type=none and its size is empty ([]), here is an image:

https://cl.ly/03d1d1f153be

Would this be indicative of an issue? From my understanding I think I should be expecting to see a value in that set. So would this indicate that I dont have a current master?

@chriskasza
Copy link

I created a test project to demonstrate this issue. I'm finding that both delayed and scheduled jobs are not executing.

https://github.com/chriskasza/resque-test

This is the first time I've used resque-scheduler, so there is the possibility that I have misconfigured something...

@ruckus
Copy link

ruckus commented Dec 18, 2018

I was running pretty old versions of resque (1.25.1) & resque-scheduler (3.0.0) and after upgrading to the latest of each I'm not seeing any more non-executed scheduled jobs.

@chriskasza
Copy link

I'm using the latest version of resque-scheduler (v4.3.1) as well. But I'm also using the latest Rails (v5.2.2) and Ruby (v2.5.3). Maybe it has something to do with that?

Also, I noticed that #647 sounds similar to this issue.

@salgadobreno
Copy link

Should be related to this: https://stackoverflow.com/a/39629389/586033 and #613 (<-- possible workarounds there) besides the one you mentioned @chriskasza

Basically resque-scheduler enqueueing doesn't seem to be compatible with ActiveJob, worst of all jobs get enqueued, are cleared from queue by workers, but nothing is actually executed.

@seb-sykio
Copy link

hi,
I saw the workaround, but it does not work with Resque.enqueue_at
Did anyone find another solution?
thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants