Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ActiveJob with retry_on— retries indefinitely despite "attempts" option #553

Closed
alexford opened this issue Feb 13, 2019 · 6 comments
Closed

Comments

@alexford
Copy link

alexford commented Feb 13, 2019

Given the following ActiveJob class in Rails 5.2.2:

class TestRetries < ApplicationJob
  class RetryError < StandardError; end
  retry_on RetryError, attempts: 2

  def perform
    fail RetryError
  end
end

I would expect the job to run in Shoryuken only twice, but it runs indefinitely, repeating at each visibility_timeout period. I am seeing this in the logs from ActiveJob, but the message remains in SQS:

Stopped retrying TestRetries due to a TestRetries::RetryError, which reoccurred on 2 attempts. The original exception was nil.

discard_on works as expected—the job is only run once.

Is there some bit of configuration I'm missing here, or is this not expected to work?

ActiveJob's docs for retry_on: https://api.rubyonrails.org/v5.2.2/classes/ActiveJob/Exceptions/ClassMethods.html#method-i-retry_on

@alexford alexford changed the title ActiveJob: retry_on retries indefinitely despite "attempts" option ActiveJob with retry_on: retries indefinitely despite "attempts" option Feb 13, 2019
@alexford alexford changed the title ActiveJob with retry_on: retries indefinitely despite "attempts" option ActiveJob with retry_on— retries indefinitely despite "attempts" option Feb 13, 2019
@phstc
Copy link
Collaborator

phstc commented Feb 13, 2019

Hi @alexford

retry_on and discard_on are not supported in the current Shoryuken adapter (unless it is something Active Job manages without using adapters). Are they supported by the other adapters already? These options are new to me, I don't know how they work with the adapters yet.

For SQS, I believe that retry_on would be better implemented with dead letter queues.

@alexford
Copy link
Author

alexford commented Feb 13, 2019

Thanks for the quick reply, @phstc. Looking now I see that some of the built-in adapters support this, and some do not: https://api.rubyonrails.org/classes/ActiveJob/QueueAdapters.html

I notice that with discard_on, the exception is captured, so I assume Shoryuken auto-deletes as intended. That would explain why that one does work.

With retry_on, ActiveJob does re-raise the exception every time, even if attempts has passed, so Shoryuken does not delete the job from SQS. I am not sure how adapters are expected to handle this case, or how ActiveJob is keeping track of retries in the first place. I will look into it a little bit when I get some time. In the meantime I can make adjustments in my worker to account for this.

@alexford
Copy link
Author

alexford commented Feb 13, 2019

There is a good workaround, though: retry_on allows a block to be passed for handling the exception once the retry attempts have been exhausted. Rescuing the exception in this block allows Shoryuken's auto_delete to take care of the job.

retry_on RetryableError, attempts: 2 do |_job, exception|
  Raven.capture_exception(exception) # for example, if using Sentry, or log however you want
end

@phstc
Copy link
Collaborator

phstc commented Feb 21, 2019

@alexford that's a great solution - thanks for sharing it.

The auto delete enabled in the AJ adapter, does not delete the message if the worker raises an error.

retry_on raises the error up if the messages reach the number of attempts. You need to catch the error in order to get Shoryuken deleting the message.

A couple of options to make retry_on to work with Shoryuken/SQS:

a) (what you suggested)

retry_on ErrorYourPerformMayRaise, attempts: N do |_job, _exception|
  # Log error, etc.
  # Do not raise the error up, otherwise, Shoryuken will not auto-delete the message
  # Must implement this block, otherwise, 
  # Active Job will re-raise errors and Shoryuken won't delete the messages
end

b)

retry_on ErrorYourPerformMayRaise, attempts: N

rescue_from ActiveJob::DeserializationError do |_exception|
  # Log error, etc.
  # Do not raise the error up, otherwise, Shoryuken will not auto-delete the message
  # Must implement this block, otherwise, 
  # Active Job will re-raise errors and Shoryuken won't delete the messages
end

c)

retry_on ErrorYourPerformMayRaise, attempts: N

discard_on ActiveJob::DeserializationError 

d)

Configure a Dead Letter queue with maxReceiveCount set to 1. With that, when a message reaches attempts it will go the DL queue.

Updated wiki: https:/phstc/shoryuken/wiki/Rails-Integration-Active-Job#how-to-use-retry_on.

@phstc phstc closed this as completed Feb 21, 2019
@jmammen
Copy link

jmammen commented Sep 20, 2019

Hi,

a) is not working for me. I configured my Dead Letter queue accordingly and set the maxReceiveCount to 1. My job is executed N times and then the retry_on block is executed but the job never makes it way into the Dead Letter queue. When I remove the block and basically do it like in c) its working. Is that a bug? Or do I do some mistakes? Using ActiveJob with the Shoryuken adapter.

app/jobs/my_job.rb

class MyJob < ApplicationJob
  queue_as 'my_queue'

  retry_on MyException1, MyException2, attempts: 3

  def perform(message)
    myService.perform_service(message)
  end
end

config/shoryuken.yml

concurrency: 15
queues:
  - [my_queue, 1]

config/application.rb

module MyModule
  config.active_job.queue_adapter = :shoryuken
end

@phstc
Copy link
Collaborator

phstc commented Sep 24, 2019

Hi @jmammen
For a) are you also passing a block?

You should use this ✅

retry_on MyException1, MyException2, attempts: 3 do |_job, _exception|
  # Log error, etc.
  # Do not raise the error up, otherwise, Shoryuken will not auto-delete the message
  # Must implement this block, otherwise, 
  # Active Job will re-raise errors and Shoryuken won't delete the messages
end

Instead of just 🚫

retry_on MyException1, MyException2, attempts: 3

Otherwise, Active Job will re-raise errors and Shoryuken won't delete the messages. As pointed out by @alexford.

With retry_on, ActiveJob does re-raise the exception every time, even if attempts has passed, so Shoryuken does not delete the job from SQS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants