Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reader: bump default max_tries from 5 to 65535 (max possible) #134

Closed
wants to merge 1 commit into from

Conversation

the-gigi
Copy link

the default should be unlimited (or in this case the max possible which is practically unlimited).
In many cases the application needs better control on what to do when messages are requeud
too many times. Often, there will be different strategies for different messages. For example,
some messages should be discarded immediately if they fail and others should be requeued
many times. while, the default is just a default that can be changed it is a surprise that can catch
people that didn't read the docs carefully (happend to me) and cause quite message loss.

the default should be unlimited (or in this case the max possible which is practically unlimited).
In many cases the application needs better control on what to do when messages are requeud
too many times. Often, there will be different strategies for different messages. For example,
some messages should be discarded immediately if they fail and others should be requeued
many times. while, the default is just a default that can be changed it is a surprise that can catch
people that didn't read the docs carefully (happend to me) and cause quite message loss.
@mreiferson
Copy link
Member

@the-gigi thanks for the PR.

I don't think the answer is to default to "unlimited". We've tried really hard with NSQ and client libraries to bound everything and choose sane and reasonable defaults for a wide variety of use cases.

What else can we do to make this less surprising?

/cc @jehiah

@mreiferson mreiferson changed the title bump default max_tries on the Reader from 5 to 65535 (max possible) reader: bump default max_tries from 5 to 65535 (max possible) Nov 21, 2015
@the-gigi
Copy link
Author

Hi Matt,
Thanks for the quick response and the explanation. I guess if you don't want to default to "unlimited" the only other non-surprising behavior is to have no default at all and require users to explicitly specify the max_tries. That would be very reasonable too. Throwing away messages quietly is a very strong action on a part of a library. This really should be an application level decision. In our case it lead to serious data loss. We have a processing queue that if a message fails to process we divert it to "retry" queue where we periodically try to process all messages and if processing fails messages are requeued back to the "retry" queue. This way, transient problems don't affect the main work queue, but we don't lose messages and can process them later or examine it. If the depth of the "retry" queue grows too much we take a look and either fix the issues that cause messages to fail or discard them. The unexpected 5 tries caused us to lose many messages from the "retry" queue. We had some problem that we know we had to fix, but wasn't urgent. Suddenly the "retry" queue drained completely.

@jehiah
Copy link
Member

jehiah commented Nov 22, 2015

@the-gigi I think @mreiferson and I understand where you are coming from, and I appreciate your time in sharing your thoughts here.

On a tangential note, i'm curious the hear if you use the built in backoff? It's typical (and intentional) that backoff slows processing to a trickle when you have a transient failure, and in turn that also means you very slowly work through your retry count.

I think there are probably two tangible changes we can make here:

  1. increase the default max retry. While I think there should continue to be a default limit, I'd be happy with something < 25.

  2. update the default implementation of giving_up() to archive the request body. This is what i do in my use of pynsq and i can't think of a good reason we shouldn't have that as the default implementation. It could write to CWD by default, and we could add a new argument to control that location.

@ploxiln
Copy link
Member

ploxiln commented Nov 22, 2015

convenient reference for others: https://pynsq.readthedocs.org/en/latest/reader.html#nsq.Reader.giving_up

I agree, would be nice to more conveniently be able to enable (or disable) archiving failed messages.

Maybe it does make sense turn it on by default, and if you don't want the archiving you can turn it off, but if you're not aware of the possibility you'll notice all these archived message files piling up and you'll quickly figure out what's going on and be able to specify the desired behavior. But I expect the project will get at least one complaint about filling up (or using all the inodes on) a root filesystem 😁

@the-gigi
Copy link
Author

Hi @jehiah,
Yes, I use the built-in backoff. As far as max_tries and giving_up() there is no real problem. I can set max_tries myself and I can override giving_up() too. The default behavior is important when people like me don't read the docs carefully. Your suggestion to provide a default behavior that will not quietly lose messages addresses it I believe. It's still seems strange. The messages are already stored in a queue and the application has code that subscribes and processes them. If the application chose to explicitly requeue them, why would you remove them by default and store them in CWD (how? in a file?) where you'll need to write different code to access them.

Anyway, I really appreciate nsq. This is just my opinion. Now, that I know of the default behavior I'm all set. This is all about the next person that might be surprised. Thanks for all the hard work and a fantastic software.

@mreiferson
Copy link
Member

see #135

@mreiferson mreiferson closed this Nov 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants