Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RedisBackend should probably not raise errors #189

Open
youtux opened this issue Sep 10, 2020 · 22 comments
Open

RedisBackend should probably not raise errors #189

youtux opened this issue Sep 10, 2020 · 22 comments

Comments

@youtux
Copy link
Contributor

youtux commented Sep 10, 2020

I find it a little bit inconsistent the fact that the RedisBackend can raise errors while trying to get a value from the cache.
In comparison, the MemcachedBackend doesn't raise any error.
Dogpile.cache is about caching, I think it should do best-effort to try to get the value from the cache, but it should not raise exceptions in case of errors while getting the value.

This example shows the issue.

from dogpile.cache import make_region
import sys

print(sys.version)

region = make_region().configure(
    'dogpile.cache.redis',
    arguments = {'url': "redis://127.0.0.1"},
)

@region.cache_on_arguments()
def get_something(n):
    return {'foo': 'bar'}


get_something(1)
get_something(1)

Just run it in 2 different python versions that have a different pickle.HIGHEST_PROTOCOL defined, like python2.7 and python3.6 for example:

$ python3.6 script.py
3.6.11 (default, Jul 28 2020, 14:47:23) 
[GCC 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.32.62)]
$ python2.7 script.py
2.7.17 (default, Feb 13 2020, 16:04:56) 
[GCC 4.2.1 Compatible Apple LLVM 11.0.0 (clang-1100.0.33.17)]
Traceback (most recent call last):
  File "/Users/youtux/Library/Application Support/JetBrains/PyCharm2020.2/scratches/script.py", line 16, in <module>
    get_something(1)
  File "</tmp/env/lib/python2.7/site-packages/decorator.pyc:decorator-gen-1>", line 2, in get_something
  File "/tmp/env/lib/python2.7/site-packages/dogpile/cache/region.py", line 1360, in get_or_create_for_user_func
    key, user_func, timeout, should_cache_fn, (arg, kw)
  File "/tmp/env/lib/python2.7/site-packages/dogpile/cache/region.py", line 962, in get_or_create
    async_creator,
  File "/tmp/env/lib/python2.7/site-packages/dogpile/lock.py", line 187, in __enter__
    return self._enter()
  File "/tmp/env/lib/python2.7/site-packages/dogpile/lock.py", line 87, in _enter
    value = value_fn()
  File "/tmp/env/lib/python2.7/site-packages/dogpile/cache/region.py", line 902, in get_value
    value = self.backend.get(key)
  File "/tmp/env/lib/python2.7/site-packages/dogpile/cache/backends/redis.py", line 154, in get
    return pickle.loads(value)
ValueError: unsupported pickle protocol: 4

Dogpile cache version: 0.9.0 (but the same happens with the latest version, 1.0.2)

I can make a PR if you think the behaviour should be more best-effort.

@zzzeek
Copy link
Member

zzzeek commented Sep 10, 2020

what does the memcached backend do? it returns None for an object that's in the cache that's in an invalid pickle format?

I can see the wisdom for both behaviors and this should be a configurable option only, or perhaps there can be a built-in proxybackend that squashes all errors. I am not too crazy about "all errors suppressed by default", like if memcached can't even get to the server, it returns None, is that right? seems familiar. I'd hate that.

I think that the two backends have different default behaviors is an unfortunate side-effect of the fact that dogpile is interacting with different services of very different philosophies so I favor an option to make things behave differently but not by default, that would be more disruptive at this point.

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

what does the memcached backend do? it returns None for an object that's in the cache that's in an invalid pickle format?

Yes, the memcached backend will swallow any exception, and it can configured at most to write a log to sys.stderr.

I can see the wisdom for both behaviors and this should be a configurable option only, or perhaps there can be a built-in proxybackend that squashes all errors. I am not too crazy about "all errors suppressed by default", like if memcached can't even get to the server, it returns None, is that right? seems familiar. I'd hate that.

I don't like the fact that the memcache Client does that either. I think that any client (as in memcache.Client, redis.StrictRedis, etc.) should have strict behaviour (that is, raise all the errors), and it should be up to the layer using that client to decide what to do with the errors.
In this case, I think that dogpile.cache.redis should be the one responsible for catching the errors, "show" them accordingly (e.g. warning.warn, logger.error, ...), and proceed as if the value was not found.

I think that the two backends have different default behaviors is an unfortunate side-effect of the fact that dogpile is interacting with different services of very different philosophies so I favor an option to make things behave differently but not by default, that would be more disruptive at this point.

I am not so sure it would be disruptive at all, since the use case for dogpile.cache is to apply caching to functions. Caching is an optimization, but IMO decorated functions should not fail if the caching backend is not available/has errors.
I apply caching to my web applications as an optimization. If memcached/redis process dies, my whole application should not break, at most it should perform worse.

If we emit warnings when error occur, users can still configure a filter that makes them errors, and keep the old behaviour. But I can't think of a reason why one would expect to see errors bubbling up in case the cache backend is not available.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

caching is an optimization sure but dogpile.cache has been around for years and certainly different developers have different expectations. I'm +1 on an option to change the behavior, no change in default behavior. if i were to change the default behavior, it would be that everything raises by default and that is certainly what most developers would expect. memcached is not making that possible so we do the best we can and consider memcached to be the outlier.

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

I understand that people may use the backends directly, so yes in that case it would break compatibility.
I guess we can introduce a flag to disable "strict" mode for the backends that support it, so that we are fully backwards compatible.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

"strict" mode means....raises? or warns ? or silently passes? is this flag at the backend level or at the Region level? Region level is fine but we'd have to build a generic exception catching mechanism for all of that, right?

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

I would say that with strict=True (or passthrough_errors=True) should raise (if the underlying library somehow informs us of an error).
Otherwise we should warnings.warn or logger.error.

I would have the flag at the backend level, not sure why it would be on the Region level (you can always configure a region and forward the "strict" flag to the backend, right?).

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

Probably "strict" is not the best wording, as the default should be off (in my opinion), while "strict" gives more the idea that it should be turned on.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

but that means "strict=True" silently fails on memcached then, right?

it seems like we really can't provide consistent behavior in both directions. which is why it seems like we can add a flag, or maybe just a proxy backend, that says, "exceptions become None" or something like that. I really dont know how to do that correctly, though. this really seems like a custom proxy backend thing to me.

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

but that means "strict=True" silently fails on memcached then, right?

I would say that it should not be allowed to instantiate a backend passing strict=True in case it's not supported (memcached).
If the backend supports that, then we allow to configure it.

Maybe we can just allow this boolean to be configured for RedisBackend, DBMBackend, MemoryBackend (although the only way it can fail is if the user is tampering with the internals).
For now the default can be be strict=True (I still don't like "strict", but this is just to give the idea), which will respect the current behaviour.
If it eventually is deemed to be better off by default, it can be defaulted to False (possibly in a major version).

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

but that means "strict=True" silently fails on memcached then, right?

I would say that it should not be allowed to instantiate a backend passing strict=True in case it's not supported (memcached).
If the backend supports that, then we allow to configure it.

then what is the default value of "strict=True" ? different on the two backends? that's not a good idea. i think it should likely be simpler such as "suppress_exceptions" and it's only on backends that are not the memcached backend. but this really doesn't give you the "every backend works the same" experience you're looking for and honestly I dont think it's a realistic expectation.

Maybe we can just allow this boolean to be configured for RedisBackend, DBMBackend, MemoryBackend (although the only way it can fail is if the user is tampering with the internals).
For now the default can be be strict=True (I still don't like "strict", but this is just to give the idea), which will respect the current behaviour.
If it eventually is deemed to be better off by default, it can be defaulted to False (possibly in a major version).

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

"suppress_exceptions" sounds way better.

then what is the default value of "strict=True" ? different on the two backends? that's not a good idea

If we want to keep backwards compatibility, "suppress_exceptions" it has to default to False. If we want consistency, it should default to True.
It can initially be False, and in a next major version True.

but this really doesn't give you the "every backend works the same" experience you're looking for and honestly I dont think it's a realistic expectation.

Why not? I would opt-in for suppress_exception in the backends that support it.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

yes you'd need to opt in on this.

the option would need some way to configure logging and all that. people will want different behaviors here.

also I have no resources to work on this myself so you'd have to work on a PR with tests.

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

I would be happy to work on it, I just wanted to know first if that would be accepted.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

My main concern is the burden of implementing this on every backend. new backends should not have to implement this flag also, idealy. it seems there should be either something in region itself, or a generic proxy backend, that implements this.

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

True, but what if we have this option until the next major release, and after that exceptions are just going to be suppressed (still logged of course)?

@youtux
Copy link
Contributor Author

youtux commented Sep 11, 2020

The question is more about what is the best behaviour of the backends. Should they raise on exception or not in case the cache backend is not available/has errors?
The path to get is not so important.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

the backends should absolutely raise by default. I think a pluggable solution, in the style of many other dogpile.cache elements, where an "exception handler" callable can be plugged into the region would be best. that way no changes need to occur to any backends and when users don't like the particular exceptions we are logging, or not, or how they are being logged, they can change it since the mechanism by which it works is transparent and pluggable.

@zzzeek
Copy link
Member

zzzeek commented Sep 11, 2020

other examples of this in dogpile:

RegionInvalidationStrategy

key_mangler

function_key_generator

async_creation_runner

etc.

I can also see people wanting an exception handler that does a SQLAlchemy-like thing where it tries to relate the exception classes from different backends to each other.

@youtux
Copy link
Contributor Author

youtux commented Sep 13, 2020

Right, I understand. Should I start working on a PR with this new argument for the Region?

@zzzeek
Copy link
Member

zzzeek commented Sep 13, 2020

sure! pluggable impl makes me non-worried. :)

@jvanasco
Copy link
Member

jvanasco commented Nov 3, 2020

FYI, the Redis backend also raises a lot of annoying exceptions on lock timeouts and race conditions with expensive processes.

I need to find time to make a new testcase for this one day, but I think the situation was because a first process has a lock that times out during generation, a second process generates a new lock, and then the first process tries to release the lock it doesn't own (Exception!) and then the second process tries to release a lock that no longer exists (Exception!).

The problem was due to somewhat incompatible implementation details of dogpile and Redis. The underlying Redis library love to raise Exceptions, and the points where they bubble up within dogpile require one of the two projects to be altered - it can't be handled properly by anyone simply implementing dogpile cache.

I wish I could remember which mailing lists I was talking about this on. There were a handful of other people who kept battling the same issues as me.

@jvanasco
Copy link
Member

jvanasco commented Sep 8, 2023

This was the ticket I once opened with redis. I should have noted it here earlier.
redis/redis-py#623

I never got around to making the test case, but I do recall the overall issue is because of how/where redis is leveraged in the dogpile code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants