Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent usage of dbm causes cache to go stale #104

Open
sqlalchemy-bot opened this issue Jul 28, 2016 · 7 comments
Open

Concurrent usage of dbm causes cache to go stale #104

sqlalchemy-bot opened this issue Jul 28, 2016 · 7 comments
Labels
bug Something isn't working

Comments

@sqlalchemy-bot
Copy link

Migrated issue, originally created by David Gardner ()

Ran into this issue in production where users were reporting items in the cache being an hour old for a cache region configured with a one minute expiration time. Problem first observed with version 0.5.7, and reproduced with version 0.6.1.

I was able to reproduce the issue in a simple test of a function that returns datetime.now() (dogpile-test.py), if I run about 10 concurrent instances of the script after about a minute or so they will start reporting stale data.

However I noticed as a work-around if I use the MutexLock class from the lock_factory documentation, I don't run into the problem:
http://dogpilecache.readthedocs.io/en/latest/api.html?highlight=dogpile.cache.dbm#dogpile.cache.backends.file.DBMBackend.params.lock_factory


Attachments: dogpile-test.py

@sqlalchemy-bot
Copy link
Author

David Gardner () wrote:

Looked into this a bit more, since my issue is with concurrent processes and not concurrent threads I think all that the MutexLock was providing for me was bypassing the default FileLock implementation.

In my case anydbm is picking dbhash which has it's own locking. I was also able to avoid the problem by simply setting rw_lockfile=False

@sqlalchemy-bot
Copy link
Author

Michael Bayer (zzzeek) wrote:

well, lockfiles can be weird. I can't reproduce any problem. I added this:

i += 1
if i % 10000 == 0:
    print("%s iterations.  current age is %s" % (i, now - dt))

ran it in three windows, I see:

#!


0.6.2
10000 iterations.  current age is 0:00:00.026704
20000 iterations.  current age is 0:00:04.107627
30000 iterations.  current age is 0:00:04.853445
40000 iterations.  current age is 0:00:04.978281
50000 iterations.  current age is 0:00:01.062407
60000 iterations.  current age is 0:00:04.126943
...

are you in one of the many danger areas for lockfiles? e.g. Windows, NFS shares, weird file systems, containers, etc. ?

@sqlalchemy-bot
Copy link
Author

David Gardner () wrote:

Its a typical Linux setup on an ext4 file system. I added your iteration counter, and ran my test launching 10 instances of the script at near the same time with:

#!bash

gnome-terminal -e ./dogpile-test.py --tab --tab --tab --tab --tab --tab --tab --tab --tab --tab

Around iteration 50000 or so the script starts complaining about the age of the cache.

@sqlalchemy-bot
Copy link
Author

Michael Bayer (zzzeek) wrote:

shrugs, lockfiles. is there evidence that a lockfile is being held open permanently ?

@sqlalchemy-bot
Copy link
Author

David Gardner () wrote:

How would I check that?

@sqlalchemy-bot
Copy link
Author

Michael Bayer (zzzeek) wrote:

hmmm probably need to put more debugging into the lock code itself, set a timer when the file lock is acquired, somehow have it print out how long it's been held, then perhaps shutdown other processes and see if that one keeps holding opened. another way would be to use linux commands, there's a util lslocks that you can try which would show whos locking it.

lockfiles are just like this, they have weird problems.

@sqlalchemy-bot sqlalchemy-bot added the bug Something isn't working label Nov 24, 2018
@zzzeek
Copy link
Member

zzzeek commented Nov 25, 2018

would need to revisit this and try running the test case again to try reproducing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants