New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Race condition in FileSystemCache on Windows #854
Comments
|
I agree that my workaround (as I called it) is not a good solution. I was reluctant to even mention it. I'm just not familiar enough with the Windows filesystem to say if a proper solution is feasible. It can be difficult to see race conditions just looking at code (especially if some of that code is internal to Windows). I think the script I just left HERE demonstrates that werkzeug.posixemulation.rename is not atomic. It repeatedly calls rename in one thread, and reads from the rename target file in the other thread. It intermittently outputs "Permission denied" exceptions from the thread doing the reading. I've run it on 3 different boxes using python 2.7.10 on 64 bit Windows Server 2012, 2.7.1 on 32 bit Windows Server 2003, and 2.7.2 on 64 bit Windows Server 2008, all with the same intermittent exceptions. |
Sorry, I somehow missed that you're running on Windows. I can reproduce this problem on Windows 7.
ISTM that Windows is just acquiring a lock on the target name instead of actually making an atomic operation. That would explain the "permission denied" error: Acquisition of the lock failed. However, this is just pure speculation. Unfortunately We could include your workaround in |
Yeah, it results in a cache miss, not a crash. And ultimately, this caused a problem in the Flask-Session extension. Maybe I should bring it up over there instead. Its 'filesystem' session type is backed by FileSystemCache. (If two threads from two browser connections pull up the session at the same time, and one has a cache miss, that empty session gets re-saved, logging the user out.) Not to say that sessions should have guaranteed persistence. But getting logged out randomly isn't good. Anyway, I can close this if you don't think it's worth addressing here at the cache level. |
close it via #1249 |
Python 2.7.2
Werkzeug 0.11.2
In FileSystemCache, the set() method creates a temporary file of the new cache entry and then replaces the old file with it. Under Windows, if get() is called from another thread at the same time as set's file replacement (implemented by werkzeug.posixemulation.rename), a "permission denied" IOError can occur. It looks like werkzeug.posixemulation.rename does its best to try to implement an atomic-like rename/replace for Windows, but apparently there's still a short moment where this can happen.
Maybe it's preferable (if technically feasible) to make werkzeug.posixemulation.rename 100% atomic. But for what it's worth, I made the following change for myself as a workaround, in werkzeug/contrib/cache.py, line 737 (in the latest version). This is the original version:
In my workaround I wrapped it in a loop that catches IOError exceptions and just pauses a moment (needs sleep() imported from time):
Whether or not this is a good approach, if you need tests or anything else, I can help with that later.
I'm not sure what a good number of max retries would be. I arbitrarily used 20 (with a 1ms pause). When the error occurs, it usually only needs 1 retry, and the most I've seen (in my limited testing) was 2.
Side note: I noticed that in set(), it sets the new file's mode after renaming. So at first I thought just setting the file mode of the temp file before the rename/replace would fix all this. But that didn't work. So it does seem to be a problem with rename() itself.
The text was updated successfully, but these errors were encountered: