Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Dictionary Size changed during iteration" around application deployment time #446

Closed
baydoun0 opened this issue May 18, 2024 · 7 comments
Closed
Labels

Comments

@baydoun0
Copy link

baydoun0 commented May 18, 2024

Expected Behaviour

Flask Limiter should behave gracefully during application initialization.

Current Behaviour

About 50 times over the last couple of weeks, ever since we upgraded Flask Limiter to the latest version (we were on 1.x and now we're at 3.5.1), we consistently observe the following error on Sentry:

Screenshot 2024-05-18 at 12 53 51 PM

It happens around the time our application initializes after deployment.

Screenshot 2024-05-18 at 12 54 20 PM

Steps to Reproduce

I don't have concrete steps to reproduce for everyone, our application is in production under normal load, and we service quite a bit of requests, but for us, steps to reproduce are simply:

  1. Construct Blueprints with routes registered, some of which have rate-limiting (this doesn't actually matter, since we observe failures even on endpoints that DO NOT HAVE the limiter decorator)
  2. Register all those blueprints to a Flask App instance.
  3. Initialize Flask limiter as usual.

Concurrently, the application needs to be bombarded with requests.

Your Environment

  • Flask-limiter version: 3.5.1
  • Flask version: 2.0.0
  • Operating system: Amazon Linux 2
  • Python version: 3.8

We're also using Redis as a backing store.

Note: We are constructing all blueprints and attaching all routes to them, and also configuring ALL before_request and after_request hooks before initializing Flask Limiter in an attempt to mitigate this, but it's not working as expected. Flask Limiter seems to be kicking in immediately (which is fine, but it should handle the race condition better).

Suggested Fix

The self._blueprint_exemptions field inside the LimitManager class is only modified in one place: the add_blueprint_exemption method. All that needs to be done is to add a lock, and acquire it when attempting to write, or iterate over self._blueprint_exemptions.

@baydoun0 baydoun0 added the bug label May 18, 2024
@alisaifee
Copy link
Owner

alisaifee commented May 19, 2024

If the blueprints have all been constructed (and decorated with any @exempt decorators) before you call limiter.init_app(app) it's a bit unclear to me how the extension's before_request hook could be registered and thus be executed while the add_blueprint_exemption method is still being called.

@baydoun0
Copy link
Author

baydoun0 commented May 19, 2024

We are not using the exempt decorator anywhere in our application, I double-checked. Only the limit decorator, and the request_filter decorator.

I can't actually figure out WHERE the calls to add_blueprint_exemption are happening, because it doesn't seem to be the case that this is called anywhere other than from an exempt decorator call, which we are not doing.

YET, we see these errors in production.

Edit

I see now that _blueprint_exemptions is a defaultdict, so a simple access may perform a write. This makes it a lot harder for me to reason about the program. I made sure we are only calling init_app after all Blueprints are constructed and all calls to Flask-Limiter's decorators are complete. I don't think I'll have time & resources to debug this.

By the way, it's worth noting that we observe these errors on endpoints that both have and do not have Flask-Limiter decorators. I feel as though the problem is that the _blueprint_exemptions map is observing blueprints it never saw before, and this is causing the race condition. Any concurrent call to _blueprint_exemption_scope and blueprint_limits could cause this issue. Nothing in the init_app method of the Limiter class suggests that all Blueprints the app knows of will be accounted for.

I'm now more convinced that we need a lock to protect access to self._blueprint_exemptions in the LimitManager.

Edit 2

An alternative solution would be to drop defaultdict usage here, in favor of a dict.get(key, default_value) pattern, which will not perform an invisible write.

@alisaifee
Copy link
Owner

Oh wow, I completely missed the defaultdict and this all makes sense now. Also, interesting that this hasn't been raised by anyone before! I'll work on a fix asap.

@alisaifee
Copy link
Owner

Fix will be available in 3.7.0

@baydoun0
Copy link
Author

baydoun0 commented May 19, 2024

Awesome! Already bumped on our end and deploying to production as we speak -- fingers crossed 🤞

@alisaifee
Copy link
Owner

@baydoun0 please close the issue if/when you've verified that it no longer exists in your environment! Thank you.

@baydoun0
Copy link
Author

I can confirm that this is no longer happening in production, so the bug is resolved! Thank you so much for fixing it this fast!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants