Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HMA: Enable gunicorn task scheduling #1565

Merged
merged 2 commits into from May 2, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
15 changes: 14 additions & 1 deletion hasher-matcher-actioner/src/OpenMediaMatch/app.py
Expand Up @@ -48,6 +48,10 @@ def _is_werkzeug_reloaded_process():
return os.environ.get("WERKZEUG_RUN_MAIN") == "true"


def _is_gunicorn():
return "gunicorn" in os.environ.get("SERVER_SOFTWARE", "")


def _setup_task_logging(app_logger: logging.Logger):
"""Clownily replace module loggers with our own"""
fetcher.logger = app_logger.getChild("Fetcher")
Expand Down Expand Up @@ -106,9 +110,18 @@ def create_app() -> flask.Flask:
scheduler: APScheduler | None = None

with app.app_context():
# For Werkzeug/debug deployments:
# We only run apscheduler in the "outer" reloader process, else we'll
# have multiple executions of the the scheduler in debug mode
Comment on lines 114 to 115
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may consider adding a note about gunicorn production code, and the dangers/desirability of running multiple schedulers there.

if _is_werkzeug_reloaded_process() and not running_migrations:
#
# For Gunicorn/production deployments:
# There is currently no check for multiple schedulers running.
# DO NOT RUN multiple workers with TASK_FETCHER=True or TASK_INDEXER=True -
# running multiple instances of these tasks may cause database conflicts
# or other undesireable behavior
if (
_is_werkzeug_reloaded_process() or _is_gunicorn()
) and not running_migrations:
now = datetime.datetime.now()
scheduler = dev_apscheduler.get_apscheduler()
scheduler.init_app(app)
Expand Down