-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue #604 in django-celery-beat: Celery Beat Crashing at the End of Daylight Savings #7901
base: main
Are you sure you want to change the base?
Conversation
can you rebase on top top of master again please? |
Done rebasing. |
I am not sure why builds are failing, can you check please? are the tests passing locally? |
36c6fc3
to
405ddfd
Compare
I am a little confused as to what the behavior should be for Celery beat. Currently, the only unit test that is failing is from https://github.com/celery/celery/blob/master/t/unit/app/test_schedules.py#L441. This unit test was added to deal with #1604. This is where the confusion lies. During the end of DST, my understanding is that we do want the tasks to run twice. For example, a Celery task runs at 1:30 PDT and 1:30 PST. From our business perspective, this makes the most sense. However, the problem in #1604 seems to be that they do not want to the tasks to run twice. What should be the expected behavior? Should we add a configuration parameter to control this? If so, what is the default parameter? |
I will come back to this soon |
can you please elaborate why a task should run twice? |
can you please revisit this? I want to know more of your thoughts |
For instance, let us assume that we have a task that we want to run every hour. An example of such important task is sending a timely email every hour at the 30 minute mark. Then, we would want this task to run both at 1:30 AM PDT AND 1:30 AM PST. In our business, we send such timely emails for many of our features hourly at different minute marks and having one hour of downtime is detrimental. |
I think running a task multiple times in different timezone should be allowed. |
Thanks, @auvipy ! In that case, keep us posted on the progress of the review. |
can't review it as the builds are failing and there are merge conflicts. we need those two resolved before moving further. |
Can you run the builds again? |
re started and builds are failing |
can you restart this? |
Yes. Yesterday, we discovered a potential bug with this change that was exposed by the latest DST transition. I will summarize it and fix it when I have time. |
Root Cause of BugIn the existing logic of the Celery source code, it will convert the This works for periodic tasks that run every X minutes, but the logic seems to be incorrect if the beat task uses a An If the Saturday in the example was prior to the transition and the Sunday was afterwards, then we expect the Symptoms of BugThe first time that the Celery beat task will run will be an hour earlier than the scheduled time. Then, the Celery beat task will run again at the expected time. This causes the Celery beat task to run an additional time unnecessarily. Proposed FixWe can check if
|
I think I fixed the issue. We can see whether the CI passes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following test fails as per CI report. we need to fix it to make everything green again
=================================== FAILURES ===================================
________________________________ test_remaining ________________________________
def test_remaining():
# Relative
remaining(datetime.utcnow(), timedelta(hours=1), relative=True)
"""
The upcoming cases check whether the next run is calculated correctly
"""
eastern_tz = ZoneInfo("US/Eastern")
tokyo_tz = ZoneInfo("Asia/Tokyo")
eastern_tz_pytz = pytz.timezone("US/Eastern")
tokyo_tz_pytz = pytz.timezone("Asia/Tokyo")
# Case 1: `start` in UTC and `now` in other timezone
start = datetime.now(ZoneInfo("UTC"))
now = datetime.now(eastern_tz)
delta = timedelta(hours=1)
assert str(start.tzinfo) == str(ZoneInfo("UTC"))
assert str(now.tzinfo) == str(eastern_tz)
rem_secs = remaining(start, delta, now).total_seconds()
# assert remaining time is approximately equal to delta
assert rem_secs == pytest.approx(delta.total_seconds(), abs=1)
# Case 2: `start` and `now` in different timezones (other than UTC)
start = datetime.now(eastern_tz)
now = datetime.now(tokyo_tz)
delta = timedelta(hours=1)
assert str(start.tzinfo) == str(eastern_tz)
assert str(now.tzinfo) == str(tokyo_tz)
rem_secs = remaining(start, delta, now).total_seconds()
assert rem_secs == pytest.approx(delta.total_seconds(), abs=1)
"""
Case 3: DST check
Suppose start (which is last_run_time) is in EST while next_run is in EDT,
then check whether the `next_run` is actually the time specified in the
start (i.e. there is not an hour diff due to DST).
In 2019, DST starts on March 10
"""
start = datetime(day=9, month=3, year=2019, hour=10, minute=0, tzinfo=eastern_tz) # EST
now = datetime(day=11, month=3, year=2019, hour=1, minute=0, tzinfo=eastern_tz) # EDT
delta = ffwd(hour=10, year=2019, microsecond=0, minute=0, second=0, day=11, weeks=0, month=3)
# `next_actual_time` is the next time to run (derived from delta)
next_actual_time = datetime(
day=11, month=3, year=2019, hour=10, minute=0, tzinfo=eastern_tz) # EDT
assert start.tzname() == "EST"
assert now.tzname() == "EDT"
assert next_actual_time.tzname() == "EDT"
rem_time = remaining(start, delta, now)
next_run = now + rem_time
assert next_run == next_actual_time
"""
Case 4: DST check (ZoneInfo, timedelta)
Suppose start (which is last_run_time) is in PDT while next_run is in PST,
Check whether there is an hour added to the time between now and the `next_run`
In 2022, DST ends on Nov 6
"""
start = datetime(day=6, month=11, year=2022, hour=1, minute=15, tzinfo=eastern_tz, fold=0)
now = datetime(day=6, month=11, year=2022, hour=1, minute=34, tzinfo=eastern_tz, fold=1)
ends_in = timedelta(minutes=80)
next_actual_time = datetime(day=6, month=11, year=2022, hour=1, minute=35, tzinfo=eastern_tz, fold=1)
assert start.tzname() == "EDT"
assert now.tzname() == "EST"
assert next_actual_time.tzname() == "EST"
rem_time = remaining(start, ends_in, now)
print(start + ends_in - now)
next_run = now + rem_time
assert next_run == next_actual_time
E AssertionError: assert datetime.datetime(2022, 11, 6, 2, 35, tzinfo=ZoneInfo(key='US/Eastern')) == datetime.datetime(2022, 11, 6, 1, 35, tzinfo=ZoneInfo(key='US/Eastern'), fold=1)
t/unit/utils/test_time.py:193: AssertionError
for more information, see https://pre-commit.ci
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7901 +/- ##
==========================================
+ Coverage 87.24% 87.33% +0.09%
==========================================
Files 148 148
Lines 18637 18526 -111
Branches 3199 3167 -32
==========================================
- Hits 16260 16180 -80
+ Misses 2080 2060 -20
+ Partials 297 286 -11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
It seems unit tests are now passing. I restarted the failing pypy build as it seems to be a network issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to check/fix the lint issues as well. thanks for all your efforts here!! I will come back to this next morning to check if it is merge able with 5.3.5 release
Integration tests are now passing |
t/unit/utils/test_time.py:136:5: F841 local variable 'tokyo_tz_pytz' is assigned to but never used |
Will try to push lint fixes |
I might consider this for 5.3.6 if it do not break backward compat |
@Nusnus can we consider this for a patch release? |
I haven't fully reviewed the entire PR, but from the original description:
It looks like the other PR isn't ready, besides this PR having failures in CI. That being said, if this work can be baked enough, I support merging on, but we'll have it for v5.4 if it will be ready by then. |
Yeah it's better off for 5.4 |
celery/utils/time.py
Outdated
@@ -218,12 +218,15 @@ def remaining( | |||
~datetime.timedelta: Remaining time. | |||
""" | |||
now = now or datetime.utcnow() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://docs.python.org/3/library/datetime.html#datetime.datetime.utcnow is deprecated in Python 3.12
Dependency
celery/django-celery-beat#605
Note: The two PRs are codependent and need to be merged together.
Description
These changes are necessary to fix celery/django-celery-beat#604 and is a dependency for celery/django-celery-beat#605. Both issues have more detailed information about the problem and how testing was performed for the changes.