You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For about 40 minutes the other night, the scheduler crashed with this error:
AssertionError: next schedule shouldn't be earlier
File "airflow/models/dag.py", line 916, in next_dagrun_info
info = self.timetable.next_dagrun_info(
File "airflow/timetables/interval.py", line 87, in next_dagrun_info
earliest = self._skip_to_latest(earliest)
File "airflow/timetables/interval.py", line 154, in _skip_to_latest
raise AssertionError("next schedule shouldn't be earlier")
The time of the crash happened to be on 3/12/2023, which also happens to be when the daylight savings time comes into effect here in the US.
What you think should happen instead
The scheduler shouldn't crash around this time, no matter the cron timetable
How to reproduce
I was able to reconstruct the call stack from our monitoring and reproduced in a simple unit test:
Note that I am mock out the current time with the time of one our crashes, and all the other variables came from the call stack that the scheduler used. I don't have control of any of those other inputs.
Running this with pytest yields the same assert
self = <airflow.timetables.interval.CronDataIntervalTimetable object at 0x7f8adb26d400>
earliest = DateTime(2021, 12, 1, 8, 0, 0, tzinfo=Timezone('UTC'))
def _skip_to_latest(self, earliest: DateTime | None) -> DateTime:
"""Bound the earliest time a run can be scheduled.
The logic is that we move start_date up until one period before, so the
current time is AFTER the period end, and the job can be created...
This is slightly different from the delta version at terminal values.
If the next schedule should start *right now*, we want the data interval
that start now, not the one that ends now.
"""
current_time = DateTime.utcnow()
last_start = self._get_prev(current_time)
next_start = self._get_next(last_start)
if next_start == current_time: # Current time is on interval boundary.
new_start = last_start
elif next_start > current_time: # Current time is between boundaries.
new_start = self._get_prev(last_start)
else:
> raise AssertionError("next schedule shouldn't be earlier")
E AssertionError: next schedule shouldn't be earlier
/home/airflow/.local/lib/python3.9/site-packages/airflow/timetables/interval.py:154: AssertionError
Apache Airflow version
2.5.1
What happened
We have a DAG with a recurring cron set to run every 4 hours, set up like this:
For about 40 minutes the other night, the scheduler crashed with this error:
The time of the crash happened to be on 3/12/2023, which also happens to be when the daylight savings time comes into effect here in the US.
What you think should happen instead
The scheduler shouldn't crash around this time, no matter the cron timetable
How to reproduce
I was able to reconstruct the call stack from our monitoring and reproduced in a simple unit test:
Note that I am mock out the current time with the time of one our crashes, and all the other variables came from the call stack that the scheduler used. I don't have control of any of those other inputs.
Running this with pytest yields the same assert
Operating System
Ubuntu 20.04.3 LTS (Focal Fossa)
Versions of Apache Airflow Providers
n/a
Deployment
Docker-Compose
Deployment details
No response
Anything else
No response
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: