Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for change in UTC offset when performing next schedule calculations #30083

Closed
wants to merge 14 commits into from
Closed

Account for change in UTC offset when performing next schedule calculations #30083

wants to merge 14 commits into from

Conversation

timkpaine
Copy link
Contributor

Closes: #7999

When performing a next value calculation using a cron schedule with a non-fixed interval (e.g. * 1,2,3 * * *), changes in timezone are not properly accounted for. The functions _get_next and _get_prev do not account for the delta between the current and scheduled values' respective timezones. This PR fixes that, and accordingly fixes the closed but unfixed #7999 issue.

Note that this may want to be considered a breaking change, even though it is a fix in the behavior.

@joshzana
Copy link

@timkpaine I just logged #30088 and happened to find this PR right after logging. Does your fix resolve my issue as well?

@timkpaine
Copy link
Contributor Author

@joshzana technically my PR makes the dag run an hour earlier during e.g. the recent DST transition, since the current code is 1 hour delayed. But I ran your repro and it passed without error, so I'm going to go with "yes"

@timkpaine timkpaine changed the title fixes #7999 - account for change in UTC offset when performing next schedule calculations Account for change in UTC offset when performing next schedule calculations - fixes #7999 Mar 14, 2023
airflow/timetables/_cron.py Outdated Show resolved Hide resolved
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (wiht the small nit - @uranusjr - I think second opinion would be great.

@potiuk potiuk linked an issue Mar 15, 2023 that may be closed by this pull request
2 tasks
@uranusjr uranusjr changed the title Account for change in UTC offset when performing next schedule calculations - fixes #7999 Account for change in UTC offset when performing next schedule calculations Mar 16, 2023
@uranusjr
Copy link
Member

Many things to fix. Actually the test failure is DST related, which strongly indicates the logic is wrong (in the sense of not being backward compatible).

@timkpaine
Copy link
Contributor Author

Many things to fix. Actually the test failure is DST related, which strongly indicates the logic is wrong (in the sense of not being backward compatible).

Yep I think that's expected, current behavior does not respect DST, new behavior will shift cron schedules to continue to run on the correct hours. Let me add a second test for the other side of DST just to be sure, but if there are tests that assume the old behavior I'll need to adjust those as well, and presumably some docs (which I think I looked at and they were already slightly wrong)

@uranusjr
Copy link
Member

I want to be cautious breaking compatibility here; since there are tests ensuring the previous behaviour, there is a high chance someone is depending on it, and their work should not be interrupted.

I would suggest the new DST behaviour should be implemented in a new Timetable class, instead of changing how the cron syntax works.

@timkpaine
Copy link
Contributor Author

timkpaine commented Mar 16, 2023

changing how the cron syntax works

*fixing the cron parsing. If your cron says "* 9,10" but once a year runs on 10,11 and 8,10, it's not "changing how the syntax works", it's fixing a long-standing bug (possibly breaking users that rely on that bug). It also isn't an undoable change: if you want absolute time, use an absolute time timezone like UTC.

I don't think a separate timetable is necessary, and certainly having the default timetable when using cron be a broken one is not preferable. But maybe a flag on the existing cron timetable is acceptable? Something obvious like adjust_for_dst and possibly even a warning if you have a schedule that will be affected and don't set this flag. Most existing workarounds that I'm familiar with involve running +/- an hour and checking the real time to see if you should run, which would continue to work.

@uranusjr
Copy link
Member

But maybe a flag on the existing cron timetable is acceptable? Something obvious like adjust_for_dst and possibly even a warning if you have a schedule that will be affected and don't set this flag.

The thing here is the semantic of schedule="<cron expr>" (or schedule_interval, they are the same) cannot change (until Airflow 3.0) due to compatibility policies. And you can’t dismiss the current behaviour as a bug; the fact it is ensured by test means the behaviour is delibrately designed by someone, and likely relied on someone as a feature.

@potiuk
Copy link
Member

potiuk commented Mar 17, 2023

But maybe a flag on the existing cron timetable is acceptable? Something obvious like adjust_for_dst and possibly even a warning if you have a schedule that will be affected and don't set this flag.

The thing here is the semantic of schedule="<cron expr>" (or schedule_interval, they are the same) cannot change (until Airflow 3.0) due to compatibility policies. And you can’t dismiss the current behaviour as a bug; the fact it is ensured by test means the behaviour is delibrately designed by someone, and likely relied on someone as a feature.

Yeah. agree it's something we should be careful about, but having "adjust_standard_cron_spec_for_dst" (or smth like that) flag in the DAG and raising deprecation warning in case if someone uses "regular" cron and has this flag "false" sounds completely doable and way better as a path for potential breaking change in case we have Airflow 3 than having a separate timetable IMHO.

@timkpaine
Copy link
Contributor Author

timkpaine commented Mar 23, 2023

I agree completely that the bugged behavior is asserted by tests, and that we should be careful with a fix. I think a minor version with a new flag to restore the current behavior would be best, followed by a minor version with a new flag to get the fixed behavior would be next best, followed by waiting until a 3.0, but that's just my personal opinion.

But, I do think the behavior is bugged or at the very least unexpected. Its unclear to me what the code for fixing DST does (@uranusjr you wrote this code so I would've thought by "fix for DST" you meant "if I want to run at 9am and 10am local time, run at 9am and 10am local time even after a DST change, but I could be wrong, in which case im not sure what "fix DST" means. Maybe "if you wanted to run at 9am local time before DST, run at 10am local time after"? but in either case, it seems like either "0 9 *" and "0 9,10 *" should both run at 9am local always, or run at 9am/10am in the same way. Right now one will adjust by 1 hour, and the other will not).

Screenshot 2023-03-23 at 12 12 33

The thing here is the semantic of schedule="<cron expr>" (or schedule_interval, they are the same)

This is incorrect. schedule="<cron expr>" is sometimes schedule interval, and sometimes schedule fixed. This is the crux of the usage problem, certain cron schedules (given non-fixed hours like 1,2 or 1-3) behave as interval, others with single hours behave as fixed.

And you can’t dismiss the current behaviour as a bug

I think it's almost certainly a bug. From the airflow docs themselves:

Screenshot 2023-03-23 at 12 01 36

This is incorrect, it only applies to non-fixed cron schedules not fixed cron schedules. Also this is not what "respect daylight savings time" means to most readers I would assume.

From astronomer.io docs:

Screenshot 2023-03-23 at 12 02 22

Also incorrect, similar reasons.

Some other samplings from the internet:
Incorrect post:
Screenshot 2023-03-23 at 12 05 47

It does behave like that depending on the type of cron schedule you use.

StackOverflow showing how to account for the bug by running extra tasks on either side of the intended tasks, clearly not the expected or desired behavior:
https://stackoverflow.com/questions/43662571/how-to-properly-handle-daylight-savings-time-in-apache-airflow

the fact it is ensured by test means the behaviour is delibrately designed by someone, and likely relied on someone as a feature.

Completely agree, it is always annoying when bugs (or if we want to agree to disagree, "unexpected behavior") become relied on as features but given that common understanding would expect airflow to account for DST, a minority of users relying on certain behavior should be deprioritized over a majority of users expecting another, with obvious requirements of documenting/announcing and versioning so that all users have a path forward.

@github-actions
Copy link

github-actions bot commented May 8, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label May 8, 2023
@github-actions github-actions bot closed this May 14, 2023
@ashb ashb reopened this Jul 25, 2023
@ashb
Copy link
Member

ashb commented Jul 25, 2023

Not respecting local TZ around DST change was a bug that was introduced with our switch to timetables.

Prior to that change the DST was meant to be reflected. So this is fixing a bug that will bite some people -- it's just that using local timezones isn't all that common.

@timkpaine If you are able and willing could you add a news fragment to warn users of this change, phrased something as "this reverts the TZ behavoiur around DST transitions to the pre $VER behaviour", then we can get this in.

@timkpaine
Copy link
Contributor Author

timkpaine commented Jul 25, 2023

after a few months away, I think all this code is unnecessary. You can simply force _should_fix_dst to False. The code that I'm "fixing" here was deliberately introduced in order to make sure something that runs at e.g. 9am runs at 10am during DST. To revert it, you don't need to add additional adjustment code, you just need to undo it:

    @cached_property
    def _should_fix_dst(self) -> bool:
        # Deprecated
        return False

    def _get_next(self, current: DateTime) -> DateTime:
        """Get the first schedule after specified time."""
        naive = make_naive(current, self._timezone)
        cron = croniter(self._expression, start_time=naive)
        scheduled = cron.get_next(datetime.datetime)
        return convert_to_utc(make_aware(scheduled, self._timezone))

    def _get_prev(self, current: DateTime) -> DateTime:
        """Get the first schedule before specified time"""
        naive = make_naive(current, self._timezone)
        cron = croniter(self._expression, start_time=naive)
        scheduled = cron.get_prev(datetime.datetime)
        return convert_to_utc(make_aware(scheduled, self._timezone))

I'm happy to make this change if desired.

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Jul 26, 2023
@eladkal eladkal added this to the Airflow 2.7.0 milestone Jul 26, 2023
@eladkal eladkal added the type:improvement Changelog: Improvements label Jul 26, 2023
@pankajkoti pankajkoti self-requested a review August 7, 2023 06:48
uranusjr and others added 3 commits November 14, 2023 21:44
This is private interface anyway so we don't need to keep it around.
Also move the logic into DAG.is_fixed_time_schedule() so it is not
broken by the removal of the logic.
Comment on lines +3 to +13
Given a cron expression resulting in an interval-based schedule, such as
``0 9-16 * * *``, job after a timezone change (e.g. entering or exiting daylight
saving time) used to be scheduled with an offset to the change in timezone. For
example, when operating in *America/New_York*, the above schedule would result
in a run at 10am on Monday morning after entering DST, instead of the previous
9am.

This was intentional to account for the change in timezone introduced in Airflow
2.2, but caused confusion to several use cases. The functionality is now
reverted to its pre-2.2 state, and timetables will now always run at the same
*clock time*, e.g. 9am in the above example.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we should state that this was a bug that we are fixing, not simply a change that we are reverting. We have observed that it's not simply that the schedule doesn't run at the correct clock time, but that the dag actually gets stuck and fails to run.

Additionally I'm a bit confused by the 9am / 10am example. Are you saying that the behavior before this fix would be that the dag would run at 10am until DST transitions back, 6 months later? Or only the first run?

Comment on lines 822 to 847
# def test_following_previous_schedule_daily_dag_cet_to_cest(self):
# """
# Make sure DST transitions are properly observed
# """
# local_tz = pendulum.timezone("Europe/Zurich")
# start = local_tz.convert(datetime.datetime(2018, 3, 25, 2), dst_rule=pendulum.PRE_TRANSITION)

utc = timezone.convert_to_utc(start)
# utc = timezone.convert_to_utc(start)

dag = DAG("tz_dag", start_date=start, schedule="0 3 * * *")
# dag = DAG("tz_dag", start_date=start, schedule="0 3 * * *")

prev = dag.previous_schedule(utc)
prev_local = local_tz.convert(prev)
# prev = dag.previous_schedule(utc)
# prev_local = local_tz.convert(prev)

assert prev_local.isoformat() == "2018-03-24T03:00:00+01:00"
assert prev.isoformat() == "2018-03-24T02:00:00+00:00"
# assert prev_local.isoformat() == "2018-03-24T03:00:00+01:00"
# assert prev.isoformat() == "2018-03-24T02:00:00+00:00"

_next = dag.following_schedule(utc)
next_local = local_tz.convert(_next)
# _next = dag.following_schedule(utc)
# next_local = local_tz.convert(_next)

assert next_local.isoformat() == "2018-03-25T03:00:00+02:00"
assert _next.isoformat() == "2018-03-25T01:00:00+00:00"
# assert next_local.isoformat() == "2018-03-25T03:00:00+02:00"
# assert _next.isoformat() == "2018-03-25T01:00:00+00:00"

prev = dag.previous_schedule(_next)
prev_local = local_tz.convert(prev)
# prev = dag.previous_schedule(_next)
# prev_local = local_tz.convert(prev)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems something needs to be addressed here. Should these tests be removed? Or uncommented? Or do we have a decision to make about what the behavior should be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test actually almost passes, only fails in the final previous_schedule call. Since previous_schedule is deprecated anyway, I think it is OK to just remove that part.

I’m also converting it into a timetable test to not rely on following_schedule, which is also deprecated.

assert cron_interval._get_prev(last_run).timestamp() == cron_fixed._get_prev(last_run).timestamp()


def test_cron_interval_dst_next_leaving():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably verify the DST transition behavior with the CronTriggerTimetable class also. perhaps parameterizing would be an easy way to do that.

@dstandish
Copy link
Contributor

@timkpaine you still interested in working on this or you want someone else to take over?

@timkpaine
Copy link
Contributor Author

I'm still working on it but I don't mind others pushing to my fork, the big outstanding items are make sure the two blocks of tests are fully uncommented, delete the redundant tests I added, and switch from pendulum internals to date time fold.

uranusjr and others added 3 commits November 24, 2023 14:29
Co-authored-by: Daniel Standish <[email protected]>
Pendulum 3 will remove the dst_rule flag altogether, so this changes the
code to use the fold flag instead, which will be the only way moving
forward, and has been supported in Pendulum 2 for a long time anyway.
@uranusjr
Copy link
Member

uranusjr commented Nov 27, 2023

Hmm, after some more tests I’m not sure we have the right fix here. Using Switzerland as an example (it is +1/+2, making the mental arithmetic a bit easier), DST was entered in 2023 at 1am UTC (2→3am local) on March 26, so for 0 3 * * *` (every day at 3am), the intervals should be

  • UTC 2023-03-24T02:00:00 to 2023-03-25T02:00:00 (no DS so +1 for both)
  • UTC 2023-03-25T02:00:00 to 2023-03-26T01:00:00 (entering DS, the interval ends one hour early!)
  • UTC 2023-03-26T01:00:00 to 2023-03-27T01:00:00 (inside DST)

It is important for the second interval to end on 1am UTC since the wall clock in Switzerland goes to 3am at this moment. But from my testing the second interval does not end correctly with the current implementation in this PR. We need more work here.

@timkpaine
Copy link
Contributor Author

timkpaine commented Nov 27, 2023

"0 3 * * *" shouldn't be affected by this PR as that's a fixed timetable not an interval timetable I thought? For that schedule it should run at 3am local throughout, so in UTC it should shift by 1 hour

"0 9-16 * * 1-5", timezone=pendulum.timezone("America/New_York")
)
# Fixed cron
cron_fixed = CronDataIntervalTimetable("0 9 * * 1-5", timezone=pendulum.timezone("America/New_York"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the test below need to get rid of 1-5 so that they're fixed not interval

@uranusjr
Copy link
Member

"0 3 * * *" shouldn't be affected by this PR as that's a fixed timetable not an interval timetable I thought?

A cron expression can be either fixed or interval—those are orthogontal concepts. Using cron as an interval timetable specifically means intervals are generated from adjacent times. This change breaks CronDataIntervalTimetable.

@uranusjr
Copy link
Member

uranusjr commented Dec 3, 2023

I posted some details in #35887. Unfortunately I don’t think it is technically possible to fix this in Airflow.

@uranusjr
Copy link
Member

uranusjr commented Dec 4, 2023

Closing in favour of #35887. That PR now passes test cases in this one.

@uranusjr uranusjr closed this Dec 4, 2023
@timkpaine
Copy link
Contributor Author

timkpaine commented Dec 4, 2023

I don't think that's sufficient, it looks like the commit above when I disabled the second block of code in get next which didn't end up working in all cases. I'm going to keep on this one as I think it will be more complete, please reopen for now.

@uranusjr
Copy link
Member

uranusjr commented Dec 5, 2023

Sure, although I am relatively sure this would not work well without support from croniter.

@uranusjr uranusjr reopened this Dec 5, 2023
@timkpaine
Copy link
Contributor Author

See #35887 (comment)

@timkpaine timkpaine closed this Dec 5, 2023
@ephraimbuddy ephraimbuddy removed this from the Airflow 2.8.0 milestone Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AssertionError: next schedule shouldn't be earlier around DST change Incorrect DAG scheduling after DST
9 participants