Skip to content

BUG: Fixed DataFrame.resample() generating errors when bin edges fall on DST transitions#65314

Open
presidentjacob wants to merge 23 commits intopandas-dev:mainfrom
presidentjacob:bugfix/dataframe
Open

BUG: Fixed DataFrame.resample() generating errors when bin edges fall on DST transitions#65314
presidentjacob wants to merge 23 commits intopandas-dev:mainfrom
presidentjacob:bugfix/dataframe

Conversation

@presidentjacob
Copy link
Copy Markdown

Problem Description
DataFrame.resample() on DataFrames where edges exist in DST gaps would cause NonExistentTimeError instead of handling DST gracefully and pushing time forward by 1 hour.

Root Cause
The resampling bin-edge construction assumed calendar-day anchors like local midnight always exist in the timezone of the target. Around DST transitions where midnight is a non-existent local time (such as Africa/Cairo on April 26th, where 00:00:00 does not exist), the assumption causes failures. Edge generation produced invalid boundaries.

Fix
Fix strips the timezone before normalizing to avoid hitting the nonexistent local time. The returned tz-naive timestamps are re-localized by the date_range call in _get_time_bins via nonexistent="shift_forward"

In _get_timestamp_range_edges in pandas/core/resample.py:

if first.tz is not None:
      first = first.tz_localize(None)
      first = first.normalize()
else:
      first = first.normalize()

if last.tz is not None:
      last = last.tz_localize(None)
      last = last.normalize()
else:
      last = last.normalize()

if closed == "left":
      first = Timestamp(freq.rollback(first))
else:
      first = Timestamp(first - freq)

last = Timestamp(last + freq)

@Liam3851
Copy link
Copy Markdown
Contributor

If I understand correctly this appears to also be a suggested fix for issues #40517, #30378, and #62602. Note also related (but unmerged!) PRs #41043 and #62633. Issue #40517 had discussion coalescing around a universal shift_forward in normalize being the correct solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: DataFrame.resample() fails across DST transitions with NonExistentTimeError in timezone-aware data

3 participants