Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] examples/tutorial/01-preprocess.ipynb: Convert timestamp from datetime - NotImplementedError: cuDF does not yet support timezone-aware datetimes #777

Open
zwei2016 opened this issue Apr 25, 2024 · 0 comments
Labels
bug Something isn't working status/needs-triage

Comments

@zwei2016
Copy link

Bug description

raw_df['event_time_dt'] = raw_df['event_time'].astype('datetime64[s]')
raw_df['event_time_ts']= raw_df['event_time_dt'].astype('int')
raw_df.head()

NotImplementedError Traceback (most recent call last)
Cell In[4], line 4
1 #import datetime
2 #raw_df['event_time'] = cudf.to_datetime(raw_df['event_time'], format='%Y-%m-%d %H:%M:%S')
----> 4 raw_df['event_time_dt'] = raw_df['event_time'].astype('datetime64[s]')
5 raw_df['event_time_ts']= raw_df['event_time_dt'].astype('int')
6 raw_df.head()

File ~/miniconda3/lib/python3.10/site-packages/nvtx/nvtx.py:116, in annotate.call..inner(*args, **kwargs)
113 @wraps(func)
114 def inner(*args, **kwargs):
115 libnvtx_push_range(self.attributes, self.domain.handle)
--> 116 result = func(*args, **kwargs)
117 libnvtx_pop_range(self.domain.handle)
118 return result

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/series.py:2102, in Series.astype(self, dtype, copy, errors)
2100 else:
2101 dtype = {self.name: dtype}
-> 2102 return super().astype(dtype, copy, errors)

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/indexed_frame.py:5009, in IndexedFrame.astype(self, dtype, copy, errors)
5007 except Exception as e:
5008 if errors == "raise":
-> 5009 raise e
5010 return self
5012 return self._from_data(data, index=self._index)

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/indexed_frame.py:5006, in IndexedFrame.astype(self, dtype, copy, errors)
5003 raise ValueError("invalid error value specified")
5005 try:
-> 5006 data = super().astype(dtype, copy)
5007 except Exception as e:
5008 if errors == "raise":

File ~/miniconda3/lib/python3.10/site-packages/nvtx/nvtx.py:116, in annotate.call..inner(*args, **kwargs)
113 @wraps(func)
114 def inner(*args, **kwargs):
115 libnvtx_push_range(self.attributes, self.domain.handle)
--> 116 result = func(*args, **kwargs)
117 libnvtx_pop_range(self.domain.handle)
118 return result

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/frame.py:272, in Frame.astype(self, dtype, copy)
270 @_cudf_nvtx_annotate
271 def astype(self, dtype, copy: bool = False):
--> 272 result_data = {
273 col_name: col.astype(dtype.get(col_name, col.dtype), copy=copy)
274 for col_name, col in self._data.items()
275 }
277 return ColumnAccessor(
278 data=result_data,
279 multiindex=self._data.multiindex,
(...)
283 verify=False,
284 )

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/frame.py:273, in (.0)
270 @_cudf_nvtx_annotate
271 def astype(self, dtype, copy: bool = False):
272 result_data = {
--> 273 col_name: col.astype(dtype.get(col_name, col.dtype), copy=copy)
274 for col_name, col in self._data.items()
275 }
277 return ColumnAccessor(
278 data=result_data,
279 multiindex=self._data.multiindex,
(...)
283 verify=False,
284 )

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/column/column.py:1002, in ColumnBase.astype(self, dtype, copy)
1000 return col.as_decimal_column(dtype)
1001 elif np.issubdtype(cast(Any, dtype), np.datetime64):
-> 1002 return col.as_datetime_column(dtype)
1003 elif np.issubdtype(cast(Any, dtype), np.timedelta64):
1004 return col.as_timedelta_column(dtype)

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/column/string.py:5749, in StringColumn.as_datetime_column(self, dtype, format)
5742 return cast(
5743 "cudf.core.column.DatetimeColumn",
5744 column.column_empty(
5745 len(self), dtype=out_dtype, masked=True
5746 ),
5747 )
5748 else:
-> 5749 format = datetime.infer_format(
5750 self.apply_boolean_mask(self.notnull()).element_indexing(0)
5751 )
5753 if format.endswith("%z"):
5754 raise NotImplementedError(
5755 "cuDF does not yet support timezone-aware datetimes"
5756 )

File ~/miniconda3/lib/python3.10/site-packages/cudf/core/column/datetime.py:108, in infer_format(element, **kwargs)
106 if fmt is not None:
107 if "%z" in fmt or "%Z" in fmt:
--> 108 raise NotImplementedError(
109 "cuDF does not yet support timezone-aware datetimes"
110 )
111 if ".%f" not in fmt:
112 # For context read:
113 # pandas-dev/pandas#52418
114 # We cannot rely on format containing only %f
115 # c++/libcudf expects .%3f, .%6f, .%9f
116 # Logic below handles those cases well.
117 return fmt

NotImplementedError: cuDF does not yet support timezone-aware datetimes

Steps/Code to reproduce bug

  1. just run the code raw_df['event_time'].astype('datetime64[s]')

Expected behavior

No error

Environment details

  • Platform: WSL2 + win 11
  • Python version: 3.10.14
  • cuDF version: '24.04.00'
  • nvtabular version: '23.08.00'
  • PyTorch version (GPU): '2.2.2+cu121'

Additional context

Problem solved by adding one line

## added one line 
raw_df['event_time'] = cudf.to_datetime(raw_df['event_time'], format='%Y-%m-%d %H:%M:%S')
##
raw_df['event_time_dt'] = raw_df['event_time'].astype('datetime64[s]')
raw_df['event_time_ts']= raw_df['event_time_dt'].astype('int')
raw_df.head()
@zwei2016 zwei2016 added bug Something isn't working status/needs-triage labels Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working status/needs-triage
Projects
None yet
Development

No branches or pull requests

1 participant