Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incompatibility between pandas.infer_freq and pandas.to_timedelta #14398

Closed
idanivanov opened this issue Oct 11, 2016 · 3 comments
Closed

Incompatibility between pandas.infer_freq and pandas.to_timedelta #14398

idanivanov opened this issue Oct 11, 2016 · 3 comments
Labels

Comments

@idanivanov
Copy link

idanivanov commented Oct 11, 2016

A small, complete example of the issue

The output of pd.infer_freq is "D" while pd.to_timedelta expects "1D".

import pandas as pd
dates = pd.date_range(start='2016-10-01', end='2016-10-10', freq='1D')
freq = pd.infer_freq(dates)
delta = pd.to_timedelta(freq)

Expected Output

delta = Timedelta('1 days 00:00:00')

>>> import pandas as pd
>>> dates = pd.date_range(start='2016-10-01', end='2016-10-10', freq='1D')
>>> dates
DatetimeIndex(['2016-10-01', '2016-10-02', '2016-10-03', '2016-10-04',
               '2016-10-05', '2016-10-06', '2016-10-07', '2016-10-08',
               '2016-10-09', '2016-10-10'],
              dtype='datetime64[ns]', freq='D')
>>> freq = pd.infer_freq(dates)
>>> freq
'D'
>>> delta = pd.to_timedelta(freq)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/iivanov/anaconda2/lib/python2.7/site-packages/pandas/util/decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/iivanov/anaconda2/lib/python2.7/site-packages/pandas/tseries/timedeltas.py", line 102, in to_timedelta
    box=box, errors=errors)
  File "/home/iivanov/anaconda2/lib/python2.7/site-packages/pandas/tseries/timedeltas.py", line 148, in _coerce_scalar_to_timedelta_type
    result = tslib.convert_to_timedelta(r, unit, errors)
  File "pandas/tslib.pyx", line 2941, in pandas.tslib.convert_to_timedelta (pandas/tslib.c:52631)
  File "pandas/tslib.pyx", line 3256, in pandas.tslib.convert_to_timedelta64 (pandas/tslib.c:56028)
  File "pandas/tslib.pyx", line 3188, in pandas.tslib.parse_timedelta_string (pandas/tslib.c:54877)
ValueError: unit abbreviation w/o a number

Output of pd.show_versions()

## INSTALLED VERSIONS

commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-42-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 26.1.1
Cython: 0.24.1
numpy: 1.11.1
scipy: 0.18.0
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.2
lxml: 3.6.4
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.40.0
pandas_datareader: None

@chris-b1
Copy link
Contributor

pd.to_timedelta can only handle fixed deltas (basically days and finer) - it isn't an API guarantee that frequency strings can be parsed as a delta, and in fact some frequencies can't be converted to a timedelta.

In [254]: dates = pd.bdate_range(start='2014-01-01', periods=10)

In [255]: pd.infer_freq(dates)
Out[255]: 'B'

If you need to convert a frequency string into a DateOffset object, which can be a fixed or relative delta, use the to_offset function.

In [253]: from pandas.tseries.frequencies import to_offset

In [256]: to_offset('D')
Out[256]: <Day>

In [257]: to_offset('2D')
Out[257]: <2 * Days>

In [258]: to_offset('B')
Out[258]: <BusinessDay>

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Oct 11, 2016

BTW, you can convert such an offset to a timedelta for certain types:

In [19]: pd.to_timedelta(to_offset('D'))
Out[19]: Timedelta('1 days 00:00:00')

In [20]: pd.to_timedelta(to_offset('B'))
...
ValueError: Invalid type for timedelta scalar: <class 'pandas.tseries.offsets.BusinessDay'>

(or alternative to_offset('D').delta)
As @chris-b1 noted above, relative offsets cannot be converted to a timedelta, so this also errors.

@idanivanov
Copy link
Author

Thank you for the comments! It makes sense now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants