BUG: not reproducible error FloatingPointError: overflow encountered in multiply
in the following sequence: read_csv followed by to_datetime with pandas version 2.2.2
#58419
Labels
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
Issue Description
I, sometimes, get the following error with pandas 2.2.2 (I don't have this error with pandas 2.1.4):
The error is not repeatable, hence the loop. I tried to reduce as much as possible the input file while keeping the raised error, this is why I provided a csv file with 200 rows, attached to this issue. I don't know if the issue is due to the
read_csv
(I got the same problem withread_parquet
) or due toto_datetime
. If theread_csv
is outside the loop and I make a deepcopy at the beginning of each loop, I don't have the problem, so my hunch is that this is linked to the reading process (read_csv
in the example).Expected Behavior
I expect the loop content to have the same behaviour, works every time or fails every time.
Installed Versions
INSTALLED VERSIONS
commit : d9cdd2e
python : 3.11.8.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-105-generic
Version : #115~20.04.1-Ubuntu SMP Mon Apr 15 17:33:04 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 65.5.0
pip : 24.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 16.0.0
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None
data.csv
The text was updated successfully, but these errors were encountered: