Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation of polars.datetime returns null when dealing with nanoseconds #16124

Open
2 tasks done
marenwestermann opened this issue May 8, 2024 · 2 comments
Open
2 tasks done
Labels
A-timeseries Area: date/time functionality bug Something isn't working P-low Priority: low python Related to Python Polars

Comments

@marenwestermann
Copy link
Contributor

marenwestermann commented May 8, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

>>> import polars as pl
>>> a = pl.datetime(2024, 1, 1, 2, 2, 2, 123456789)
>>> pl.select(a)
shape: (1, 1)
┌──────────────┐
│ datetime     │
│ ---          │
│ datetime[μs] │
╞══════════════╡
│ null         │
└──────────────┘
>>> b = pl.datetime(2024, 1, 1, 2, 2, 2, 123456789, time_unit='ns')
>>> pl.select(b)
shape: (1, 1)
┌──────────────┐
│ datetime     │
│ ---          │
│ datetime[ns] │
╞══════════════╡
│ null         │
└──────────────┘

Log output

No response

Issue description

It is possible to include nanoseconds when creating an expression with polars.datetime. However, when the expression gets evaluated, the result is null (see examples above).

Expected behavior

A warning should be raised that polars.datetime cannot be evaluated if nanoseconds are included. Additionally, the option "ns" might need to be removed from the documentation of the parameter time_unit.

Installed versions

--------Version info---------
Polars:               0.20.25
Index type:           UInt32
Platform:             macOS-14.4.1-arm64-arm-64bit
Python:               3.12.3 (v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:47) [Clang 13.0.0 (clang-1300.0.29.30)]

----Optional dependencies----
adbc_driver_manager:  0.11.0
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            0.17.3
fastexcel:            0.10.4
fsspec:               2023.12.2
gevent:               24.2.1
hvplot:               0.10.0
matplotlib:           3.8.4
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             3.1.2
pandas:               2.2.2
pyarrow:              16.0.0
pydantic:             2.7.1
pyiceberg:            0.6.1
pyxlsb:               1.0.10
sqlalchemy:           2.0.30
torch:                <not installed>
xlsx2csv:             0.8.2
xlsxwriter:           3.2.0
@marenwestermann marenwestermann added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels May 8, 2024
@marenwestermann
Copy link
Contributor Author

ping @MarcoGorelli

@MarcoGorelli MarcoGorelli added A-timeseries Area: date/time functionality P-low Priority: low and removed needs triage Awaiting prioritization by a maintainer labels May 9, 2024
@datenzauberai
Copy link
Contributor

datenzauberai commented May 17, 2024

The problem here is that time_unit determines the internal representation of the polars.datetime and not the unit of the seventh parameter (which is always microseconds).

ms = pl.datetime(1990, 12, 31, 10, 0, 59, 999999, time_unit="ms").alias("ms")
ns = pl.datetime(1990, 12, 31, 10, 0, 59, 999999, time_unit="ns").alias("ns")
pl.select(ms, ns)

However, I think this should not silently fail and return null or overflow when microseconds > 999999 which it does not do consistently.

# overflows to 1991-01-01 00:00:00.200
pl.select(pl.datetime(1990, 12, 31, 23, 59, 59, 1200000, time_unit="ns").alias("ns"))
# returns null
pl.select(pl.datetime(1990, 12, 31, 23, 59, 58, 1200000, time_unit="ns").alias("ns"))

Either it should fail, return null or overflow?!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-timeseries Area: date/time functionality bug Something isn't working P-low Priority: low python Related to Python Polars
Projects
Status: Ready
Development

No branches or pull requests

3 participants