Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Raise TypeError for float->timedelta promotion #16592

Merged
merged 1 commit into from Jun 19, 2020

Conversation

anirudh2290
Copy link
Member

Fixes #16566

@anirudh2290 anirudh2290 changed the title BUG: Raise TypeError for f32->timedelta promotion BUG: Raise TypeError for float->timedelta promotion Jun 12, 2020
@seberg
Copy link
Member

seberg commented Jun 16, 2020

@jorisvandenbossche in my branch on array-coercion, the one failure that is obviously related to the change (I expect most others are just due to the master branch/other issues), is a test like this:

    def test_infer_dtype_timedelta_with_na(self, na_value, delta):
        # starts with nan
        arr = np.array([na_value, delta])
        assert lib.infer_dtype(arr, skipna=True) == "timedelta"

        arr = np.array([na_value, delta, na_value])
        assert lib.infer_dtype(arr, skipna=True) == "timedelta"

Roughly, this test fails on my branch, because it currently promotes the other way around [delta, na_value] (so I can swap that, and it should be the same). This PR will deprecate the whole thing, which here would mean that in the future you will get an object array. I assume pandas would be happy about the general fix?

But, it also means you would get deprecation/future warnings in this type of array coercion... so the more general question is, whether we try to hack a more userfriendly warning (I did something similar in my start on string+numeric promotion), or whether we rather just rip off the band-aid here, considering that it is inconsistent and rather a bug to begin with.

I am starting to favor that approach. With regard to array-coercion, it will just make things work (by returning object arrays), which does not seem terrible. And since it was an error previously, it seems unlikely that someone try:/excepted: it.

Edit: Ops, embarrassing, meant to ping @jbrockmendel on this one, since I think he is more into the datetime stuff...

@jbrockmendel
Copy link
Contributor

what are na_value and delta in the failing test(s)?

@seberg
Copy link
Member

seberg commented Jun 16, 2020

Ah sorry, its np.nan and an arbitrary timedelta (nothing special). The problem is that sometimes np.promote_types(float, timedelta64) gives timedelta64, and sometimes it gives an error (which for array-coercion means object and object is what pandas needs I guess)

@jbrockmendel
Copy link
Contributor

Can you post a traceback? its hard to guess where lib.infer_dtype would go wrong here

@seberg
Copy link
Member

seberg commented Jun 16, 2020

Oh, my bad again, but I guess that also proofs that pandas is just not dependend on NumPy here. The line that changes – that I am worried about – is the np.array() line...

@seberg
Copy link
Member

seberg commented Jun 16, 2020

Sorry to be clear:

In [2]: np.array([np.nan, np.timedelta64(1, "D")])  
Out[2]: array([nan, numpy.timedelta64(1,'D')], dtype=object)

In [3]: np.array([np.timedelta64(1, "D"), np.nan])     
ValueError: Could not convert object to NumPy timedelta

Is the current behaviour. This PR makes to always return object. If we were to deprecate that though, we might end up with a warning in the second case. Which seems undesirable?

@jbrockmendel
Copy link
Contributor

This PR makes to always return object

This seems reasonable to me. It isnt obvious how that would break anything within pandas

@seberg
Copy link
Member

seberg commented Jun 19, 2020

@anirudh2290 I think I would like to try and put this in. Maybe we should add a short release note just in case?

@anirudh2290
Copy link
Member Author

@seberg done !

float->timedelta promotion will raise a TypeError
-------------------------------------------------
``np.promote_types("float32", "m8")`` and other float to timedelta promotion
which used to return "m8" (timedelta) before, will now raise a TypeError.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if we mention that it aligns it with np.promote_types("m8", "float32") to ease peoples mind that it really should not be a large change. E.g. could also smuggle the word "consistently" in there ;). "Float and timedelta promotion consistently raises TypeError"?

But, this is also fine, so maybe I will just merge, or update myself later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks ! i have made a change to address this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Anirudh, lets put this in.

@seberg
Copy link
Member

seberg commented Jun 19, 2020

Some strange test failure. I think they must be due to these changes being based on a not quite up-to-date branch. In any case nothing to worry about/unrelated to the change.

@seberg seberg merged commit 2f156d8 into numpy:master Jun 19, 2020
@seberg
Copy link
Member

seberg commented Jun 19, 2020

Ehhh. My bad, I completely missed that this is also the (bad?) behaviour for all or some integers:

In [1]: np.promote_types("m8", "uint64") 
TypeError: invalid type promotion

In [2]: np.promote_types("uint64", "m8")  
Out[2]: dtype('<m8')

And I guess we can again simply change it without any real downstream impact, since ufuncs don't use promotion often to begin with, and even if, have the loops defined correctly for dtypes...

@anirudh2290
Copy link
Member Author

Ehhh. My bad, I completely missed that this is also the (bad?) behaviour for all or some integers:

In [1]: np.promote_types("m8", "uint64") 
TypeError: invalid type promotion

In [2]: np.promote_types("uint64", "m8")  
Out[2]: dtype('<m8')

And I guess we can again simply change it without any real downstream impact, since ufuncs don't use promotion often to begin with, and even if, have the loops defined correctly for dtypes...

aah, sure. let me open another pr for this !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: timedelta and float promotion sometimes works
3 participants