Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Allow cast from cat to extension dtype #28762

Merged
merged 11 commits into from
Oct 8, 2019
Merged

BUG: Allow cast from cat to extension dtype #28762

merged 11 commits into from
Oct 8, 2019

Conversation

dsaxton
Copy link
Member

@dsaxton dsaxton commented Oct 2, 2019

Modifies categorical.astype to allow for casting to extension dtypes. Also fixes the merge issue identified in #28668.

@dsaxton
Copy link
Member Author

dsaxton commented Oct 3, 2019

Getting this mypy error: pandas/core/arrays/categorical.py:527: error: Argument 1 to "array" has incompatible type "Categorical"; expected "Sequence[object]".

Should pandas.core.construction.array be able to accept type Categorical?

@WillAyd
Copy link
Member

WillAyd commented Oct 3, 2019

Hmm have we ever considered making the ExtensionArray definition align with any of the Python abcs? To align with the Sequence interface would need __reversed__, __contains__, index, and count. Collection would only require __contains__ out of those

@TomAugspurger any thoughts? Would certainly help the type system

@TomAugspurger
Copy link
Contributor

I thought it was an instance of abc.Sequence, but apparently not. Apparently, ndarray's aren't Sequences either...

I don't have a strong opinion here.

@WillAyd
Copy link
Member

WillAyd commented Oct 3, 2019

I opened #28770 as a follow up to that. @dsaxton for now I think you can just # type: ignore that and ref the open issue

pandas/core/arrays/categorical.py Outdated Show resolved Hide resolved
pandas/tests/extension/test_categorical.py Outdated Show resolved Hide resolved
@jschendel jschendel added Bug Categorical Categorical Data Type ExtensionArray Extending pandas with custom dtypes or arrays. labels Oct 4, 2019
@jschendel jschendel added this to the 1.0 milestone Oct 4, 2019
@@ -520,6 +520,8 @@ def astype(self, dtype: Dtype, copy: bool = True) -> ArrayLike:
if dtype == self.dtype:
return self
return self._set_dtype(dtype)
if is_extension_array_dtype(dtype):
return array(self, dtype=dtype, copy=copy) # type: ignore # GH 28770
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the following an elif

pandas/tests/extension/test_categorical.py Show resolved Hide resolved
pandas/tests/extension/test_categorical.py Show resolved Hide resolved
pandas/tests/extension/test_categorical.py Outdated Show resolved Hide resolved
pandas/tests/extension/test_categorical.py Show resolved Hide resolved
pandas/tests/reshape/merge/test_merge.py Outdated Show resolved Hide resolved
pandas/tests/reshape/merge/test_merge.py Outdated Show resolved Hide resolved
pandas/tests/reshape/merge/test_merge.py Show resolved Hide resolved
@@ -217,6 +217,7 @@ Categorical

- Added test to assert the :func:`fillna` raises the correct ValueError message when the value isn't a value from categories (:issue:`13628`)
- Bug in :meth:`Categorical.astype` where ``NaN`` values were handled incorrectly when casting to int (:issue:`28406`)
- Bug in :meth:`Categorical.astype` not allowing for casting to extension dtypes (:issue:`28668`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add another note about the merge issue (which was the OP); IIRC its the same issue number. otherwise lgtm, ping on green.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback Included a note for the merge issue, let me know if it looks good and thanks for the review

@jreback jreback merged commit 48f1a67 into pandas-dev:master Oct 8, 2019
@jreback
Copy link
Contributor

jreback commented Oct 8, 2019

thanks @dsaxton

MarcoGorelli pushed a commit to MarcoGorelli/pandas that referenced this pull request Oct 8, 2019
MarcoGorelli pushed a commit to MarcoGorelli/pandas that referenced this pull request Nov 2, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge error on Categorical Interval columns
5 participants