Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REA] How to remove tags? #1855

Open
AresDan opened this issue Jul 4, 2023 · 4 comments
Open

[REA] How to remove tags? #1855

AresDan opened this issue Jul 4, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@AresDan
Copy link

AresDan commented Jul 4, 2023

Hello,

I would like to ask how can I remove tags from schema while building a preprocessing using NVTabular? I want to extract the last element from the list that was sliced, however, tag LIST is dragged along and I couldn't find any function which would remove it.

Thank you in advance.

@karlhigley karlhigley added the enhancement New feature or request label Jul 5, 2023
@karlhigley
Copy link
Contributor

@jperez999 @radekosmulski @rnyak Since we have operators that add tags, operators that remove them also seem like something we should have. This may also represent a case where one of the operators (ListSlice?) should remove the list tag when the result is a scalar column. Would one (or more) of you up for tackling this issue?

@rnyak
Copy link
Contributor

rnyak commented Jul 6, 2023

@karlhigley agreed this can be a useful feature, let's sync.

@rnyak
Copy link
Contributor

rnyak commented Jul 6, 2023

@AresDan may be as a workaround you can do something like that:

  • if you are using groupby op to generate item-id-list col, then you can add last like ("item_id": ["list", "count", "last"],) so that it will automatically create a col of last item-id and it wont tag item_id-last as LIST.
  • then remove the last item from the item-id-list column.

@AresDan
Copy link
Author

AresDan commented Jul 6, 2023

@AresDan may be as a workaround you can do something like that:

  • if you are using groupby op to generate item-id-list col, then you can add last like ("item_id": ["list", "count", "last"],) so that it will automatically create a col of last item-id and it wont tag item_id-last as LIST.
  • then remove the last item from the item-id-list column.

This is what I tried to do as well and it worked. However, when I want to filter items in item-id-list to take only elements of length 2 or more, I somehow need to filter item-id-last as well, and I can't do that, as the length of this feature is 1. And if I don't do any filtering on item-id-last, then when I join item-id-list and item-id-last they end up having different shape and Nan values appear in the final table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants