Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Throw warning if reserved column is used #1845

Open
bschifferer opened this issue Jun 21, 2023 · 0 comments · May be fixed by #1846
Open

[BUG] Throw warning if reserved column is used #1845

bschifferer opened this issue Jun 21, 2023 · 0 comments · May be fixed by #1846
Assignees
Labels
bug Something isn't working P2
Milestone

Comments

@bschifferer
Copy link
Contributor

Describe the bug
It seems, we are not allowed to call a column labels for the categorify op.

Steps/Code to reproduce bug

import cudf
import nvtabular as nvt
from nvtabular.ops import *


df = cudf.DataFrame({'labels': [10,11,12]})

feat = ['labels'] >> nvt.ops.Categorify()

workflow = nvt.Workflow(feat)

dataset = nvt.Dataset(df, cpu=False)
workflow.fit(dataset)
workflow.transform(dataset).compute()

The output is the original input [10,11,12]

Expected behavior
The output should be the categorified column

We should throw at least a warning (or even an error), that we cannot use labels as a column name in categorify

Environment details (please complete the following information):
I tested it in pytorch:22.12 container. Reading the NVT code, it seems that labels is a special column name

https://github.com/NVIDIA-Merlin/NVTabular/blob/main/nvtabular/ops/categorify.py#L1645

Additional context

@bschifferer bschifferer added the bug Something isn't working label Jun 21, 2023
@karlhigley karlhigley added this to the Merlin 23.07 milestone Jun 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants