[BUG] IndexError: list index out of range #1872

Oussamakhammassi · 2024-01-29T09:22:49Z

I'm working on a transformers4rec project and i want to do the proprocessing and the encoding of categorical features in my sql framework before entering the table and without using categorify() function to the model. But i get this error when i debugg .from_schema() function:

rnyak · 2024-01-29T14:45:10Z

@Oussamakhammassi without using categorify() function

TF4Rec Models are designed to read from the schema file. Categorify op is critical to give information about the number of unique categories for a given column (categorical feature), and it adds the tag Categorical automatically in the schema.

if you dont use categorify op, you need to create a proper schema file yourself. A schema should have proper tags for all categorical and continuous features. Tags like, categorical, continuous, is_list, is_ragged, etc etc..

you can check out one of the NVTabular nbs here and see how a schema file looks like.

Oussamakhammassi · 2024-02-01T08:41:36Z

Can you give me a tutorial link on how to create my proper schema file?

rnyak · 2024-02-01T21:17:10Z

@Oussamakhammassi sorry, we dont have a tutorial for how to create a schema file if you dont use NVTabular. if you use NVTabular it is created automatically. what you can do is to run one of the toy examples, and then take the schema file, and change the numbers and feature names and types based on your dataset.

Run this example until cell 10, you will see there is a schema file saved on disk. you can try to modify it.

Oussamakhammassi added the bug Something isn't working label Jan 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] IndexError: list index out of range #1872

[BUG] IndexError: list index out of range #1872

Oussamakhammassi commented Jan 29, 2024

rnyak commented Jan 29, 2024

Oussamakhammassi commented Feb 1, 2024

rnyak commented Feb 1, 2024 •

edited

[BUG] IndexError: list index out of range #1872

[BUG] IndexError: list index out of range #1872

Comments

Oussamakhammassi commented Jan 29, 2024

rnyak commented Jan 29, 2024

Oussamakhammassi commented Feb 1, 2024

rnyak commented Feb 1, 2024 • edited

rnyak commented Feb 1, 2024 •

edited