Issue in train_ful, test_full, dev_full files #24

sajidaraz · 2021-03-13T05:00:09Z

I prepared the data following dataproc_mimic_III.ipynb file and i got six file i.e train_50, test_50, dev_50, train_full, test_full, dev_full. I am facing problem with train_full, test_full and dev_full such that train_full contain 8686 unique labels, test_full contain 4075 unique labels and dev_full contains 3009 unique labels. I don't know why labels are not of equal size in each file and now how to make them of equal size so that I can train my model.

kindly help me

airingzhang · 2021-03-18T04:44:53Z

This is because there are some of the codes only occur once. So none of the three splits contains all unique codes.

sajidaraz · 2021-03-20T05:41:03Z

can you kindly guide me on how to make these labels of equal size? so that we can train a model because the model does not accept the different sizes of labels in y_train and y_test, y_valid.

airingzhang · 2021-03-20T16:55:27Z

I am not the author. BUT, I guess this is actually the setting of this task (full label scenario) that training set does not see all the unique labels.

monk1337 · 2021-10-10T16:47:01Z

@sajidaraz @sarahwie Have you found the solution?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue in train_ful, test_full, dev_full files #24

Issue in train_ful, test_full, dev_full files #24

sajidaraz commented Mar 13, 2021

airingzhang commented Mar 18, 2021

sajidaraz commented Mar 20, 2021

airingzhang commented Mar 20, 2021

monk1337 commented Oct 10, 2021 •

edited

Issue in train_ful, test_full, dev_full files #24

Issue in train_ful, test_full, dev_full files #24

Comments

sajidaraz commented Mar 13, 2021

airingzhang commented Mar 18, 2021

sajidaraz commented Mar 20, 2021

airingzhang commented Mar 20, 2021

monk1337 commented Oct 10, 2021 • edited

monk1337 commented Oct 10, 2021 •

edited