Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with RandomEdgeSplit for multilabel edge classification #9262

Open
sadrahkm opened this issue Apr 30, 2024 · 4 comments
Open

Error with RandomEdgeSplit for multilabel edge classification #9262

sadrahkm opened this issue Apr 30, 2024 · 4 comments
Labels

Comments

@sadrahkm
Copy link

sadrahkm commented Apr 30, 2024

馃悰 Describe the bug

Recently, I've been dealing with a multi-label edge classification problem. In other words, an edge can have more than one label. So I implemented a simple GNN model to see if I get good results or not.

I have 935 types of labels and have encoded them using the MultiLabelBinarizer method in sklearn. I have tested and I'm sure that all the labels are 0 or 1.

But after splitting the edges using RandomEdgeSplit, I noticed that there are more than two types of labels in the test and validation tests. I mean in the train set, there are 1 and 0, but in the validation set there 0, 1, 2. This makes the work a little hard. In the following screenshot, I have shown this. The first cell is the original data which is encoded with MultiLabelBinarizer. The next three cells are train/val/test sets, respectively. These train/val/test sets are splitted using the RandomEdgeSplit that I've provided in the code block.

image

For example, I want to compute the AUC score in the test process. I have attached the code and errors that I've got. I don't what I should do why the edge splitter function returning more than two types of labels. I think it should only have 0 or 1. I would appreciate your help in this regards.

transform = T.RandomLinkSplit(
    num_val=0.1,
    num_test=0.1,
    disjoint_train_ratio=None,
    add_negative_train_samples=False,
)
train_data, val_data, test_data = transform(data)


from torchmetrics.classification import MultilabelAUROC
@torch.no_grad()
def test(data):
    model.eval()
    z = model.encode(data.x, data.edge_index)
    out = model.decode(z, data.edge_label_index)

    ml_auroc = MultilabelAUROC(num_labels=935, average="macro", thresholds=None)
    auc = ml_auroc(out.cpu(), data.edge_label.cpu())

    return auc

for epoch in range(1, 100):
    loss = train()
    val_auc = test(val_data)
    print(f'Epoch: {epoch:03d}, Loss: {loss:.4f}, Val AUC: {val_auc:.4f}, Val AUPRRRC: {val_auprc:.4f}')
RuntimeError: Detected the following values in `target`: tensor([0, 1, 2]) but expected only the following values [0, 1].

Versions

Collecting environment information...
PyTorch version: 2.2.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A

OS: Debian GNU/Linux 12 (bookworm) (x86_64)
GCC version: (Debian 12.2.0-14) 12.2.0
Clang version: Could not collect
CMake version: version 3.25.1
Libc version: glibc-2.36

Python version: 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] (64-bit runtime)
Python platform: Linux-6.1.0-20-amd64-x86_64-with-glibc2.36
...

@sadrahkm sadrahkm added the bug label Apr 30, 2024
@keeganq
Copy link

keeganq commented May 6, 2024

I was able to reproduce this problem with a minimal example. The root cause is that when add_negative_train_samples=False, negative sampling still occurs for val and test examples.

num_neg_train = 0
if self.add_negative_train_samples:
if num_disjoint > 0:
num_neg_train = int(num_disjoint * self.neg_sampling_ratio)
else:
num_neg_train = int(num_train * self.neg_sampling_ratio)
num_neg_val = int(num_val * self.neg_sampling_ratio)
num_neg_test = int(num_test * self.neg_sampling_ratio)
num_neg = num_neg_train + num_neg_val + num_neg_test

Unfortunately, this not only adds negative edges to the val_data and test_data, but also means that their edge labels are incremented by 1, whereas the train_data labels are unchanged. In your example, label 1 in val_data corresponds with label 0 in train_data, and so on. Label 0 in val_data indicates a negative link.

This seems like a very confusing kwarg, and possibly an unintended result? Would be happy to submit a PR to try to fix this.

@keeganq
Copy link

keeganq commented May 6, 2024

@sadrahkm A quick workaround is to pass the kwarg neg_sampling_ratio=0. to T.RandomLinkSplit. This will prevent negative sampling for the validation and test sets, and will also preserve the original labels in your dataset.

@sadrahkm
Copy link
Author

Thank you @keeganq for your help

Right, I hadn't noticed that the add_negative_train_samples option is only working for training samples, and the validation/test sets are automatically considered for negative sampling.

Yes, putting neg_sampling_ratio=0 would fix it. But I think it should be clarified in the documentation to avoid any confusion like this.

@sadrahkm
Copy link
Author

I think if we want to have negative samples for train/val/test sets, there would be a problem with this issue. Because in that case, we would have to set add_negative_train_samples=True as well as putting neg_sampling_ratio=2.0. By doing this, val/test would have more than 2 labels, as I mentioned in the problem statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants