Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For discriminative loss, is the true NCE batch size the number of masked patches? #26

Open
hillup opened this issue Oct 18, 2023 · 2 comments
Labels
question Further information is requested

Comments

@hillup
Copy link

hillup commented Oct 18, 2023

image
In this piece of code, it seems that the loss is calculated at the granularity of samples.

@hillup
Copy link
Author

hillup commented Oct 18, 2023

So even if you increase the number of gpus, contrastive learning will not see more negative examples.

@YuanGongND YuanGongND added the question Further information is requested label Dec 16, 2023
@YuanGongND
Copy link
Owner

For discriminative loss, is the true NCE batch size the number of masked patches?

In line 347 in your screenshot, NCE is accumulated to all batch of samples, but the negative samples are all from the same spectrogram. I.e., say B=12 (you have 12 spectrograms in a batch), each spectrogram has 512 patches and you mask 400 of them. Then the negative samples is always 400-1=399, but NCE won't update until it goes through all 12 spectrograms.

So even if you increase the number of gpus, contrastive learning will not see more negative examples.

The negative samples will always be #masked_patches-1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants