Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regarding dataset prep scripts and audioset splits #3

Open
Moadab-AI opened this issue Mar 17, 2022 · 4 comments
Open

regarding dataset prep scripts and audioset splits #3

Moadab-AI opened this issue Mar 17, 2022 · 4 comments
Labels
question Further information is requested

Comments

@Moadab-AI
Copy link

Moadab-AI commented Mar 17, 2022

Hi,

I couldnt find in any of you recent publications on Audioset how you split the unbalanced (or even balanced) train segments to train and val for hyper parameter tuning. Just to try to replicate your results. also the dropbox link for the PSLA experiments you have listed is down.
On another note regarding FSD50k, could you elaborate what are those "forbidden" classes and why? also could you explain the purpose of this comment in prep_fsd50k.py when generating the JSON files please?:
"# only apply to the vocal sound data"

Thanks

@YuanGongND
Copy link
Owner

Hi there,

Can you point me to the code that is related to the "forbidden" classes? Thanks!

-Yuan

@Moadab-AI
Copy link
Author

Sorry my bad, that was not in your code, but in FSD50K original release:
FSD50k/FSD50K.ground_truth/analyze_dataset.py

@YuanGongND
Copy link
Owner

YuanGongND commented Mar 17, 2022

That's fine. For your other questions.

I couldnt find in any of you recent publications on Audioset how you split the unbalanced (or even balanced) train segments to train and val for hyper parameter tuning.

No, we don't use a validation set for AudioSet experiments, I think that is a common setting for most papers (e.g.,this paper, see the footnote on page 1237; you can see this in the code of other papers to verify this point) using AudioSet, practically, it is non-trivial to sample a meaningful validation set due to the label co-occurrence. For this reason, we did not report the best model but reported the average performance of the last few model checkpoints during training. The model performance is not sensitive to most hyperparameters and we didn't tune most of hyperparameters in the paper. On FSD50K that does have a validation split, the PSLA methods work equally well.

For the missing link, it seems to be a blank link, I cannot remember if I have something on dropbox. I will try to fix that when I have some time.

For the "# only apply to the vocal sound data" comment, that might be a mistake, did you see anything wrong with the prepared JSON file on the eval set? FYI, we collected a VocalSound dataset and will release it soon, we did some experiments on combining the FSD50K and VocalSound dataset, that's why you see the comment there, and I might forget to remove it when I clean up and upload the code. If you don't see an issue with the output JSON file, you can safely ignore the comment.

-Yuan

@YuanGongND YuanGongND added the question Further information is requested label Mar 17, 2022
@Moadab-AI
Copy link
Author

Thanks for the explanation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants