Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifications for creating sequential batches instead of random sampling. #112

Open
Raviteja-banda opened this issue Jan 13, 2023 · 0 comments

Comments

@Raviteja-banda
Copy link

Raviteja-banda commented Jan 13, 2023

Please correct me if I'm wrong:
In the speaker_id.py file, in the function create_baches_rnd(), you select a batch of random random samples and then a random chunk of length 200ms(since the length of the sample is 3200 and sampling rate is 16000) from each sample. This way, you might select the same file in 2 different batches and you might end up not selecting some files. Eventually the model might end up not training some labels. Am i correct in saying so?

What is the effect of selecting the files in sequence instead of random selection. This way, I select each audio file only once, and I make sure all the audio samples are selected for training. This also decreases the number of batches for training and hence less training time.
Could some one please clarify these things?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant