-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model fails to converge on transfer to audio backtesting problem #19
Comments
hi there, Do you mean you finetune our pretrained model for a regression task? What do you by this?
-Yuan |
thank you for your reply!I mainly use this data set for fine-tuning, and separate the audio of this data set(https://chalearnlap.cvc.uab.cat/dataset/24/description/). Each audio is a 15-second speech audio, and the MLP is stitched after the model to adjust the dimension of the audio data output by the final model to ( batchsize,5), 5 corresponds to the regression value of five personality traits corresponding to an audio. |
In the experiment, I tried to adjust the learning rate and other parameters, tried to remove the mask and mixing in the data preprocessing, set the input_tdim to 1530 to suit my audio length, label_dim to 512, and finally performed regression prediction through the following code : nn.Sequential( |
There are a few things:
Is Sigmoid common for regression? Setting "label_dim to 512" (for classification) and then a few dense layers seems to be redundent. You can just change the last MLP layer to a regression head. ssast/src/models/ast_models.py Lines 166 to 167 in a1a3eec
But I know very little about your task. You need to tune the params by yourself. For some networks, we use a larger learning rate for the mlp layer because it is random initialized while other parameters are pretrained. I mainly answer questions that are related to what we presented in the paper, and it is hard for me to answer questions regarding new task / usage of the model. -Yuan |
Another minor point is that you said there are 5 regression values, but |
Dear Yuan and authors,
First of all, thank you for your paper. Recently, I migrated your pre-trained model to the regression prediction task of personality computing. After splicing several fully connected layers after your original model, the result is that the predicted value will only be maintained at a very low level during training. In a small interval, there will be no effective changes. Have you done relevant regression experiments? What are the possible reasons for this problem?
Sorry to bother you with my question and thank you very much for reading my question
yang
The text was updated successfully, but these errors were encountered: