Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

-np.inf in mask_3d causes numerical instability ? #10

Open
chiragjn opened this issue Dec 20, 2017 · 1 comment
Open

-np.inf in mask_3d causes numerical instability ? #10

chiragjn opened this issue Dec 20, 2017 · 1 comment

Comments

@chiragjn
Copy link

I have found using -np.inf in the inter-attention module (attend part) often leads to nan loss computations even with gradient clipping or very low learning rates. Replacing it with some large value like -1e18 helps my case.

Could it be because there is some error in masking before calculating attention scores?

@erickrf
Copy link
Owner

erickrf commented Dec 20, 2017

I never found this problem, and I have tried the code with different datasets. Are you sure you provided the correct sentence sizes?

If you're sure this happens, could you provide more information on your training setup? Like data and some hyperparameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants