Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q's on Performer & Text Classification #19

Open
Muennighoff opened this issue Feb 8, 2021 · 1 comment
Open

Q's on Performer & Text Classification #19

Muennighoff opened this issue Feb 8, 2021 · 1 comment

Comments

@Muennighoff
Copy link

Thanks for the great work. I had a couple questions when trying to reproduce the Performer on the Byte Level Text Classification:

  1. What Kernel Function are you using? (Softmax approximation or Relu?)
  2. I found the training to be very instable. Do you take the final model after 20K steps or do you take the best checkpoint?
  3. With the learning rate scheduler you use, the learning rate is 0 if the first step is 0 isn't it? Shouldn't you instead start your training loop with for step in range(1, X) at https://github.com/google-research/long-range-arena/blob/main/lra_benchmarks/text_classification/train.py

Looking forward to the implementations of the other models, thanks!

@jinfengr
Copy link

FYI: the implementations of all models are available now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants