Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding Pathfinder and Listops performance #60

Open
LeoXinhaoLee opened this issue Sep 25, 2023 · 2 comments
Open

Question regarding Pathfinder and Listops performance #60

LeoXinhaoLee opened this issue Sep 25, 2023 · 2 comments

Comments

@LeoXinhaoLee
Copy link

Hi, thank you for releasing code for this inspiring work! When I was trying to reproduce the results of Transformer and Linear Transformer on Pathfinder32 and Listops tasks, I encountered the following problems:

(1) Transformer and Linear Transformer only got about 50% acc on Pathfinder32 task. If I replaced the fixed positional encoding (in official config) with learnable positional embedding, Transformer reached around 70%, but Linear Transformer stayed at 50%.

(2) On the Listops task, Transformer only had about 17% acc with fixed positional encoding (official config) or learnable position embedding.

Thank you very much for your help!

@lucaslingle
Copy link

lucaslingle commented Jan 7, 2024

For ListOps, I think the checkpointing is broken somehow. The average performance across runs appears to be the same as a randomly initialized model.

However, you can get the correct result for the trained model by evaluating on the test set at the end of training, rather than saving the model to a checkpoint and reloading it.

@lucaslingle
Copy link

@LeoXinhaoLee

By the way, did you change any other settings for Pathfinder32? I tried your suggestion but I am still getting only 50% accuracy for a vanilla Transformer, even with learnable position embeddings.

Thanks for any insights you can provide!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants