Question regarding Pathfinder and Listops performance #60

LeoXinhaoLee · 2023-09-25T01:24:17Z

Hi, thank you for releasing code for this inspiring work! When I was trying to reproduce the results of Transformer and Linear Transformer on Pathfinder32 and Listops tasks, I encountered the following problems:

(1) Transformer and Linear Transformer only got about 50% acc on Pathfinder32 task. If I replaced the fixed positional encoding (in official config) with learnable positional embedding, Transformer reached around 70%, but Linear Transformer stayed at 50%.

(2) On the Listops task, Transformer only had about 17% acc with fixed positional encoding (official config) or learnable position embedding.

Thank you very much for your help!

lucaslingle · 2024-01-07T04:49:48Z

For ListOps, I think the checkpointing is broken somehow. The average performance across runs appears to be the same as a randomly initialized model.

However, you can get the correct result for the trained model by evaluating on the test set at the end of training, rather than saving the model to a checkpoint and reloading it.

lucaslingle · 2024-01-07T08:50:08Z

@LeoXinhaoLee

By the way, did you change any other settings for Pathfinder32? I tried your suggestion but I am still getting only 50% accuracy for a vanilla Transformer, even with learnable position embeddings.

Thanks for any insights you can provide!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question regarding Pathfinder and Listops performance #60

Question regarding Pathfinder and Listops performance #60

LeoXinhaoLee commented Sep 25, 2023

lucaslingle commented Jan 7, 2024 •

edited

lucaslingle commented Jan 7, 2024

Question regarding Pathfinder and Listops performance #60

Question regarding Pathfinder and Listops performance #60

Comments

LeoXinhaoLee commented Sep 25, 2023

lucaslingle commented Jan 7, 2024 • edited

lucaslingle commented Jan 7, 2024

lucaslingle commented Jan 7, 2024 •

edited