Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert #933

Open
Harzva opened this issue Jun 14, 2023 · 1 comment

Comments

@Harzva
Copy link

Harzva commented Jun 14, 2023

Describe the question(问题描述)
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts

In the MOE method does expert have to learn and can the frozen model be used as an expert?like gpt3 bert

thank you very much!!

@wangzhen38
Copy link
Collaborator

We just reproduced this model with paddlepaddle according to the source code of the paper, so it can't use other frozen model to be an expert directly, but it supports warm start by the model saved in past epochs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants