Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocess training and training details #5

Open
zhyunlong opened this issue Sep 9, 2021 · 10 comments
Open

multiprocess training and training details #5

zhyunlong opened this issue Sep 9, 2021 · 10 comments

Comments

@zhyunlong
Copy link

How can i use multiprocess training? Can i use stable-baselines‘ multiprocessing interface?
And I also wonder training details to repeat the results of squence tagging. I run the script, and didn't get same good performance.
Thanks a lot.

@rajcscw
Copy link
Owner

rajcscw commented Sep 9, 2021

Hey, for multi-process training, refer to stable-baselines interface since Nlp-gym does not provide implementations of RL algorithms. Also, with respect to hyperparameter settings for DQN and PPO, please refer to our paper https://arxiv.org/pdf/2011.08272.pdf

@zhyunlong
Copy link
Author

Thanks for your reply, I see hyperparameters in training scripts(train_seq_tagging.py), does it mean i can just run the script to get same result? How many steps does model need to learn?
Very apprecaite for you reply.

1 similar comment
@zhyunlong
Copy link
Author

Thanks for your reply, I see hyperparameters in training scripts(train_seq_tagging.py), does it mean i can just run the script to get same result? How many steps does model need to learn?
Very apprecaite for you reply.

@rajcscw
Copy link
Owner

rajcscw commented Sep 11, 2021

Hey, you can train the agent for 1e+6 steps, you can do this as follows:

for i in range(int(1e+2)):
    model.learn(total_timesteps=int(1e+4), reset_num_timesteps=False)
    eval_model(model, env)

Also, make sure to use the PPO algorithm, which gave best results...

@zhyunlong
Copy link
Author

Thanks a lot!

@zhyunlong
Copy link
Author

Another question, in nlp-gym/nlp_gym/data_pools/custom_seq_tagging_pools.py, class 'CONLLNerTaggingPool' has no attribute '_get_dataset_from_corpus'. Do you forget to involve this code?

@xkianteb
Copy link

xkianteb commented Oct 1, 2021

Below is the method that is missing:

`

@staticmethod
def _get_dataset_from_corpus(corpus, split):
    # from flair.datasets import CONLL_03
    corpus = datasets.CONLL_03()
    if split == 'train':
        return corpus.train
    elif split == 'val':
        return corpus.dev
    elif split == 'test':
        return corpus.test

`

@rajcscw
Copy link
Owner

rajcscw commented Oct 1, 2021

Yes @zhyunlong, you are right, missed that function during refactoring. @xkianteb Thanks for the snippet, that is the missing implementation 👍

@xkianteb
Copy link

xkianteb commented Oct 1, 2021

@rajcscw Would you be open to me being a contributor to the repo? I would like to add a few more tasks.

@rajcscw
Copy link
Owner

rajcscw commented Oct 1, 2021

Sure @xkianteb, sounds like a good idea. What tasks do you have in mind? If you are on discord/twitter, feel free to reach me with rajkumar_rrk, we can have a quick chat..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants