Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BART training time #1525

Closed
sunilitggu opened this issue Dec 19, 2019 · 1 comment
Closed

BART training time #1525

sunilitggu opened this issue Dec 19, 2019 · 1 comment
Assignees
Labels

Comments

@sunilitggu
Copy link

May I know how much time BART pre-training took in which GPU configuration? I can see in the paper its written 500K steps with batch size 8k but I want to know the time it took. Many thanks.

@ngoyal2707
Copy link
Contributor

ngoyal2707 commented Dec 19, 2019

The time can depend on the type and numbers of gpus. We trained for around 11-12 days on 256 gpus.

facebook-github-bot pushed a commit that referenced this issue Dec 28, 2020
Summary:
Before:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353781760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89663978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train

```
After:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353,781,760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89,663,978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train
```

Pull Request resolved: fairinternal/fairseq-py#1525

Test Plan:
Run `fairseq-eval-lm` or `fairseq-train` and look at logs.
For example,
```
export dd2=/private/home/sshleifer/data-bin/new_hybrid_data
export m=/private/home/myleott/models/public_models/LM/roberta_lm.me_fp16.bm_none.tps1024.transformer_lm_gpt2_small.share.adam.b2_0.98.eps1e-08.cl0.0.lr0.003.wu3000.dr0.1.atdr0.1.wd0.01.ms2.uf4.mu100000.s1.ngpu64/model.pt
fairseq-eval-lm  $dd2 \
    --path $m \
    --sample-break-mode complete --gen-subset train \
        --tokens-per-sample 3072 --max-tokens 3072 --context-window 2560 --softmax-batch 1024 --fp16
```

Reviewed By: myleott

Differential Revision: D25693004

Pulled By: sshleifer

fbshipit-source-id: bfeb93fc6607cca2cb7a6e820f51e174d02d1f62
harkash pushed a commit to harkash/fairseq that referenced this issue Feb 23, 2021
…rch#1525)

Summary:
Before:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353781760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89663978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train

```
After:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353,781,760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89,663,978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train
```

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/1525

Test Plan:
Run `fairseq-eval-lm` or `fairseq-train` and look at logs.
For example,
```
export dd2=/private/home/sshleifer/data-bin/new_hybrid_data
export m=/private/home/myleott/models/public_models/LM/roberta_lm.me_fp16.bm_none.tps1024.transformer_lm_gpt2_small.share.adam.b2_0.98.eps1e-08.cl0.0.lr0.003.wu3000.dr0.1.atdr0.1.wd0.01.ms2.uf4.mu100000.s1.ngpu64/model.pt
fairseq-eval-lm  $dd2 \
    --path $m \
    --sample-break-mode complete --gen-subset train \
        --tokens-per-sample 3072 --max-tokens 3072 --context-window 2560 --softmax-batch 1024 --fp16
```

Reviewed By: myleott

Differential Revision: D25693004

Pulled By: sshleifer

fbshipit-source-id: bfeb93fc6607cca2cb7a6e820f51e174d02d1f62
sshleifer added a commit that referenced this issue Apr 7, 2021
Summary:
Before:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353781760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89663978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train

```
After:
```
2020-12-23 11:46:16 | INFO | fairseq_cli.eval_lm | num. model params: 353,781,760
2020-12-23 11:46:21 | INFO | fairseq.data.data_utils | loaded 89,663,978 examples from: /private/home/sshleifer/data-bin/new_hybrid_data/train
```

Pull Request resolved: fairinternal/fairseq-py#1525

Test Plan:
Run `fairseq-eval-lm` or `fairseq-train` and look at logs.
For example,
```
export dd2=/private/home/sshleifer/data-bin/new_hybrid_data
export m=/private/home/myleott/models/public_models/LM/roberta_lm.me_fp16.bm_none.tps1024.transformer_lm_gpt2_small.share.adam.b2_0.98.eps1e-08.cl0.0.lr0.003.wu3000.dr0.1.atdr0.1.wd0.01.ms2.uf4.mu100000.s1.ngpu64/model.pt
fairseq-eval-lm  $dd2 \
    --path $m \
    --sample-break-mode complete --gen-subset train \
        --tokens-per-sample 3072 --max-tokens 3072 --context-window 2560 --softmax-batch 1024 --fp16
```

Reviewed By: myleott

Differential Revision: D25693004

Pulled By: sshleifer

fbshipit-source-id: bfeb93fc6607cca2cb7a6e820f51e174d02d1f62
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants