-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
如何再8*A100上预训练128k长度的llama3? #683
Comments
import torch from xtuner.dataset import process_hf_dataset with read_base(): pretrained_model_name_or_path = '/opt/218/models/Meta-Llama-3-8B-Instruct-continue_pre' data_path = './train_128k_1000.jsonl' batch_size = 1 # per_device save_steps = 100 Evaluate the generation performance during the trainingevaluation_freq = 1000 tokenizer = dict( model = dict( train_dataset = dict( train_dataloader = dict( train_cfg = dict(type=TrainLoop, max_epochs=max_epochs) custom_hooks = [dict(type=DatasetInfoHook, tokenizer=tokenizer)] default_hooks = dict( env_cfg = dict( visualizer = None log_level = 'INFO' load_from = None resume = False randomness = dict(seed=None, deterministic=False) log_processor = dict(by_epoch=False) |
看README的图表是可以训练的,但是我一直OOM
The text was updated successfully, but these errors were encountered: