You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am encountering an issue similar to issue #421 where I attempted to fine-tune a model using the AF_multimer pretrained parameters (resume_from_jax_params: params_model_1_multimer_v3.npz). However, the loss is unusually high, and it seems as if the model is starting from initial training, not utilizing the pretrained parameters.
I suspect that in trainer.py, the training might be initializing from scratch due to the following lines:
trainer.fit(
model_module,
datamodule=data_module,
ckpt_path=ckpt_path # ckpt_path is None since I use args.resume_from_jax
)
Could this be why the model seems to be training from scratch rather than fine-tuning? If so, how can I convert the .npz file(params_model_1_multimer_v3.npz) to .ckpt format? Is there a script available for this conversion? Thank you for your assistance.
The text was updated successfully, but these errors were encountered:
Hi, I am encountering an issue similar to issue #421 where I attempted to fine-tune a model using the AF_multimer pretrained parameters (resume_from_jax_params: params_model_1_multimer_v3.npz). However, the loss is unusually high, and it seems as if the model is starting from initial training, not utilizing the pretrained parameters.
Here are the arguments I used:
python ~~~/train_openfold.py
~~~/train/data_dir
~~~/train/alignment_dir
~~~/data/template_mmcif_dir/
~~~/save/
2021-09-30
--train_mmcif_data_cache_path ~~~/train_mmcif_test.json
--precision bf16
--val_data_dir ~~~/valid/data_dir
--val_alignment_dir ~~~/valid/alignment_dir
--val_mmcif_data_cache_path ~~~/valid_mmcif_test.json
--kalign_binary_path ~~~/bin/kalign
--obsolete_pdbs_file_path ~~~/pdb_mmcif/obsolete.dat
--template_release_dates_cache_path ~~~/template_mmcif_cache.json
--seed 622
--replace_sampler_ddp True
--checkpoint_every_epoch
--resume_from_jax_params ~~~/openfold/resources/params/params_model_1_multimer_v3.npz
--log_performance True
--script_modules False
--train_epoch_len 200
--log_lr
--config_preset "model_1_multimer_v3"
--gpus 1
--num_processes 16
--strategy ddp
--num_nodes=1
--deepspeed_config ~~~/deepspeed_config.json
I suspect that in trainer.py, the training might be initializing from scratch due to the following lines:
trainer.fit(
model_module,
datamodule=data_module,
ckpt_path=ckpt_path # ckpt_path is None since I use args.resume_from_jax
)
Could this be why the model seems to be training from scratch rather than fine-tuning? If so, how can I convert the .npz file(params_model_1_multimer_v3.npz) to .ckpt format? Is there a script available for this conversion? Thank you for your assistance.
The text was updated successfully, but these errors were encountered: