Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is CPU acceleration failed? #26

Open
yanchenmochen opened this issue Aug 25, 2022 · 3 comments
Open

Is CPU acceleration failed? #26

yanchenmochen opened this issue Aug 25, 2022 · 3 comments

Comments

@yanchenmochen
Copy link

Last day, I make some experiments in a Server to run the
./run_alphafold.sh -d /dataset/ -o result -p monomer -m model_2 -i input/T1061.fasta
and I read the log, confused, the T1061 is 949AA.
`
I0822 07:33:00.806264 140553952322112 jackhmmer.py:133] Launching subprocess "/opt/conda/bin/jackhmmer -o /dev/null -A /tmp/tmpxbrk9wt6/output.sE 0.0001 -E 0.0001 --cpu 8 -N 1 input/T1061.fasta /dataset//uniref90/uniref90.fasta"
I0822 07:33:01.157015 140553952322112 utils.py:36] Started Jackhmmer (uniref90.fasta) query
I0822 07:37:27.058227 140553952322112 utils.py:40] Finished Jackhmmer (uniref90.fasta) query in 265.901 seconds
I0822 07:37:27.072012 140553952322112 jackhmmer.py:133] Launching subprocess "/opt/conda/bin/jackhmmer -o /dev/null -A /tmp/tmpnn6am537/output.sE 0.0001 -E 0.0001 --cpu 8 -N 1 input/T1061.fasta /dataset//mgnify/mgy_clusters_2018_12.fa"
I0822 07:37:27.439405 140553952322112 utils.py:36] Started Jackhmmer (mgy_clusters_2018_12.fa) query
I0822 07:42:42.192071 140553952322112 utils.py:40] Finished Jackhmmer (mgy_clusters_2018_12.fa) query in 314.752 seconds
I0822 07:42:42.364272 140553952322112 hhsearch.py:85] Launching subprocess "/opt/conda/bin/hhsearch -i /tmp/tmpog4q4684/query.a3m -o /tmp/tmpog40/pdb70"
I0822 07:42:42.712445 140553952322112 utils.py:36] Started HHsearch query
I0822 07:44:18.199999 140553952322112 utils.py:40] Finished HHsearch query in 95.487 seconds
I0822 07:44:18.555797 140553952322112 hhblits.py:128] Launching subprocess "/opt/conda/bin/hhblits -i input/T1061.fasta -cpu 4 -oa3m /tmp/tmpz9oq 1000000 -realign_max 100000 -maxfilt 100000 -min_prefilter_hits 1000 -d /dataset//bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_optst30_2018_08"
I0822 07:44:19.050110 140553952322112 utils.py:36] Started HHblits query

I0822 09:01:02.278290 140553952322112 utils.py:40] Finished HHblits query in 4603.228 seconds
`
feature extraction spend time: 5305.185729026794
feature extraction Completed succesfully

I print the feature extraction time, find that , the 5305 is almost equals to the sum of each db search time, but according to the article, I think the feature extraction spend time should be almost equal to HHblits search, so can you explain the confusing problem?

@Zuricho
Copy link
Owner

Zuricho commented Aug 25, 2022

5305 should be the sum of all MSA search time (actually include the PDB search, but that's fast), in the article, we state that the HHblits is the most time-consuming step, not meaning that the time for featuraization step should be equal to ones for HHblits.

@Zuricho
Copy link
Owner

Zuricho commented Aug 25, 2022

If you mean the CPU accerlation, acutally we have not yet add that to this repo, I will upload that soon

@yanchenmochen
Copy link
Author

If you mean the CPU accerlation, acutally we have not yet add that to this repo, I will upload that soon

oops, I means the step described from Sequential Tasks to Parallel Tasks on ParaFold. Threee independent sequential MSA searchs in parallel, in this way, we can reduce the duration consumed to complete MSA search。 @Zuricho
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants