Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of N_eff when the align stage (but not the couplings stage) is run #251

Open
berkalpay opened this issue Oct 7, 2020 · 3 comments
Assignees

Comments

@berkalpay
Copy link

if "align" in config["stages"] and "couplings" not in config["stages"]:
config["align"]["compute_num_effective_seqs"] = True

This may be unexpected behavior since the corresponding config file parameter is overridden. Additionally, computing N_eff in the align stage may significantly lengthen computation when the couplings stage is expected to be run at a later point.

@thomashopf
Copy link
Contributor

thomashopf commented Oct 16, 2020

Hi @berkalpay, I agree this may seem unexpected (and I am normally not happy with having explicit overrides in the code like that). At the same time, the reason why this was put in because it makes the common use case where you first explore the available amount of sequences without calculating couplings a lot more convenient. Before having the override, the unexpected behaviour was to not have N_eff available after running align only, which is a major inconvenience.

Do you have a use case where you just run the alignment, but do not care about the Meff?

@berkalpay
Copy link
Author

Hi @thomashopf. Yes, my code runs the stages of EVcouplings separately; it is useful for my pipeline for the stages to be modular (e.g. for parallelization). So, in my use case the couplings stage is often run immediately after the align stage, and I would like to avoid calculating N_eff in the align stage - this can add many hours to what should be a quick step. Unfortunately, there seems to be no way to get around the override, so I have forked the repository in the meantime and removed the lines of code above.

@thomashopf thomashopf self-assigned this Oct 26, 2020
@thomashopf
Copy link
Contributor

@berkalpay I understand... this really bends the way the command-line application is meant to be used. The intended way to patch together the individual stages/components of the pipeline differently (I very much anticipated this scenario, e.g. different pipeline runners) would be to call some of the underlying Python functions directly depending on the level of abstraction needed.

But anyways, to accommodate your off-label use case, we could add an optional command-line flag that allows to turn off the override. I've flagged this as an enhancement for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants