Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't store program and data location in config files #42

Open
duboism opened this issue Feb 12, 2023 · 2 comments
Open

Don't store program and data location in config files #42

duboism opened this issue Feb 12, 2023 · 2 comments

Comments

@duboism
Copy link

duboism commented Feb 12, 2023

Hello,

IIUC, phigaro config file is, among other things, used to specify the location of some programs used by the software (hmmsearch and prodigal) and the location of the pVOG database.

I think that the location of software should not be stored on a config file but rather use the values found in the environment (with prior checks of course and maybe logging the binaries used) and have an CLI option to specify them. This is more flexible and simplify the installation. If I understand correctly, this is easy to do with the sh package.

Similarly, I think that the location of the pVOG database should be based on an environment variable with a CLI option.

Of course the other parameters could stay in the config file.

Any thoughts ?

@PollyTikhonova
Copy link
Collaborator

Hello @duboism, I'm regularly bump into problems with config of the bash or zsh environments when I run qsub. Or do automatisations through python subprocess / os.system. The nature of problems, I believe lays in the fact that those commands use some processor (bash/zsh...) different from the one I use when I launch commands manually. I never was able to resolve those (even when I put direct path to bash for subprocess lib), so I'm ending to write full paths to the programs. Now, considering all of this, I'm thinking that writing additionally one config path in such case, is easier than wexporting additionally two more paths for prodigal and hmmer.
I would be grateful, if you provide your thoughts of those problems) mb if I understand more about the problems of variables/bash/zsh/sh/etc I would see transferring of variables from config as a better alternative )

@duboism
Copy link
Author

duboism commented Feb 12, 2023

Hi,

Thanks for you quick answer.

I'm not sure I understand your problem with qsub . Do you mean that phigaro can't find hmmsearch/prodigal on the execution nodes ?

The main problem lies in the fact that the location of the programs or the data can vary from one machine to the other (for instance between my laptop and the cluster) or can change over time. It's generally easier and more flexible to manipulate the environment than to manipulate config files.

For instance, imagine that I want to compare 2 versions of the pVOG database. With environment variables, I just have to modify the variable that points to the location of the data (say PVOG_HOME) in the current shell or the qsub batch file. If the location is written in a config file, I will have to have 2 different files, pay attention that other parameters are the same in all files, pass the correct one to phigaro, etc.

Similarly, if one install a new version of prodigal, I would need to update all the various files configuration files in order to use it while when using the environment, I just have to update the PATH variable.

On the technical side:

  • For the programs phigaro has nothing to do: if say hmmsearch is in the PATH variable (whether you added it yourself or with something like Environment Modules), the sh library will find and execute it (there is no need to pass the whole path of the binary).
  • For the location of the pVOG database, you simply have to pass the value of os.environ["PVOG_HOME"].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants