New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for named conda environments #494
Conversation
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
snakemake/parser.py
Outdated
class CondaEnv(RuleKeywordState): | ||
@property | ||
def keyword(self): | ||
return "conda_env" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be condaenv
but using snakecase to make it clearer
snakemake/utils.py
Outdated
CondaEnvTuple = namedtuple( | ||
"CondaEnvTuple", | ||
"is_named env" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be removed, and instead of using is_named
, we just using isinstance(conda_env, str)
in its stead.
Thank you! I'll try to review this ASAP. Just give me some additional time as there is a deadline tomorrow and I need to release some urgent fixes in Varlociraptor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! This is a very good start. In order to make the code even clearer, I think it might be better to implement an OO hierarchy for the specification of conda envs, i.e.:
class CondaEnvSpec(ABC):
@abstractmethod
def expand_wildcards(self, wildcards):
...
@abstractmethod
def get_conda_env(self):
...
class CondaEnvFile(CondaEnvSpec):
...
class CondaEnvName(CondaEnvSpec):
...
Does that make sense?
Further, may I ask you to format the PR with black? |
@johanneskoester I'll try to disentangle the reliance on an environment file existing, as that's what I was trying to make work. Likely I'll get to this in early September when I have some time off for side-projects :) |
I take it that this still isn't resolved? Just was trying to figure out calling up a conda env for a single rule today! |
@conchoecia I haven't had time to get around to this, and have stuck with boiler plate code for now. I like to store all my conda environments in one place using In the snakefile I use the following shell boilerplate: rule:
params:
conda_env = "some_prebuilt_and_named_env"
shell:
"""
set +eu
&& PS1=dummy
&& . $(conda info --base)/etc/profile.d/conda.sh
&& conda activate {params.conda_env};
set -eu
...
""" |
a701c2e
to
31b8b2e
Compare
@johanneskoester I really tried to reason about how I think the latest set of changes are good compromise for a minimal set of changes to get it working. Thank-you for your consideration! |
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
@johanneskoester tests pass, this is ready to go! |
Just adding a +1 for this feature - is there a timeline for implementation? |
I'm also interested in this. I also don't need snakemake to manage my conda environments for me. Given that the code has already been helpfully written, can we get traction on this? |
Any updates? I'm looking forward to using Snakemake with complex (but standardized!) environments like this one: |
Chiming in to say this would be very useful in a couple of my more esoteric workflows with external non-conda deps. |
This would be useful enhancement of snakemake functionality! Any updates on the timeline for this feature? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for pointing me again to this. Sorry for the delay. This slipped my attention, because, to be honest, I personally rather favor to have the yaml in the workflow definition, because I think this is much more transparent and adaptable (at least for the average non-expert user). Nevertheless, Snakemake best practices are not a religion and if we can make a change that fits your habits without totally rewriting the way Snakemake works, I am fine with giving it a try.
Nevertheless, I don't want to merge this in the current form, because of the hack using something considered to be a file path as a name instead. I really think the OO solution is worth giving a try. But I know, this is quite some refactoring work, and I can understand that you probably also don't have the time for this. Given the fact that I forgot this PR for such a long time and you patiently waited for it, I will have a look into this today. Maybe I can come up with a solution quickly.
Thank you @johanneskoester for giving it a shot. I commend you for making reproducibility easy and the first choice when running Snakemake, but also understand the pragmatic choices that other people (including myself) make that require alternatives. Like you said, we shouldn’t become dogmatic about the right way to ensure reproducibility. In some cases, I want Snakemake just to be the excellent workflow engine that it is, and not force me down a specific path to build my environment. I’d argue that it’s great both are included in Snakemake, but should be allowed to be decoupled if wanted. I did try to figure out how to implement the OO approach, but like I said, there’s a lot of coupling in the code and as a newcomer to the code it was quite daunting. I’d love to learn and contribute more (along with our team) as we really love Snakemake and use it in our day to day across a large set of clients. Seeing how you’d approach this feature could help us understand the code, and later we can add more functionality and/or code docs to help our future selves. Thank you for your consideration! |
Thanks for returning to this PR, @johanneskoester I'm glad we waited to revisit this issue. Much has changed, so my use case for named environments is different. My user story: most rules I write call a single program (bwa) or some unix tools (grep + sed). I find it easy to
But sometimes I use Qiime2. Each version has +400 dependencies, and is installed with conda into an environment with a fixed name. Because the conda environment for each release of is consistent, calling Before mamba, installing more copies of qiime2 was very slow. But times have changed! I'll try putting this into an snakemake conda env and see how it goes! |
So, I think I have successfully finalized an object oriented flavor of this feature in PR #1340. Hence, closing in favor of that PR. Thanks for your help on this, for your patience, and for sharing your use cases! And sorry for being slow with the review. |
Use the conda_env directive
Implements: #352