Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--update ignored for packages not in existing lockfile #386

Open
timsnyder-siv opened this issue Mar 6, 2023 · 2 comments
Open

--update ignored for packages not in existing lockfile #386

timsnyder-siv opened this issue Mar 6, 2023 · 2 comments

Comments

@timsnyder-siv
Copy link

Observed Behavior

Adding a package to an input specs file and then asking for the conda-lock to --update that package causes the environment to be completely solved and the --update request is silently dropped on the floor.

Expected Behavior

I would expect one of two reasonable outcomes:

  1. conda-lock errors and says that it cannot update the lockfile for something not already in the lockfile
  2. conda-lock does the best it can to solve the environment in a way that is consistent with the requirements and fail if the solver can't find a solution.

Acknowlegement of Difficulty

If you try to ask conda (22.11.1) to update a package that isn't already installed, you get a message like:

-bash-4.2$  conda update fsspec

PackageNotInstalledError: Package is not installed in prefix.
  prefix: /opt/conda
  package name: fsspec

And then a typical user will likely just conda install fsspec and and not realize that it might be impossible to give the solver a set of requirements that would put them back into this exact set of packages because any version requirements they had when previously creating or installing are not maintained across conda invocations (unless they are actually 'pinned').

So, I'm definitely not asking for you to make it possible for conda-lock to give me an inconsistent environment. There will be plenty of cases that just can't be solved for incrementally adding a new package. However, fail loudly when that happens.

Concrete Example

In my case, I want to add fsspec to https://github.com/firesim/firesim/blob/a81e3725c11406ea740c3f1a893d344e049f458f/conda-reqs.conda-lock.yml. To do that, I:

  1. add fsspec to https://github.com/firesim/firesim/blob/a81e3725c11406ea740c3f1a893d344e049f458f/conda-reqs.yaml
  2. conda-lock -f conda-reqs.yaml -p linux-64 --lockfile conda-reqs.conda-lock.yml --update fsspec

NOTE: I don't necessarily install the package into the conda environment because conda-lock doesn't use the contents of the currently active environment in it's calculation of the lockfile, does it?

However, I find that unless package you're trying to add is in the lockfile, the intersection() at

to_update = set(update).intersection(conda_locked)
filters the added package out of to_update and conda-lock effectively re-solves the entire environment, it doesn't use the update code.

Also, I think it is mildly confusing to suggest step 1 is installing the package into the currently active environment because it implies that having the packages installed somehow influences how conda-lock creates the lockfile, which I don't think is correct. If you are using a lockfile to reproducibly create an environment, it would probably be better to create the updated lockfile and then use it to create the environment. I haven't tried installing on top of an existing environment to see what happens but I'd expect it to be the same as removing the environment completely and reinstalling everything in the lockfile (but perhaps with optimizations for leaving unchanged packages alone).

Originally posted by @timsnyder-siv in #355 (comment)

@srilman
Copy link
Contributor

srilman commented Mar 6, 2023

I believe the--update flag is only officially supported for one very specific use case. Assume the following:

  • You have some number of input source files (environment.yaml, pyproject.toml, ...) that you have not modified
  • You have a preexisting lockfile built from the source files
  • There is a dependency already defined in your source files that you would like to update to the latest version, while satisfying the constraint defined in the source file
  • You want to minimize the number of changes made to your lockfile

Only in this situation is it recommended to use the --update flag. There is a feature request open to support other forms of updates: #370, but we're currently in the discussion phase. If you have any ideas, please let us know!

Apologies for the confusion! I noticed that the documentation around the --update flag is a bit sparse and a little misleading; we can definitely make it clearer. In addition, I think we can add some checks during runtime to ensure that the source files have not been changed. Maybe we could use the source hash in the lockfile? I know there is a --check-input-hash argument that does something similar, but that's opt-in.

@timsnyder-siv
Copy link
Author

timsnyder-siv commented Mar 6, 2023

Thanks for the reply @srilman ! You could add a simple check at the code I pointed out in the original post:

to_update = set(update).intersection(conda_locked)

If my usecase isn't supported, I would suggest that you do something like:

    for _ in set(update) - conda_locked:
        warn(f"--update only works for previously locked packages. Ignoring {_}")  

I'd almost rather that you raise RuntimeError() with the message instead of ignoring the input.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants