Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Core feature] UX improvement: support local pip-installable packages in ImageSpec and pyflyte run --remote #5343

Open
2 tasks done
cosmicBboy opened this issue May 9, 2024 · 3 comments
Labels
enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers

Comments

@cosmicBboy
Copy link
Contributor

Motivation: Why do you think this is important?

Say I have the following flytekit project folder structure:

/my_project
    /src
        __init__.py
        module_a.py
        ... # etc
    tasks.py
    workflows.py
    setup.py

Where src is a pure python internal library that I can independently run/test. Then I want to import modules from src into my flytekit tasks.py and workflows.py files.

When I build my ImageSpec, I currently have to role my own solution to install src is a pip package (maybe via github).

Then, when I want to quickly iterate on it with pyflyte run --remote, I use the --copy-all flag to make sure src is in the PYTHONPATH of the container running my tasks.

Goal: What should the final outcome look like, ideally?

flytekit UX improvement: would love to be able to support local pip-installable packages in ImageSpec , something like:

ImageSpec(
    packages=[...]  # python packages
    local_packages=["src"]
)

Where I can pip install -e src and tasks.py and workflows.py can import stuff from src when developing locally and src is a local package that I want to keep separate from my flytekit tasks and workflows.

ImageSpec build time

At image build time, this should understand that I want to copy the src package into the image and bake it into the image itself.

Iteration time

local_packages in image spec implies that when I pyflyte run --remote with local changes, somehow this should be also be available as a package in the container that’s running on remote as part of fast registration.

Describe alternatives you've considered

The current workarounds for this are --copy-all at iteration time and pip installing via github at ImageSpec build time (there are probably more workarounds here).

Propose: Link/Inline OR Additional context

This is part of a broader pain point of the flyte-maintainer-approved way of building a flytekit project. The tutorials/guides in our docs sort of assume you're working on a single file, or some simpler structure that doesn't include having a python project that's separate from flyte task/workflow modules.

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@cosmicBboy cosmicBboy added enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers labels May 9, 2024
@thomasjpfan
Copy link
Member

Having a ImageSpec(local_packages=...) is nice once the local pip installable package is set.

For rapid iteration, I think building an image every time code changes is a bit more overhead. I'm thinking of this user story:

  1. They already have a container_image set to a predefined string
  2. They have a local pip installable python library locally they are iterating on.
  3. They want their local python library installed in their task environment.

What do you think of having a new --install-local that installs the local python library before the task executes?

pyflyte run --remote --install-local

@fg91
Copy link
Member

fg91 commented May 9, 2024

Notes from discussion in contribs' sync:

This could be incorporated into fast registration. A user would specify e.g. pyflyte run --local-package /some/path1/package1 --local-package /some/path2/package2 .... These packages would then be copied into the code tarball (or separate ones). The pod entrypoint pyflyte-fast-execute would create a directory called e.g. /home/flyte/additional-packages, add this directory to the pythonpath, and finally unpack the additional package tarballs into this directory.

The benefit of this mechanism over ImageSpec would be that it can be used regardless of how images are built. Some organizations build images purely in CI instead of using ImageSpec.

@thomasjpfan
Copy link
Member

@fg91 For multiple local-packages, does the Python library have code that needs to be complied? For example, a Polars extension plugin that requires Rust to build.

In any case, I think there is enough Python only packages out there where --local-package ... is a significant improvement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request untriaged This issues has not yet been looked at by the Maintainers
Projects
None yet
Development

No branches or pull requests

3 participants