Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: [python] subdeps breaking wmill import #3596

Open
erickvneri opened this issue Apr 23, 2024 · 4 comments
Open

bug: [python] subdeps breaking wmill import #3596

erickvneri opened this issue Apr 23, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@erickvneri
Copy link
Contributor

erickvneri commented Apr 23, 2024

Describe the bug

Note: I raise this as an issue as I don't have enough Rust expertise to fix it and submit it

Resources

  1. Pinning dependencies and Requirements

Summary

Yesterday we submitted a PR to fix an import parse opting to prioritize haystack imports to load haystack-ai instead haystack. The release worked perfectly and allowed us to test a few flows.

However, there are a few sub dependencies on the Haystack's modules that require other deps. We followed the official document referred above and it seems we broke our wmill pip installation.

This was reproduced several times on a localhost environment.

To reproduce

  1. Create a flow and an Inline python action
  2. Copy/Paste the following code into the action
# requirements:
# dependency
# transformers==4.40.0
# accelerate==0.29.3
# torch==2.2.2
import wmill
import haystack

def main():
    pass
  1. At some point, you should be able to see the following error message:
ExecutionErr: ExitCode: 1, last log lines:
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/tmp/windmill/wk-default-77e24f9bbede-VtJBM/018f0c3d-6288-a8e3-0320-d3952c60bfbe/wrapper.py", line 9, in <module>
    from u.talentgenius.job_recommendations.branchone_2 import step_0 as inner_script
  File "/tmp/windmill/wk-default-77e24f9bbede-VtJBM/018f0c3d-6288-a8e3-0320-d3952c60bfbe/u/talentgenius/job_recommendations/branchone_2/step_0.py", line 7, in <module>
    import wmill
ModuleNotFoundError: No module named 'wmill'

Expected behavior

ExecutionErr: ExitCode: 1, last log lines:
Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/tmp/windmill/wk-default-77e24f9bbede-VtJBM/018f0c3d-6288-a8e3-0320-d3952c60bfbe/wrapper.py", line 9, in
from u.talentgenius.job_recommendations.branchone_2 import step_0 as inner_script
File "/tmp/windmill/wk-default-77e24f9bbede-VtJBM/018f0c3d-6288-a8e3-0320-d3952c60bfbe/u/talentgenius/job_recommendations/branchone_2/step_0.py", line 7, in
import wmill
ModuleNotFoundError: No module named 'wmill'

Screenshots

image

Browser information

Versión 1.65.114 Chromium: 124.0.6367.60 (Build oficial) (64 bits)

Application version

1.313.0 (2024-04-23)

Additional Context

It seems wmill is broken per action, i.e. other actions with different import configurations won't get affected and will be able to use the wmill module properly

@erickvneri erickvneri added the bug Something isn't working label Apr 23, 2024
@rubenfiszel
Copy link
Contributor

you're simply missing wmill in your list of requirements

@erickvneri
Copy link
Contributor Author

I'll give you an update ASAP. I attempted importing wmill on different order but somehow I was missing either haystack or wmill again... it was a weird loop that in some cases were not even finishing execution/installation

@erickvneri
Copy link
Contributor Author

Hi, @rubenfiszel

Here are some steps that you can follow to replicate the loop:

version: CE v1.317.1-7-g094f50cdc

This works great and explicit requirements for wmill installation wasn't necessary. Also, haystack-ai was properly installed.

    import wmill
    from haystack import Document

    hugging_face_token = wmill.get_variable("u/user/HUGGING_FACE_TOKEN")

    def main():
        pass

Errors will appear begin when I try to access haystack's submodules:

  1. This one will raise as I need additional deps:

     import wmill
     from haystack import Document
     from haystack.components.rankers import TransformersSimilarityRanker
    
     hugging_face_token = wmill.get_variable("u/user/HUGGING_FACE_TOKEN")
    
     def main():
         docs = [Document(content="Paris"), Document(content="Berlin")]
         ranker = TransformersSimilarityRanker()
         ranker.warm_up()
         ranker.run(query="City in France", documents=docs, top_k=1)
    
     {
         "error": {
             "name": "ImportError",
             "stack": "  File \"/tmp/windmill/wk-default-77e24f9bbede-vupvl/018f16cd-a026-3255-0d79-e81aa130865f/u/user/haystack_transformers_similarity_ranker.py\", line 11, in main\n    ranker = TransformersSimilarityRanker()\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n  File \"/tmp/windmill/cache/pip/haystack-ai==2.0.1/haystack/core/component/component.py\", line 132, in __call__\n    instance = super().__call__(*args, **kwargs)\n               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n  File \"/tmp/windmill/cache/pip/haystack-ai==2.0.1/haystack/components/rankers/transformers_similarity.py\", line 92, in __init__\n    torch_and_transformers_import.check()\n\n  File \"/tmp/windmill/cache/pip/lazy-imports==0.3.1/lazy_imports/try_import.py\", line 107, in check\n    raise ImportError(message) from exc_value\n",
             "message": "Failed to import 'accelerate'. Run 'pip install transformers[torch,sentencepiece]'. Original error: No module named 'accelerate'"
         }
     }
    
  2. Add subdependencies (from this format transformers[torch,sentencepiece] to the following three libraries as it raises the same error):

     # requirements:
     # transformers==4.40.0
     # accelerate==0.29.3
     # torch==2.2.2
     import wmill
     from haystack import Document
     from haystack.components.rankers import TransformersSimilarityRanker
    
     hugging_face_token = wmill.get_variable("u/user/HUGGING_FACE_TOKEN")
    
     def main():
         docs = [Document(content="Paris"), Document(content="Berlin")]
         ranker = TransformersSimilarityRanker()
         ranker.warm_up()
         ranker.run(query="City in France", documents=docs, top_k=1)
    
    
     job 018f16bd-bf74-496f-f78e-952f5f937a26 on worker wk-default-77e24f9bbede-vupvl (tag: python3)
    
     --- PYTHON CODE EXECUTION ---
    
     Traceback (most recent call last):
     File "<frozen runpy>", line 198, in _run_module_as_main
     File "<frozen runpy>", line 88, in _run_code
     File "/tmp/windmill/wk-default-77e24f9bbede-vupvl/018f16bd-bf74-496f-f78e-952f5f937a26/wrapper.py", line 9, in <module>
         from u.user import haystack_transformers_similarity_ranker as inner_script
     File "/tmp/windmill/wk-default-77e24f9bbede-vupvl/018f16bd-bf74-496f-f78e-952f5f937a26/u/user/haystack_transformers_similarity_ranker.py", line 6, in <module>
         import wmill
     ModuleNotFoundError: No module named 'wmill'
    
  3. Code Based on that I'll try to install wmill explicitly:

     # requirements:
     # transformers==4.40.0
     # accelerate==0.29.3
     # torch==2.2.2
     # wmill
     import wmill
     from haystack import Document
     from haystack.components.rankers import TransformersSimilarityRanker
    
     hugging_face_token = wmill.get_variable("u/user/HUGGING_FACE_TOKEN")
    
     def main():
         docs = [Document(content="Paris"), Document(content="Berlin")]
         ranker = TransformersSimilarityRanker()
         ranker.warm_up()
         ranker.run(query="City in France", documents=docs, top_k=1)
    
    
     job 018f16d1-52fd-584d-033e-008f5ba8b0ea on worker wk-default-8d69dd904b5b-4vDkL (tag: python3)
    
     --- PYTHON CODE EXECUTION ---
    
     Traceback (most recent call last):
     File "<frozen runpy>", line 198, in _run_module_as_main
     File "<frozen runpy>", line 88, in _run_code
     File "/tmp/windmill/wk-default-8d69dd904b5b-4vDkL/018f16d1-52fd-584d-033e-008f5ba8b0ea/wrapper.py", line 9, in <module>
         from u.user import haystack_transformers_similarity_ranker as inner_script
     File "/tmp/windmill/wk-default-8d69dd904b5b-4vDkL/018f16d1-52fd-584d-033e-008f5ba8b0ea/u/user/haystack_transformers_similarity_ranker.py", line 7, in <module>
         from haystack import Document
     ModuleNotFoundError: No module named 'haystack'
    
  4. At this point, windmill requires me to re-refer to haystack-ai

     # requirements:
     # transformers==4.40.0
     # accelerate==0.29.3
     # torch==2.2.2
     # wmill
     # haystack-ai
     import wmill
     from haystack import Document
     from haystack.components.rankers import TransformersSimilarityRanker
    
     hugging_face_token = wmill.get_variable("u/user/HUGGING_FACE_TOKEN")
    
    
     def main():
         docs = [Document(content="Paris"), Document(content="Berlin")]
         ranker = TransformersSimilarityRanker()
         ranker.warm_up()
         ranker.run(query="City in France", documents=docs, top_k=1)
    

It is at this point where windmill loops forever and this is the info I can get from the browser:

  1. Preview Request:

     POST http://localhost/api/w/localhost/jobs/run/preview -> 200
    
     Response
     {"path":"u/user/haystack_transformers_similarity_ranker","content":"# requirements:\n# transformers==4.40.0\n# accelerate==0.29.3\n# torch==2.2.2\n# wmill\n# haystack-ai\nimport wmill\nfrom haystack import Document\nfrom haystack.components.rankers import TransformersSimilarityRanker\n\nhugging_face_token = wmill.get_variable(\"u/user/HUGGING_FACE_TOKEN\")\n\n\ndef main():\n    docs = [Document(content=\"Paris\"), Document(content=\"Berlin\")]\n    ranker = TransformersSimilarityRanker()\n    ranker.warm_up()\n    ranker.run(query=\"City in France\", documents=docs, top_k=1)\n","args":{},"language":"python3","tag":null}
    
  2. Get Update Request:

     GET http://localhost/api/w/localhost/jobs_u/getupdate/018f16d8-5d88-8c03-f7cd-261453042728?running=true&log_offset=220 -> 200
    
     Response
     {
         "running": null,
         "completed": null,
         "new_logs": "",
         "log_offset": 220,
         "mem_peak": 5716,
         "flow_status": null
     }
    
  3. Cancel Request:

     POST http://localhost/api/w/localhost/jobs_u/queue/cancel/018f16d8-5d88-8c03-f7cd-261453042728 -> 200
    

This is the deepest I've reached into debugging the installation.

Let me know if you need extra info to replicate the issue.

@rubenfiszel
Copy link
Contributor

On your example 4, I do not think it''s looping forever, it's just resolving all the dependencies which can take up to a few minutes depending on your network and cpu. It works for me and it works on app.windmill.dev minus the fact it exceeds space storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants