Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

radical hangs when running from conda environment #660

Open
JMGilbert opened this issue Feb 6, 2024 · 4 comments
Open

radical hangs when running from conda environment #660

JMGilbert opened this issue Feb 6, 2024 · 4 comments

Comments

@JMGilbert
Copy link

I have been using facts successfully for quite some time and have recently been attempting to merge the environment for facts with the environment for another application. Despite successfully running facts using venv, when I attempt to run it using a conda environment (using identical packages), the session hangs indefinitely:

EnTK session: re.session.404b4a68-c525-11ee-a75e-cfdd3d6ffb24
Creating AppManager
Setting up ZMQ queues                                                         ok
AppManager initialized                                                        ok
Validating and assigning resource manager                                     ok
Setting up ZMQ queues                                                        n/a

This is also the output of the verbose.log file. I have noticed that when I check the virtual environment generated at ~/radical.pilot.sandbox/ve.local.localhost.1.46.2/, there is no activate in the bin folder. When I manually set up an environment under the same name using venv: python3 -m venv ~/radical.pilot.sandbox/ve.local.localhost.1.46.2/ (and install radical.entk in that venv), there is an activate file in the folder and the program runs without error from the conda environment.

Here is my environment.yml file:

name: radical-testing
channels:
  - conda-forge
  - defaults
dependencies:
  - python==3.9
  - setuptools
  - pip
  - wheel
  - virtualenv
  - pip:
      - radical.entk
      - pyyaml==6.0

I have also zipped and attached my re.session. Thanks in advance for any help you can provide!

re.session.zip

@andre-merzky
Copy link
Member

Thanks for reporting! The session seems to hang because the pilot never comes up. Would you please also create a zip or tarball of the pilot sandbox and attach that here? Thank you!

@JMGilbert
Copy link
Author

Attached is my radical pilot sandbox. My computer is not allowing me to zip some of the files from the venv, so I have attached a screenshot of those files (the contents of radical.pilot.sandbox/ve.local.localhost.1.46.2/bin). In the radical pilot sandbox are three sessions:

  • Two of the sessions I kill by keyboard interrupt:
    • re.session.7296a7a8-c531-11ee-be5d-3f86565debf6
    • re.session.de68d350-c530-11ee-8700-1d9ac2b0884f
  • The third session I exported as it was hanging
    • re.session.de68d350-c530-11ee-8700-1d9ac2b0884f

Let me know if there's anything else you need. Thanks!

Screenshot_59

radical.pilot.sandbox.zip

@andre-merzky
Copy link
Member

Thanks @JMGilbert , that helped.

The pilot bootstrapper failed because it could use neither python3 -m venv not virtualenv to prepare the pilot environment. From <session_id>/pilot.0000/bootstrap_0.out:

# -------------------------------------------------------------------
#
# Create ve with venv
# cmd: python3 -m venv /root/radical.pilot.sandbox/ve.local.localhost.1.46.2
#
The virtual environment was not created successfully because ensurepip is not
available.  On Debian/Ubuntu systems, you need to install the python3-venv
package using the following command.

    apt-get install python3-venv

You may need to use sudo with that command.  After installing the python3-venv
package, recreate your virtual environment.

Failing command: ['/root/radical.pilot.sandbox/ve.local.localhost.1.46.2/bin/python3', '-Im', 'ensurepip', '--upgrade', '--default-pip']

#
# ERROR
# no fallback command available
#
# -------------------------------------------------------------------

# -------------------------------------------------------------------
#
# Create ve with system virtualenv
# cmd: virtualenv /root/radical.pilot.sandbox/ve.local.localhost.1.46.2
#
/root/radical.pilot.sandbox/re.session.de68d350-c530-11ee-8700-1d9ac2b0884f/pilot.0000//bootstrap_0.sh: line 597: virtualenv: command not found
#
# ERROR
# no fallback command available
#
# -------------------------------------------------------------------

# -------------------------------------------------------------------
#
# Download virtualenv tgz
# cmd: curl -1 -k -L -O 'https://files.pythonhosted.org/packages/1c/c2/7516ea983fc37cec2128e7cb0b2b516125a478f8fc633b8f5dfa849f13f7/virtualenv-16.7.12.tar.gz'
#
/root/radical.pilot.sandbox/re.session.de68d350-c530-11ee-8700-1d9ac2b0884f/pilot.0000//bootstrap_0.sh: line 597: curl: command not found
#
# ERROR
# no fallback command available
#
# -------------------------------------------------------------------
ERROR: couldn't download virtualenv via curl

Now this should have led to a failing pilot, not a hanging one - I'll investigate why that is happening. Meanwhile, there are two ways to address the issue above: either make sure that one of the options attempted by the bootstrapper works, or instructing the pilot to use a manually created virtualenv. The latter can be any virtualenv in which radical.pilot and it's dependencies are installed. A simple way is to use the same virtualenv in which the client side is running. To do so you can drop the following config file into ~/.radical/pilot/configs/resource_local.json:

{
    "localhost": {
        "rp_version"    : "installed",
        "virtenv_mode"  : "local"
    }
}

That config will overwrite these two settings: virtenv_mode=local will trigger the use of the client side (local) virtualenv, rp_version=installed will not attempt to install RP in that env, but just use what is already installed.

@JMGilbert
Copy link
Author

Got it working now -- this was very helpful, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants