Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theano compilelock issues with MultiPool #105

Open
adrn opened this issue Jul 15, 2020 · 7 comments
Open

Theano compilelock issues with MultiPool #105

adrn opened this issue Jul 15, 2020 · 7 comments

Comments

@adrn
Copy link
Owner

adrn commented Jul 15, 2020

See email by Song Wang:

with schwimmbad.MultiPool() as pool:
        joker_mcmc = tj.TheJoker(prior_mcmc, pool=pool, random_state=rnd)
        mcmc_init = joker_mcmc.setup_mcmc(data, samples)

with schwimmbad.MultiPool() as pool:
       joker = tj.TheJoker(prior, pool=pool, random_state=rnd)
       prior_samples = prior.sample(size=10000,random_state=rnd)
       samples = joker.rejection_sample(data, prior_samples, max_posterior_samples=256)

throw a warning:

"INFO (theano.gof.compilelock): Waiting for existing lock by process '' (I am process '')
INFO (theano.gof.compilelock): To manually release the lock, delete ***/lock_dir"

@dfm
Copy link
Collaborator

dfm commented Jul 15, 2020

Perhaps you already know this, but I normally use a hack to set the compiledir using os.pid. It's possible that this could also be handled by pre-compiling the required theano functions and then passing those around.

@adrn
Copy link
Owner Author

adrn commented Jul 15, 2020

Oh right (and for context, this was an email I got from a user). Do you have an example you could share?

@dfm
Copy link
Collaborator

dfm commented Jul 15, 2020

Something like the following can work:

import os
from multiprocessing import Pool

os.environ["THEANO_FLAGS"] = f"compiledir={os.getpid()}"

import theano
import theano.tensor as tt


def func(x):
    x_ = tt.dscalar()
    return theano.function([x_], [x_ * x_])(x)


if __name__ == "__main__":
    with Pool(4) as pool:
        print(list(pool.map(func, range(10))))

@dfm
Copy link
Collaborator

dfm commented Jul 15, 2020

Or...

from multiprocessing import Pool
import theano
import theano.tensor as tt


if __name__ == "__main__":
    x_ = tt.dscalar()
    func = theano.function([x_], [x_ * x_])
    with Pool(4) as pool:
        print(list(pool.map(func, range(10))))

@AstroSong
Copy link

Thanks @adrn @dfm

When I add this "os.environ" line to my script, the warning stops keep brushing the screen, and just appears for fixed times (equal to how many processes set in the Pool). However, the code appears to be at a standstill, although the CPU is running. I wait for more than 20 minutes, neither of the processes completes the rejection sampling part. It seems needs quite long time to move to the next step. Still not in parallel?

Another strange case is when I set processes equal to 2, the code can run, but it skips the mcmc part.

My computer has 10 cores, is it OK if I set processes equal to 4?

I also try to open two or three terminals to run a single-process code. It works, but do not save too much time. It seems that the different terminals are not totally in parallel.

@adrn
Copy link
Owner Author

adrn commented Jul 28, 2020

@AstroSong Strange! Could you share a minimum working example script, and send the versions of schwimmbad & thejoker that you are using? What platform are you on? Thanks!

python -c "import schwimmbad; print(schwimmbad.__version__)"
python -c "import thejoker; print(thejoker.__version__)"

@AstroSong
Copy link

@AstroSong Strange! Could you share a minimum working example script, and send the versions of schwimmbad & thejoker that you are using? What platform are you on? Thanks!

python -c "import schwimmbad; print(schwimmbad.__version__)"
python -c "import thejoker; print(thejoker.__version__)"

@adrn If I use the above second code from @dfm, it works without any warning. But if I use the first code, the warning is still there like follows,

INFO (theano.gof.compilelock): Waiting for existing lock by unknown process (I am process '37974')
INFO (theano.gof.compilelock): Waiting for existing lock by unknown process (I am process '37975')
INFO (theano.gof.compilelock): To manually release the lock, delete /home/song/k2_4/joker/test/37878/lock_dir
INFO (theano.gof.compilelock): To manually release the lock, delete /home/song/k2_4/joker/test/37878/lock_dir
INFO (theano.gof.compilelock): Waiting for existing lock by unknown process (I am process '37974')
INFO (theano.gof.compilelock): To manually release the lock, delete /home/song/k2_4/joker/test/37878/lock_dir

python -c "import schwimmbad; print(schwimmbad.version)" => 0.3.1
python -c "import thejoker; print(thejoker.version)" => 1.1

In addition, I updated joker before by using the git+https://github.com/adrn/thejoker, but I got one mistake:

  File "/usr/local/python3/lib/python3.8/site-packages/thejoker/prior.py", line 320, in sample
    with random_state_context(random_state):
  File "/usr/local/python3/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/usr/local/python3/lib/python3.8/site-packages/thejoker/utils.py", line 299, in random_state_context
    np.random.seed(integers(random_state, 2**32-1))  # HACK
  File "/usr/local/python3/lib/python3.8/site-packages/thejoker/utils.py", line 30, in <lambda>
    integers = lambda obj, *args, **kwargs: obj.integers(*args, **kwargs)
AttributeError: 'numpy.random.mtrand.RandomState' object has no attribute 'integers'

My numpy version is 1.19.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants