You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have also asked a question on Stockoverflow, with slight differences.
A simplified case:
## lsof.py, python 3.11, joblib 1.4(also test in 1.4.2)
from joblib import Parallel, delayed
import time
import sys
import pandas as pd
class Tasker:
def __init__(self):
self.data = pd.Series([])
def run(self):
time.sleep(10)
return 1.0
def get_num_of_opened_files() -> tuple[int, int]:
from subprocess import run
return int(run('lsof | wc -l', shell=True, capture_output=True, text=True).stdout.strip()), \
int(run('lsof | grep \\.so$ | wc -l', shell=True, capture_output=True, text=True).stdout.strip())
tasker = Tasker()
f0, s0 = get_num_of_opened_files()
xs = Parallel(n_jobs=32, return_as='generator')(delayed(tasker.run)() for _ in range(32))
time.sleep(2)
f1, s1 = get_num_of_opened_files()
print(f'Opened files: before {f0}, after {f1}, delta all {f1 - f0}, delta so: {s1 - s0}', flush=True)
print(sum(xs))
Run above py script will got something like:
>> python lsof.py
>> Opened files: before 13924, after 77428, delta all 63504, delta so: 40012
The joblib opened about 60,000 files!!!
And If I running 10 programs like this, the joblib will claim that: UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
**Or even raise an error(my server with 2T free memory): ** A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The text was updated successfully, but these errors were encountered:
I have also asked a question on Stockoverflow, with slight differences.
A simplified case:
Run above py script will got something like:
The joblib opened about 60,000 files!!!
And If I running 10 programs like this, the joblib will claim that:
UserWarning: A worker stopped while some jobs were given to the executor. This can be caused by a too short worker timeout or by a memory leak.
**Or even raise an error(my server with 2T free memory): **
A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.
The text was updated successfully, but these errors were encountered: