Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing.pool.MaybeEncodingError #193

Open
glorioushonor opened this issue Mar 26, 2023 · 8 comments
Open

multiprocessing.pool.MaybeEncodingError #193

glorioushonor opened this issue Mar 26, 2023 · 8 comments

Comments

@glorioushonor
Copy link

Good job! When I run the script python -m scripts.render_batch -debug -headless, I got error as follows:

Start Rendering thuman2 with 36 views, 512x512 size.
Output dir: ./debug/thuman2_36views
Rendering types: ['light', 'normal', 'depth']
  0%|                                                                                                                                                                                    | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/media/vision/linjie/ICON/scripts/render_batch.py", line 254, in <module>
    for _ in tqdm(
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f5f10a6ed90>'. Reason: 'ValueError('ctypes objects containing pointers cannot be pickled')'

I think it has to do with the number of Gpus, you used two Gpus, I am using a four Gpus server, but I can only use the number 3 GPU, I really don't know how to change it.

@MalignusCN
Copy link

Good job! When I run the script python -m scripts.render_batch -debug -headless, I got error as follows:

Start Rendering thuman2 with 36 views, 512x512 size.
Output dir: ./debug/thuman2_36views
Rendering types: ['light', 'normal', 'depth']
  0%|                                                                                                                                                                                    | 0/2 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/media/vision/linjie/ICON/scripts/render_batch.py", line 254, in <module>
    for _ in tqdm(
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/multiprocessing/pool.py", line 868, in next
    raise value
multiprocessing.pool.MaybeEncodingError: Error sending result: '<multiprocessing.pool.ExceptionWithTraceback object at 0x7f5f10a6ed90>'. Reason: 'ValueError('ctypes objects containing pointers cannot be pickled')'

I think it has to do with the number of Gpus, you used two Gpus, I am using a four Gpus server, but I can only use the number 3 GPU, I really don't know how to change it.

Maybe you can't see the bug with multiprocessing, you can first hacked the render_batch without any multi processing, just execute render_subject, to find the bug

@lucas-jay
Copy link

Hello, I have encountered a similar problem before, and I resolved it by setting rs_rate in render_bash.py to 1.0

@glorioushonor
Copy link
Author

您好,我以前遇到过类似的问题,我通过将render_bash.py设置为 1.0 来解决它rs_rate

Thanks for your kind suggestion, but it didn‘t work to me.

@lucas-jay
Copy link

Or maybe you can try temporarily modify multi process part to single to find bug, like:
image

@glorioushonor
Copy link
Author

Or maybe you can try temporarily modify multi process part to single to find bug, like: image

Thank you very much. Indeed, you are right. Other errors may have led to the error of multiprocessing. At present, the following error has occurred. I have searched for the error by google, but it has not been solved. I'll keep exploring until I figure it out.

Start Rendering thuman2 with 36 views, 512x512 size.
Output dir: ./debug/thuman2_36views
Rendering types: ['light', 'normal', 'depth']
  0%|                                                                                                                                | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/media/vision/linjie/ICON copy 2/scripts/render_batch.py", line 254, in <module>
    for _ in tqdm(
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/site-packages/tqdm/std.py", line 1180, in __iter__
    for obj in iterable:
  File "/media/vision/linjie/ICON copy 2/scripts/render_batch.py", line 57, in render_subject
    initialize_GL_context(width=size, height=size, egl=egl)
  File "/media/vision/linjie/ICON copy 2/lib/renderer/gl/init_gl.py", line 23, in initialize_GL_context
    create_opengl_context((width, height))
  File "/media/vision/linjie/ICON copy 2/lib/renderer/gl/glcontext.py", line 89, in create_opengl_context
    egl_display = egl.eglGetDisplay(egl.EGL_DEFAULT_DISPLAY)
  File "/media/vision/linjie/.conda/envs/ICON/lib/python3.8/site-packages/OpenGL/platform/baseplatform.py", line 402, in __call__
    return self( *args, **named )
  File "src/errorchecker.pyx", line 58, in OpenGL_accelerate.errorchecker._ErrorChecker.glCheckError
OpenGL.error.GLError: GLError(
        err = 12300,
        baseOperation = eglGetDisplay,
        cArguments = (
                <OpenGL._opaque.EGLNativeDisplayType_pointer object at 0x7fd5b9323240>,
        ),
        result = <OpenGL._opaque.EGLDisplay_pointer object at 0x7fd5b9323140>
)

@glorioushonor
Copy link
Author

Or maybe you can try temporarily modify multi process part to single to find bug, like: image

Hi. I'm sorry to bother you. I have got the reproduction result, but the data provided by the author has expired. May I ask if we can exchange the quantitative experimental results and other implementation details? Or if you allow me, I would like to communicate with you through email or add your QQ friends for further communication. Thank you.

@xiaoniujz
Copy link

Or maybe you can try temporarily modify multi process part to single to find bug, like: image

Hi. I'm sorry to bother you. I have got the reproduction result, but the data provided by the author has expired. May I ask if we can exchange the quantitative experimental results and other implementation details? Or if you allow me, I would like to communicate with you through email or add your QQ friends for further communication. Thank you.

I also encounter this question. I reinstalled my Nvidia Driver to Nvidia Display Driver, and the error disappear.

@msverma101
Copy link

i tried to use the single gpu as mentioned but when i see the gpu usage it says its 0 percentage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants