Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process is blocked in drawn_humanoid_pose_estimator #285

Open
CaNaRdEoS opened this issue Apr 30, 2024 · 1 comment
Open

Process is blocked in drawn_humanoid_pose_estimator #285

CaNaRdEoS opened this issue Apr 30, 2024 · 1 comment

Comments

@CaNaRdEoS
Copy link

I launch python image_to_animation.py drawings/garlic.png garlic_out
The script remains stuck on the post resp = requests.post("http://localhost:8080/predictions/drawn_humanoid_pose_estimator", files=data_file, verify=False)

I have a docker with 20 G of RAM. I don't think it's a RAM problem as it only uses 13G of RAM.

Here is the last lines of the docker log :

2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG - Backend worker process died.
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG - Traceback (most recent call last):
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_service_worker.py", line 263, in <module>
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     worker.run_server()
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_service_worker.py", line 231, in run_server
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     self.handle_connection(cl_socket)
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_service_worker.py", line 194, in handle_connection
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     service, result, code = self.load_model(msg)
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -                             ^^^^^^^^^^^^^^^^^^^^
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_service_worker.py", line 131, in load_model
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     service = model_loader.load(
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -               ^^^^^^^^^^^^^^^^^^
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_loader.py", line 108, in load
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     module, function_name = self._load_handler_file(handler)
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-30T10:18:45,112 [INFO ] epollEventLoopGroup-5-32 org.pytorch.serve.wlm.WorkerThread - 9000 Worker disconnected. WORKER_STARTED
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/ts/model_loader.py", line 153, in _load_handler_file
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     module = importlib.import_module(module_name)
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-30T10:18:45,112 [DEBUG] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - System state is : WORKER_STARTED
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/importlib/__init__.py", line 90, in import_module
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     return _bootstrap._gcd_import(name[level:], package, level)
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
2024-04-30T10:18:45,112 [DEBUG] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Backend worker monitoring thread interrupted or backend worker process died., responseTimeout:120sec
java.lang.InterruptedException: null
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2056) ~[?:?]
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2133) ~[?:?]
	at java.util.concurrent.ArrayBlockingQueue.poll(ArrayBlockingQueue.java:432) ~[?:?]
	at org.pytorch.serve.wlm.WorkerThread.run(WorkerThread.java:229) [model-server.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:829) [?:?]
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
2024-04-30T10:18:45,112 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap_external>", line 995, in exec_module
2024-04-30T10:18:45,113 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.BatchAggregator - Load model failed: drawn_humanoid_pose_estimator, error: Worker died.
2024-04-30T10:18:45,113 [DEBUG] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - W-9000-drawn_humanoid_pose_estimator_1.0 State change WORKER_STARTED -> WORKER_STOPPED
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
2024-04-30T10:18:45,113 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerThread - Auto recovery failed again
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/tmp/models/616337bc7e8e4f0396a01d96d6b2a8ed/mmpose_handler.py", line 8, in <module>
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     from mmpose.apis import (inference_bottom_up_pose_model,
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -   File "/opt/conda/lib/python3.12/site-packages/mmpose/__init__.py", line 24, in <module>
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG -     assert (mmcv_version >= digit_version(mmcv_minimum_version)
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout MODEL_LOG - AssertionError: MMCV==1.7.2 is used but incompatible. Please install mmcv>=1.3.8, <=1.7.0.
2024-04-30T10:18:45,113 [INFO ] W-9000-drawn_humanoid_pose_estimator_1.0-stdout org.pytorch.serve.wlm.WorkerLifeCycle - Stopped Scanner - W-9000-drawn_humanoid_pose_estimator_1.0-stdout
2024-04-30T10:18:45,113 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-drawn_humanoid_pose_estimator_1.0-stderr
2024-04-30T10:18:45,113 [WARN ] W-9000-drawn_humanoid_pose_estimator_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - terminateIOStreams() threadName=W-9000-drawn_humanoid_pose_estimator_1.0-stdout
2024-04-30T10:18:45,134 [WARN ] W-9015-drawn_humanoid_pose_estimator_1.0-stderr MODEL_LOG - /opt/conda/lib/python3.12/site-packages/mmcv/__init__.py:20: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.```


It seems to be Java errors. How to fix it ?
@yihleego
Copy link
Contributor

You can try to limit CPU and Memory: docker run --cpus 4 -m 8g

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants