Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error trying to recognize or register person face #5

Open
nurmukhametdaniyar opened this issue Feb 20, 2024 · 6 comments
Open

Error trying to recognize or register person face #5

nurmukhametdaniyar opened this issue Feb 20, 2024 · 6 comments

Comments

@nurmukhametdaniyar
Copy link

I have followed the instructions in the README, Docker containers start just fine but I get an error whenever I try to recognize or register a person:

failed to retrieve the metadata: [StatusCode.UNAVAILABLE] failed to connect to all addresses; last error: UNKNOWN: ipv4:127.0.0.1:8081: Failed to connect to remote host: Connection refused

Error getting model info

Not sure how to solve that, spent the last couple of days on that

@SamSamhuns
Copy link
Owner

This will happen when requests are sent to the server before the uvicorn_trt_server:latest image container has started hosting the face detection models with the triton server. Can you check if this image is running and check if the models are properly hosted inside with docker logs NAME_OF_CONTAINER (this case docker logs uvicorn_trt_server_cont)

@nurmukhametdaniyar
Copy link
Author

nurmukhametdaniyar commented Feb 20, 2024

@SamSamhuns I see, for some reason it is not able to load the models

I0220 12:03:16.163007 51 server.cc:264] Waiting for in-flight requests to complete.

I0220 12:03:16.163282 51 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences

I0220 12:03:16.163726 51 server.cc:295] All models are stopped, unloading models

I0220 12:03:16.163843 51 server.cc:302] Timeout 30: Found 3 live models and 0 in-flight non-inference requests

I0220 12:03:16.165292 51 tensorflow.cc:2729] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0220 12:03:16.165433 51 tensorflow.cc:2668] TRITONBACKEND_ModelFinalize: delete model state

I0220 12:03:16.166342 51 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0220 12:03:16.451797 51 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state

I0220 12:03:16.454315 51 model_lifecycle.cc:579] successfully unloaded 'arcface_resnet18_110' version 1

I0220 12:03:16.587559 51 model_lifecycle.cc:579] successfully unloaded 'facenet_trtserver' version 1

I0220 12:03:17.168129 51 server.cc:302] Timeout 29: Found 1 live models and 0 in-flight non-inference requests

I0220 12:03:17.352380 51 model_lifecycle.cc:579] successfully unloaded 'face_detection_postprocess' version 1

I0220 12:03:18.171294 51 server.cc:302] Timeout 28: Found 0 live models and 0 in-flight non-inference requests

error: creating server: Internal - failed to load all models

image

@SamSamhuns
Copy link
Owner

You need to post the full logs from the error with docker logs uvicorn_trt_server_cont

@nurmukhametdaniyar
Copy link
Author

I think I found the problem: I am using Mac M1 which doesn't have any Nvidia GPU to use CUDA

W0220 12:27:58.050274 54 pinned_memory_manager.cc:236] Unable to allocate pinned system memory, pinned memory pool will not be available: no CUDA-capable device is detected

I0220 12:27:58.050860 54 cuda_memory_manager.cc:115] CUDA memory pool disabled

I0220 12:27:58.055310 54 model_lifecycle.cc:459] loading: arcface_resnet18_110:1

I0220 12:27:58.055458 54 model_lifecycle.cc:459] loading: face-reidentification-retail-0095:1

I0220 12:27:58.055516 54 model_lifecycle.cc:459] loading: face_detection_0204:1

I0220 12:27:58.055578 54 model_lifecycle.cc:459] loading: face_detection_postprocess:1

I0220 12:27:58.055642 54 model_lifecycle.cc:459] loading: facenet_trtserver:1

E0220 12:27:58.056370 54 model_lifecycle.cc:597] failed to load 'face_detection_0204' version 1: Invalid argument: unable to find 'libtriton_openvino.so' for model 'face_detection_0204', searched: app/triton_server/models/face_detection_0204/1, app/triton_server/models/face_detection_0204, /opt/tritonserver/backends/openvino

E0220 12:27:58.056363 54 model_lifecycle.cc:597] failed to load 'face-reidentification-retail-0095' version 1: Invalid argument: unable to find 'libtriton_openvino.so' for model 'face-reidentification-retail-0095', searched: app/triton_server/models/face-reidentification-retail-0095/1, app/triton_server/models/face-reidentification-retail-0095, /opt/tritonserver/backends/openvino

I0220 12:27:58.066352 54 onnxruntime.cc:2459] TRITONBACKEND_Initialize: onnxruntime

I0220 12:27:58.066396 54 onnxruntime.cc:2469] Triton TRITONBACKEND API version: 1.10

I0220 12:27:58.066399 54 onnxruntime.cc:2475] 'onnxruntime' TRITONBACKEND API version: 1.10

I0220 12:27:58.066402 54 onnxruntime.cc:2505] backend configuration:

{"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}

/home/triton-server/venv/lib/python3.8/site-packages/pydantic/_internal/fields.py:151: UserWarning: Field "model_name" has conflict with protected namespace "model".

You may be able to resolve this warning by setting model_config['protected_namespaces'] = ().

warnings.warn(

I0220 12:28:00.323930 54 tensorflow.cc:2536] TRITONBACKEND_Initialize: tensorflow

I0220 12:28:00.324065 54 tensorflow.cc:2546] Triton TRITONBACKEND API version: 1.10

I0220 12:28:00.324072 54 tensorflow.cc:2552] 'tensorflow' TRITONBACKEND API version: 1.10

I0220 12:28:00.324076 54 tensorflow.cc:2576] backend configuration:

{"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}}

I0220 12:28:00.324242 54 python_be.cc:1856] TRITONBACKEND_ModelInstanceInitialize: face_detection_postprocess_0 (CPU device 0)

I0220 12:28:00.516557 54 model_lifecycle.cc:694] successfully loaded 'face_detection_postprocess' version 1

I0220 12:28:00.521232 54 onnxruntime.cc:2563] TRITONBACKEND_ModelInitialize: arcface_resnet18_110 (version 1)

I0220 12:28:00.522941 54 onnxruntime.cc:666] skipping model configuration auto-complete for 'arcface_resnet18_110': inputs and outputs already specified

I0220 12:28:00.523689 54 onnxruntime.cc:2606] TRITONBACKEND_ModelInstanceInitialize: arcface_resnet18_110_0 (CPU device 0)

I0220 12:28:00.893132 54 tensorflow.cc:2642] TRITONBACKEND_ModelInitialize: facenet_trtserver (version 1)

I0220 12:28:00.894961 54 model_lifecycle.cc:694] successfully loaded 'arcface_resnet18_110' version 1

2024-02-20 12:28:00.896630: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:00.992433: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }

2024-02-20 12:28:00.994066: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:01.036530: E tensorflow/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

2024-02-20 12:28:01.036597: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (a5543553b3b8): /proc/driver/nvidia/version does not exist

2024-02-20 12:28:01.310714: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:354] MLIR V1 optimization pass is not enabled

2024-02-20 12:28:01.358767: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.

2024-02-20 12:28:02.613673: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:03.136287: I tensorflow/cc/saved_model/loader.cc:325] SavedModel load for tags { serve }; Status: success: OK. Took 2239755 microseconds.

I0220 12:28:03.429492 54 tensorflow.cc:2691] TRITONBACKEND_ModelInstanceInitialize: facenet_trtserver_0 (CPU device 0)

W0220 12:28:03.430826 54 tensorflow.cc:834] GPU Execution Accelerator will be ignored for model instance on CPU

2024-02-20 12:28:03.430997: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:03.477543: I tensorflow/cc/saved_model/reader.cc:89] Reading meta graph with tags { serve }

2024-02-20 12:28:03.477623: I tensorflow/cc/saved_model/reader.cc:130] Reading SavedModel debug info (if present) from: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:03.650165: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.

2024-02-20 12:28:04.859342: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: app/triton_server/models/facenet_trtserver/1/model.savedmodel

2024-02-20 12:28:05.212270: I tensorflow/cc/saved_model/loader.cc:325] SavedModel load for tags { serve }; Status: success: OK. Took 1781321 microseconds.

I0220 12:28:05.213535 54 model_lifecycle.cc:694] successfully loaded 'facenet_trtserver' version 1

E0220 12:28:05.215350 54 model_repository_manager.cc:487] Invalid argument: ensemble 'ensemble_face_arcface' depends on 'face_detection_0204' which has no loaded version

E0220 12:28:05.215807 54 model_repository_manager.cc:487] Invalid argument: ensemble 'ensemble_face_face_reid' depends on 'face-reidentification-retail-0095' which has no loaded version

E0220 12:28:05.215817 54 model_repository_manager.cc:487] Invalid argument: ensemble 'ensemble_face_facenet' depends on 'face_detection_0204' which has no loaded version

I0220 12:28:05.220091 54 server.cc:563]

+------------------+------+

| Repository Agent | Path |

+------------------+------+

+------------------+------+

I0220 12:28:05.220224 54 server.cc:590]

+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Backend | Path | Config |

+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |

| python | /opt/tritonserver/backends/python/libtriton_python.so | {"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |

| tensorflow | /opt/tritonserver/backends/tensorflow2/libtriton_tensorflow2.so | {"cmdline":{"auto-complete-config":"true","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |

+-------------+-----------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0220 12:28:05.220770 54 server.cc:633]

+-----------------------------------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Model | Version | Status |

+-----------------------------------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| arcface_resnet18_110 | 1 | READY |

| face-reidentification-retail-0095 | 1 | UNAVAILABLE: Invalid argument: unable to find 'libtriton_openvino.so' for model 'face-reidentification-retail-0095', searched: app/triton_server/models/face-reidentification-retail-0095/1, app/triton_server/models/face-reidentification-retail-0095, /opt/tritonserver/backends/openvino |

| face_detection_0204 | 1 | UNAVAILABLE: Invalid argument: unable to find 'libtriton_openvino.so' for model 'face_detection_0204', searched: app/triton_server/models/face_detection_0204/1, app/triton_server/models/face_detection_0204, /opt/tritonserver/backends/openvino |

| face_detection_postprocess | 1 | READY |

| facenet_trtserver | 1 | READY |

+-----------------------------------+---------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0220 12:28:05.223044 54 tritonserver.cc:2264]

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| Option | Value |

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

| server_id | triton |

| server_version | 2.29.0 |

| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace logging |

| model_repository_path[0] | app/triton_server/models |

| model_control_mode | MODE_NONE |

| strict_model_config | 0 |

| rate_limit | OFF |

| pinned_memory_pool_byte_size | 268435456 |

| response_cache_byte_size | 0 |

| min_supported_compute_capability | 6.0 |

| strict_readiness | 1 |

| exit_timeout | 30 |

+----------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0220 12:28:05.223066 54 server.cc:264] Waiting for in-flight requests to complete.

I0220 12:28:05.223075 54 server.cc:280] Timeout 30: Found 0 model versions that have in-flight inferences

I0220 12:28:05.223317 54 server.cc:295] All models are stopped, unloading models

I0220 12:28:05.223326 54 server.cc:302] Timeout 30: Found 3 live models and 0 in-flight non-inference requests

I0220 12:28:05.223921 54 onnxruntime.cc:2640] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0220 12:28:05.224416 54 tensorflow.cc:2729] TRITONBACKEND_ModelInstanceFinalize: delete instance state

I0220 12:28:05.224479 54 tensorflow.cc:2668] TRITONBACKEND_ModelFinalize: delete model state

I0220 12:28:05.392863 54 onnxruntime.cc:2586] TRITONBACKEND_ModelFinalize: delete model state

I0220 12:28:05.394893 54 model_lifecycle.cc:579] successfully unloaded 'arcface_resnet18_110' version 1

I0220 12:28:05.489371 54 model_lifecycle.cc:579] successfully unloaded 'facenet_trtserver' version 1

I0220 12:28:06.225798 54 server.cc:302] Timeout 29: Found 1 live models and 0 in-flight non-inference requests

I0220 12:28:06.530202 54 model_lifecycle.cc:579] successfully unloaded 'face_detection_postprocess' version 1

I0220 12:28:07.229122 54 server.cc:302] Timeout 28: Found 0 live models and 0 in-flight non-inference requests

error: creating server: Internal - failed to load all models

@SamSamhuns
Copy link
Owner

I shall try running on a Mac. The original development was on a Linux machine. However, the missing GPU shouldn't be an issue since all models are set to use cpus. The actual error might be the openvino library issue failed to load 'face_detection_0204' version 1: Invalid argument: unable to find 'libtriton_openvino.so' for model 'face_detection_0204', searched:

@SamSamhuns
Copy link
Owner

@nurmukhametdaniyar , it seems the OpenVINO backend is not supported on mac m1 chips. I've raised an issue with in that GitHub repo inquiring about this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants