[Windows] Support CPU shared memory (Client/Frontend) #7048

fpetrini15 · 2024-03-27T18:12:20Z

Goal: Support CPU shared memory between the server and client for Windows

Sub-goals: Modify L0_shared_memory to run on bare-metal Windows using only Python.

Client changes: triton-inference-server/client#551

Some things to note:

When I can verify that the Linux tests pass using only the Python script, I will remove test.sh
L0_shared_memory uses a graphdef model by default. I swapped it with Python so that it would be supported on both Windows and Linux. I still need to go back and investigate how the model ends up in L0_shared_memory (not generated by script) and remove it.
Some of the default paths need to be modified to reflect the testing environment and will be modified pre-merge.

qa/L0_shared_memory/shared_memory_test.py

src/shared_memory_manager.cc

src/shared_memory_manager.h

qa/L0_shared_memory/shared_memory_test.py

nv-kmcgill53

This review is a bit ramble-y but it's very tricky as well. You've done a great job so far, I'm teasing out the nuances so you can provide a good template of how to program for multiple OSes in a sustainable way.

docs/protocol/extension_shared_memory.md

qa/common/util.py

src/shared_memory_manager.cc

src/shared_memory_manager.h

src/shared_memory_manager.cc

src/shared_memory_manager.h

rmccorm4 · 2024-04-11T18:08:38Z

qa/L0_shared_memory/shared_memory_test.py

-            triton_client = httpclient.InferenceServerClient(_url, verbose=True)
+    # Custom setup method to allow passing of parameters
+    def _setUp(self, protocol, log_file_path):
+        self._tritonserver_ipaddr = os.environ.get("TRITONSERVER_IPADDR", "localhost")


Does this need to be configurable in practice? Do we expect to use shared memory for anything other than a co-located server on localhost?

TBD: Currently on the Windows testing side of things, it's passed in as a variable and is different from "localhost". Still trying to get a CI pipeline up to see the new behavior for this test in particular. Will remove if no issue.

rmccorm4 · 2024-04-11T18:09:58Z

qa/L0_shared_memory/shared_memory_test.py

+        self._build_model_repo()
+        self._build_server_args()
+        self._shared_memory_test_server_log = open(log_file_path, "w")
+        self._server_process = util.run_server(


How does util.run_server interact with test.sh also starting server? Is there conflict or issue there?

I don't believe they should overlap. For this test my ultimate goal is to remove test.sh entirely.

Isn't this also getting run in the linux case that runs test.sh? or is there changes on the gitlab-side to not run test.sh at all?

Ah I see your point. There are changes on the gitlab side such that test.sh will not run at all for Windows. I will attempt to change the Linux test case so that it also will not run test.sh

rmccorm4 · 2024-04-11T18:13:01Z

qa/L0_shared_memory/shared_memory_test.py

+            backend_dir = "C:\\opt\\tritonserver\\backends"
+            model_dir = "C:\\opt\\tritonserver\\qa\\L0_shared_memory\\models"
+            self._server_executable = "C:\\opt\\tritonserver\\bin\\tritonserver.exe"


Probably more of a random note or follow-up, but I was under the impression something like Pathlib.Path("/opt/tritonserver/backends") would translate to "C:\\opt\\tritonserver\\backends" for free when run on Windows. If so you could probably condense the cases to work for both.

Did you see otherwise?

No, I believe you are right. ATM they are set to my local path and were hardcoded for convenience. They need to be modified and will once I determine the CI environment.

rmccorm4 · 2024-04-11T18:25:36Z

qa/L0_shared_memory/shared_memory_test.py

+    # Constant members
+    shared_memory_test_client_log = Path(os.getcwd()) / "client.log"
+    model_dir_path = Path(os.getcwd()) / "models"
+    model_source_path = Path(os.getcwd()).parents[0] / "python_models/add_sub/model.py"


Future follow-up as we expand python utilities for CI testing, but might be nice to have some kind of utils.relative_path([path, to, thing]).

ex maybe something like this:

model_dir_path = utils.relative_path("models") model_source_path = utils.relative_path("..", "python_models", "add_sub", "model.py")

nv-kmcgill53

LGTM. Great work on this!

src/shared_memory_manager.h

src/shared_memory_manager.cc

GuanLuo

Left some comments, can be addressed in the future PR that adds clean up logic

src/shared_memory_manager.cc

fpetrini15 mentioned this pull request Mar 27, 2024

[Windows] Support CPU shared memory (Client/Frontend) triton-inference-server/client#551

Open

github-advanced-security bot found potential problems Mar 27, 2024

View reviewed changes

nv-kmcgill53 requested changes Apr 3, 2024

View reviewed changes

fpetrini15 force-pushed the fpetrini-win-cpu-shm branch from 86b0362 to 8b63bab Compare April 4, 2024 19:28

github-advanced-security bot found potential problems Apr 4, 2024

View reviewed changes

qa/L0_shared_memory/shared_memory_test.py Fixed Show fixed Hide fixed

nv-kmcgill53 reviewed Apr 9, 2024

View reviewed changes

fpetrini15 requested a review from GuanLuo April 9, 2024 18:26

fpetrini15 added 10 commits April 11, 2024 10:18

Initial commit

47e89e2

Fix GPU case

09496cc

Validate offset

1930402

Open shm file, don't create

9e6adee

Intermmediate commit: Major test restructuring.

123ef0e

Formatting

17a9f57

Gitbot Fixes

c68a3a0

Major software bloat refactor. Opaque shm file handle

e9add6a

Fixes for Unix and handle-agnostic get restructure

955ecb9

Review comments. Passing ShmFile pointer instead of void

a5b6b7e

fpetrini15 force-pushed the fpetrini-win-cpu-shm branch from 15f94bb to a5b6b7e Compare April 11, 2024 17:18

rmccorm4 reviewed Apr 11, 2024

View reviewed changes

rmccorm4 mentioned this pull request Apr 11, 2024

Validate the memory requested for the infer request is not out of bounds #7083

Merged

Merge remote-tracking branch 'origin/main' into fpetrini-win-cpu-shm

7405f83

nv-kmcgill53 previously approved these changes Apr 11, 2024

View reviewed changes

src/shared_memory_manager.h Show resolved Hide resolved

src/shared_memory_manager.cc Outdated Show resolved Hide resolved

GuanLuo reviewed Apr 11, 2024

View reviewed changes

src/shared_memory_manager.cc Outdated Show resolved Hide resolved

src/shared_memory_manager.cc Outdated Show resolved Hide resolved

Open backing file to validate shared memory

126f7dc

fpetrini15 dismissed nv-kmcgill53’s stale review via 126f7dc April 13, 2024 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Windows] Support CPU shared memory (Client/Frontend) #7048

[Windows] Support CPU shared memory (Client/Frontend) #7048

fpetrini15 commented Mar 27, 2024 •

edited

nv-kmcgill53 left a comment

rmccorm4 Apr 11, 2024

fpetrini15 Apr 11, 2024

rmccorm4 Apr 11, 2024

fpetrini15 Apr 11, 2024

rmccorm4 Apr 11, 2024

fpetrini15 Apr 11, 2024 •

edited

rmccorm4 Apr 11, 2024

fpetrini15 Apr 11, 2024

rmccorm4 Apr 11, 2024 •

edited

nv-kmcgill53 left a comment

GuanLuo left a comment

[Windows] Support CPU shared memory (Client/Frontend) #7048

Are you sure you want to change the base?

[Windows] Support CPU shared memory (Client/Frontend) #7048

Conversation

fpetrini15 commented Mar 27, 2024 • edited

nv-kmcgill53 left a comment

Choose a reason for hiding this comment

rmccorm4 Apr 11, 2024

Choose a reason for hiding this comment

fpetrini15 Apr 11, 2024

Choose a reason for hiding this comment

rmccorm4 Apr 11, 2024

Choose a reason for hiding this comment

fpetrini15 Apr 11, 2024

Choose a reason for hiding this comment

rmccorm4 Apr 11, 2024

Choose a reason for hiding this comment

fpetrini15 Apr 11, 2024 • edited

Choose a reason for hiding this comment

rmccorm4 Apr 11, 2024

Choose a reason for hiding this comment

fpetrini15 Apr 11, 2024

Choose a reason for hiding this comment

rmccorm4 Apr 11, 2024 • edited

Choose a reason for hiding this comment

nv-kmcgill53 left a comment

Choose a reason for hiding this comment

GuanLuo left a comment

Choose a reason for hiding this comment

fpetrini15 commented Mar 27, 2024 •

edited

fpetrini15 Apr 11, 2024 •

edited

rmccorm4 Apr 11, 2024 •

edited