Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14834 control: Enable parallel server->engine dRPCs (#14193) #14367

Merged
merged 1 commit into from
May 17, 2024

Conversation

mjmac
Copy link
Contributor

@mjmac mjmac commented May 14, 2024

The idea here is to remove the bottleneck in daos_server that serializes
dRPC calls, to enable daos_server to pass along multiple dRPC calls even
if the first one hasn't yet returned. In the current master branch, we
have a single dRPC client structure that uses RW locks to control access
to its internals. dRPC calls that take a long time can potentially impede
other commands.

My proposed solution is to create a new drpc.ClientConnection for each
command that needs to be sent to the daos_engine. Each command is handled
on its own connection. We were using a connect->send->disconnect pattern
on the client connection anyway.

Required-githooks: true
Change-Id: Ibe5a03d28ecc5099b5827ef22fbbace9e3d8b963
Signed-off-by: Kris Jacque kris.jacque@intel.com

Copy link

Bug-tracker data:
Ticket title is 'LRZ: complete control system hang-up'
Status is 'Resolved'
Labels: 'LRZ,md_on_ssd,scrubbed'
https://daosio.atlassian.net/browse/DAOS-14834

Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

@mjmac mjmac requested a review from kjacque May 14, 2024 20:04
Copy link
Contributor

@kjacque kjacque left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status FAILURE. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-14367/3/display/redirect

The idea here is to remove the bottleneck in daos_server that serializes
dRPC calls, to enable daos_server to pass along multiple dRPC calls even
if the first one hasn't yet returned. In the current master branch, we
have a single dRPC client structure that uses RW locks to control access
to its internals. dRPC calls that take a long time can potentially impede
other commands.

My proposed solution is to create a new drpc.ClientConnection for each
command that needs to be sent to the daos_engine. Each command is handled
on its own connection. We were using a connect->send->disconnect pattern
on the client connection anyway.

Required-githooks: true
Change-Id: Ibe5a03d28ecc5099b5827ef22fbbace9e3d8b963
Signed-off-by: Kris Jacque <kris.jacque@intel.com>
@mjmac mjmac force-pushed the dev/mjmac/pdRPC-backport branch from 3190405 to 238973f Compare May 16, 2024 14:20
Copy link
Collaborator

@daosbuild1 daosbuild1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. No errors found by checkpatch.

@mjmac mjmac merged commit 264cda4 into google/2.4 May 17, 2024
34 checks passed
@mjmac mjmac deleted the dev/mjmac/pdRPC-backport branch May 17, 2024 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants