[EPIC] v1 data path performance enhancement #6600

derekbit · 2023-08-28T04:58:21Z

Is your feature request related to a problem? Please describe (👍 if you like this request)

This epic is used to track improvements of the v1 data path.

Frontend
- [POC] Investigate Longhorn with userspace block device (ublk) #5159
Backend
- [IMPROVEMENT] investigate v1 volume backend improvement #6590
- [IMPROVEMENT] Improve the replica rebuilding using different data transmission protocol #5002 (replica rebuild)
Investigate performance bottleneck
- [IMPROVEMENT] Investigate performance bottleneck in v1 data path #8436

Describe the solution you'd like

Describe alternatives you've considered

Additional context

cc @longhorn/dev-data-plane

innobead · 2023-08-28T05:01:01Z

The primary improvement would be:

frontend <-> engine: introduce ublk (relying on kernel support, at least 6.1)
engine/replica <-> replica: introduce a new protocol (like quic) to replace the in-house TCP-based protocol.

The purpose is to make v1 appropriate for running under constrained resource environments like Edge.

derekbit · 2024-04-30T04:26:12Z

@Kampadais improved the v1 data path based on my previous PoC.
longhorn/longhorn-engine#1067

I think we can work on the improvement in v1.8.0 together and replace the iSCSI frontend with the new one.
WDYT @PhanLe1010 @Kampadais

cc @innobead

PhanLe1010 · 2024-04-30T04:29:33Z

Agree that we can implement it in 1.8.0. Because I am working on v1 performance related topics, I can help to drive this ticket if it is ok Derek

derekbit · 2024-04-30T04:32:37Z

Agree that we can implement it in 1.8.0. Because I am working on v1 performance related topics, I can help to drive this ticket if it is ok Derek

Sure! ;) BTW, the author of ubdsrv, Ming, is friendly and interested in Longhorn. Feel free to discuss with him if we encounter any issues.

Kampadais · 2024-04-30T22:45:58Z

I would like to work on it more. I am currently experimenting and investigating why with my multiple connections approach can't achieve ublk's performance.

PhanLe1010 · 2024-04-30T22:47:09Z

Surely @Kampadais . I am happy to follow your lead!

shuo-wu · 2024-05-03T05:06:37Z

Recently Phan and me tried to make some improvements to the v1 backend data path:

Avoid unnecessary memory allocations during revision counter increment (for Write operation): Fix some v1 data path bottlenecks longhorn-engine#1085
Increase the connection between engine and replicas: Fix some v1 data path bottlenecks longhorn-engine#1085
Cache the disk size so that there is no need to invoke system call fstat for each Read operation: [IMPROVEMENT] Investigate performance bottleneck in v1 data path #8436 (comment)
Use channels rather than mutex lock for revision counter increment
Increase the revision counter and write actual data simultaneously

Notice that only the first two ideas work as expected. And the remaining 3 ideas cannot get a better result in the fio test, regardless that the CPU profiling shows a CPU usage percentage decrement for the R/W function. We suspect that iSCSI tgt may be the main bottleneck. It may waste all CPU resources saved by the last 3 ideas, which finally leads to 0 improvements in the fio test.

If we plan to introduce ublk or other kinds of frontend, which has better improvement than iSCSI tgt. We can re-do the test for the last 3 ideas. For example:

               ublk + idea 1,2            ublk + idea 1,2,3            ublk + idea 1,2,3,4,5
IOPS
CPU

cc @PhanLe1010

DamiaSan · 2024-05-03T07:34:14Z

Maybe the bottleneck can be the revision counter itself? I mean the writing of the revision counter on the disk.

derekbit added kind/feature Feature request, new feature Epic labels Aug 28, 2023

innobead added this to the v1.7.0 milestone Aug 28, 2023

innobead added highlight Important feature/issue to highlight priority/0 Must be fixed in this release (managed by PO) area/v1-data-engine v1 data engine (iSCSI tgt) area/replica Volume replica where data is placed area/edge Edge related labels Aug 28, 2023

innobead mentioned this issue Sep 14, 2023

[IMPROVEMENT] Improve the replica rebuilding using different data transmission protocol #5002

Open

innobead assigned derekbit and DamiaSan Sep 14, 2023

innobead unassigned derekbit and DamiaSan Feb 26, 2024

innobead modified the milestones: v1.7.0, v1.8.0 Feb 26, 2024

innobead added the area/environment-low-spec Use low spec hardware resources like HDD for longhorn disk label Feb 27, 2024

This was referenced Mar 15, 2024

[TASK] Investigate data plane optimization potential #3446

Closed

[TASK] IO performance measurement #3377

Closed

Kampadais mentioned this issue Mar 27, 2024

Ublk longhorn/longhorn-engine#1067

Open

innobead assigned PhanLe1010 Apr 30, 2024

shuo-wu mentioned this issue May 3, 2024

Improve replica revision counter longhorn/longhorn-engine#1097

Open

derekbit mentioned this issue May 17, 2024

[IMPROVEMENT] investigate v1 volume backend improvement #6590

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] v1 data path performance enhancement #6600

[EPIC] v1 data path performance enhancement #6600

derekbit commented Aug 28, 2023 •

edited by PhanLe1010

innobead commented Aug 28, 2023 •

edited

derekbit commented Apr 30, 2024 •

edited

PhanLe1010 commented Apr 30, 2024 •

edited

derekbit commented Apr 30, 2024

Kampadais commented Apr 30, 2024

PhanLe1010 commented Apr 30, 2024

shuo-wu commented May 3, 2024

DamiaSan commented May 3, 2024

[EPIC] v1 data path performance enhancement #6600

[EPIC] v1 data path performance enhancement #6600

Comments

derekbit commented Aug 28, 2023 • edited by PhanLe1010

Is your feature request related to a problem? Please describe (👍 if you like this request)

Describe the solution you'd like

Describe alternatives you've considered

Additional context

innobead commented Aug 28, 2023 • edited

derekbit commented Apr 30, 2024 • edited

PhanLe1010 commented Apr 30, 2024 • edited

derekbit commented Apr 30, 2024

Kampadais commented Apr 30, 2024

PhanLe1010 commented Apr 30, 2024

shuo-wu commented May 3, 2024

DamiaSan commented May 3, 2024

derekbit commented Aug 28, 2023 •

edited by PhanLe1010

innobead commented Aug 28, 2023 •

edited

derekbit commented Apr 30, 2024 •

edited

PhanLe1010 commented Apr 30, 2024 •

edited