Performance Difference between RWX and RWO volumes #6964
I3lacx
started this conversation in
Show and tell
Replies: 1 comment
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hey all,
because of some internal testing I made a couple of experiments on our Cluster related to performance of RWX and RWO volumes. Because this might be of interest to some people I thought I share it here:
The experiments were done in an artificial setting through with interesting results. I also ran some actual experiments (training a Deep Learning model), but could not observe any significant difference in this experiment. All the experiments were done in our Kubernetes Cluster consisting of 6 worker and 4 data nodes.
The artificial experiments were done using the
fio
command and testing the writing speed of different operations. Different block sizes (bs) were used, as they have a drastic impact on performance. Additionally different sizes were used, mainly to keep the experiments in a doable scale.Overview
This below graph shows the overview of the findings. Here RWO = ReadWriteOnce, RWX = ReadWriteMany and "local" / "remote" refers to if a physical volume was present on the same physical node. E.g. when the pod is running on worker5, but the volume is located on data1, the communication still takes place, but happens through the network.
IOPS
are the I/O operations per second, whereas Bandwidth (in MiB/s) and Duration in seconds are as known. Likely the Duration is the most representative value.RWO vs RWX: An important difference is that the filesystem for RWO is
ext4
and for RWX isnfs4
. As the RWX is using a network file storage, there will be a lot of overhead, which is only necessary to allow for such a remote storage.Conclusion
Details
When conducting artificial experiments it is important to remember:
Setup
In order to test this, running a benchmark following: https://cloud.google.com/compute/docs/disks/benchmarking-pd-performance
I replaced the variables with 3 values
SIZE=[6G, 2G, 400M]
andBS=[1M, 32K, 1K]
, so 3 different experiments were performed. Each was run 10 times with a simple bash script looping over all options and preparing them in a nice format for visualization:Detailed Results
The results are as described in the conclusion. Here just all the results visualized as bar charts with actual values. Note the standard deviation on the bar charts. Also note that the duration should be higher for RWX, but in the comparison, the difference in duration was flipped (to fit the graph better)
Beta Was this translation helpful? Give feedback.
All reactions