-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hierarchical bandwidth and operations rate limits. #16205
base: master
Are you sure you want to change the base?
Conversation
Some first-pass questions/comments: Maybe I missed it, but do you specify the units anywhere? Are the If you specify 100MB/s and they system is idle with 8GB/s bandwidth available, will it still only use 100MB/s? What happens if someone specifies a Does anything bad happen if you cap it to something super low, like 1 byte/sec? The Is |
Sure, I'll add that.
Correct.
Correct. Always the lowest limit will be enforced. The same if the parent has a lower limit then the child or children combined.
It cannot be lower than the resolution (which is 16 per second), so it will be rounded up to 16, but it will also allocate large number of slots to keep the history, so here one slot per byte of each pending request.
Sure.
So it was counted when it was doing fall back to read/write, but it wasn't counted in case of block cloning, which I think it should, so I just added it. |
I did some hand testing of this and it works just as described 👍
I also verified that it worked for multithreaded writes, and verified that top-level dataset's values correctly overrode their children's values. |
Thank you! BTW. You can use suffixes got limit_bw_* properties, eg. limit_bw_write=200M |
# This run file contains all of the common functional tests. When | ||
# adding a new test consider also adding it to the sanity.run file | ||
# if the new test runs to completion in only a few seconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you want this change included in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really part of this PR, but I was hoping to smuggle it in instead of creating a separate PR for it:)
ce9c37a
to
61f0f95
Compare
Introduce six new properties: limit_{bw,op}_{read,write,total}. The limit_bw_* properties limit the read, write, or combined bandwidth, respectively, that a dataset and its descendants can consume. Limits are applied to both file systems and ZFS volumes. The configured limits are hierarchical, just like quotas; i.e., even if a higher limit is configured on the child dataset, the parent's lower limit will be enforced. The limits are applied at the VFS level, not at the disk level. The dataset is charged for each operation even if no disk access is required (e.g., due to caching, compression, deduplication, or NOP writes) or if the operation will cause more traffic (due to the copies property, mirroring, or RAIDZ). Read bandwidth consumption is based on: - read-like syscalls, eg., aio_read(2), pread(2), preadv(2), read(2), readv(2), sendfile(2) - syscalls like getdents(2) and getdirentries(2) - reading via mmaped files - zfs send Write bandwidth consumption is based on: - write-like syscalls, eg., aio_write(2), pwrite(2), pwritev(2), write(2), writev(2) - writing via mmaped files - zfs receive The limit_op_* properties limit the read, write, or both metadata operations, respectively, that dataset and its descendants can generate. Read operations consumption is based on: - read-like syscalls where the number of operations is equal to the number of blocks being read (never less than 1) - reading via mmaped files, where the number of operations is equal to the number of pages being read (never less than 1) - syscalls accessing metadata: readlink(2), stat(2) Write operations consumption is based on: - write-like syscalls where the number of operations is equal to the number of blocks being written (never less than 1) - writing via mmaped files, where the number of operations is equal to the number of pages being written (never less than 1) - syscalls modifing a directory's content: bind(2) (UNIX-domain sockets), link(2), mkdir(2), mkfifo(2), mknod(2), open(2) (file creation), rename(2), rmdir(2), symlink(2), unlink(2) - syscalls modifing metadata: chflags(2), chmod(2), chown(2), utimes(2) - updating the access time of a file when reading it Just like limit_bw_* limits, the limit_op_* limits are also hierarchical and applied at the VFS level. Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net>
Introduce six new properties: limit_{bw,op}_{read,write,total}.
The limit_bw_* properties limit the read, write, or combined bandwidth, respectively, that a dataset and its descendants can consume. Limits are applied to both file systems and ZFS volumes.
The configured limits are hierarchical, just like quotas; i.e., even if a higher limit is configured on the child dataset, the parent's lower limit will be enforced.
The limits are applied at the VFS level, not at the disk level. The dataset is charged for each operation even if no disk access is required (e.g., due to caching, compression, deduplication, or NOP writes) or if the operation will cause more traffic (due to the copies property, mirroring, or RAIDZ).
Read bandwidth consumption is based on:
read-like syscalls, eg., aio_read(2), pread(2), preadv(2), read(2), readv(2), sendfile(2)
syscalls like getdents(2) and getdirentries(2)
reading via mmaped files
zfs send
Write bandwidth consumption is based on:
write-like syscalls, eg., aio_write(2), pwrite(2), pwritev(2), write(2), writev(2)
writing via mmaped files
zfs receive
The limit_op_* properties limit the read, write, or both metadata operations, respectively, that dataset and its descendants can generate.
Read operations consumption is based on:
read-like syscalls where the number of operations is equal to the number of blocks being read (never less than 1)
reading via mmaped files, where the number of operations is equal to the number of pages being read (never less than 1)
syscalls accessing metadata: readlink(2), stat(2)
Write operations consumption is based on:
write-like syscalls where the number of operations is equal to the number of blocks being written (never less than 1)
writing via mmaped files, where the number of operations is equal to the number of pages being written (never less than 1)
syscalls modifing a directory's content: bind(2) (UNIX-domain sockets), link(2), mkdir(2), mkfifo(2), mknod(2), open(2) (file creation), rename(2), rmdir(2), symlink(2), unlink(2)
syscalls modifing metadata: chflags(2), chmod(2), chown(2), utimes(2)
updating the access time of a file when reading it
Just like limit_bw_* limits, the limit_op_* limits are also hierarchical and applied at the VFS level.
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.