Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: enable issue_discards=1 in /etc/lvm/lvm.conf #123

Closed
sammcj opened this issue Jan 3, 2019 · 20 comments
Closed

Feature request: enable issue_discards=1 in /etc/lvm/lvm.conf #123

sammcj opened this issue Jan 3, 2019 · 20 comments

Comments

@sammcj
Copy link

sammcj commented Jan 3, 2019

issue_discards should be enabled by default in /etc/lvm/lvm.conf, this allows TRIM/DISCARD and SCSI UNMAP IOCTLs through the LVM layer which is very important if you use SSDs or in many cases with thin provision iSCSI storage.

From lvm.conf:

"Configuration option devices/issue_discards.
Issue discards to PVs that are no longer used by an LV.
Discards are sent to an LV's underlying physical volumes when the LV
is no longer using the physical volumes' space, e.g. lvremove,
lvreduce. Discards inform the storage that a region is no longer
used. Storage that supports discards advertise the protocol-specific
way discards should be issued by the kernel (TRIM, UNMAP, or
WRITE SAME with UNMAP bit set). Not all storage will support or
benefit from discards, but SSDs and thinly provisioned LUNs
generally do. If enabled, discards will only be issued if both the
storage and kernel provide support."

Patch for change:

--- /etc/lvm/lvm.conf	2019-01-03 11:36:22.379998897 +1100
+++ /etc/lvm/lvm.conf.discards	2019-01-03 14:35:14.624804878 +1100
@@ -298,7 +298,7 @@
 	# benefit from discards, but SSDs and thinly provisioned LUNs
 	# generally do. If enabled, discards will only be issued if both the
 	# storage and kernel provide support.
-	issue_discards = 0
+	issue_discards = 1
 }

 # Configuration section allocation.

This is an improvement suggestion as per a brief discussion on the XCP-ng forums based on my XenServer and CentOS tuning experiences.

@sammcj sammcj changed the title enable issue_discards=1 in /etc/lvm/lvm.conf Feature request: enable issue_discards=1 in /etc/lvm/lvm.conf Jan 3, 2019
@sammcj
Copy link
Author

sammcj commented Jan 3, 2019

FYI we have been doing this with puppet as such:

  # Enable discard / unmap / trim at the LVM level
  file_line {'issue_discards':
    ensure   => 'present',
    line     => '  issue_discards = 1',
    path     => '/etc/lvm/lvm.conf',
    match    => /  issue_discards = 0/,
    multiple => false,
    replace  => true,
  }

@stormi
Copy link
Member

stormi commented Jan 3, 2019

Do you know why it's not the default in lvm.conf? Are there drawbacks to this setting?

@sammcj
Copy link
Author

sammcj commented Jan 6, 2019

No idea, I wouldn't be surprised if it was just a historical thing, it certainly annoys me having to change it on every machine I build.

AFAIK there are no drawbacks because if the underlying block device is not capable of the IOCTL it's simply not issued.

@Ultra2D
Copy link

Ultra2D commented Jan 11, 2019

Do you make any other changes for this to have any effect? Because this does not seem to work on lvmoiscsi when removing a VDI from a SR.

According to https://discussions.citrix.com/topic/378764-reclaiming-freed-space/#comment-1931283 lvremove is called without discard, unless you do a "Reclaim freed space" which sends discards even if issue_discards = 0 in lvm.conf. Reclaim freed space does work.

It also suggests changing lvutil.py, and indeed, after changing lvutil.py it does work for me.

adding the config_param "issues_discards=1" to the lvutil.py remove function, which makes it automatic and transparent

It would be nice to have a SR parameter (maybe an sr-config key) auto-trim=true or something so folks with cooperative storage platforms can maintain the benefit of thin provisioning at scale.

@sammcj
Copy link
Author

sammcj commented Jan 13, 2019

Ah yes - sorry lvutil.py is the other part I forgot in this suggestion - thank you for picking that up 👍

@Supermathie
Copy link

@sammcj I believe you are misinterpreting this option - issue_discards tells LVM to actively discard any PV extents no longer mapped to an LV.

Discard commands issued on a filesystem can be passed down through an LV regardless of this setting.

@benapetr
Copy link

I am now testing it with CEPH RBD backend, it is certainly promising, as CEPH is also benefiting from this greatly

@benapetr
Copy link

it most definitely works with CEPH and is very useful, just changing this option in both /etc/lvm/lvm.conf as well as master/lvm.conf made space reclaiming possible, deletion of any virtual disk immediately frees space in CEPH pool, which didn't happen before, unless you ran "reclaim free space" on SR, enabling this makes free space reclaiming unnecessary.

@benapetr
Copy link

I think that main reason why Xen doesn't go with this is simply that in theory for some storage vendors this may put some "performance impact" on storage when you discard unused blocks, or that's what they say:

Note: Reclaiming freed space is an intensive operation and can affect the performance of the storage array. You should perform this operation only when space reclamation is required on the array. Citrix recommends that you schedule this work outside the peak array demand hours

So in theory, you may not want to automatically discard unused blocks of data left by deleted VDI (LVs) immediately, but during some scheduled night job. However I believe that in most other cases (SSD disks, or even CEPH as in my case) this is irrelevant, as SSD disks don't suffer from any performance penalty when TRIM issued. I don't even know of any other enterprise storage arrays that actually do, this may very well be some ancient left-over from old times where storages really had issues with discards?

@stormi
Copy link
Member

stormi commented Sep 15, 2020

I have raised your questions to the developers of the storage stack in the XAPI project: xapi-project/sm#514

@stormi
Copy link
Member

stormi commented Sep 15, 2020

Quoting the answer:

It is set to 0 by default due to data corruption caused by SANs that implement the discard functionality incorrectly so it isn't safe to have on by default.
The performance issue is also valid. Some SANs are understood to lock an entire LUN or RAID set in order to complete discard which is obviously undesirable with active dynamic workloads.

More on the linked issue.

@nagilum99
Copy link

nagilum99 commented Sep 16, 2020

Any chance to know which ones work safely and which don't?
Edit: Quickspecs for our HPE MSA2040 says: "UNMAP reclaims space that is no longer on a thinly provisioned VMFS volume" so technically it should be fine. But from what I read it's limited to unmaps of (at least) 4 MB per block. (https://community.hpe.com/t5/msa-storage/msa2042-unmap-on-esx6-5/m-p/6976465#M10916)

@stormi
Copy link
Member

stormi commented Sep 16, 2020

Any chance to know which ones work safely and which don't?

No, there is no such list.

@benapetr
Copy link

benapetr commented Sep 16, 2020

they do have valid points though, I do believe that some SAN storage (especially older one) that is working fine now with discards disabled, may start malfunctioning when you suddenly enable them, so indeed, forcing this via update or patch for everyone is probably not wanted here.

It would be best if there was an option to enable / disable this (probably on storage level, cluster-wide?) so that users can decide per storage if they want this enabled.

I was reading more about this topic when it comes to SSD, I strongly recommend this: https://web.archive.org/web/20100227234254/http://www.devwhy.com/blog/2009/8/4/from-write-down-to-the-flash-chips.html which describes how TRIM works in detail on SSDs, and after reading this and many more articles, I came to conclusion that issuing discard too frequently could indeed have some negative performance effect on SSD drives, which is also a reason why most Linux FS do not recommend running continuous trim, but rather schedule a fstrim weekly or monthly (but that is a different use-case, here we talk about deleteing whole VHD, instead of deleting files on FS, which would probably happen much less often).

I have to admit I didn't have SSDs in mind at all when writing about this, as my use-case is CEPH backend, which seems to only benefit from this, with no performance impact whatsoever and no issues with data-corruption (or I haven't observed any so far, but CEPH seems to support this for quite some time, and other virtualization providers such as proxmox or OpenStack use this) https://ceph.io/geen-categorie/openstack-and-ceph-rbd-discard/

@nagilum99
Copy link

Would also be interesting: Which storages does XCP-ng/CH 'clean up'? All block storages? iSCSI, FC... local (SAS)?
AFAIK TRIM is the (S-)ATA command and in SCSI-World (SAS, FC, iSCSI) it's discard.

@nagilum99
Copy link

Any chance to know which ones work safely and which don't?

No, there is no such list.

But it's assumable, that if it works with VMWare, there's no reason it should break stuff unter Linux (XCP-ng...)?

@benapetr
Copy link

benapetr commented Sep 22, 2020

no behaviour would be same, no matter the OS

Trim and discard and virtually same operation from OS level point of view, just as you said "trim" is used in SSD world. Technically you are just notifying underlying storage that data segment (marked by starting block and length) is no longer used and can be discarded. Which on thin provisioned storage means reducing used space (which could then be used by another volume, resulting in much more efficient storage usage).

@brendanhoar
Copy link

In LVM, as stated earlier, issue_discards=1 only performs a trim when deallocating extents/chunks from a PV. In practice, this means that if issue_discards=1:

  1. When a normal LV is removed from a VG, discards are issued for the space in the PV that the LV relinquished.
  2. When a thin LV is removed from a thin pool, discards ARE NOT ISSUED for any space the LV relinquished, as it is still allocated to the thin pool. A workaround is to explicitly invoke blkdiscard on the thin LV device before removing it from the pool, after the associated VM has shut down.
  3. When a thin pool is removed from a VG, discards are issued for the space in the PV that the thin pool relinquised (a thin pool lives inside a object similar to a normal LV, effectively).

Why is the default 0?

Likely to prevent long I/O stalls when a large object (normal LV or thin pool) is removed and the space is discarded...along with the serious side effects long I/O stalls can cause in production systems.

B

@sammcj sammcj closed this as not planned Won't fix, can't repro, duplicate, stale Nov 13, 2022
@Fohdeesha
Copy link
Member

Do you make any other changes for this to have any effect? Because this does not seem to work on lvmoiscsi when removing a VDI from a SR.

According to https://discussions.citrix.com/topic/378764-reclaiming-freed-space/#comment-1931283 lvremove is called without discard, unless you do a "Reclaim freed space" which sends discards even if issue_discards = 0 in lvm.conf. Reclaim freed space does work.

It also suggests changing lvutil.py, and indeed, after changing lvutil.py it does work for me.

adding the config_param "issues_discards=1" to the lvutil.py remove function, which makes it automatic and transparent

It would be nice to have a SR parameter (maybe an sr-config key) auto-trim=true or something so folks with cooperative storage platforms can maintain the benefit of thin provisioning at scale.

@Ultra2D Can you outline exactly where/what changes you made to lvutil.py? In usual citrix fashion, the thread you linked with details (https://discussions.citrix.com/topic/378764-reclaiming-freed-space/#comment-1931283) no longer exists

@Ultra2D
Copy link

Ultra2D commented Jan 19, 2024

No, sorry, I'm no longer using XCP-ng and have no notes on this either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants