Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different dimethyl labels on different lysines within same peptide #1083

Open
p-bell opened this issue Apr 17, 2023 · 4 comments
Open

Different dimethyl labels on different lysines within same peptide #1083

p-bell opened this issue Apr 17, 2023 · 4 comments

Comments

@p-bell
Copy link

p-bell commented Apr 17, 2023

Hi,

Thanks again for developing these excellent tools.

I'm analysing dimethyl-labelled samples, and I'm running into issues with search space + some incorrect PSMs. Perhaps this is addressed by another feature but if not, I might be able to suggest a couple of solutions:

Problem:
I'm labelling at the protein level looking for neo-N-termini, so have to include heavy or light mods for peptide n-terminus, in addition to dimethyl modified lysines (which are not cleaved by trypsin). I find that inclusion of heavy and light dimethyl variable mods on lysine / peptide n-terminus, increases the number of modified peptides to a huge amount, where number of possible dimethyl mods on K is set to >2. Add in semi- trypsin R digestion, plus oxidized methionine and deamidation on Q or N and it becomes a lengthy search on my workstation.

This is not the biggest problem though - by having these as variable mods, PSMs are detected that pass 1% FDR that contain lysines with a heavy label, and other lysines with a light label (within the same peptide). These are incorrect assignments, which likely interfere with the FDR calculation / thresholds applied. Is this a known issue?

Solutions?:

  1. Is is possible (via an existing or new feature) to implement 'semi'-fixed modifications to help ID of dimethylated peptides? i.e. inheriting the fixed mods (eg alkylation on Cys), then having a +28K (light) or +34K (heavy) 'semi-fixed' mods... then true variable mods (oxidised Met etc) a level below these? This would likely reduce the search space considerably and reduce the risk of mis-identification of peptides with mixed labels.
    semi-fixed-mods

  2. Or alternatively perhaps a parameter for mass-shift searches that allows you to specify the mass shift based on the number of lysines (or any specific amino acid) in each peptide?

Thanks and best wishes,
Pete

@fcyu
Copy link
Member

fcyu commented Apr 17, 2023

Hi Pete,

Thank you very much for your feedback.

If I understand it correctly, you want to have a "variable modification group" that variable modification in different groups can't happen on the same peptide. I think it is a useful setting for chemical labelling. We will discuss internally and (hopefully) implement it.

Best,

Fengchao

@p-bell
Copy link
Author

p-bell commented Apr 17, 2023

Hi Fengchao,

Thanks for your quick reply - yes the solution you describe would be really useful for the issue of incorrect PSMs with different labels on the same peptide.

For the related but separate issue regarding the large search space, it might be a good approach if different chemical labels could be treated as 2 separate bins ('semi-fixed' in the schematic) for generation of the modified peptide database used in searches.

In the case of dimethyl labelling of Lys, labelling efficiency is almost 100%, and could be treated as a fixed mod for searches to drastically reduce search space... but the reason we can't do that at present is because we would only detect 1 label type (either heavy or light).

The workaround I've tried is to use +28 Lys (light) as fixed, then 'overlabel' with +6 Lys for heavy label... however, even doing this the number of modified peptides is huge (hundreds of Gb), there's a limit to the number of +6 Lys var mods that can be included in searches, and we run into the issue of mixed peptides with heavy and light labelled Lys on the same peptide.

It would be fantastic if something could be implemented to address these issues.

Thanks again,
Pete

@fcyu
Copy link
Member

fcyu commented Apr 17, 2023

For the related but separate issue regarding the large search space, it might be a good approach if different chemical labels could be treated as 2 separate bins ('semi-fixed' in the schematic) for generation of the modified peptide database used in searches.

In the case of dimethyl labelling of Lys, labelling efficiency is almost 100%, and could be treated as a fixed mod for searches to drastically reduce search space... but the reason we can't do that at present is because we would only detect 1 label type (either heavy or light).

I don't think it is a "fixed" modification because the C-term lysine is unmodified according to "...in addition to dimethyl modified lysines (which are not cleaved by trypsin)".

Best,

Fengchao

@p-bell
Copy link
Author

p-bell commented Apr 18, 2023

Sorry for the confusion - all lysines are labelled, for this reason it would be good to consider dimethyl on lysine as a fixed mod.

In this experiment, dimethyl labelling is performed at the protein level, therefore trypsin cleaves only at Arg, due to blocked lysines. Each peptide can therefore have multiple labelled Lys.

Only a subset of peptide N-termini are labelled (therefore peptide n-term dimethyl labelling should be treated as variable).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants