Analysis of a dilution series

You can view the notebook in your browser at this link or click on the Jupyter notebook file (MAN1353_peptides_proteins.ipynb) to see the notebook in your browser. It may take a minute for the page to render and load, so please be patient.

The sample is a mouse brain prep that was digested, split into 6 aliquots, digested, and TMT labeled. A mixture was created with relative volumes of 25 to 20 to 15 to 10 to 5 to 2.5. The mixture was run on a Thermo Fusion using the synchronous precursor scan MS3 method:

McAlister, G.C., Nusinow, D.P., Jedrychowski, M.P., Wühr, M., Huttlin, E.L., Erickson, B.K., Rad, R., Haas, W. and Gygi, S.P., 2014. MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression across cancer cell line proteomes. Analytical chemistry, 86(14), pp.7150-7158.

The data were processed with MSConvert from Proteowizard to extract the MS2 and MS3 scans. The MS2 scans were processed with Comet to identify the peptides. Accurate mass and target/decoy methods were used to filter PSMs to a 1% FDR. Parsimonious protein inference and protein grouping were used to create protein and peptide reports. Shared or unique peptide status was determined based on the final list of identified proteins. Unique peptides were used for quantification.

Chambers, M.C., Maclean, B., Burke, R., Amodei, D., Ruderman, D.L., Neumann, S., Gatto, L., Fischer, B., Pratt, B., Egertson, J. and Hoff, K., 2012. A cross-platform toolkit for mass spectrometry and proteomics. Nature biotechnology, 30(10), p.918.

Eng, J.K., Jahan, T.A. and Hoopmann, M.R., 2013. Comet: an open‐source MS/MS sequence database search tool. Proteomics, 13(1), pp.22-24.

Wilmarth, P.A., Riviere, M.A. and David, L.L., 2009. Techniques for accurate protein identification in shotgun proteomic studies of human, mouse, bovine, and chicken lenses. Journal of ocular biology, diseases, and informatics, 2(4), pp.223-234.

The analysis looks at the properties (mostly variance) of the reporter ion data at three levels: PSMs, peptides, and proteins. The PSMs are the raw MS3 scan data; peptide sequences have not been determined. The PSMs will include non-identifiable scans, contaminants, decoys, incorrect scans, and correct scans. The peptides are much cleaner. They are derived from 1% FDR PSMs. Contaminants and decoys have been removed, and shared peptides have been excluded. Peptides also have some limited degree of aggregation. Multiple MS2 scans and different charge states have been combined.

Reporter ions for proteins are aggregated (summed) from the reporter ions of all constituent peptides. Only the unique peptides are used in the sums. There is a great reduction in the number of proteins compared to the number of peptides or PSMs. This greatly simplifies analyses. The analysis also demonstrates that the aggregation improves the quality of the reporter ion measurements.

An interesting aside is looking at the normalization functions in edgeR. edgeR is a useful RNA-Seq statistical package that I have used in other work in this and related repositories. The normalization factors that edgeR reports can seem confusing and they are explained here.

Robinson, M.D., McCarthy, D.J. and Smyth, G.K., 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26(1), pp.139-140.

Robinson, M.D. and Oshlack, A., 2010. A scaling normalization method for differential expression analysis of RNA-seq data. Genome biology, 11(3), p.R25.

The January 2019 update has some better notebook organization, better R scripting, and more use of tidyverse and ggplot2.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.gitignore		.gitignore
LICENSE		LICENSE
MAN1353_peptides_proteins.html		MAN1353_peptides_proteins.html
MAN1353_peptides_proteins.ipynb		MAN1353_peptides_proteins.ipynb
MAN1353_peptides_proteins.r		MAN1353_peptides_proteins.r
README.md		README.md
grouped_peptide_summary_TMT_8.csv		grouped_peptide_summary_TMT_8.csv
grouped_protein_summary_TMT_8.csv		grouped_protein_summary_TMT_8.csv
psm_tmt.csv		psm_tmt.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.ipynb_checkpoints

.ipynb_checkpoints

.gitignore

.gitignore

LICENSE

LICENSE

MAN1353_peptides_proteins.html

MAN1353_peptides_proteins.html

MAN1353_peptides_proteins.ipynb

MAN1353_peptides_proteins.ipynb

MAN1353_peptides_proteins.r

MAN1353_peptides_proteins.r

README.md

README.md

grouped_peptide_summary_TMT_8.csv

grouped_peptide_summary_TMT_8.csv

grouped_protein_summary_TMT_8.csv

grouped_protein_summary_TMT_8.csv

psm_tmt.csv

psm_tmt.csv

Repository files navigation

Analysis of a dilution series

About

Releases

Packages

Languages

License

pwilmart/Dilution_series

Folders and files

Latest commit

History

Repository files navigation

Analysis of a dilution series

About

Topics

Resources

License

Stars

Watchers

Forks

Languages