New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sprint 11 Task List #456
Comments
Notes on "Add automated report generation for the highly consequential variants (with far fewer annotations"
Needs to be easily digestible - either a saved subset of the full,
|
* Adds support for SomaScan files in the API, and adds tests for the api endpoint.
2024-04-19 Sprint 10 RetroOverall what has been accomplished:
Summary for Sprint 11 WorkProteomics Statistical Methods
This is an ambitious list. If they roll over, they roll over to the next sprint Deliverable that we're aiming at over next 2 sprints: get the work/results in Erik Johnson's and Thomas Wingo's hands. Proteomics API
PRSNothing was achieved, all work rolls over. @akotlar will take over until Cristina is back, best effort. Expecting that initial PRS solution is done by Sprint 11 end; so delay 3 weeks. PRS excitement is high from Dave Cutler, Elizabeth Leslie's group (potentially, as informed by Julien, her lead bioinformatic analyst), and IBDGC. Infrastructure and bystro webappFurther improvements on hold with the possible exception of migrating from zip file downloads to either tar downloads, an improved/fixed zip download, or individual file downloads rather than zipping
What went well
What didn't
What is 1 thing that we will do differently this sprint.
|
2024-04-23Proteomics Topic MeetingAustin:
|
2024-04-30Proteomics Topic MeetingDomain Adaptation:
|
2024-05-03Proteomics topic meetingProteomicsPipelineDemonstration.ipynb.zip
|
2024-05-07Proteomics Topic MeetingDennis got blocked by annotator installation (to create dev instance); running into installation issues, which are being documented and fixed. Ilha is working this week on covariance estimation methods:
Alex - on track for proteomics data; initial analysis on 300 sample CSF TMT + SomaScan, then 400 and 900 sample datasets that Thomas/Nick shared. Austin - will share the Jupyter notebook demonstrating SPPCA on neuroscience data. Common variant topic meetingAustin/POE: Rare variant topic meetingAustin is trying to prove rare variant analysis is inherently impossible outside mendelian traits. He is showing that if you have many rare variants, and bound their effects (in terms of P(Disease|variant))...when having any mutation has a tiny effect, the population variance in having disease goes to 0; which is to say everyone has identical risk for having disease. |
2025-05-08Austin - working on NeurIPS paper |
2025-05-10 Weekly MeetingAgenda
DiscussionSingular value shrinkers is still WIP - working on a version that handles any sample size |
Due date for Sprint 11 - May 16th.
General
Proteomics
Datasets: https://www.synapse.org/#!Synapse:syn53420674.1/datasets/, https://www.synapse.org/#!Synapse:syn31822992/wiki/617907
PRS
Goal for Sprint 11: Have a PRS C+T running through the webapp (with display of results in webapp potentially in sprint 12)
Provisional for Sprint 13: Have this deployed to IBDGC (we'll need information from them for what they'll find useful in terms of GWAS summary stats)
Add back in AD GWAS summary statistics for hg19 - 2024/05/06 for PR
Add in LD map for clumping in hg38 - @cristinaetrv - 2024/05/06 for PR
Optimizing PRS C+T for performance - @cristinaetrv - 2024/05/06 for PR
(sprint 12) Need annotation for ancestry in AD stat summary - @cristinaetrv
Add batch processing for PRS C+T workflow with dosage matrix for memory issues @cristinaetrv - @cristinaetrv - 2024/05/08 for PR
Automatically launch PRS after ancestry from API server - @akotlar - 2024/05/16 for PR
(stretch) Take in ancestry PCs as PRS-CS covariates @austinTalbot7241993 @akotlar - April 24
(stetch) Take in top hit from ancestry, convert to superpop, connect to LD map for corresponding pop for LD clump / expectation is that we will at least have this in progress in sprint 11 @cristinaetrv - 2024/05/16
(stretch) Add readme for AD GWAS sum stats @cristinaetrv - 2024/05/16
(stretch) Display basic PRS results in webapp (table with individuals and their score) - @akotlar - 2024/05/16
(sprint 12) Finish PRS-CS standard way without Langevin Dynamics @austinTalbot7241993
(sprint 12) Weigh PRS scores by gnomad allele frequencies for specific ancestries and the corresponding ancestry probability:
Beta*dosage - 2*( sum_over_superprop_ancestry { maf_gnomad_in_ancestry * p_ancestry } )
- TBDImportant to IBDGC (and likely other consortiums).
Covariance Matrix Estimation
Overall goal: is to improve network analysis, regressions, clustering, anything that relies on a covariance matrix, and the empirical covariance matrix is not a good estimator, especially in small sample sizes.
Infrastructure
Post IBDGC Tasks
The text was updated successfully, but these errors were encountered: