DNA Methylation-based Prediction
of Midlife Dementia Risk

Background · Research Questions · Workflow · Contributors

This repository contains the scripts that were used for the thesis project DNA Methylation-based Prediction of Midlife Dementia risk as part of the Master Systems Biology at Maastricht University.

Background

For the early identification of people with increased dementia risk, LIBRA and CAIDE midlife dementia risk scores have been developed. As DNA methylation may act as the molecular link between lifestyle/environment and the biological processes governing health and disease, DNA methylation data might be utilized as an alternative to predict midlife dementia risk. However, the large dimensionality of DNA methylation data makes parameter optimization and model training often computationally infeasible without prior feature selection.

Research Questions

Can a robust and computationally feasible feature selection method be established to reduce the dimensionality of the data?
Can reliable DNA methylation-based models for the prediction of a person’s LIBRA, CAIDE1, and CAIDE2 scores and risk factors be constructed in a general population cohort?
Does the extension of the dementia risk score models with polygenetic risk scores (PGSs) of dementia risk factors, subtypes, and comorbidities improve predictive power?
Can the LIBRA, CAIDE1, CAIDE2, and risk factor models be used to predict dementia and cognitive impairment status in dementia-associated cohorts??
What biological processes are captured by the most important features of the risk factor models?

Workflow

An overview of the applied methodological workflow is shown in figure below and encompasses of the following steps:

I. Pre-processing

Pre-processing of phenotype data (PhenotypeProcessing)
Pre-processing of genomics data (GenotypeProcessing)
Pre-processing of DNA methylation data (MethylationProcessing)

II. Evaluation of Feature Selection Methods

Evaluation of eight different feature selection methods (FeatureSelection):

Variance-based feature selection
- β-values
- M-values
- β-values corrected for cell type composition
- M-values corrected for cell type composition
S-score-based feature selection
PCA-based feature selection
Kennard-Stone-like feature selection
Correlation-based feature selection

III. Prediction of Dementia Risk

Prediction of the dementia risk scores, categories, and factors (MachineLearning).

IV. Model Validation & Interpretation

Biological interpretation of the best-performing risk factor models (ModelInterpretation)
Validation of the established models in independent dementia-associated cohorts (ModelValidation).

Contributors

Jarno Koetsier^1*, Rachel Cavill², and Ehsan Pishva³

¹ Faculty of Science and Engineering (FSE), Maastricht University

² Department of Advanced Computing Sciences (DACS), Faculty of Science and Engineering (FSE), Maastricht University

³ Department of Psychiatry and Neuropsychology, School for Mental Health and Neuroscience (MHeNS), Faculty of Health, Medicine and Life Sciences (FHML), Maastricht University

^*Feel free to contact me via email: j.koetsier@student.maastrichtuniversity.nl

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
1. PhenotypeProcessing		1. PhenotypeProcessing
2. GenotypeProcessing		2. GenotypeProcessing
3. MethylationProcessing		3. MethylationProcessing
4. FeatureSelection		4. FeatureSelection
5. MachineLearning		5. MachineLearning
6. ModelInterpretation		6. ModelInterpretation
7. ModelValidation		7. ModelValidation
Images		Images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1. PhenotypeProcessing

1. PhenotypeProcessing

2. GenotypeProcessing

2. GenotypeProcessing

3. MethylationProcessing

3. MethylationProcessing

4. FeatureSelection

4. FeatureSelection

5. MachineLearning

5. MachineLearning

6. ModelInterpretation

6. ModelInterpretation

7. ModelValidation

7. ModelValidation

Images

Images

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

DNA Methylation-based Prediction
of Midlife Dementia Risk

Background

Research Questions

Workflow

I. Pre-processing

II. Evaluation of Feature Selection Methods

III. Prediction of Dementia Risk

IV. Model Validation & Interpretation

Contributors

About

Languages

License

jarnokoetsier/MidlifeDementiaRisk

Folders and files

Latest commit

History

Repository files navigation

DNA Methylation-based Prediction of Midlife Dementia Risk

Background

Research Questions

Workflow

I. Pre-processing

II. Evaluation of Feature Selection Methods

III. Prediction of Dementia Risk

IV. Model Validation & Interpretation

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Languages

DNA Methylation-based Prediction
of Midlife Dementia Risk