Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sample size with repeated observations #415

Open
ldehoyos opened this issue Jan 29, 2024 · 1 comment
Open

Sample size with repeated observations #415

ldehoyos opened this issue Jan 29, 2024 · 1 comment

Comments

@ldehoyos
Copy link

I want to compute LDSC heritability and genetic correlation of a genome-wide meta-analysis including repeated observations and I was wondering what value for sample size I should use.

For example, for some studies, the same individuals were included at age 20 and age 25. Here an example for one SNP to illustrate:

Study Phenotype Age N
Study_1 Phenotype 20 1000
Study_1 Phenotype 25 900
Study_2 Phenotype 24 2000
Study_3 Phenotype 24 3000
Study_4 Phenotype 25 2000
Study_5 Phenotype 20 2500
Study_5 Phenotype 23 2500

What sample size should I use?

  • Should I use the N for the independent individuals? 1000+2000+3000+2000+2500 = 10500
  • Or the N of observations? sum(N) = 13900

Note that in this example the difference in N is not so big (3000), but in my real dataset the differences are substantial e.g. Nind= 60,000 vs Nobs=400,000.

Thanks in advance,
L

@aksarkar
Copy link

aksarkar commented May 8, 2024

@ldehoyos I think the only correct way to perform the analysis is to perform GWAS of phenotype at each age, avoiding "repeated observations". Then, you should compute heritability/genetic correlation of each phenotype/age combination.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants