Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do you apply any transformation for the continuous trait in case it does not follow a normal distribution? #136

Open
neginmb opened this issue Jan 29, 2021 · 9 comments
Labels

Comments

@neginmb
Copy link

neginmb commented Jan 29, 2021

No description provided.

@johnlees
Copy link
Collaborator

We don't, no. Have a look at warpedlmm. If you want, you can take the warped phenotype output from that package and use it with pyseer

@neginmb
Copy link
Author

neginmb commented Feb 1, 2021

Thanks for your answer, but what about the generalized linear model (GLM)? Do you apply any for the continuous trait? If so, what is the link function and the variance function?

@johnlees
Copy link
Collaborator

johnlees commented Feb 2, 2021

No, we don't fit a GLM. We just use OLS:
https://github.com/mgalardini/pyseer/blob/master/pyseer/model.py#L284-L292

Statsmodels supports GLMs, so you could easily change this line if you wanted to use a different link or family:
https://www.statsmodels.org/stable/generated/statsmodels.genmod.generalized_linear_model.GLM.html?highlight=glm#statsmodels.genmod.generalized_linear_model.GLM

The LMM is from limix/fastlmm, which I believe has a linear link and Gaussian variance. This is more difficult to modify, but the limix package has a few possible alternatives.

@snowformatics
Copy link

I have a similar situation and I am wondering whether I could use the mean calculated warped phenotypes of my repeated measurements with different environments instead of BLUEs? Any idea?

Thanks
Stefanie

@sydelstan
Copy link

sydelstan commented Oct 19, 2021

does the continuous trait need to be normally distributed? is there a test to confirm the assumptions of the OLS are met?

@johnlees
Copy link
Collaborator

does the continuous trait need to be normally distributed? is there a test to confirm the assumptions of the OLS are met?

@sydelstan OLS doesn't assume responses/traits are normally distributed, it assumes their errors/residuals are. You can plot lots of useful diagnostics in R if you did something like:

model <- glm(y ~ x, family=gaussian())
plot(model)

But doing this for all of your predictors is more difficult. I would generally just use warpedlmm.

@johnlees
Copy link
Collaborator

I have a similar situation and I am wondering whether I could use the mean calculated warped phenotypes of my repeated measurements with different environments instead of BLUEs? Any idea?

@snowformatics sorry I missed your message, from a while back I now see. This is beyond the scope of pyseer, certainly. Perhaps my intuition would be to set up a small simulation study in this case. But to use LMMs, the greater flexibility in limix, or even lme4, or a Bayesian/MCMC version (stan is probably not a bad idea) might be more helpful in more complex situations.

@anhvu989
Copy link

anhvu989 commented Oct 3, 2023

Hello,
Does the continuous phenotype need to be normally distributed when using LMM model? Because I read somewhere that the requirements for LMM model are phenotypes and residuals need to follow a normal distribution.

I am new to Pyseer and came across a publication (https://www.mdpi.com/2076-2607/10/7/1366) running Pyseer and it has residuals plotted and normality checked for satisfying assumptions for using LMM.
Is it possible to extract residuals from Pyseer results for this plotting purpose?

Thank you very much.

@johnlees
Copy link
Collaborator

johnlees commented Oct 3, 2023

Hello, Does the continuous phenotype need to be normally distributed when using LMM model? Because I read somewhere that the requirements for LMM model are phenotypes and residuals need to follow a normal distribution.

I am new to Pyseer and came across a publication (https://www.mdpi.com/2076-2607/10/7/1366) running Pyseer and it has residuals plotted and normality checked for satisfying assumptions for using LMM. Is it possible to extract residuals from Pyseer results for this plotting purpose?

Thank you very much.

I think this is mostly covered in the comments above – I would recommend warpedlmm if you are worried about this but typically it's not likely to be a big problem.
If you wanted residuals this would be a bit manual, but you could run phenotype prediction (see the docs) to get an idea about this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants