Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for arbitratry design matrices and contrast vectors #213

Open
grst opened this issue Nov 28, 2023 · 5 comments
Open

Support for arbitratry design matrices and contrast vectors #213

grst opened this issue Nov 28, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@grst
Copy link

grst commented Nov 28, 2023

Is your feature request related to a problem? Please describe.
Most linear models support passing designs as design matrices and contrasts as contrast vectors. This is the "smallest common denominator" for specifying designs and it's useful

  • for more complex designs and comparisons that aren't covered by a simple [column, baseline, treatment] triplet
  • for writing wrapper functions (e.g. multi-condition-comparisions) that use PyDESeq2 as one of multiple backends and already deal with building model matrices and contrast vectors from more user-friendly input such as formulae.

Describe the solution you'd like

  • DeSeqDataset should take a design matrix
  • DeseqStats should take a contrast vector with one value per fitted coefficient, such as [0, -1, 1].

Additional context
discussed on the scverse hackathon in Cambridge

CC @const-ae @emdann

@BorisMuzellec
Copy link
Collaborator

Hi @grst @const-ae @emdann, is there a consensus regarding what would be most convenient? I'm assuming we want to use formulaic?

I won't have the bandwidth to implement this feature on my own in the next few weeks, but if anyone wants to give it a try, I'm happing to help them.

@grst
Copy link
Author

grst commented Dec 4, 2023

I don't even think you'd need to deal with formulaic/patsy in PyDESeq2, at least initially. Either tool generates a design matrix (which advanced users could also create manually) which should be the input for PyDESeq2.

@const-ae
Copy link

const-ae commented Dec 4, 2023

I agree with Gregor that the easiest change might be to simply allow some way to provide a design matrix and then just skip the step build_design_matrix at https://github.com/owkin/PyDESeq2/blob/main/pydeseq2/dds.py#L249. Of course, longer term I think it would be great to save the user from converting data + formula to a design matrix and do it internally, but in the end it's just syntactic sugar :)

@jeandut
Copy link
Contributor

jeandut commented Apr 24, 2024

The PR #181 is implementing the ability to give a design matrix directly however for now it needs to follow pydeseq2 naming conventions for further preprocessing namely the _vs_ syntax.

@jeandut
Copy link
Contributor

jeandut commented Apr 24, 2024

Don't hesitate to play with the branch and give feedbacks on limitations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants