Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] one vs. all others #168

Open
acoteataltius opened this issue Aug 23, 2023 · 4 comments
Open

[Feature request] one vs. all others #168

acoteataltius opened this issue Aug 23, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@acoteataltius
Copy link

I'd like to be able to input a contrast design (or otherwise choose design factors), to do a one vs all comparison within in a particular "condition" that has more than two levels. If my column "condition" has levels A, B, C, and D, do a comparison of A vs B, C, D.

Something like these options in R deseq2:
design <- ~0 + condition
contrast = c(1, -1/3, -1/3, -1/3)
contrast=list(c("conditionA"),
c("conditionB","conditionC","conditionD"))

Would it be possible to do something where if you leave the second option blank in contrast, like:
contrast = ['condition', 'A', '']
it compares A with all other samples?

@acoteataltius acoteataltius changed the title one vs. all others [Feature request] one vs. all others Aug 23, 2023
@BorisMuzellec BorisMuzellec added the enhancement New feature or request label Aug 31, 2023
@BorisMuzellec
Copy link
Collaborator

Hi @acoteataltius, that would be a convenient feature to have indeed.

It's not available in pydeseq2 yet, but I'm adding it to our feature wishlist. I'll give it a go when I have time, but I'm also happy to help anyone opening a PR. Not sure what would be the best way to implement it from a user perspective (maybe a one_vs_all boolean argument?).

In the meantime it seems that it would be possible to obtain the same results by manually setting the contrast_vector attribute after initializing the DeseqStats object, but I'm not 100% sure about this either.

@GalaMichal
Copy link

Hi @BorisMuzellec I'd like to ask it is even possible to compare all vs all? Basically, treating each level of the condition factor as a separate group and not setting any of them as a reference (e.g. healthy).

Something like in R Deseq2:
design <- ~0 + condition

@BorisMuzellec
Copy link
Collaborator

BorisMuzellec commented Dec 15, 2023

@GalaMichal there is unfortunately no direct way to do this as of yet. This relates to #213.

However I think it is possible to obtain the same design matrix using pydeseq2.utils.build_design_matrix with no intercept but an expanded design, and use it in your DeseqDataSet like this:

dds = ds.DeseqDataSet(counts=counts, metadata=metadata, design_factors="condition")

# This is where you replace the design matrix
dds.obsm["design_matrix"] =  build_design_matrix(
            metadata=dds.obs,
            design_factors=dds.design_factors,
            expanded=True,
            intercept=False,
        )

# And then you should be able to carry on as usual

dds.deseq2() # etc.

Let me know if this works!

@GalaMichal
Copy link

@BorisMuzellec thank you for quick response. Unfortunately, it doesn't work.
dds.deseq2() is calculated but stat_res = DeseqStats(dds) shows: KeyError: 'Condition_1_vs_Condition_1

The same situation occurs when I try, for example, stat_res = DeseqStats(dds, contrast =("Conditions", "Condition_1", "Condition_2"))

'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants