Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gene name hyphens get filtered out when using cob.from_table() #66

Open
MeeshCompBio opened this issue Oct 6, 2017 · 3 comments
Open
Milestone

Comments

@MeeshCompBio
Copy link
Contributor

If I have a data matrix that looks like this to serve as input into COB.from_table.

R1M-C2 R2M-C2 R3M-C2 R1T-C2
BDIBD21-3.1G0000700 0.440474 0.255481 0.312441
BDIBD21-3.1G0000800 1.41546 2.19172 2.00877
BDIBD21-3.1G0000900 0.0210054 0 0.0714931
BDIBD21-3.1G0001000 0 0 0

That command converts the gene names to something like BDIBD213.1G000290 before testing membership where the hyphen gets removed. This change ends up filtering out all of the genes since they don't match the RefGen gene names that include the hyphen.

@monprin
Copy link
Contributor

monprin commented Oct 6, 2017

So the place it is getting filtered out is right here at line 256:

https://github.com/schae234/Camoco/blob/master/camoco/Expr.py

I don't know if I did this or someone else, but if you change the regexp it should work, but I can't speak to the downstream effects of that, I looked and remembered having issues with column labels in bColz (see solution to that problem a couple lines above), but not specifically in the columns themselves.

@schae234
Copy link
Member

It looks like this code was introduced because of HDF5 not bcolz. I think we can just remove this regex and be fine.

But we should add a regression test, I am not sure what bcolz can store in terms of strings.

@monprin
Copy link
Contributor

monprin commented Oct 10, 2017

Okay cool,

I don't recall noticing any weirdness, but may have been protected by that regex.

@schae234 schae234 added this to the v0.6.0 milestone Feb 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants