Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add-methylation level not working #48

Open
knoedlerj opened this issue Jan 14, 2020 · 12 comments
Open

add-methylation level not working #48

knoedlerj opened this issue Jan 14, 2020 · 12 comments

Comments

@knoedlerj
Copy link

When I run add-methylation-level with an input tsv (genes that are differentially expressed, for which I'm interested in getting total methylation level) methylpy generates a blank output file.

@yupenghe
Copy link
Owner

Do you mind to share the top few lines of your input file?

@knoedlerj
Copy link
Author

Sure thing! The input tsv file is starts with:

chr1 59764278 59878081 NM_007561 0 +
chr1 193301993 193343878 NM_008484 0 +
chr1 42695767 42700209 NM_008900 0 +
chr1 95587681 95667594 NM_009183 0 -
chr1 84036292 84284645 NM_001003948 0 -
chr1 93478992 93509732 NM_010891 0 +

@yupenghe
Copy link
Owner

The format looks fine. There are a few possible causes.

  • I notice that you are using "chr1". Please make sure the chromosome in allc file is named in the same way.
  • Please double check that the input file is tab-separated.
  • You will need to add a header to the file; otherwise the first line of the file will be treated as header.

@knoedlerj
Copy link
Author

Thanks, that seems to have worked! However, now only some of the entries actually get their methylation levels calculated - currently trying to figure out why.

@knoedlerj
Copy link
Author

Update - it's only calculating levels for about 10% of the intervals listed and nothing obvious seems different about those intervals (these samples have about 25x coverage so there should be information on most of them). Has this behavior been reported before?

@yupenghe
Copy link
Owner

I don't think so. It will be great help for me to debug if you can share a subset of the data for reproducing this issue.

@knoedlerj
Copy link
Author

Can do - I can supply a subset of one of the allc files and the tsv. Even the reduced allc is pretty big (414 MB compressed) - how would you like me to send it? Thank you very much for your assistance!

@yupenghe
Copy link
Owner

If you can set up a link for me to download the data, it will be fine. FTP, google drive etc will work for me.

@knoedlerj
Copy link
Author

@yupenghe
Copy link
Owner

Thanks. I think the problem is that the input tsv file is not sorted. You can use the below command to sort the file.

head -n 1 POA_allpairwisegenes.tsv > header
tail -n +2 POA_allpairwisegenes.tsv|sort -k 1,1 -k 2,2g -k 3,3g |cat header - > POA_allpairwisegenes.reformatted.tsv
rm header

The problem should be solved with the sorted file. Please let me know if it works.

@knoedlerj
Copy link
Author

Looks like it worked, thank you!! Now to figure out what it all means . .

@coralzhang
Copy link

coralzhang commented Apr 7, 2020

I have a similar issue. But my output tsv file has only the bed file
This is the test bed file
chromosome start end
2 1 40001
2 40001 80001
2 80001 120001
2 120001 160001
This is the output I get
chromosome start end methylation_level_ACA methylation_level_ACB methylation_level_pCMT3-RNAiA methylation_level_pCMT3-RNAiB
2 1 40001
2 40001 80001
2 80001 120001
2 120001 160001

I checked that my files are using proper tab as delimiter and the allc files are not empty....
2 1647 - CAT 2 20 1
2 1649 + CGT 10 12 1
2 1650 - CGT 16 21 1
2 1653 + CCT 0 12 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants