Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gunzip -c chr*.fa.masked > hg19.fa #13

Open
zhenzuo2 opened this issue Sep 14, 2018 · 5 comments
Open

gunzip -c chr*.fa.masked > hg19.fa #13

zhenzuo2 opened this issue Sep 14, 2018 · 5 comments
Labels

Comments

@zhenzuo2
Copy link

zhenzuo2 commented Sep 14, 2018

Hello,

When I run

gunzip -c chr*.fa.masked > hg19.fa

in CENTIPEDE.tutorial in Genomic sequence Section, I got the following error

gzip: chr10.fa.masked: not in gzip format
gzip: chr11.fa.masked: not in gzip format
gzip: chr11_gl000202_random.fa.masked: not in gzip format
...

Am I doing it wrong? Thanks.

@slowkow
Copy link
Owner

slowkow commented Sep 15, 2018

Yes, gzip will throw that error when you try to decompress (gunzip) a file that is not compressed.

You only need to decompress a file when it is compressed (it will have the .gz file extension).

@slowkow
Copy link
Owner

slowkow commented Sep 15, 2018

I think you caught a typo in my tutorial! Thanks for reporting it.

I haven't tested it, but I think the correct command is:

cat chr*.fa.masked > hg19.fa

Good luck! Let me know how it goes. Thanks again for reporting the problem.

@zhenzuo2
Copy link
Author

Thank you so much for your prompt reply. Your tutorial is the best tutorial I can find online. It helps a lot for beginners like me.

I also have a question about where to download ENCFF001UUQ.narrowPeak.gz and ENCFF001UUQ_gt8.narrowPeak.gz. You explained it in DNase-Seq data section but I am still not very sure where to find them. Thanks.

@slowkow
Copy link
Owner

slowkow commented Sep 16, 2018

It looks like the particular files I used in the tutorial have been archived.

You should find a different study that interests you:

https://www.encodeproject.org/search/?type=Experiment&assay_term_name=DNase-seq&replicates.library.biosample.donor.organism.scientific_name=Homo+sapiens

Here is one possible example, showing the links to the BAM file and the narrowPeak bed file:

2018-09-16_17-54-45

@zhenzuo2
Copy link
Author

Thank you a lot. It makes sense now.

Best,

Zhen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants