Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No CDS found (proteinToGenome) #106

Open
Zupanova opened this issue Jul 17, 2020 · 5 comments
Open

No CDS found (proteinToGenome) #106

Zupanova opened this issue Jul 17, 2020 · 5 comments

Comments

@Zupanova
Copy link

Zupanova commented Jul 17, 2020

Hello,
I am trying to map protein coordinates to genome coordinates for Arabidopsis Thaliana. I tried both ensDbFromGtf() and ensDbFromGff() to create the ensDB object. As protein ID I used the given transcript ids "tx_id".
Although the ID and CDS coordinates are definitely contained in the file (checked manually) I keep getting this message:
Fetching CDS for 1 proteins ... 0 found
Checking CDS and protein sequence lengths ... 0/0 OK
Warning message:
In proteinToGenome(x, EDB_Ath, id = "name", idType = "tx_id") :
No CDS found for: AT1G01470.1

What could be the cause?

P.S I also tried ensDbFromAH() but there was no GTF file give for A.Thaliana among the 7 found

Greetings

@jorainer
Copy link
Owner

Hi, the problem is that the protein annotations needed for proteinToGenome can not be imported from Gtf files. These are only available if the EnsDb is created directly with the Ensembl Perl API. If you tell me the Ensembl version (or better the Ensemblgenome release version, e.g. 47) you would need I will build the EnsDb for you.

@Zupanova
Copy link
Author

Hi, thanks for the reply. It would be very nice if You could do that for me. I am working with Arabidopsis Thaliana TAIR 10 sequences, so release version 47 on Ensembl.
I still have one question: for what purpose can the functions ensDbFromGtf() and ensDbFromGff() be used?
Greetings

@jorainer
Copy link
Owner

You can download the EnsDb SQLite file from here. You can then simply provide the file name (inclusive path) of this file in the EnsDb constructor to load the resource (i.e. edb <- EnsDb(<sqlite file>)) and use it right away.

@Zupanova
Copy link
Author

Ok. Thank You very much for your help.

@jorainer
Copy link
Owner

Closing this issue now, feel free to re-open if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants