Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

different result comparing to prodigal #42

Open
zzsunday opened this issue Sep 4, 2023 · 3 comments
Open

different result comparing to prodigal #42

zzsunday opened this issue Sep 4, 2023 · 3 comments
Labels
question Further information is requested

Comments

@zzsunday
Copy link

zzsunday commented Sep 4, 2023

hello, something strange happens.
when I use prokka, which uses prodigal to predict CDS,(prokka version 1.14.6) to annotate, the result showed that the position from 49243 to 50055 is entire CDS. but, when I using bakta, which uses pyrodigal to predict cds( bakta version 1.7.0) , it truncate the entire CDS into two different CDS. I also do it by blast, the result showed that the 49243 to 50055 100% match to a specific gene.

attached files is :
prokka.gff : prokka result
bakta.gff : bakta result
ndm5.out: blastn result
19D44.fasta: sequence to annotate(query sequence)
ndm5.fasta: the ndm5 gene sequence
Archive.zip

@althonos
Copy link
Owner

althonos commented Sep 5, 2023

Hi @zzsunday, Pyrodigal and Prodigal are expected to have different results unless you compiled Prodigal from source, because Pyrodigal fixed several bugs in Prodigal that affect the gene scoring. But otherwise, it was tested extensively before being integrated in bakta (see #21).

@althonos althonos added the question Further information is requested label Sep 5, 2023
@zzsunday
Copy link
Author

zzsunday commented Sep 5, 2023

Thank you @althonos,
but another question occured,
when i used spades to do assembly, then use bakta to annotate. and I also extracted one specific contig named contig40 (the contig name in spades result was
Uploading Archive.zip…
NODE_40_length_21390_cov_105.006) from spades assembly result and used bakta to annotate. interestingly, I got two different cds prediction to the same sequence( from position 5813 to 9110).
attached file:
101.fasta: spades assembly result
101.gff3: spades assembly result annotated by bakta
contig40.fasta(the name in spades assembly result was NODE_40_length_21390_cov_105.006): single contig extracted from spades assembly result
contig40.gff3: single contig annotated by bakta

@althonos
Copy link
Owner

althonos commented Sep 8, 2023

Were you running in meta or single mode?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants