Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixstart stringency #163

Open
rjsorr opened this issue Apr 25, 2020 · 1 comment
Open

fixstart stringency #163

rjsorr opened this issue Apr 25, 2020 · 1 comment

Comments

@rjsorr
Copy link

rjsorr commented Apr 25, 2020

Hi, I want to run fixstart to start from the coi gene of the mitochondria. I have provided 40 coi genes in nucleotide format with a broad representation from my phylum of interest. On running fixstart it finds the coi gene (provided) for approx 10% of the input contigs (all contigs are complete mitochondria from within the same phylum and also therefore have coi) before defaulting to prodigal. Basically, I want it to start from the coi gene or nothing, even dropping prodigal if needs be. so questions.

  1. How do I lower the stringency so it finds the coi gene using the coi nt (database) sequences provided (what are good settings)? I have tried an array of --min_id and --mincluster options down to 10 for both with no real improvement (not easy to quantify with an input of 400 mitochondria). I could of course increase the size of the coi database provided but this seems defeatist. Is it not more logical to provide a protein sequence for stringency purposes?
  2. Can prodigal be turned off? or is it possible to provide the program with a choice of mt genes at a later date (most seem to choose coi as a default start gene for mitogenomes)?

cheers

@rjsorr
Copy link
Author

rjsorr commented Apr 25, 2020

Replying to myself, and after a bit a playing around, --min_id 25 and --mincluster 20 (default) was the least stringent for my dataset, but not easy to fully establish as results aren't constant. If I execute the same command multiple times the program will give different results. Conclusion is at least that the database needs to be larger... shame!

still need answer to the last part of 1 and 2.

regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant