fixstart stringency #163

rjsorr · 2020-04-25T12:26:22Z

Hi, I want to run fixstart to start from the coi gene of the mitochondria. I have provided 40 coi genes in nucleotide format with a broad representation from my phylum of interest. On running fixstart it finds the coi gene (provided) for approx 10% of the input contigs (all contigs are complete mitochondria from within the same phylum and also therefore have coi) before defaulting to prodigal. Basically, I want it to start from the coi gene or nothing, even dropping prodigal if needs be. so questions.

How do I lower the stringency so it finds the coi gene using the coi nt (database) sequences provided (what are good settings)? I have tried an array of --min_id and --mincluster options down to 10 for both with no real improvement (not easy to quantify with an input of 400 mitochondria). I could of course increase the size of the coi database provided but this seems defeatist. Is it not more logical to provide a protein sequence for stringency purposes?
Can prodigal be turned off? or is it possible to provide the program with a choice of mt genes at a later date (most seem to choose coi as a default start gene for mitogenomes)?

cheers

rjsorr · 2020-04-25T16:07:23Z

Replying to myself, and after a bit a playing around, --min_id 25 and --mincluster 20 (default) was the least stringent for my dataset, but not easy to fully establish as results aren't constant. If I execute the same command multiple times the program will give different results. Conclusion is at least that the database needs to be larger... shame!

still need answer to the last part of 1 and 2.

regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixstart stringency #163

fixstart stringency #163

rjsorr commented Apr 25, 2020

rjsorr commented Apr 25, 2020 •

edited

fixstart stringency #163

fixstart stringency #163

Comments

rjsorr commented Apr 25, 2020

rjsorr commented Apr 25, 2020 • edited

rjsorr commented Apr 25, 2020 •

edited