Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble testing on chromosomes #32

Closed
Liqueurdefehling opened this issue Apr 15, 2022 · 9 comments
Closed

Trouble testing on chromosomes #32

Liqueurdefehling opened this issue Apr 15, 2022 · 9 comments
Labels
help wanted Extra attention is needed

Comments

@Liqueurdefehling
Copy link

Hi
I am testing Platon 1.6 on the E. coli chromosome accession number CP027572.1 as well as bacterial chromosomes CP045233.1 and CP011509.1.
platon [–c] --db /env/ig/biobank/by-soft/platon/1.6/db/ --output …/test_ecoli_c/ --verbose …/ecoli.fasta
There is no output when running in accuracy mode. When launched in –c mode, I get a table with one row, the ID being the sequence ID and the RDS being negativ, and the chromosome.fasta file is empty whereas the sequence is in the plasmid.fasta file.
The same thing happens when I try an input file containing both chromosomes and plasmids sequences, every sequences are in the plasmid.fasta file.
Any idea on what I might be missing ?
Best regards

@oschwengers
Copy link
Owner

Hi @Liqueurdefehling ,
Though this might sound confusing in first place, it is actually the expected behavior. Platon was designed to classify draft contigs and thus extract plasmid-borne contigs. In order to do so, one can adjust sensitivity/specificity values by running Platon in either sensitivity, accuracy or specificity mode via the --mode parameter.

In addition and besides the above described normal operation, one can also use Platon in order to characterize (NOT classify) all plasmids via --characterize.

In that context, the above behavior is expected since in characterization mode, Platon executes the full characterization pipeline which is why all contigs are handled as plasmid-borne. I agree that in this case the output might be misleading and this might deserve a little bit of improvement.

@oschwengers oschwengers added the help wanted Extra attention is needed label Apr 19, 2022
@Gian77
Copy link

Gian77 commented Jul 20, 2022

Thank you for explanation, now makes sense. I had similar results using the --characterize option. All the contigs were written into the <prefix>.plasmid.fasta file while the <prefix>.chromosome.fasta file was empty. Can this be chaged so the plasmids that had hits will be automatically written into a file for further use?
Great tool anyways , thanks a lot!
G

@Gian77
Copy link

Gian77 commented Jul 20, 2022

I have tried to comapre the two outputs with (bottom, secon cat) and without (top, first cat) the --characterize option and I am not sure how to interpret the result. Wht the 2 contigs NODE_5 and NODE_11 that were included in the <prefix>.plasmid.fasta then are not marked as having any plasmid hits, even when using the --characterize option. Thanks a lot. G
Screenshot from 2022-07-20 15-46-53
.

@oschwengers
Copy link
Owner

oschwengers commented Aug 2, 2022

Hi @Gian77 ,
the --characterize option simply conducts all characterization tasks without filtering for or predicting any plasmid/chromosome inference. It's just a convenience option to characterize all contigs.

If you'd like to predict plasmid-borne contigs, then you should use Platon in the default mode w/o --characterize. In your example NODE_5 and NODE_11 are predicted to be plasmid-borne.

@Gian77
Copy link

Gian77 commented Sep 16, 2022

Hey @oschwengers,

thanks for the explanation, very useful. I am still confused, though, about what the # Plasmid Hits field means in the --characterize mode of platon. I have several contigs that have 1 in the characterize mode in that field, should't they match with what predicted in the default mode?

Thanks much!
Gian

@oschwengers
Copy link
Owner

Hi @Gian77 ,
wel, it depends. Sure, a small contig can have a BLAST+ hit against a reference plasmid. But this might also be a small part of a mobile element or a fragment thereof, for example an IS, transposon or even just a transposase. To filter out these maybe false-positives, Platon screens for contigs with a sufficiently-high RDS. Only after this initial screening step, remaining contigs are characterized. By this, we can significantly speed up the entire process.

@oschwengers oschwengers pinned this issue Sep 21, 2022
@Gian77
Copy link

Gian77 commented Sep 21, 2022

Hello @oschwengers ,
Thanks for the explanation. So this means that the characterization may be not 100% correct due to the reasons you mention above, while the default mode it is correct since is performed after the screening for possible false positives. In the end I shoudl trust the default mode results, correct?
Thanks a lot,
Gian

@oschwengers
Copy link
Owner

Well, not exactly. The characterization is correct in terms of the descriptions. This step does not classify by any means, it merely provides all information on all contigs.
For an actual classification (chromosome/plasmid), you should use Platon in the default (accuracy) mode.

@Gian77
Copy link

Gian77 commented Sep 24, 2022

ok @oschwengers, will look into the manual. I think I did not specified accuracy mode when I run it. Thanks a lot,
Gian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants