New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Same results whether or not hmmer/blast results included #172
Comments
Hi,
If the hmmer/pfam info and/or blast hit info was used to guide selection,
then that info should be embedded into both the gff3 file and the headers
of the transdecoder final peptide file.
I agree that it would be peculiar for the results to be exactly the same in
using vs. not using that info.
Note, be sure to use the latest Transdecoder if you're using hmmsearch as
opposed to hmmscan. Earlier versions were compatible with hmmscan results
only, but the latest version is compatible with both - in case that's part
of the issue.
best,
~b
…On Sun, Mar 19, 2023 at 11:22 AM kdarragh1994 ***@***.***> wrote:
I have run two trinity assemblies through the TransDecoder pipeline and
was surprised to find that in one case the exact same number of transcripts
was found whether or not I included the blast and hmmer information, and in
the other case, more transcripts (~1000 more) were found when including the
blast and hmmer results. Is this normal? Is there any way to check that the
hmmer and blast hits were definitely used to make the final prediction?
—
Reply to this email directly, view it on GitHub
<#172>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX66DWY5XLLOZ62G5Q3W44QEXANCNFSM6AAAAAAWAFQQA4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Thank you, I think that was it, I now see the information in the gff3 file. Is it normal for there to be more transcripts retained following inclusion of pfam and blast info? I was expecting most transcripts to be removed resulting in a more reduced dataset. |
Sounds good.
The blast and pfam mostly just adds coding sequences that otherwise didn't
have sufficient coding metrics to be included based on sequence
composition alone, so totally normal to expect more rather than less to be
included. It's a way to better ensure potentially important sequences
aren't excluded.
…On Wed, Mar 22, 2023 at 11:17 AM kdarragh1994 ***@***.***> wrote:
Thank you, I think that was it, I now see the information in the gff3
file. Is it normal for there to be more transcripts retained following
inclusion of pfam and blast info? I was expecting most transcripts to be
removed resulting in a more reduced dataset.
—
Reply to this email directly, view it on GitHub
<#172 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABZRKX46HZPNVGQ6X6ERAXTW5MJZ7ANCNFSM6AAAAAAWAFQQA4>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas <http://broad.mit.edu/~bhaas>
|
Perfect, thank you! |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I have run two trinity assemblies through the TransDecoder pipeline and was surprised to find that in one case the exact same number of transcripts was found whether or not I included the blast and hmmer information, and in the other case, more transcripts (~1000 more) were found when including the blast and hmmer results. Is this normal? Is there any way to check that the hmmer and blast hits were definitely used to make the final prediction?
The text was updated successfully, but these errors were encountered: