Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

colnames and EC number #776

Open
SueFletcher opened this issue Jan 8, 2024 · 7 comments
Open

colnames and EC number #776

SueFletcher opened this issue Jan 8, 2024 · 7 comments

Comments

@SueFletcher
Copy link

SueFletcher commented Jan 8, 2024

hello,
I want to thank you for this amazing tool.
I tried to use it and it went really fast
I used these commands as you suggested :

downloading the tool

wget http://github.com/bbuchfink/diamond/releases/download/v2.1.8/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz

creating a diamond-formatted database file

./diamond makedb --in reference.fasta -d reference

running a search in blastp mode

./diamond blastp -d reference -q queries.fasta -o matches.tsv

running a search in blastx mode

./diamond blastx -d reference -q reads.fasta -o matches.tsv

Now I'm wondering what are the column names for my output data file
PNEG_00003T0 sp|Q09895|YAI8_SCHPO 45.7 387 195 7 1 373 1 386 7.08e-114 340
PNEG_00003T0 sp|Q9Y282|ERGI3_HUMAN 38.2 387 215 8 8 383 8 381 4.22e-83 261

second question : how to parameter the tool in term of e-value, qcov_hsp_perc etc
final question how I could determine EC number from this output file !!
thank you in advance !!

@bbuchfink
Copy link
Owner

The columns are explained here: https://github.com/bbuchfink/diamond/wiki/1.-Tutorial

how to parameter the tool in term of e-value, qcov_hsp_perc etc

All options are explained in the Wiki.

final question how I could determine EC number from this output file !!

These mappings can be downloaded at sites e.g. Uniprot. I'm to aware of a tool to do this though.

@SueFletcher
Copy link
Author

@bbuchfink Thank you I didn't notice that.
for clustering , there is diamond cluster for protein , is there a solution for clustering applied on nucleotide ?

@bbuchfink
Copy link
Owner

There are other tools that can do that, but diamond works only on proteins.

@SueFletcher
Copy link
Author

@bbuchfink thank you
otherwise I still can apply blastx using swissprot by applyinf this command: ./diamond blastx -d swissprot -q queries.fasta -o matches.tsv ? or only ./diamond blastp -d swissprot -q queries.fasta -o matches.tsv works

@SueFletcher
Copy link
Author

@bbuchfink sorry again by in this command : diamond cluster -d INPUT_FILE -o OUTPUT_FILE --approx-id 30 -M 64G
how I can pass the path of my protein fasta input file
-d is for the database and -o in for output :/

@AMbioinformatics
Copy link

Supported formats are FASTA and DIAMOND (.dmnd), so you can provide after -d also your FASTA file.

@bbuchfink
Copy link
Owner

otherwise I still can apply blastx using swissprot by applyinf this command:

Yes of course you can use the blastx mode of diamond.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants