colnames and EC number #776

SueFletcher · 2024-01-08T10:15:11Z

hello,
I want to thank you for this amazing tool.
I tried to use it and it went really fast
I used these commands as you suggested :

downloading the tool

wget http://github.com/bbuchfink/diamond/releases/download/v2.1.8/diamond-linux64.tar.gz
tar xzf diamond-linux64.tar.gz

creating a diamond-formatted database file

./diamond makedb --in reference.fasta -d reference

running a search in blastp mode

./diamond blastp -d reference -q queries.fasta -o matches.tsv

running a search in blastx mode

./diamond blastx -d reference -q reads.fasta -o matches.tsv

Now I'm wondering what are the column names for my output data file
PNEG_00003T0 sp|Q09895|YAI8_SCHPO 45.7 387 195 7 1 373 1 386 7.08e-114 340
PNEG_00003T0 sp|Q9Y282|ERGI3_HUMAN 38.2 387 215 8 8 383 8 381 4.22e-83 261

second question : how to parameter the tool in term of e-value, qcov_hsp_perc etc
final question how I could determine EC number from this output file !!
thank you in advance !!

bbuchfink · 2024-01-08T13:19:45Z

The columns are explained here: https://github.com/bbuchfink/diamond/wiki/1.-Tutorial

how to parameter the tool in term of e-value, qcov_hsp_perc etc

All options are explained in the Wiki.

final question how I could determine EC number from this output file !!

These mappings can be downloaded at sites e.g. Uniprot. I'm to aware of a tool to do this though.

SueFletcher · 2024-01-08T13:36:11Z

@bbuchfink Thank you I didn't notice that.
for clustering , there is diamond cluster for protein , is there a solution for clustering applied on nucleotide ?

bbuchfink · 2024-01-08T13:39:06Z

There are other tools that can do that, but diamond works only on proteins.

SueFletcher · 2024-01-08T13:47:44Z

@bbuchfink thank you
otherwise I still can apply blastx using swissprot by applyinf this command: ./diamond blastx -d swissprot -q queries.fasta -o matches.tsv ? or only ./diamond blastp -d swissprot -q queries.fasta -o matches.tsv works

SueFletcher · 2024-01-08T14:30:01Z

@bbuchfink sorry again by in this command : diamond cluster -d INPUT_FILE -o OUTPUT_FILE --approx-id 30 -M 64G
how I can pass the path of my protein fasta input file
-d is for the database and -o in for output :/

AMbioinformatics · 2024-01-09T19:03:00Z

Supported formats are FASTA and DIAMOND (.dmnd), so you can provide after -d also your FASTA file.

bbuchfink · 2024-01-10T10:42:30Z

otherwise I still can apply blastx using swissprot by applyinf this command:

Yes of course you can use the blastx mode of diamond.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colnames and EC number #776

colnames and EC number #776

SueFletcher commented Jan 8, 2024 •

edited

bbuchfink commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

bbuchfink commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

AMbioinformatics commented Jan 9, 2024

bbuchfink commented Jan 10, 2024

colnames and EC number #776

colnames and EC number #776

Comments

SueFletcher commented Jan 8, 2024 • edited

downloading the tool

creating a diamond-formatted database file

running a search in blastp mode

running a search in blastx mode

bbuchfink commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

bbuchfink commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

SueFletcher commented Jan 8, 2024

AMbioinformatics commented Jan 9, 2024

bbuchfink commented Jan 10, 2024

SueFletcher commented Jan 8, 2024 •

edited