-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database Setup #8
Comments
Hi Aditya,
Sorry for the delayed reply - it's been a busy few days of deadlines.
This is a good question. There are some instructions on this here:
https://github.com/sharpton/shotmap/blob/master/docs/build_shotmap_searchdb.pl.md
See in particular this note:
-r, --refdb=/PATH/TO/REFERENCE/FLATFILES (REQUIRED argument) NO DEFAULT
VALUE
Location of the protein family reference data. Each family must have a HMM
(if running HMMER tools) or a set of protein sequences sequences, in fasta
format, that are members of the family (if running blast-like tools).
Files in this directory should correspond to an individual family, with
the prefix of the file being the family identifier (e.g., IPR020405) and
the suffix should either be .hmm (for HMMs) or .fa (for protein sequences).
These files can be placed in any subdirectory structure within this upper
level directory; shotmap will recurse through all subdirectories and append
all appropriate .hmm or .fa files to the list of families that will be
incorporated into the search database.
In short, you want to create directory of fasta files, where each file is a
distinct KO and contains the sequences from KEGG that are members of that
KO.
Does this answer your question? If not, I'm happy to help with additional
questions.
…On Sun, Feb 19, 2017 at 2:54 PM, Aditya Bandla ***@***.***> wrote:
Hi
Sorry for the rather naive question. But could you point me to any
tutorials on how to setup, for example a KEGG protein family database, to
use with shotmap?
After having done quite an amount of reading, I am still a bit lost at
this step
Best,
Aditya
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#8>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEUcALAExXiC0IOWAEi1hv7JFoMDdIXks5reMg8gaJpZM4MFoHL>
.
--
Thomas J. Sharpton
Assistant Professor
Department of Microbiology
Department of Statistics
Oregon State University
(541) 737-8623
thomas.sharpton@gmail.com
@tjsharpton
lab.sharpton.org
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi
Sorry for the rather naive question. But could you point me to any tutorials on how to setup, for example a KEGG protein family database, to use with shotmap?
After having done quite an amount of reading, I am still a bit lost at this step
Best,
Aditya
The text was updated successfully, but these errors were encountered: