Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[New pipeline] DeNovoRepeatLib #32

Open
Juke34 opened this issue May 18, 2020 · 0 comments
Open

[New pipeline] DeNovoRepeatLib #32

Juke34 opened this issue May 18, 2020 · 0 comments

Comments

@Juke34
Copy link
Collaborator

Juke34 commented May 18, 2020

See #17 for the general picture.

Maybe can be merge with the DeNovoRepeatLib pipeline (see #33).

The purpose of DeNovoRepeatLib is to make de-novo repeat library of a genome.
There is two approach, should we only use the standard one? Should we use both solutions in parallel? We can provide an option to choose.

solution 1 (standard):
Input: A genome fasta file + an existing lib e.g dfam or RepBase to classify the de novo repeat (give family name), A protein database(swissprot eukaryote/prokaryote) for remove potential proteins from repeats.
Output: A repeat library fasta file

For detailed approach see the wiki of the annotation cluster repo here and a more condense description in this post on Biostars.

TransposonPSI is now in bioconda.
protexcluder is available in the nanjiang conda channel, it should be moved into bioconda.
Be careful to Blast version (protexcluder needs particular ones).

solution 2 : Use EDTA available in conda and consequently as biocontainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant