Skip to content

egluckthaler/starfish

Repository files navigation

DOI Anaconda_version Anaconda_downloads Anaconda_platforms Anaconda-Server Badge

starfish is a modular toolkit for giant mobile element annotation. Built primarily for annotating Starship elements in fungal genomes, it can be easily adapted to find any large mobile element (≥6kb) that shares the same basic architecture as a fungal Starship or a bacterial integrative and conjugative element: a "captain" gene with zero or more "cargo" genes downstream of its 3' end. It is particularly well suited for annotating low-copy number elements in a content independent manner.

Overview

The starfish workflow is organized into three main modules: Gene Finder, Element Finder, and Region Finder. Each has dedicated commands that are typically run sequentially. Auxiliary commands that provide additional utilities and generate visualizations are also available through the commandline. Several useful stand-alone scripts are located in the /scripts directory.

Documentation

Head to our GitHub Wiki for useful resources, including installation instructions, a manual with important details and considerations for each command, and a step-by-step tutorial. If you run into difficulties, please open an issue on GitHub

Citations and dependencies

Many starfish commands have dependencies that are stand-alone programs in their own right. If you use starfish in your research, please contact us as it has not yet been published.

Please cite both the starfish manuscript in addition to any dependencies you may have used (see Table below for a guide). For example:

We used starfish v1.0.0 (Gluck-Thaler and Vogan 2023) in conjunction with metaeuk (Karin et al. 2020), mummer4 (Marcais et al 2018), and blastn (Camacho et al. 2009) to annotate and visualize giant mobile elements.

Command Dependency Citation
annotate, augment metaeuk, hmmer, bedtools Karin et al. 2020, Eddy 2011, Quinlan and Hall 2010
insert, extend blastn, mummer4 Camacho et al. 2009, Marcais et al 2018
flank cnef Ayad et al. 2018
sim sourmash Pierce et al 2019
group mcl Enright et al. 2002
*-viz circos, gggenomes, mummer4,
mafft, minimap2
Krzywinski et al. 2009, Hackl and Ankenbrand 2022, Marcais et al 2018,
Katoh and Standley, Li 2018

License and acknowlegements

Please cite our work if you use starfish in your research:

Gluck-Thaler, E., & Vogan, A. A. (2023). Systematic identification of cargo-carrying genetic elements reveals new dimensions of eukaryotic diversity. bioRxiv, 2023-10.

starfish is an open source tool available under the GNU Affero General Public License version 3.0 or greater. This work was supported by funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement (grant number 890630).