eFISHent

A command-line based tool to facilitate the creation of eFISHent single-molecule RNA fluorescence in-situ hybridization (RNA smFISH) oligonucleotide probes.

Description

eFISHent is a tool to facilitate the creation of eFISHent RNA smFISH oligonucleotide probes. Some of the key features of eFISHent are:

One-line installation using conda (available through bioconda*)
Automatic gene sequence download from NCBI when providing a gene and species name (or pass a FASTA file)
Filtering steps to remove low-quality probes including off-targets, frequently occuring short-mers, secondary structures, etc.
Mathematical or greedy optimization to ensure highest coverage

* The release on bioconda is always associated with waiting times. Therefore, the easiest approach is to install conda dependencies and install eFISHent using pip.

Installation

eFISHent is being tested on MacOS and Linux with Python versions 3.8 - 3.10. Unfortunately, due to the bioinformatics dependencies Windows is not supported. For Windows users, we reccommend installing "Windows Subsystem for Linux (WSL)" (Windows 10, Windows 11) or using a fully fledged Virtual Machine. Using conda environment, install eFISHent as follows:

# Create an environment and install all dependencies (e.g. python)
conda env create bbquercus/efishent  

# Activate environment
conda activate efishent

# Install efishent via pypi
pip install efishent

Any updates can then simply be done via pypi (pip install --upgrade efishent).

Usage

A detailed usage guide can be found on the GitHub wiki but here is a quick example:

eFISHent --reference-genome <reference-genome> --gene-name <gene> --organism-name <organism>

Component overview

eFISHent is built up modularly using the following components...

Index creation workflow:

Bowtie index
Jellyfish indices

Probe filtering workflow:

Download / prepare sequences
Generate candidate probes
Filter with basic filters
Align probes to reference genome
Filter based on alignment score and uniqueness
Filter reoccuring k-mers
Filter based on secondary structure prediction
Create final list of probes
Write final list of probes to file with report

Probe set analysis plotting:

Create a simple overview over the key parameters

TODO

Add more detailed documentation as wiki page(s)
- Add links to genomes and RNAseq databases
- Add examples from multiple sources
- Add benchmarks for deltaG, counts
Add mathematical description for model (in wiki?)
Add probe set analysis txt file with off-target locations / potentially harmful probes

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.github/workflows		.github/workflows
conda		conda
eFISHent		eFISHent
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
MANIFEST.in		MANIFEST.in
README.md		README.md
license.txt		license.txt
logo.png		logo.png
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini
workflow.png		workflow.png

License

BBQuercus/eFISHent

Folders and files

Latest commit

History

Repository files navigation

eFISHent

Description

Installation

Usage

Component overview

TODO

About

Topics

Resources

License

Stars

Watchers

Forks

Languages