Skip to content

Commit

Permalink
Merge pull request #34 from zktuong/devel
Browse files Browse the repository at this point in the history
0.0.21
  • Loading branch information
zktuong committed Nov 16, 2020
2 parents 4fa21b5 + 1c6a039 commit 02ecd46
Show file tree
Hide file tree
Showing 177 changed files with 19,750 additions and 3,162 deletions.
55 changes: 28 additions & 27 deletions README.md
Expand Up @@ -2,10 +2,10 @@

![dandelion_logo](notebooks/img/dandelion_logo.png)

Version = 0.0.17
Version = 0.0.21

## Intro
Hi there! I have put together a python package for analyzing single cell BCR/V(D)J data from 10x Genomics 5' solution! It streamlines the pre-processing of immcantation tools for single-cell BCR analysis and includes a couple of functions for visualization.
Hi there! I have put together a python package for analyzing single cell BCR/V(D)J data from 10x Genomics 5' solution! It streamlines the pre-processing, leveraging some tools from immcantation suite, and integrates with scanpy/anndata for single-cell BCR analysis. It also includes a couple of functions for visualization.

Overview

Expand All @@ -26,41 +26,40 @@ I would reccomend instaling this in order:
```bash
# in bash/zsh terminal
# create a conda environment with specific modules
conda create --name dandelion python=3.7
conda create --name dandelion python=3.7
conda activate dandelion

# the following two are what's required by scanpy
# the following are normal installation intructions for scanpy
conda install seaborn scikit-learn statsmodels numba pytables
conda install -c conda-forge python-igraph leidenalg
conda install -c conda-forge python-igraph leidenalg
pip install scanpy

# these are required by dandelion
conda install distance scikit-bio joblib plotnine jupyter adjustText
conda install -c bioconda igblast blast changeo presto
conda install -c conda-forge distance joblib plotnine adjustText jupyter
conda install -c bioconda igblast blast # if this doesn't work, download them manually (see below)
conda install -c conda-forge rpy2
# or pip install rpy2
# If it fails because it's compiling using clang, first, work out where the path is to your gcc compiler (use brew to install gcc if needed):
# then run
# env CC=/path/to/location/of/bin/gcc-9 pip install rpy2

# Use pip to install the following with --no-cache-dir --upgrade if necessary
pip install networkx polyleven scrublet scanpy

# and then lastly install this
pip install git+https://github.com/zktuong/dandelion.git

# for the development branch, run this:
pip install git+https://github.com/zktuong/dandelion.git@devel
````

dandelion also requires some R packages intalled.
`dandelion` also requires some R packages intalled.
```R
# in R
install.packages(c("optparse", "alakazam", "tigger", "airr", "shazam"))
```
or the following if using conda to manage R:
```bash
# in bash/zsh terminal
conda install -c bioconda r-optparse
conda install -c conda-forge r-alakazam r-tigger r-airr r-shazam
conda install -c conda-forge r-optparse r-alakazam r-tigger r-airr r-shazam
```

The package should now be properly installed and when starting up jupyter notebook in the virtual environment, the kernel `python3` should work. Otherwise, you might need to add it manually:
Expand All @@ -74,6 +73,7 @@ Last but not least, you will need to download the database folder in this reposi

So for example, if I unpack into `~/Documents`
```bash
# in bash/zsh terminal
# set up environmental variables in ~/.bash_profile
export GERMLINE=~/Documents/dandelion/database/germlines/
export IGDATA=~/Documents/dandelion/database/igblast/
Expand All @@ -83,30 +83,28 @@ export BLASTDB=~/Documents/dandelion/database/blast/
## External softwares
While blast and igblast executables are managed through conda, you can also download [igblast](https://ftp.ncbi.nih.gov/blast/executables/igblast/release/LATEST/) and [blast+](https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/) manually, and store the softwares somewhere accessible. Just make sure to set the paths to them appropriately.
```bash
# in bash/zsh terminal
# unpack where relevant and export the path to the softwares, e.g. ~/Documents/
echo 'export PATH=~/Documents/software/bin:$PATH' >> ~/.bash_profile
source ~/.bash_profile
```

## Requirements
## Basic Requirements
Python packages
```python
# conda
python==3.7.6 (conda-forge)
numpy==1.18.4 (conda-forge)
pandas==1.0.3 (conda-forge)
distance==0.1.3 (conda-forge)
python>=3.7,<3.8 (conda-forge)
numpy>=1.18.4 (conda-forge)
pandas>=1.0.3 (conda-forge)
distance>=0.1.3 (conda-forge)
joblib==0.14.1 (conda-forge)
jupyter==1.0.0 (conda-forge)
scikit-learn==0.23.0 (conda-forge)
numba==0.48.0 (conda-forge)
scikit-learn>=0.23.0 (conda-forge)
numba>=0.48.0 (conda-forge)
pytables==3.6.1 (conda-forge)
seaborn==0.10.1 (conda-forge)
leidenalg==0.8.0 (conda-forge)
plotnine==0.6.0 (conda-forge)
scikit-bio==0.5.6 (conda-forge)
changeo==1.0.0 (bioconda)
presto==0.6.0 (bioconda)
seaborn>=0.10.1 (conda-forge)
leidenalg>=0.8.0 (conda-forge)
plotnine>=0.6.0 (conda-forge)
# Other executables (through conda)
blast>=2.10.0 (bioconda) # depends on the database version as well
Expand All @@ -115,7 +113,10 @@ igblast==1.15.0 (bioconda)
# pip
anndata>=0.7.1
scanpy>=1.4.6
scrublet==0.2.1
scrublet>=0.2.1
scikit-bio>=0.5.6
changeo>=1.0.0
presto>=0.6.0
polyleven>=0.5
networkx>=2.4
rpy2>=3.3.2
Expand All @@ -131,7 +132,7 @@ ggplot2
```

## Acknowledgements
I would like to acknowledge the contributions from Dr. Ondrej Suschanek, Dr. Benjamin Stewart, Dr. Rachel Bashford-Rogers and Prof. Menna Clatworthy who helped with the initial conception of the project and for all discussions.
I would like to acknowledge the contributions from Dr. Ondrej Suschanek, Dr. Benjamin Stewart, Dr. Rachel Bashford-Rogers and Prof. Menna Clatworthy, who helped with the initial conception of the project and for all discussions.

I would also like to acknowledge Dr. Jongeun Park, Dr. Cecilia-Dominguez Conde, Dr. Hamish King, Dr. Krysztof Polanksi and Dr. Peng He with whom I have had very useful discussions. I would also like to thank my wife who helped name the package, because she thought the plots looked like a dandelion =D.

Expand Down
4 changes: 2 additions & 2 deletions bin/tigger-genotype.R
Expand Up @@ -61,7 +61,7 @@ opt_list <- list(make_option(c("-d", "--db"), dest="DB",
help=paste("Output directory. Will be created if it does not exist.",
"\n\t\tDefaults to the current working directory.")),
make_option(c("-f", "--format"), dest="FORMAT", default=FORMAT,
help=paste("File format. One of 'airr' (default) or 'changeo'.")),
help=paste("File format. One of 'blast' (default), 'changeo' or 'airr'.")),
make_option(c("-p", "--nproc"), dest="NPROC", default=NPROC,
help=paste("Number of subprocesses for multiprocessing tools.",
"\n\t\tDefaults to the available processing units.")))
Expand Down Expand Up @@ -101,7 +101,7 @@ if (opt$FORMAT == "changeo") {
junction_length <- "junction_length"
sequence_alignment <- "sequence_alignment"
ext <- "tsv"
} else if (opt$FORMAT == "airr") {
} else if (opt$FORMAT %in% "airr") {
db <- airr::read_rearrangement(opt$DB)
v_call <- "v_call"
j_call <- "j_call"
Expand Down
4 changes: 2 additions & 2 deletions dandelion/plotting/__init__.py
Expand Up @@ -2,6 +2,6 @@
# @Author: Kelvin
# @Date: 2020-05-13 23:21:45
# @Last Modified by: Kelvin
# @Last Modified time: 2020-08-17 18:22:17
# @Last Modified time: 2020-11-13 20:10:00

from ._plotting import random_palette, clone_network, barplot, stackedbarplot, spectratype, clone_rarefaction
from ._plotting import random_palette, clone_network, barplot, stackedbarplot, spectratype, clone_rarefaction, clone_overlap

0 comments on commit 02ecd46

Please sign in to comment.