Skip to content

trvinh/fCAT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fCAT

PyPI version License: GPL v3 Build Status

Python package for fCAT, a feature-aware Completeness Assessment Tool

Table of Contents

How to install

fCAT tool is distributed as a python package called fcat. It is compatible with Python ≥ v3.7.

You can install fcat using pip:

# python3 -m pip install fcat
python3 -m pip install git+https://github.com/BIONF/fCAT

or, in case you do not have admin rights, and don't use package systems like Anaconda to manage environments you need to use the --user option:

# python3 -m pip install --user fcat
python3 -m pip install --user git+https://github.com/BIONF/fCAT

and then add the following line to the end of your ~/.bashrc or ~/.bash_profile file, restart the current terminal to apply the change (or type source ~/.bashrc):

export PATH=$HOME/.local/bin:$PATH

Usage

The complete process of fCAT can be done using one function fcat

fcat --coreDir /path/to/fcat_data --coreSet eukaryota --refspecList "HOMSA@9606@2" --querySpecies /path/to/query.fa [--annoQuery /path/to/query.json] [--outDir /path/to/fcat/output]

where eukaryota is name of the fCAT core set (equivalent to BUSCO set); HOMSA@9606@2 is the reference species from that core set that will be used for the ortholog search; query is the name of species of interest. If --annoQuery not specified, fCAT fill do the feature annotation for the query proteins using FAS tool.

Output

You will find the output in the /path/to/fcat/output/fcatOutput/eukaryota/query/ folder, where /path/to/fcat/output/ could be your current directory if you not specified --outDir when running fcat. The following important output files/folders can be found:

- all_summary.txt: summary of the completeness assessment using all 4 score modes
- fdogOutput.tar.gz: a zipped file of the ortholog search result
- mode_1, mode_2, mode_3 and mode_4: detailed output for each score mode
- phyloprofileOutput: output phylogenetic profile data that can be used with [PhyloProfile tool](https://github.com/BIONF/PhyloProfile)

For internal use

fCAT algorithm consists of 3 main steps:

  1. Calculate group-specific cutoffs for a core set

Please make sure that the R dependencies are available before running this function.

fcat.cutoff --coreDir /path/to/fcat_data --coreSet eukaryota
  1. Search for orthologs in a gene set of interst and create phylogenetic profiles
fcat.ortho --coreDir /path/to/fcat_data --coreSet eukaryota --refspecList "HOMSA@9606@2" --querySpecies /path/to/query.fa --annoQuery /path/to/query.json
  1. Create report for completeness assessment
fcat.report --coreDir /path/to/fcat_data --coreSet eukaryota --outDir /path/to/fcat/output --queryID queryID --mode 1

Bugs

Any bug reports or comments, suggestions are highly appreciated. Please open an issue on GitHub or be in touch via email.

Contributors

Contact

For further support or bug reports please contact: tran@bio.uni-frankfurt.de

About

Python package for fCAT, a feature-aware completeness assessment tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages