This is a series of scripts that parse and extract information about each carbohydrate binding module (CBM) family from the Carbohydrate Active enZYme Database.
CAZy is an online database created in 1998 that holds genomic, structural and biochemical information about Carbohydrate-Active Enzymes (CAZymes) and their associated modules.
These include:
- Glycoside Hydrolases (GH)
- GlycosylTransferases (GT)
- Polysaccharide Lyases (PL)
- Carbohydrate Esterases (CE)
- Auxilliary Activities (AA)
- Carbohydrate Binding Modules (CBM)
Consolidates the 'Activity in Family' information from each CBM page into a single excel file. 'Note' information is used in place of 'Activity in Family' if that field is not populated.
Downloads and extracts each CBM listed in the CAZy database across bacteria, archaea, viruses and eukaryota.
Generic functions used across the other scripts.
- pandas 1.5.2
- python3-wget 3.2
Written in Python 3.10.9