Skip to content

Coda-Research-Group/AlphaFind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation



AlphaCharges

AlphaFind: Discover structure similarity across the entire known proteome

AlphaFind is a web-based search engine that allows for structure-based search of the entire AlphaFold Protein Structure Database. Uniprot ID, PDB ID, or Gene Symbol is accepted as input – the engine will return the most similar proteins found within AlphaFold DB, with an option for additional search to extend and refine the results. The search results are grouped by their source organism and displayed along with several similarity metrics. 3D visualizations of the structural superposition of the proteins are provided, and text filters can be used to find specific organisms or Uniprot IDs. For details about the methodology and usage, please see the manual. This website is free and open to all users and there is no login requirement.

Vector embeddings and model weights used in AlphaFind are available at AlphaFind: Discover structure similarity across the entire known proteome – data and model | Czech national repository. This project uses USalign.

Code Structure

The codebase is divided into three folders:

  • training (model training, index building)
  • api (backend)
  • ui (frontend)

See the README.md files in each folder for more details.

Running locally

Prerequisites:

  1. Clone this repository
  2. Run ./run.sh in your terminal
  3. Open http://localhost:8081 in your browser

The training/data/cifs folder contains a small subset of the AlphaFold DB comprising 109 proteins. The full AlphaFold DB can be downloaded from here.

License

MIT license