Skip to content

Python Scripts for Haim Lab (Microbiology Research) Projects

Notifications You must be signed in to change notification settings

MrColinHan/Haim-Lab-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Code for Haim Lab Projects

HIV Volatility project:

  • Divide a fasta file into multiple fasta files based on identifiers
  • Divide a csv file into multiple csv/fasta files based on identifiers
  • Calculate the volatility

HIV AA Frequency project:

  • Amino Acids Distribution.
  • In-host Amino Acids Distribution.
  • Delta of Amino Acids Distribution.
  • Euclidean Distance.
  • Automate Log Conversion Process
  • Distribution Percentage Cutoff
  • Translate a Excel/CSV row into a python list
  • Filt out accession numbers from B.KR in a .NWK format file
  • Convert AA sequence from csv to fasta

SARS-CoV-2 project:

  • Extract sequences
  • Clean up sequences
  • CSV conversion

Flu project:

  • NEW APPROACH: clean data by searching non-ACTG chars and remove the corresponding sequence directly in a Nucleotide file
  • Calculate FD and stdev with multiple selection options (e.g. Position, seasons, group, country, ...)
  • Calculate the difference between different FD profile
  • Match sequence with all attributes
  • Match sequence with groups (grouped by newick tree)
  • Assign Hydropathy Value.
  • Combine several fasta format files into one fasta file.
  • Remove 8 Characters & dash Before each Dash in a newick format file.
  • Group Amino Acids
  • Equally distribute a stdev_output into 2 groups
  • Extract position pairs based on a given range of p_value in a co-volatility matrix
  • Compare two groups of position pairs and extract overlapping pairs

HCV project:

  • (New version)Find Accession Numbers that contain special characters(e.g. #, $)
  • Remove Accession Numbers that contain special characters