Skip to content

Supervised Machine Learning project for HIV-1 cleavege sites

Notifications You must be signed in to change notification settings

martinaturchini/HIV-1-cleavage-project

Repository files navigation

HIV-1-cleavage-prediction

Supervised Machine Learning project for HIV-1 cleavege sites. This project compares the results obtained with three different classifiers: Multi-Layer Perceptron, Logistic Regression and k-NN.

Description

To work with the code is necessary to download and save the data set (https://github.com/martinaturchini/HIV-1-cleavege-/blob/main/12859_2022_5017_MOESM2_ESM.xlsx) in "/content/gdrive/My Drive".

"This work shows the possibility to study and predict the HIV-1 cleavage sites given the combination of sequence information, including amino acid binary profiles, bond composition, and physicochemical properties. The best performances are obtained via the Multi-Layer Perceptron classifier and similar results can be obtained with the Logistic Regression classifier. The k-NN classifier is the worst among the methods analyzed in this work. The results could be improved by using a larger data set, that can be implemented via data augmentation techniques."

The same methods used in this project could be implemented for a HEP binary classification problem. For example, to disinguish if a certain particle we are looking for belongs or not to the background.