Skip to content

GianCarloMilanese/dsim_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Digital Signal and Image Recognition project

Team members

Tasks

  • Audio:
    • Digit recognition: from 0 to 9
    • Speaker recognition: among the two of us, our girlfriends and 4 speakers in the free-spoken-digit dataset
  • Images:
    • Face recognition: among the two of us and Gian Carlo's family members
  • Retrieval:
    • Face similarity: find out which are the 10 VIPS that are more similar to us

Project structure

The project is structured as follows:

  • Audio:
    • 0_record_audio.ipynb : this notebook can be used for quickly recording current user voice
    • 1_Data augmentation pipeline.ipynb: here we show the two augmentation strategies (random noise and pitch shift) we implemented and the empirical tests for finding the best values
    • 2_train_classifiers.ipynb: this is the core notebook, where we load tracks, build predictive models and find out the best "model + data representation" for the two predictive tasks. The best models are stored in the directory best_models for later reuse
    • 3_test_model_audio.ipynb: here you can test the two best models we created
  • Images:
    • 0_take_pictures.ipynb: this notebook can be used for quickly taking pictures of the current user
    • 1_preprocess_pictures.ipynb: here we show the various preprocessing functions used for transforming input images so that predictive models will be more effective
    • 2_train_model.ipynb: this is the core notebook for the Image part, where we train various models in order to find out the best face recognition model. The final models are stored in the models directory
    • 3_test_model_images.ipynb: here you can test the best models and find out who you are most similar with.

We highly encourage you to have a look at our demo gifs so that you can see how the model behaves while recognising different people!

  • Retrieval:
    • 0_choose_model.ipynb: we evaluate different models on our own dataset in order to choose the one to use for the task of finding the most similar celebrities to a given picture. The model with the best scores ended up making very inconsistent predictions on the celebrities dataset, so Mobilenet was used for the final retrieval task.
    • the notebook 1_retrieval.ipynb can be used for finding which are the top 10 most similar celebrities to a given query picture

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published