Skip to content

A project to predict the neural visual responses to naturalistic scenes using machine learning. It was my capstone project of the Machine learning course from the MSc in Systems Biology at Maastricht University.

License

Notifications You must be signed in to change notification settings

sayalaruano/img2brain

Repository files navigation

Img2brain

PyPI - License DOI colab

Predicting the neural responses to visual stimuli of naturalistic scenes using machine learning

Table of contents:

About the project

The goal of this project is to employ machine learning techniques to forecast the neural visual responses triggered by naturalistic scenes. These computational models strive to replicate the intricate process through which neuronal activity encodes visual stimuli aroused by the external environment. The following figure gives a schematic representation of the brain encoding and decoding processes.

my alt text

Brain encoding and decoding in fMR. Obtained from [1].

Visual encoding models based on fMRI data employ algorithms that transform image pixels into model features and map these features to brain activity. This framework enables the prediction of neural responses from images. The following figure illustrates the mapping between the pixel, feature, and brain spaces.

my alt text

The general architecture of visual encoding models consists of three spaces (the input space, the feature space, and the brain activity space) and two in-between mappings. Obtained from [2].

Dataset

The data for this project is part of the Natural Scenes Dataset (NSD), a massive dataset of 7T fMRI responses to images of natural scenes coming from the COCO dataset. The training dataset consists of brain responses measured at 10.000 brain locations (voxels) to 8857 images (in jpg format) for one subject. The 10.000 voxels are distributed around the visual pathway and may encode perceptual and semantic features in different proportions. The test dataset comprises 984 images (in jpg format), and the goal is to predict the brain responses to these images.

You can access the dataset through Zenodo with the following DOI: 10.5281/zenodo.7979730.

The training dataset was split into training and validation partitions with an 80/20 ratio. The training partition was used to train the models, and the validation partition was used to evaluate the models. The test dataset was used to make predictions with the best model on unseen data.

Feature engineering

Due to the high dimensionality of the feature representation of images using the raw pixel values (i.e., the original images have a size of 425x425 and 3 channels (RGB), which results in a feature representation of 425x425x3 = 541875 features), I used the representations obtained from different layers of pre-trained CNNs to obtain a lower dimensional representation of the images. In this case, I tried various layers of four different pre-trained CNNs: AlexNet, VGG16, ResNet50, and InceptionV3, available in the torchvision package.

The feature representations of the images were obtained by passing the images through the pre-trained CNNs and extracting the output of the desired layer. The size of the feature vectors at this point was still very large, so I used PCA to overcome this problem and got a set of 30 features. I fit the PCA on the training image features and used it to downsample the training, validation, and test image features.

I evaluated the best feature representation by training a simple linear regression model to predict the brain activity of the voxels from the feature representation of the images. The best feature representation was the one that resulted in the highest encoding accuracy (i.e., median correlation between the predicted and actual brain activity of the voxels) on the validation set.

You can find the code for this part of the project here.

Machine learning models

I trained 6 different machine learning algorithms (linear regression - base model, ridge regression, lasso regression, elasticnet regression, k-nearest neighbors regressor, and decision tree regressor) to predict the brain activity of the voxels from the feature representation of the images. In this project, the learning task was a multioutput regression problem, where the input is the feature representation of the images and the output is the brain activity of all the voxels. Each regressor maps from the feature space to each voxel, so there is a separate encoding model per voxel, leading to voxelwise encoding models. Therefore, every model trained with this dataset has 10.000 independent regression models with n coefficients each (the number of features). As in the previous section, the best model was the one that resulted in the highest encoding accuracy on the validation set.

The best model was the lasso regression with an encoding accuracy of 0.2417 on the validation set. The best hyperparameters of the lasso regression model were alpha=0.01 and the default max_iter=1000. This model was trained with the feature representation of the images obtained from the layer features.12 of the AlexNet CNN. The feature representation of the images was reduced to 100 features using PCA. Although the encoding accuracy of the best model was low, it is a starting point to build upon.

Check it out the code for this part of the project here.

How to set up the environment to run the code?

I used conda to create a virtual environment with the required libraries to run the code. To create a Python virtual environment with libraries and dependencies required for this project, you should clone this GitHub repository, open a terminal, move to the folder containing this repository, and create a conda virtual environment with the following commands:

# Create the conda virtual environment
$ conda env create -f img2brain_env.yml

# Activate the conda virtual environment
$ conda activate img2brain_env

Then, you can open the Jupyter Notebook with the IDE of your choice and run the code.

Structure of the repository

The main files and directories of this repository are:

File Description
EDA_feateng_modelbuild_img2brain.ipynb Jupyter notebook with EDA, feature engineering, creation of the machine learning algorithms, performance metrics of all models, and evaluation of the best model
LassoRegressor_alpha0.01_img2brain.bin Bin file of the best model
img2brain_env.yml File with libraries and dependencies to create the conda virtual environment
img2brain_report.pdf Report with detailed explanation of the project
Results/ Folder to save performance metrics and other outputs of the machine learning models
Scripts_plots/ Folder for the scripts to create the plots of the report
img/ images and gifs

Credits

Further details

More details about the biological background of the project, the interpretation of the results, and ideas for further work are available in this pdf report.

Contact

If you have comments or suggestions about this project, you can open an issue in this repository, or email me at sebasar1245@gamil.com.

About

A project to predict the neural visual responses to naturalistic scenes using machine learning. It was my capstone project of the Machine learning course from the MSc in Systems Biology at Maastricht University.

Topics

Resources

License

Stars

Watchers

Forks