Skip to content

Computer vision task to do early detection of COVID-19 using X-Ray chest image. This AI model built with ConvNeXt architecture

License

Notifications You must be signed in to change notification settings

adhiiisetiawan/covid19-recognition

Repository files navigation

COVID-19 Recognition

python PyTorch Lightning Config: Hydra Template


Description

The COVID-19 pandemic has highlighted the importance of early detection of the disease. In many cases, chest X-rays can play a crucial role in identifying the presence of COVID-19, as well as differentiating between COVID-19, normal and pneumonia cases. This project aims to develop a deep learning model for recognizing COVID-19 positive cases from X-Ray chest images using the ConvNeXt architecture. The purpose of this project is to provide a tool for early detection of COVID-19 using chest X-rays. The model will be trained on a dataset of X-Ray images and will be evaluated based on its accuracy and other performance metrics. This tool can be used by healthcare professionals to make informed decisions in the management of COVID-19 patients. This project also aims to test the performance of ConvNeXt architecture on chest X-Ray image datasets.


Dataset

The dataset used for training and testing the model consists of X-Ray chest images of COVID-19 positive, normal, and pneumonia cases. The dataset was obtained from the following research papers:

  • Covid Image Data Collection by Joseph Paul Cohen et al. arXiv
  • COVID-19 Image Data Collection: Prospective Predictions are the Future by Joseph Paul Cohen et al. arXiv

They also provide a GitHub link that can be found here. Dataset that is already preprocessed also can be downloaded from kaggle.


Architecture

This project using ConvNeXt architecture, ConvNeXt is a modification of the ResNet architecture for deep ConvNets. It introduces the concept of "grouped convolutions" which allow for a more efficient utilization of computation resources such as memory and computation power. The main features of ConvNeXt include:

  1. Grouped Convolutions: The main difference from ResNet is that ConvNeXt uses grouped convolutions instead of standard convolutions in the residual blocks. Grouped convolutions split the input channels into groups, and apply a separate set of filters to each group. This allows for more computation to be done with a smaller number of parameters, which helps reduce overfitting.

  2. Bottleneck Design: ConvNeXt uses a bottleneck design, where the number of filters in the 3x3 convolution is reduced, and then increased again using a 1x1 convolution. This design helps reduce the number of parameters and computational costs.

  3. Stem and Head: ConvNeXt has a stem and head component, which includes several convolutional layers to extract features from the input image and several fully connected layers to make the final prediction.

  4. Dense Connections: ConvNeXt uses dense connections, which means that every residual block is connected to every other block in the network. This helps reduce the vanishing gradient problem that can occur in deep ConvNets.

Overall, the ConvNeXt architecture aims to balance efficiency and performance, and has been shown to achieve state-of-the-art results on various computer vision tasks such as image classification and object detection.


Main Technologies

PyTorch Lightning - a lightweight PyTorch wrapper for high-performance AI research. Think of it as a framework for organizing your PyTorch code.

Hydra - a framework for elegantly configuring complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line.


Project Structure

The directory structure of new project looks like this:

├── configs                   <- Hydra configuration files
│   ├── callbacks                <- Callbacks configs
│   ├── data                     <- Data configs
│   ├── debug                    <- Debugging configs
│   ├── experiment               <- Experiment configs
│   ├── extras                   <- Extra utilities configs
│   ├── hparams_search           <- Hyperparameter search configs
│   ├── hydra                    <- Hydra configs
│   ├── local                    <- Local configs
│   ├── logger                   <- Logger configs
│   ├── model                    <- Model configs
│   ├── paths                    <- Project paths configs
│   ├── trainer                  <- Trainer configs
│   │
│   ├── eval.yaml             <- Main config for evaluation
│   └── train.yaml            <- Main config for training
│
├── data                   <- Project data
│
├── logs                   <- Logs generated by hydra and lightning loggers
│
├── notebooks              <- Jupyter notebooks for the project
│
├── src                    <- Source code
│   ├── data                     <- Lightning datamodules
│   ├── models                   <- Lightning models
│   └── utils                    <- Utility scripts
│
├── .env.example              <- Example of file for storing private environment variables
├── .gitignore                <- List of files ignored by git
├── requirements.txt          <- File for installing python dependencies
├── README.md
├── eval.py                   <- Run evaluation
└── train.py                  <- Run training

How to run

Install dependencies

# clone project
git clone https://github.com/adhiiisetiawan/covid19-recognition
cd covid19-recognition

# [OPTIONAL] create python environment
python3 -m venv [env-name]

# activate environment
source [env-name]/bin/activate

# install requirements
pip install -r requirements.txt

Note: Before run training, you must download the dataset first from kaggle and put in the data folder. You also need to extract the dataset. For another approach, you need to register a kaggle API to download the folder with code. But, I recomended to download manually since using kaggle API need some effort to configure.

Train model with default configuration, default configuration using ConvNeXt Base

# train on CPU
python src/train.py trainer=cpu

# train on GPU
python src/train.py trainer=gpu

Train model with specific architecture

# train on CPU
python src/train.py trainer=cpu model=convnext_small

# train on GPU
python src/train.py trainer=gpu model=convnext_tiny

Train model using wandb logger

# train on CPU
python src/train.py trainer=cpu model=convnext_small logger=wandb

# train on GPU
python src/train.py trainer=gpu model=convnext_tiny logger=wandb

Performance and Results

Here's the performance of the model using Covid-19 X-Ray Dataset. I just provide three ConvNeXt models (tiny, small, base) because of limited resources. Details about graphic accuracy and loss can be found on wandb report.

name acc #params model
ConvNeXt-T 96.9 28M model
ConvNeXt-S 95.4 50M model
ConvNeXt-B 93.8 89M model

Acknowledgement

  • This project is built using the Lightning Hydra template.
  • Thanks to Joseph Paul Cohen et al. for providing the dataset. Paper 1 | Paper 2 | GitHub
  • Thanks to pranavraikokte for providing the preprocessed dataset in Kaggle

Reference

@article{cohen2020covid,
  title={COVID-19 image data collection},
  author={Joseph Paul Cohen and Paul Morrison and Lan Dao},
  journal={arXiv 2003.11597},
  url={https://github.com/ieee8023/covid-chestxray-dataset},
  year={2020}
}
@article{cohen2020covidProspective,
  title={COVID-19 Image Data Collection: Prospective Predictions Are the Future},
  author={Joseph Paul Cohen and Paul Morrison and Lan Dao and Karsten Roth and Tim Q Duong and Marzyeh Ghassemi},
  journal={arXiv 2006.11988},
  url={https://github.com/ieee8023/covid-chestxray-dataset},
  year={2020}
}
@Article{liu2022convnet,
  author  = {Zhuang Liu and Hanzi Mao and Chao-Yuan Wu and Christoph Feichtenhofer and Trevor Darrell and Saining Xie},
  title   = {A ConvNet for the 2020s},
  journal = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year    = {2022},
}

About

Computer vision task to do early detection of COVID-19 using X-Ray chest image. This AI model built with ConvNeXt architecture

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published