This project leverages Convolutional Neural Networks (CNNs) to automatically identify colon cancer from histopathological images. It is designed for reproducibility, deployment, and accessibility on multiple platforms.
- 🔗 GitHub Repository: Identifying-Colon-Cancer-Using-Deep-Learning
- 📊 Kaggle Notebook: Colon Cancer Classifier on Kaggle
- 🐳 Docker Image: Coming soon
├── Colon_cancer.ipynb # Jupyter Notebook with training pipeline
├── train.csv # Dataset with labeled image paths for training
├── pred.csv # Dataset with image paths for inference
├── best_checkpoint.model # Saved PyTorch model checkpoint
├── graphs/ # Visualizations and training plots
├── example.csv # A sample dataset format (for reference)
├── requirements.txt # Dependencies file (install this first)
├── Dockerfile # Docker configuration to replicate environment
└── README.md # Project documentation
-
Data Preparation
train.csv
: labeled images (image_id
,label
)pred.csv
: for model inference- Image preprocessing includes resizing, normalization, and augmentation.
-
Model Architecture
- CNN built using PyTorch or with transfer learning (e.g., ResNet18)
- Loss:
CrossEntropyLoss
, Optimizer:Adam
- Checkpointing used to store best model
-
Evaluation
- Accuracy, Precision, Recall metrics
- Visuals: Loss vs Epoch, Accuracy vs Epoch
- Prediction samples visualized for interpretability
- Clone the repository:
git clone https://github.com/Arpangpta/Identifying-Colon-Cancer-Using-Deep-Learning.git
cd Identifying-Colon-Cancer-Using-Deep-Learning
- Set up virtual environment:
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
📄 View requirements.txt
- Launch Jupyter Notebook:
jupyter notebook Colon_cancer.ipynb
- Build the Docker image:
docker build -t colon-cancer-dl .
- Run the container:
docker run -p 8888:8888 colon-cancer-dl
Access Jupyter Notebook at localhost:8888
train.csv
— Columns:image_id
,label
pred.csv
— Column:image_id
(no labels)- Example schema available in
example.csv
Image Sample | True Label | Predicted Label |
---|---|---|
![]() |
Non-Cancer | Non-Cancer |
![]() |
Cancer | Cancer |
The best trained model is saved as:
best_checkpoint.model
You can load it in PyTorch using:
import torch
model = torch.load("best_checkpoint.model")
model.eval()
- 👤 Arpan Gupta
- 📓 Notebook: Kaggle
This project is licensed under the MIT License. See the LICENSE file for details.