Object detection and segmentation using PennFudanPed/ dataset

This folder contains data and various code samples related to using object detectors and object segmentation. The original code was adapted from Pytorch - TorchVision Object Detection Finetuning Tutorial and David Macêdo Github. The intent of this code is to cover all stages in the object detection and segmentation pipeline as a programming practice. It is true that not all aspects can be covered. It uses pre-trained models from Pytorch and the Penn-Fudan Database from here

Models used and tools used

Python 3, [Pytorch](https://pytorch.org/.
Mask R-CNN
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Models and pre-trained weights

Datsets

Penn-Fudan Database for Pedestrian Detection and Segmentation

Deep learning courses

Deep Learning courses

Links to tutorials, useful information

David Macêdo
TorchVision Object Detection Finetuning Tutorial. Good explanation about the use of classes for custom datasets.
TorchVision Instance Segmentation Finetuning Tutorial
Instance Segmentation with PyTorch and Mask R-CNN
Object Detection using PyTorch Faster R-CNN MobileNetV3.
Object Detection Tutorial with Torchvision.
Object detection reference training scripts. Reference training scripts for object detection.

Pytorch visualization utils Torchvision

Pytorch Models and pre-trained weights

Models and pre-trained weights

Pytorch tensors

With video Introduction to PyTorch Tensors
TORCH.TENSOR
PyTorch PIL to Tensor and vice versa
Pytorch Converting tensors to images
Good tutorial about Numpy. Introduction to NumPy and OpenCV
Data transfer to and from PyTorch
Beginners guide to Tensor operations in PyTorch.

Conversions between image formats

PIL.Image to Tensor. Converting an image to a Torch Tensor in Python
Numpy to PIL. Convert a NumPy array to an image
Plot torch.Tensor using OpenCV
How do I display a single image in PyTorch?

Installing tools

Torchvision utilities and Tensors

Torchvision examples

Using Pytorch library to show images and masks.

Folders	Description
torchvision_01.py	From PennFudanPed it uses torchvision library to read a .PNG image, makes transformations using GPU/CPU and show it on the screen.
torchvision_02.py	Takes instance segmentation mask images, transforms from Tensor to Pillow image, after it merges the masks in one image.

Use of tensors and transformation of tensors and images

Basic examples using image transforms offered by torchvision.transforms.functional. Two ways to call the same function.

import torchvision.transforms.functional as F
p_img_01 = F.to_pil_image(tensor_img)
p_img_01.show()

import torchvision.transforms as T
transform = T.ToPILImage()
transforms.append(T.ToTensor())
p_img_01 = transform(tensor_img.to(device))

Folders	Description
tensor_conversion_pytorch.py	Read images using read_image() conversion, basic pipeline.
tensor_conversion_pil.py	Read images using PIL.Image.open() conversion, basic pipeline.
tensor_conversion_opencv.py	Read images using OpenCV cv2.imread() conversion, basic pipeline.

Connecting tensor conversion with deep learning models. Examples using MASK R-CNN (from torchvision.models.detection import maskrcnn_resnet50_fpn, maskrcnn_resnet50_fpn(pretrained=True)). The result is a binary mask converted.

Folders	Description
tensor_conversion_01.py	Read images using read_image() conversion.
tensor_conversion_02.py	Read images using PIL.Image.open() conversion.
tensor_conversion_03.py	Read images using cv2.imread() conversion.
tensor_conversion_opencv_fasterrcnn.py	Read images using cv2.imread() conversion to model FASTER R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_fasterrcnn_02.py	Read images using cv2.imread() conversion to model FASTER R-CNN V2 and get OpenCV format. This is a good example of conversions in a pipeline with models.
tensor_conversion_opencv_maskrcnn.py	Read images using cv2.imread() conversion to model MASK R-CNN and get OpenCV format. This is a good example of conversions in a pipeline with models.

This link explains, about data type conversion.

Creating and training a U-Net model with PyTorch for 2D & 3D semantic segmentation: Dataset building [1/4]

Model pipelines for bounding box (BBOX) and mask segmentation (MASK)

Training models

Folders	Description
./train_scripts/main_free_gpu_cache.py	Tool for clean GPU memory
./train_scripts/main_training_code.py	Code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth
./train_scripts/tv-training-code_corrected.py	Original code to train people detector using PennFudanPed/ dataset. This script produces a file with weights in format .pth

Evaluation

Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in PennFudanPed/

Folders	Description
eval_pennfudanpen_bbox_01.py	Detecting people using PennFudanPed/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_pennfudanpen_mask_01.py	Detecting apples using PennFudanPed/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model

Testing bounding box models(BBOX) and mask segmentation models (MASK) sequence in a normal image.

Folders	Description
eval_story_rgb_bbox_01.py	Detecting people using story_rgb/ dataset with from torchvision.models.detection.fasterrcnn_resnet50_fpn pretrained model
eval_story_rgb_mask_01.py	Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model
IMPORTANT! eval_story_rgb_mask_02.py	Detecting apples using story_rgb/ dataset with from from torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model saving data in an output/ folder

Checking the trained weight in a .pth file with a MASK R-CNN model.

Folders	Description
main_evaluate_pennfudanpen_code.py	Detecting people using random images from PennFudanPed/ dataset, with torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth
main_evaluate_people_code.py	Detecting people using test images torchvision.models.detection import maskrcnn_resnet50_fpn pretrained model and load trained weights from a file .pth

Webcam examples RGB cameras

Folders	Description
webcam_basic_loop_01.py	Basic loop to extract frames from webcam without object detection.
webcam_obj_detect_01.py	It is a simple object detector, it has not enough performance.
webcam_obj_detect_02.py	It is a demo using object detection for BBOX. This get a stream from a webcam and detect objects.
webcam_obj_detect_pre_bbox.py	It is a demo using object detection for BBOX with pre trained default model MASK R-CNN
webcam_obj_detect_pre_mask.py	It is a demo using object detection for MASK with pre trained default model MASK R-CNN

Requirements

Hardware and software stack used

Ubuntu 20.04.3 LTS 64 bits.
Windows 10
Intel® Core™ i7-8750H CPU @ 2.20GHz × 12.
GeForce GTX 1050 Ti Mobile.
Python 3.8.10

Edition tools

Python stack environment

Create de environment

python3 -m pip install python-venv
pip3 install python-venv
python -m venv ./object_detector_tutorial_venv
source ./venv/bin/activate
python --version
pip install --upgrade pip

Installing libraries

pip install requirements_windows.txt

Installing in Windows 10

pip install opencv-python

Installing Ubuntu 20.04 LTS

Install Python tools

sudo apt install python3-pip
sudo apt install python3.8-venv

Installing CUDA toolkit Linux notes

Deleting any nvidia data

sudo rm /etc/apt/sources.list.d/cuda*
sudo apt remove --autoremove nvidia-cuda-toolkit
sudo apt remove --autoremove nvidia-*
sudo rm -rf /usr/local/cuda*
sudo apt-get purge nvidia*
sudo apt-get update
sudo apt-get autoremove
sudo apt-get autoclean

Install nvidia-cuda-toolkit

Download the current toolkit available from NVIDIA here

Installing driver

sudo apt-get update
sudo ubuntu-drivers autoinstall
nvidia-driver-470

Checking CUDA version installed

nvcc --version
nvidia-smi

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
.idea		.idea
dataset		dataset
deprecated		deprecated
detector		detector
docs		docs
helpers		helpers
output		output
pennfudanpenn_data		pennfudanpenn_data
pennfudanpenn_utility		pennfudanpenn_utility
tests		tests
to_clean		to_clean
trained_model		trained_model
training_scripts		training_scripts
.gitignore		.gitignore
20230310_time_gpu_cpu.txt		20230310_time_gpu_cpu.txt
OBJECT_DETECTION_NOTES.md		OBJECT_DETECTION_NOTES.md
README.md		README.md
TODO.md		TODO.md
TO_CLEAN.txt		TO_CLEAN.txt
eval_pennfudanpen_bbox_01.py		eval_pennfudanpen_bbox_01.py
eval_pennfudanpen_mask_01.py		eval_pennfudanpen_mask_01.py
eval_story_rgb_bbox_01.py		eval_story_rgb_bbox_01.py
eval_story_rgb_mask_01.py		eval_story_rgb_mask_01.py
eval_story_rgb_mask_02.py		eval_story_rgb_mask_02.py
main_evaluate_pennfudanpen_code.py		main_evaluate_pennfudanpen_code.py
main_evaluate_people_code.py		main_evaluate_people_code.py
plot_visualization_utils.ipynb		plot_visualization_utils.ipynb
requirements_windows.txt		requirements_windows.txt
tensor_conversion_01.py		tensor_conversion_01.py
tensor_conversion_02.py		tensor_conversion_02.py
tensor_conversion_03.py		tensor_conversion_03.py
tensor_conversion_opencv.py		tensor_conversion_opencv.py
tensor_conversion_opencv_fasterrcnn.py		tensor_conversion_opencv_fasterrcnn.py
tensor_conversion_opencv_fasterrcnn_02.py		tensor_conversion_opencv_fasterrcnn_02.py
tensor_conversion_opencv_maskrcnn.py		tensor_conversion_opencv_maskrcnn.py
tensor_conversion_pil.py		tensor_conversion_pil.py
tensor_conversion_pytorch.py		tensor_conversion_pytorch.py
torchvision_01.py		torchvision_01.py
torchvision_02.py		torchvision_02.py
webcam_basic_loop_01.py		webcam_basic_loop_01.py
webcam_obj_detect_01.py		webcam_obj_detect_01.py
webcam_obj_detect_02.py		webcam_obj_detect_02.py
webcam_obj_detect_pre_bbox.py		webcam_obj_detect_pre_bbox.py
webcam_obj_detect_pre_mask.py		webcam_obj_detect_pre_mask.py

juancarlosmiranda/object_detector_tutorial

Folders and files

Latest commit

History

Repository files navigation