Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub-folders containing model-specific information #4

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Empty file added .gitattributes
Empty file.
3 changes: 3 additions & 0 deletions .gitignore
@@ -0,0 +1,3 @@
*.pth
*.h5
*.pt
3 changes: 3 additions & 0 deletions .idea/.gitignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should exclude IDE configuration from being checked in to the repo -- I recommend removing the .idea directory and its contents

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions .idea/FathomNet.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions .idea/codeStyles/codeStyleConfig.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/inspectionProfiles/profiles_settings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/other.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Binary file added Fish_and_Squid_Detector/Figures/augmentations.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Fish_and_Squid_Detector/Figures/data_dirs.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Fish_and_Squid_Detector/Figures/yolov5_dirs.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@@ -0,0 +1,328 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/github/heinsense2/AIO_CaseStudy/blob/main/notebooks/Training_on_FathomNet_Custom_Data.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "jpTpijONjCPL"
},
"source": [
"## Custom training using YOLOv5 on Fathomnet custom dataset\n",
"\n",
"This notebook explains how to train a custom dataset using YOLOv5 to recognize different marine species presnt in the Monterey bay. This notebook serves as a guideline to produce the results presented in the paper\n",
"\n",
" *Demystifying image-based machine learning: a practical guide to automated analysis of imagery using modern machine learning tools*, \n",
"\n",
"\n",
"The data is prepared using code available [here](https://github.com/heinsense2/AIO_CaseStudy/tree/main/data/scripts).\n",
"\n",
"NOTE: If you wish to use this notebook, you will need to make changes to refer to the datast locations and python script parameters amoung others.\n",
"\n",
"Here are the relevant steps:\n",
"\n",
"* Create the dataset and annotations (labels). Organize directories.\n",
"* Export dataset to YOLOv5\n",
"* Train YOLOv5 to recognize the objects (marine animals) in our dataset\n",
"* Evaluate our YOLOv5 model's performance\n",
"* Run inference to view the model at work\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "uruljSVEk_hc"
},
"source": [
"# 1. Install requirements"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "xaA_HZi8lK4U"
},
"outputs": [],
"source": [
"# Clone YOLOv5\n",
"!git clone https://github.com/ultralytics/yolov5 # clone repo\n",
"%cd yolov5\n",
"%pip install -qr requirements.txt # install dependencies\n",
"%pip install torch==1.8.1 torchvision==0.9.1\n",
"\n",
"import torch\n",
"import os\n",
"from IPython.display import Image, clear_output # to display images\n",
"\n",
"print(f\"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})\")\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "KXnRWZ2nnQuD"
},
"source": [
"# 2. Assemble Dataset\n",
"\n",
"To train our model, we need to assemble a dataset of representative images with bounding boxes around the objects we want to detect. Our dataset must be in YOLOv5 format.\n",
"\n",
"The Fathomnet data is downloaded and prepared using code available [here](https://github.com/heinsense2/AIO_CaseStudy/tree/main/data/scripts).\n",
"\n",
"\n",
"When usig Google Colab, it is recommended to have the data available on Google Drive. So we need to first mount our Google Drive.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "Vncfn_fqoeLW"
},
"outputs": [],
"source": [
"# Mount Google Drive\n",
"from google.colab import drive\n",
"drive.mount('/content/gdrive')\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "07QSJM5iuvar"
},
"outputs": [],
"source": [
"# List the directory where the data resides\n",
"# For example:\n",
"#!ls \"/content/gdrive/My Drive/data\""
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "X7yAi9hd-T4B"
},
"source": [
"# 3. Train Our Custom YOLOv5 model\n",
"\n",
"We are able to pass a number of arguments to ```train.py```. To see all the settable arguments:\n",
"\n",
">>\n",
"```python\n",
"python train.py -h\n",
"```\n",
"\n",
"Here is what we used:\n",
"- **img:** define input image size\n",
"- **batch:** determine batch size\n",
"- **epochs:** define the number of training epochs. We use 300. (Note: typically 300-1000 are used although 1000+ are common here!)\n",
"- **data:** Our dataset locaiton is saved in the `data.location`\n",
"- **weights:** specify a path to weights to start transfer learning from. Here we choose the pretrained model yolov5s, which is the small and quickest one.\n",
"- **cache:** cache images for faster training\n",
"\n",
"The commad to train in general will have the form:\n",
"\n",
">>\n",
"```python\n",
"python train.py --img 640 --batch 16 --epochs 300 --data {data.directory}/{domain}.yaml --weights yolov5s.pt --cache\n",
"```\n"
]
},
{
"cell_type": "markdown",
"source": [
"Below is an example on how to train YOLOv5 on data from years prior to 2012 using pretrained ```python\n",
"--weights yolov5s.pt```. Image size is set to 640 and we use batch size of 16.\n"
],
"metadata": {
"id": "K2pVR3QcAGdb"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "LQttoa-D2mgn"
},
"outputs": [],
"source": [
"# Train using pre-2012 data\n",
"!python train.py --img 640 --batch 16 --epochs 300 --data ../data/pre_2012/yolov5/pre_2012.yaml --weights yolov5s.pt --cache"
]
},
{
"cell_type": "markdown",
"source": [
"To train on the other spatial/depth and temporal regions using the command above, you only need to change the ```--data``` parameter to point to the corresponding yaml file.\n",
"\n",
"In this notebook, we will continue using pre-2012 data as example code."
],
"metadata": {
"id": "ZIqMTTCGdoeM"
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "AcIRLQOlA14A"
},
"source": [
"# Evaluate Custom YOLOv5 Detector Performance\n",
"All results are logged by default to runs/train, with a new experiment directory created for each new training (runs/train/exp2, runs/train/exp3)\n",
"\n",
"Training losses and performance metrics are saved to Tensorboard and also to a CSV logfile results.csv\n",
"\n",
"If you are new to these metrics, the one you want to focus on is `mAP_0.5` - learn more about mean average precision [here](https://blog.roboflow.com/mean-average-precision/)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1jS9_BxdBBHL"
},
"outputs": [],
"source": [
"# Start tensorboard\n",
"# Launch after you have started training\n",
"# logs save in the folder \"runs/train/exp*\"\n",
"%load_ext tensorboard\n",
"%tensorboard --logdir runs"
]
},
{
"cell_type": "markdown",
"source": [
"You can also validate the trained detection model on the test or out-of-domain datasets by using the val.py script in YOLOv5 and the corresponding yaml file.\n",
"\n",
"For the pre-2012 trained model, the following commands will validate it on the test or out-of-domain datasets, assuming the results were saved to ```runs/train/exp```."
],
"metadata": {
"id": "ffwecYhRKF8Q"
}
},
{
"cell_type": "code",
"source": [
"# Validate model on test dataset\n",
"!python val.py --data ../data/pre_2012/yolov5/pre_2012.yaml --weights runs/train/exp/weights/best.pt --task test\n",
"\n",
"# Validate model on out-of-domain (post-2012) dataset\n",
"!python val.py --data ../data/post_2012/yolov5/post_2012_as_out_of_domain.yaml --weights runs/train/exp/weights/best.pt --task test\n",
"\n"
],
"metadata": {
"id": "o-_FjSzJKqlv"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "jtmS7_TXFsT3"
},
"source": [
"#Run Inference With Trained Weights on New Images\n",
"\n",
"Once the model is trained, you can run inference witht the best pretrained checkpoint. `best.pt` on test (`images/test`) or out-of-domain (`all/images`) images.\n",
"\n",
"`--conf 0.65` only detections where the confidence level of it being an actual detection is 0.65 or greater.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "NkJrK89z_O0q"
},
"outputs": [],
"source": [
"# Run detection on pre-2012 images\n",
"!python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.65 --source \"../data/pre_2012/yolov5/images/test\"\n",
"\n",
"# Run detection on out-of-domain (post-2012) images\n",
"!python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.65 --source \"../data/post_2012/yolov5/all/images\"\n",
"\n"
]
},
{
"cell_type": "markdown",
"source": [
"To visualize the detection results, there are many options. One simple way using IPython display is to presented below."
],
"metadata": {
"id": "htF5pmgWldBO"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "e9h22JUt_m56"
},
"outputs": [],
"source": [
"# Display inference on ALL resulting detection images\n",
"\n",
"import glob\n",
"from IPython.display import Image, display\n",
"\n",
"# Assume resulting images from detection are in runs/detect/exp\n",
"for imageName in glob.glob('/content/yolov5/runs/detect/exp/*.png'): #assuming PNG\n",
" display(Image(filename=imageName))\n",
" print(\"\\n\")"
]
},
{
"cell_type": "markdown",
"source": [
"# Export Trained Weights for Future Inference\n",
"You can now export the trained weights from our detector for inference on your device elsewhere.\n"
],
"metadata": {
"id": "KAXD9i3I91CV"
}
},
{
"cell_type": "code",
"source": [
"#export your model's weights for future use\n",
"from google.colab import files\n",
"files.download('./runs/train/exp/weights/best.pt')"
],
"metadata": {
"id": "yfgdDBYC-ztS"
},
"execution_count": null,
"outputs": []
}
],
"metadata": {
"accelerator": "GPU",
"colab": {
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
},
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 0
}