Merge pull request #8 from BMW-InnovationLab/dev-yolov4

Yolov3-v4 Training automation
BMW-InnovationLab · Jul 16, 2020 · 821e9a3 · 821e9a3
2 parents 7a5e42d + 647a3aa
commit 821e9a3
Show file tree

Hide file tree

Showing 22 changed files with 1,925 additions and 92 deletions.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
-# YOLOv3 Training Automation API for Linux
-
-This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed!  You provide your labeled dataset and you can start the training right away and monitor it in many different ways like TensorBoard or a custom REST API and GUI. Training with YOLOv3 has never been so easy.
+# YOLOv4-v3 Training Automation API for Linux
 
+This repository allows you to get started with training a state-of-the-art Deep Learning model with little to no configuration needed!  You provide your labeled dataset and you can start the training right away and monitor it in many different ways like TensorBoard or a custom REST API and GUI. Training with YOLOv4  has never been so easy.
+This repository has also cross compatibility with Yolov3 training 
 ![](swagger_yolo_training.png)
 
 ## Prerequisites
@@ -22,7 +22,7 @@ chmod +x scripts/install_docker.sh && source scripts/install_docker.sh
 Prepare the docker image with all weights for GPU usage
 
 ```bash
-sudo docker build -f docker/Dockerfile -t darknet_yolo_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 --build-arg DOWNLOAD_ALL=1 .
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 --build-arg DOWNLOAD_ALL=1 .
 ```
 ![terminal_example](gifs/1.gif)
 
@@ -38,20 +38,35 @@ The environment is dockerized to run on GPU or CPU.
 For GPU, you need to build the image in the following way:
 
 ```bash
-sudo docker build -f docker/Dockerfile -t darknet_yolo_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 .
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 .
 ```
 
+If you have a GPU: Volta, Xavier, Turing and higher
+
+```bash
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=1 --build-arg OPENCV=1 .
+```
+If you are behind proxy 
+```bash
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_gpu:1 --build-arg GPU=1 --build-arg CUDNN=1 --build-arg CUDNN_HALF=1 --build-arg OPENCV=1 --build-arg http_proxy='' --build-arg https_proxy='' .
+```
 For CPU only, you can run the same command while setting GPU=0 CUDNN=0 and naming it darknet_yolo_cpu:1 for clarity.
 
 ```bash
-sudo docker build -f docker/Dockerfile -t darknet_yolo_cpu:1 --build-arg GPU=0 --build-arg CUDNN=0 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 .
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_cpu:1 --build-arg GPU=0 --build-arg CUDNN=0 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 .
+```
+If you want multi-core cpu training you can add OPENMP=1 to the build 
+
+```bash
+sudo docker build -f docker/Dockerfile -t darknet_yolov4_cpu:1 --build-arg GPU=0 --build-arg CUDNN=0 --build-arg CUDNN_HALF=0 --build-arg OPENCV=1 --build-arg OPENMP=1 .
 ```
 
 ## Preparing your dataset
 
 We provided a `sample_dataset` to show how your data should be structured in order to start the training seemlesly.
 The `train_config.json` file found in `sample_dataset` is a copy of the template `config/train_config.json.template` with needed modifications.  The template can as well be copied as is while making sure to remove the '.template' from the name.
-You can also provide your own `train.txt` and `test.txt` to specify which images will be used for training and which ones are for testing.  If not provided, the dataset will be split according to the `data/train_ratio` (by default 80% train 20% test)
+You can also provide your own `train.txt` and `test.txt` to specify which images will be used for training and which ones are for testing.  If not provided, the dataset will be split according to the `data/train_ratio` (by default 80% train 20% test).
+If you are using **Yolov4** training please make sure to choose your `yolov4` instead of **yolov3** in `train_config.json`  `model/model-name` **Yolov4** specific hyperparams ("mosaic","blur")
 
 ## Starting the training
 
@@ -78,23 +93,23 @@ dogs-dataset_20191110_14:21:41
 ├── config
 │   ├── obj.data
 │   ├── obj.names
-│   └── yolov3.cfg
+│   └── yolov4.cfg
 ├── test.txt
 ├── train.txt
 ├── weights
 │   ├── initial.weights
-│   ├── yolov3_10000.weights
-│   ├── yolov3_1000.weights
-│   ├── yolov3_2000.weights
-│   ├── yolov3_3000.weights
-│   ├── yolov3_4000.weights
-│   ├── yolov3_5000.weights
-│   ├── yolov3_6000.weights
-│   ├── yolov3_7000.weights
-│   ├── yolov3_8000.weights
-│   ├── yolov3_9000.weights
-│   ├── yolov3_best.weights
-│   └── yolov3_last.weights
+│   ├── yolov4_10000.weights
+│   ├── yolov4_1000.weights
+│   ├── yolov4_2000.weights
+│   ├── yolov4_3000.weights
+│   ├── yolov4_4000.weights
+│   ├── yolov4_5000.weights
+│   ├── yolov4_6000.weights
+│   ├── yolov4_7000.weights
+│   ├── yolov4_8000.weights
+│   ├── yolov4_9000.weights
+│   ├── yolov4_best.weights
+│   └── yolov4_last.weights
 ├── yolo_events.log
 └── yolo_events.log.1
 ```
@@ -132,38 +147,30 @@ Some of the elements are specific to YOLO itself like saturation, hue, rotation,
 ## Benchmark
 
 <table>
-  <tr>
-    <th></th>
-    <th colspan="4">OS</th>
-  </tr>
-  <tr>
-    <td></td>
-    <td>Windows</td>
-    <td colspan="3">Ubuntu</td>
-  </tr>
-  <tr>
-    <td></td>
-    <td>CPU</td>
-    <td colspan="2">CPU</td>
-    <td>GPU</td>
-  </tr>
-  <tr>
-    <td></td>
-    <td>Intel Xeon CPU 2.3 GHz</td>
-    <td>Intel Xeon CPU 2.3 GHz</td>
-    <td>Intel Core i9-7900 3.3 GHz</td>
-    <td>GeForce GTX 1080</td>
-  </tr>
-  <tr>
-    <td>pascalvoc_dataset</td>
-    <td>0.793 second/image</td>
-    <td>0.885 second/image</td>
-    <td>0.295 second/image</td>
-    <td>0.0592 second/image</td>
-  </tr>
+    <thead align="center">
+        <tr>
+            <th></th>
+            <th colspan=3>Ubuntu</th>
+        </tr>
+    </thead>
+    <thead align="center">
+        <tr>
+            <th>Network\Hardware</th>
+            <th>Intel Xeon CPU 2.3 GHz</th>
+            <th>Intel Core i9-7900 3.3 GHZ</th>
+            <th>Tesla V100</th>
+        </tr>
+    </thead>
+    <tbody align="center">
+        <tr>
+            <td>COCO Dataset</td>
+            <td>0.259 seconds/image</td>
+            <td>0.281 seconds/image</td>
+            <td>0.0691 seconds/image</td>
+        </tr>
+    </tbody>
 </table>
 
-
 ## Preparing weights
 
 Default yolo weights are provided on the [official website](https://pjreddie.com/darknet/yolo/).
@@ -174,7 +181,7 @@ Change your current working directory to be inside the repo. The following steps
 yolov3.weights
 
 ```bash
-wget https://pjreddie.com/media/files/yolov3.weights -P config/darknet/yolo_default_weights
+wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov3.weights -P config/darknet/yolo_default_weights
 ```
 
 yolov3-tiny.weights
@@ -189,6 +196,13 @@ darknet53.conv.74
 wget https://pjreddie.com/media/files/darknet53.conv.74 -P config/darknet/yolo_default_weights
 ```
 
+ yolov4.weights
+
+```bash
+wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights -P config/darknet/yolov4_default_weights
+```
+
+
 
 ## Known Issues
 
@@ -198,6 +212,8 @@ Issue related to darknet itself can be filed in [the correct repo](https://githu
 - If during training you see nan values for avg (loss) field - then training goes wrong, but if nan is in some other lines - then training goes well.
 - If error Out of memory occurs then you should try increasing subdivisions to 16, 32 or 64 or have a smaller image size.
 - If training finishes immediately without any error you should decrease batch size and subdivisions.
+- if training on multiple GPU produce nan try decreasing the learning rate to 0,00065
+
 
 ## Acknowledgements
 
@@ -208,3 +224,5 @@ Issue related to darknet itself can be filed in [the correct repo](https://githu
 Lynn Nassif, Beirut, Lebanon
 
 Nour Azzi, Beirut, Lebanon
+
+Hadi Koubeissy , Beirut, Lebanon
diff --git a/config/darknet/yolo_default_cfgs/yolov3-tiny.cfg b/config/darknet/yolo_default_cfgs/yolov3-tiny.cfg
@@ -15,6 +15,7 @@ saturation = 1.5
 exposure = 1.5
 hue=.1
 
+letter_box=1
 learning_rate=0.001
 burn_in=1000
 max_batches = 500200

diff --git a/config/darknet/yolo_default_cfgs/yolov3.cfg b/config/darknet/yolo_default_cfgs/yolov3.cfg
@@ -15,6 +15,7 @@ saturation=1.5
 exposure=1.5
 hue=.1
 
+letter_box=1
 learning_rate=0.001
 burn_in=1000
 max_batches=500200

diff --git a/config/darknet/yolo_default_weights/download_weights.sh b/config/darknet/yolo_default_weights/download_weights.sh
@@ -6,11 +6,11 @@ else
     if test $1 -eq 1; then
         echo "Downloading Weights"
         echo "Downloading yolov3.weights"
-        wget https://pjreddie.com/media/files/yolov3.weights -P $2
+        wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov3.weights -P $2 -4
         echo "Downloading yolov3-tiny.weights"
-        wget https://pjreddie.com/media/files/yolov3-tiny.weights -P $2
+        wget https://pjreddie.com/media/files/yolov3-tiny.weights -P $2 -4 
         echo "Downloading darknet53.conv.74"
-        wget https://pjreddie.com/media/files/darknet53.conv.74 -P $2
+        wget https://pjreddie.com/media/files/darknet53.conv.74 -P $2 -4 
         echo "All weights downloaded!"
     fi
 fi