- Cover the 4,5,6,7,8th feature maps with default box(4 sets)=7308(in paper 8732) for each category
- Pinpoint the default box wich matches Ground Truth Box with IOU(heigher rate than 0.5)
- Finally, Overlap every matched box from 4,5,6,7,8th feature maps and extract prediction box by NMS.
This is the file which contains every priro_box(default box) coordinates. There are 7308 boxes.
Every box contains x_min, y_min, x_max, y_max, variance_1, variance2, variance3, variance_4.
Gradient-weighted Class Activation Mapping(Grad-CAM)is an excellent visualization idea for understanding Convolutional Neural Network functions. As more detail explanation of this technique, It uses the gradients of any target concept(say logits for 'dog' or even a caption),flowing into the final convolutional layer to produce a coarse localization map highlighting important regions in an image for predicting the concept. Furthermore, By piling up these localization map onto Guided Backpropagation output, it realizes high level visualization system. There are roughly two algorithm flows. One is the Class Activation Mapping(CAM) and the other one is Guided BackPropagation. CAM is one of the funduamental idea for Grad-CAM.
- marknet_module/run_marknet.py: モデル実行ファイル(pytorchモデル)
- marknet_module/model/MarkNet.py: MarkNetモデル(pytorch実装)
- marknet_module/utils/create_dataset_csv.py: 学習用画像パスcsvファイルの書き出し
- marknet_module/utils/conduct_gradcam.py: GradCAMモジュール
- marknet_module/utils/data_loader.py: InputPipelinモジュール(pytorch実装)
Weights are ported from the original models and are available here. You need weights_SSD300.hdf5
, weights_300x300_old.hdf5
is for the old version of architecture with 3x3 convolution for pool6
.
This code was tested with Keras
v1.2.2, Tensorflow
v1.0.0, OpenCV
v3.1.0-dev