Base-architecture : ResNet 50
Dataset : CIFAR-10 (https://www.cs.toronto.edu/~kriz/cifar.html)
The key idea is to emphasize relevant information and suppress the rest.
- In neural entworks, information is compressed in the form of feature map.
- Feature map X Attention -> Refined feature map
- ResNet50 stage 3 output feature map
- Features are averaged over channel axis and normalized per layer statistics
Generalizable attention module
- Can be adapted to any convolutional neural networks
- Global Avg. Pooling -> Squeeze -> FC*2 (Hu et al., CVPR 2018)
- Increase a little learnable parameters
- Implement attention module
- Report the followings for each of it and compare them - Numbers to Report
- Top-1 and Top-5 Errors
- of the baseline
- with channel attention
- Parameters and Flops
- Analyze the learned features using Grad-CAM method (http://arxiv.org/abs/1610.02391)
- New createive Idea