Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen #10996

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

Frozen #10996

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# This is my fork of official Tensorflow repository.
For my work with DeepLab and RUGD dataset, please go to research/deeplab where is more informations
#
#
#

<div align="center">
<img src="https://storage.googleapis.com/tf_model_garden/tf_model_garden_logo.png">
</div>
Expand Down
333 changes: 11 additions & 322 deletions research/deeplab/README.md
Original file line number Diff line number Diff line change
@@ -1,325 +1,14 @@
# DeepLab: Deep Labelling for Semantic Image Segmentation
# Welcome to my fork of official DeepLab repository
I uploaded my new files and changes of official DeepLab files here. For official DeepLab README please see README_original.md <br />

**To new and existing DeepLab users**: We have released a unified codebase for
dense pixel labeling tasks in TensorFlow2 at https://github.com/google-research/deeplab2.
Please consider switching to the newer codebase for better support.
This repository contains my bachelor thesis work. The aim of the thesis was to make a semantic segmentation model which could recognize various types of surfaces in nature. The robots of BUT's researchers (https://www.vut.cz/en/) maybe will use my trained model to identify if the robot can get through some type of terrain.<br />

DeepLab is a state-of-art deep learning model for semantic image segmentation,
where the goal is to assign semantic labels (e.g., person, dog, cat and so on)
to every pixel in the input image. Current implementation includes the following
features:
To train the model, I used RUGD dataset (http://rugd.vision/).<br />
To see some segmented examples of RUGD dataset with trained model, please go to datasets/rugd/exp/train_on_trainval_set/vis/segmentation_results/<br />
After some small changes (comment or uncomment) in test_video.py and test_video2.py you can try trained model with your own images or video.<br />
video_z_datasetu.mp4 is video of test ride with BUT's robot. <br />
If you want to use RUGD dataset to train the model with 8 bit images, you can use color palette RUGD_vsetky_farby.pal with add_palete.bat to have all images with the same colormap.<br />
vis.py is my modified version of vis_orig.py(which is vis.py but renamed).<br />
There are some new files in utils repository - get_dataset_colormap.py and save_annotation.py. They contain code to work with RUGD dataset.<br />
In datasets, there are 2 modified files - data_generator.py and build_voc2012_data_Vlastne_obrazky.py - which are used to make TFrecord of RUGD.

1. DeepLabv1 [1]: We use *atrous convolution* to explicitly control the
resolution at which feature responses are computed within Deep Convolutional
Neural Networks.

2. DeepLabv2 [2]: We use *atrous spatial pyramid pooling* (ASPP) to robustly
segment objects at multiple scales with filters at multiple sampling rates
and effective fields-of-views.

3. DeepLabv3 [3]: We augment the ASPP module with *image-level feature* [5, 6]
to capture longer range information. We also include *batch normalization*
[7] parameters to facilitate the training. In particular, we applying atrous
convolution to extract output features at different output strides during
training and evaluation, which efficiently enables training BN at output
stride = 16 and attains a high performance at output stride = 8 during
evaluation.

4. DeepLabv3+ [4]: We extend DeepLabv3 to include a simple yet effective
decoder module to refine the segmentation results especially along object
boundaries. Furthermore, in this encoder-decoder structure one can
arbitrarily control the resolution of extracted encoder features by atrous
convolution to trade-off precision and runtime.

If you find the code useful for your research, please consider citing our latest
works:

* DeepLabv3+:

```
@inproceedings{deeplabv3plus2018,
title={Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation},
author={Liang-Chieh Chen and Yukun Zhu and George Papandreou and Florian Schroff and Hartwig Adam},
booktitle={ECCV},
year={2018}
}
```

* MobileNetv2:

```
@inproceedings{mobilenetv22018,
title={MobileNetV2: Inverted Residuals and Linear Bottlenecks},
author={Mark Sandler and Andrew Howard and Menglong Zhu and Andrey Zhmoginov and Liang-Chieh Chen},
booktitle={CVPR},
year={2018}
}
```

* MobileNetv3:

```
@inproceedings{mobilenetv32019,
title={Searching for MobileNetV3},
author={Andrew Howard and Mark Sandler and Grace Chu and Liang-Chieh Chen and Bo Chen and Mingxing Tan and Weijun Wang and Yukun Zhu and Ruoming Pang and Vijay Vasudevan and Quoc V. Le and Hartwig Adam},
booktitle={ICCV},
year={2019}
}
```

* Architecture search for dense prediction cell:

```
@inproceedings{dpc2018,
title={Searching for Efficient Multi-Scale Architectures for Dense Image Prediction},
author={Liang-Chieh Chen and Maxwell D. Collins and Yukun Zhu and George Papandreou and Barret Zoph and Florian Schroff and Hartwig Adam and Jonathon Shlens},
booktitle={NIPS},
year={2018}
}

```

* Auto-DeepLab (also called hnasnet in core/nas_network.py):

```
@inproceedings{autodeeplab2019,
title={Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic
Image Segmentation},
author={Chenxi Liu and Liang-Chieh Chen and Florian Schroff and Hartwig Adam
and Wei Hua and Alan Yuille and Li Fei-Fei},
booktitle={CVPR},
year={2019}
}

```


In the current implementation, we support adopting the following network
backbones:

1. MobileNetv2 [8] and MobileNetv3 [16]: A fast network structure designed
for mobile devices.

2. Xception [9, 10]: A powerful network structure intended for server-side
deployment.

3. ResNet-v1-{50,101} [14]: We provide both the original ResNet-v1 and its
'beta' variant where the 'stem' is modified for semantic segmentation.

4. PNASNet [15]: A Powerful network structure found by neural architecture
search.

5. Auto-DeepLab (called HNASNet in the code): A segmentation-specific network
backbone found by neural architecture search.

This directory contains our TensorFlow [11] implementation. We provide codes
allowing users to train the model, evaluate results in terms of mIOU (mean
intersection-over-union), and visualize segmentation results. We use PASCAL VOC
2012 [12] and Cityscapes [13] semantic segmentation benchmarks as an example in
the code.

Some segmentation results on Flickr images:
<p align="center">
<img src="g3doc/img/vis1.png" width=600></br>
<img src="g3doc/img/vis2.png" width=600></br>
<img src="g3doc/img/vis3.png" width=600></br>
</p>

## Contacts (Maintainers)

* Liang-Chieh Chen, github: [aquariusjay](https://github.com/aquariusjay)
* YuKun Zhu, github: [yknzhu](https://github.com/YknZhu)
* George Papandreou, github: [gpapan](https://github.com/gpapan)
* Hui Hui, github: [huihui-personal](https://github.com/huihui-personal)
* Maxwell D. Collins, github: [mcollinswisc](https://github.com/mcollinswisc)
* Ting Liu: github: [tingliu](https://github.com/tingliu)

## Tables of Contents

Demo:

* <a href='https://colab.sandbox.google.com/github/tensorflow/models/blob/master/research/deeplab/deeplab_demo.ipynb'>Colab notebook for off-the-shelf inference.</a><br>

Running:

* <a href='g3doc/installation.md'>Installation.</a><br>
* <a href='g3doc/pascal.md'>Running DeepLab on PASCAL VOC 2012 semantic segmentation dataset.</a><br>
* <a href='g3doc/cityscapes.md'>Running DeepLab on Cityscapes semantic segmentation dataset.</a><br>
* <a href='g3doc/ade20k.md'>Running DeepLab on ADE20K semantic segmentation dataset.</a><br>

Models:

* <a href='g3doc/model_zoo.md'>Checkpoints and frozen inference graphs.</a><br>

Misc:

* Please check <a href='g3doc/faq.md'>FAQ</a> if you have some questions before reporting the issues.<br>

## Getting Help

To get help with issues you may encounter while using the DeepLab Tensorflow
implementation, create a new question on
[StackOverflow](https://stackoverflow.com/) with the tag "tensorflow".

Please report bugs (i.e., broken code, not usage questions) to the
tensorflow/models GitHub [issue
tracker](https://github.com/tensorflow/models/issues), prefixing the issue name
with "deeplab".

## License

All the codes in deeplab folder is covered by the [LICENSE](https://github.com/tensorflow/models/blob/master/LICENSE)
under tensorflow/models. Please refer to the LICENSE for details.

## Change Logs

### March 26, 2020
* Supported EdgeTPU-DeepLab and EdgeTPU-DeepLab-slim on Cityscapes.
**Contributor**: Yun Long.

### November 20, 2019
* Supported MobileNetV3 large and small model variants on Cityscapes.
**Contributor**: Yukun Zhu.


### March 27, 2019

* Supported using different loss weights on different classes during training.
**Contributor**: Yuwei Yang.


### March 26, 2019

* Supported ResNet-v1-18. **Contributor**: Michalis Raptis.


### March 6, 2019

* Released the evaluation code (under the `evaluation` folder) for image
parsing, a.k.a. panoptic segmentation. In particular, the released code supports
evaluating the parsing results in terms of both the parsing covering and
panoptic quality metrics. **Contributors**: Maxwell Collins and Ting Liu.


### February 6, 2019

* Updated decoder module to exploit multiple low-level features with different
output_strides.

### December 3, 2018

* Released the MobileNet-v2 checkpoint on ADE20K.


### November 19, 2018

* Supported NAS architecture for feature extraction. **Contributor**: Chenxi Liu.

* Supported hard pixel mining during training.


### October 1, 2018

* Released MobileNet-v2 depth-multiplier = 0.5 COCO-pretrained checkpoints on
PASCAL VOC 2012, and Xception-65 COCO pretrained checkpoint (i.e., no PASCAL
pretrained).


### September 5, 2018

* Released Cityscapes pretrained checkpoints with found best dense prediction cell.


### May 26, 2018

* Updated ADE20K pretrained checkpoint.


### May 18, 2018
* Added builders for ResNet-v1 and Xception model variants.
* Added ADE20K support, including colormap and pretrained Xception_65 checkpoint.
* Fixed a bug on using non-default depth_multiplier for MobileNet-v2.


### March 22, 2018

* Released checkpoints using MobileNet-V2 as network backbone and pretrained on
PASCAL VOC 2012 and Cityscapes.


### March 5, 2018

* First release of DeepLab in TensorFlow including deeper Xception network
backbone. Included checkpoints that have been pretrained on PASCAL VOC 2012
and Cityscapes.

## References

1. **Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs**<br />
Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille (+ equal
contribution). <br />
[[link]](https://arxiv.org/abs/1412.7062). In ICLR, 2015.

2. **DeepLab: Semantic Image Segmentation with Deep Convolutional Nets,**
**Atrous Convolution, and Fully Connected CRFs** <br />
Liang-Chieh Chen+, George Papandreou+, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille (+ equal
contribution). <br />
[[link]](http://arxiv.org/abs/1606.00915). TPAMI 2017.

3. **Rethinking Atrous Convolution for Semantic Image Segmentation**<br />
Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam.<br />
[[link]](http://arxiv.org/abs/1706.05587). arXiv: 1706.05587, 2017.

4. **Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation**<br />
Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam.<br />
[[link]](https://arxiv.org/abs/1802.02611). In ECCV, 2018.

5. **ParseNet: Looking Wider to See Better**<br />
Wei Liu, Andrew Rabinovich, Alexander C Berg<br />
[[link]](https://arxiv.org/abs/1506.04579). arXiv:1506.04579, 2015.

6. **Pyramid Scene Parsing Network**<br />
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia<br />
[[link]](https://arxiv.org/abs/1612.01105). In CVPR, 2017.

7. **Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate shift**<br />
Sergey Ioffe, Christian Szegedy <br />
[[link]](https://arxiv.org/abs/1502.03167). In ICML, 2015.

8. **MobileNetV2: Inverted Residuals and Linear Bottlenecks**<br />
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen<br />
[[link]](https://arxiv.org/abs/1801.04381). In CVPR, 2018.

9. **Xception: Deep Learning with Depthwise Separable Convolutions**<br />
François Chollet<br />
[[link]](https://arxiv.org/abs/1610.02357). In CVPR, 2017.

10. **Deformable Convolutional Networks -- COCO Detection and Segmentation Challenge 2017 Entry**<br />
Haozhi Qi, Zheng Zhang, Bin Xiao, Han Hu, Bowen Cheng, Yichen Wei, Jifeng Dai<br />
[[link]](http://presentations.cocodataset.org/COCO17-Detect-MSRA.pdf). ICCV COCO Challenge
Workshop, 2017.

11. **Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems**<br />
M. Abadi, A. Agarwal, et al. <br />
[[link]](https://arxiv.org/abs/1603.04467). arXiv:1603.04467, 2016.

12. **The Pascal Visual Object Classes Challenge – A Retrospective,** <br />
Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John
Winn, and Andrew Zisserma. <br />
[[link]](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/). IJCV, 2014.

13. **The Cityscapes Dataset for Semantic Urban Scene Understanding**<br />
Cordts, Marius, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele. <br />
[[link]](https://www.cityscapes-dataset.com/). In CVPR, 2016.

14. **Deep Residual Learning for Image Recognition**<br />
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. <br />
[[link]](https://arxiv.org/abs/1512.03385). In CVPR, 2016.

15. **Progressive Neural Architecture Search**<br />
Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy. <br />
[[link]](https://arxiv.org/abs/1712.00559). In ECCV, 2018.

16. **Searching for MobileNetV3**<br />
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam. <br />
[[link]](https://arxiv.org/abs/1905.02244). In ICCV, 2019.