Skip to content

Graph Reasoned Multi-Scale Road Segmentation in Remote Sensing Imagery


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



20 Commits

Repository files navigation

Graph Reasoned Multi-Scale Road Segmentation in Remote Sensing Imagery

🚀 Presented at the IEEE International Geoscience and Remote Sensing Symposium (IGARSS) 2023 Conference in Pasadena, California, USA 🚀

Example: Rapid City-Scale Road Network Extraction (Las Vegas)

Road network predictions on a small region of Las Vegas with geo-referenced (1300x1300x3) RGB images from ArcGIS World Imagery Map Service (ground resolution of 0.3 m2 / pixel) using QGIS. Image below is (9x6) grid.


Areal inference speed: ~650km2 / hour / GPU

Paper link: Graph Reasoned Multi-Scale Road Segmentation in Remote Sensing Imagery

How to Run

1. Dataset Instructions

DeepGlobe (Kaggle account required),
MassachusettsRoads (Kaggle account required),
Spacenet (AWS account required).

Once you download either the DeepGlobe or the Massachusetts Roads datasets, extract their contents into a "DeepGlobe" or "MassachusettsRoads" folder respectively in the Datasets folder.

For Spacenet, the procedure is a bit more involved...

We need the images in 8-bit format.
After downloading AOIs 2-5 (Vegas, Paris, Shanghai, Khartoum), go to the CRESI repository and select "SpaceNet 5 Baseline Part 1 - Data Prep".
Use as described in the link. Then use to create continuous masks. Binarize these masks between [0,1] and place them in /Datasets/Spacenet/trainval_labels/train_masks/

Next, locate the "PS-MS" folder in each corresponding AOI_#_<city> directory.
Move all image files in each of these "PS-MS" folders to /Datasets/Spacenet/trainval/.
Like-wise, locate the "MUL-PanSharpen" folder in each corresponding AOI_#_<city>_Roads_Test_Public directory and move all of these image files to /Datasets/Spacenet/test/

2. Setup

Create an environment with anaconda: conda create --name <your_env_name> python=3.9
Next, activate your environment: conda activate <your_env_name>
Install dependencies from pip: pip install -r requirements.txt
Install dependencies from conda:
conda install gdal
conda install pytorch=1.13.0 torchvision=0.14 pytorch-cuda=11.6 -c pytorch -c nvidia

Now we will create our cropped images for each train/val/test part (where applicable) of a chosen Dataset.
In the console enter: python -d Datasets -cs 512 -j <name of dataset> (-cs is the crop-size)
The dataset name should be identical to the ones in the Dataset Instructions section. Wait approximately ~15 minutes.

Cropped Image Disk Space:
DeepGlobe ~= 24.3GB
MassachusettsRoads ~= 9.71GB
Spacenet ~= 25GB

3. Training

All training was performed on a single NVIDIA GeForce RTX 2080 Ti (11GB VRAM).
See the cfg.json file to ensure that the training settings are appropriate for your rig.

To train the model from scratch, run:
python -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name>

Example python -m ConvNeXt_UPerNet_DGCN_MTL -d MassachusettsRoads -e MassachusettsRoads

To resume the training of a model:
python -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -r ./Experiments/<experiment_name>/model_best.pth.tar

To fine-tune a pre-trained model on a new dataset:
python -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -rd ./Experiments/<experiment_name>/model_best.pth.tar

For example, one can use pre-trained MassachusettsRoads model weights to start training for DeepGlobe or Spacenet to speed up convergence.

4. Evaluation

Backup your log files (*.txt) in ./Experiments/<experiment_name>/

Once training ends (Default: 120 epochs), to evaluate Precision, Recall, F1, IoU(relaxed) IoU(accurate) metrics run:
python -m ConvNeXt_UPerNet_DGCN_MTL -d <dataset_name> -e <experiment_name> -r ./Experiments/<experiment_name>/model_best.pth.tar

The evaluation script uses elements from the utils folder of [3].

This will create a ./Experiments/<experiment_name>/images_eval folder with each file showing (clock-wise) the original image, its label, a feature heat-map and the stitched prediction. Note, that for MassachusettsRoads, use a validation setting batch_size of 3 (cfg.json) when creating the images.

To evaluate the APLS metric refer to this link.

5. Results

You may also refer to this link for better viewing. results

6. REFERENCES [1] N. Weir et al., “SpaceNet MVOI: A Multi-View Overhead Imagery Dataset”, 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 992-1001, doi: 10.1109/ICCV.2019.00108.

[2] I. Demir et al., “DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images”, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2018, pp. 172-17209, doi: 10.1109/CVPRW.2018.00031.

[3] A. Batra, S. Singh, G. Pang, S. Basu, C. V. Jawahar and M. Paluri, “Improved Road Connectivity by Joint Learning of Orientation and Segmentation”, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 10377-10385, doi: 10.1109/CVPR.2019.01063.

[4] L. Zhang et al., “Dual Graph Convolutional Network for Semantic Segmentation”, 2019 British Machine Vision Conference (BMVC), 2019,

[5] Z. Liu, H. Mao, C.Y. Wu, C. Feichtenhofer, T. Darrell, S. Xie, “A ConvNet for the 2020s”, 2022 Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 11976-11986

[6] T. Xiao, Y. Liu, B. Zhou, Y. Jiang, J. Sun, “Unified perceptual parsing for scene understanding”. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 432–448. Springer, Cham (2018).

[7] A. Etten, D. Lindenbaum, T. Bacastow, “SpaceNet: A Remote Sensing Dataset and Challenge Series”, 2018,

[8] V. Mnih, “Machine Learning for Aerial Image Labeling”, PhD Dissertation, University of Toronto, 2013.

[9] W.G.C. Bandara, J.M.J. Valanarasu, V.M .Patel, “Spin road mapper: extracting roads from aerial images via spatial and interaction space graph reasoning for autonomous driving”. arXiv preprint arXiv:2109.07701 (2021)