Skip to content

Latest commit

 

History

History
210 lines (192 loc) · 7.11 KB

PREPARE_DATASET.md

File metadata and controls

210 lines (192 loc) · 7.11 KB

Prepare Datasets

Download BDD100K

Please first download the images and annotations from the official website. For more details about the dataset, please refer to the official documentation.

On the official download page, the required data and annotations for each task are:

  • image tagging set:
    • images: 100K Images
    • annotations: Detection 2020 Labels
  • object detection set:
    • images: 100K Images
    • annotations: Detection 2020 Labels
  • pose estimation set:
    • images: 100K Images
    • annotations: Pose Estimation Labels
  • instance segmentation set:
    • images: 10K Images
    • annotations: Instance Segmentation
  • semantic segmentation set:
    • images: 10K Images
    • annotations: Semantic Segmentation
  • panoptic segmentation set:
    • images: 10K Images
    • annotations: Panoptic Segmentation
  • drivable area set:
    • images: 100K Images
    • annotations: Drivable Area
  • box tracking (MOT) set:
    • images: MOT 2020 Images
    • annotations: MOT 2020 Labels
  • segmentation tracking (MOTS) set:
    • images: MOTS 2020 Images
    • annotations: MOTS 2020 Labels

We list all the tasks here for completeness, but you only need to download the images and labels for the task you are interested in.

Convert Annotations

For object detection, pose estimation, instance segmentation, and panoptic segmentation, please transform the official annotation files to COCO style with the provided scripts by BDD100K.

First, uncompress the downloaded annotation file and you will obtain a folder named bdd100k.

To convert the detection set, you can run:

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m det \
    -i bdd100k/labels/det_20/det_${SET_NAME}.json \
    -o bdd100k/jsons/det_${SET_NAME}_cocofmt.json

To convert the pose estimation set, you can run:

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m pose \
    -i bdd100k/labels/pose_21/pose_${SET_NAME}.json \
    -o bdd100k/jsons/pose_${SET_NAME}_cocofmt.json

To convert the instance segmentation set, you can either run (for bitmasks):

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m ins_seg --only-mask \
    -i bdd100k/labels/ins_seg/bitmasks/${SET_NAME} \
    -o bdd100k/jsons/ins_seg_${SET_NAME}_cocofmt.json \
    [--nproc ${NUM_PROCESS}]

or run (for RLEs):

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m ins_seg \
    -i bdd100k/labels/ins_seg/rles/ins_seg_${SET_NAME}.json \
    -o bdd100k/jsons/ins_seg_${SET_NAME}_cocofmt.json \
    [--nproc ${NUM_PROCESS}]

To convert the panoptic segmentation set, you can either run (for bitmasks):

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco_panseg \
    -i bdd100k/labels/pan_seg/bitmasks/${SET_NAME} \
    -o bdd100k/jsons/pan_seg_${SET_NAME}_cocofmt.json \
    --pan-mask-base bdd100k/jsons/pan_seg/masks/${SET_NAME} \
    [--nproc ${NUM_PROCESS}]

For box and segmentation tracking, you can also convert the annotations for each video to one COCO style JSON annotation file by running:

To convert the box tracking set, you can run:

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m box_track \
    -i bdd100k/labels/box_track_20/${SET_NAME} \
    -o bdd100k/jsons/box_track_${SET_NAME}_cocofmt.json

To convert the segmentation tracking set, you can either run (for bitmasks):

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m seg_track --only-mask \
    -i bdd100k/labels/seg_track_20/bitmasks/${SET_NAME} \
    -o bdd100k/jsons/seg_track_${SET_NAME}_cocofmt.json \
    [--nproc ${NUM_PROCESS}]

or run (for RLEs):

mkdir bdd100k/jsons
python -m bdd100k.label.to_coco -m seg_track \
    -i bdd100k/labels/seg_track_20/rles/${SET_NAME} \
    -o bdd100k/jsons/seg_track_${SET_NAME}_cocofmt.json \
    [--nproc ${NUM_PROCESS}]

The ${SET_NAME} here can be one of ['train', 'val'].

Symlink the Data

It is recommended to symlink the dataset root to $bdd100-models/data. If your folder structure is different, you may need to change the corresponding paths in each config file, which is not recommended. Our full folder structure is as follows:

bdd100k-models
└── data
    └── bdd100k
        ├── images
        │   ├── 100k
        |   |   ├── train
        |   |   └── val
        │   ├── 10k
        |   |   ├── train
        |   |   └── val
        |   ├── track
        |   |   ├── train
        |   |   ├── val
        |   |   └── test
        |   └── seg_track_20
        |       ├── train
        |       ├── val
        |       └── test
        ├── labels
        │   ├── det_20
        |   |   ├── det_train.json
        |   |   └── det_val.json
        │   ├── pose_21
        |   |   ├── pose_train.json
        |   |   └── pose_val.json
        │   ├── ins_seg
        |   |   ├── bitmasks
        |   |   |  ├── train
        |   |   |  └── val
        |   |   ├── colormaps
        |   |   ├── polygons
        |   |   └── rles
        │   ├── sem_seg
        |   |   ├── masks
        |   |   |  ├── train
        |   |   |  └── val
        |   |   ├── colormaps
        |   |   ├── polygons
        |   |   └── rles
        │   ├── pan_seg
        |   |   ├── bitmasks
        |   |   |  ├── train
        |   |   |  └── val
        |   |   ├── colormaps
        |   |   ├── polygons
        |   |   └── rles
        │   ├── drivable
        |   |   ├── masks
        |   |   |  ├── train
        |   |   |  └── val
        |   |   ├── colormaps
        |   |   ├── polygons
        |   |   └── rles
        |   ├── box_track_20
        |   |   ├── train
        |   |   └── val
        |   └── seg_track_20
        |       ├── bitmasks
        |       |   ├── train
        |       |   └── val
        |       ├── colormaps
        |       |   ├── train
        |       |   └── val
        |       ├── polygons
        |       |   ├── train
        |       |   └── val
        |       └── rles
        |           ├── train
        |           └── val
        └── jsons
            ├── det_train_cocofmt.json
            ├── det_val_cocofmt.json
            ├── pose_train_cocofmt.json
            ├── pose_val_cocofmt.json
            ├── ins_seg_train_cocofmt.json
            ├── ins_seg_val_cocofmt.json
            ├── pan_seg_train_cocofmt.json
            ├── pan_seg_val_cocofmt.json
            ├── pan_seg
            |   └── masks
            |       ├── train
            |       └── val
            ├── box_track_train_cocofmt.json
            ├── box_track_val_cocofmt.json
            ├── seg_track_train_cocofmt.json
            └── seg_track_val_cocofmt.json