Skip to content

mseg-dataset/mseg-mturk

Repository files navigation

Build Status

This repo contains the Amazon Mechanical Turk (AMT) workflow scripts for the paper:

MSeg: A Composite Dataset for Multi-domain Semantic Segmentation (CVPR 2020, Official Repo) [PDF]
John Lambert*, Zhuang Liu*, Ozan Sener, James Hays, Vladlen Koltun
Presented at CVPR 2020. Link to MSeg Video (3min)

This repo is the fourth of 4 repos that introduce our work. It provides utilities to perform large-scale Mechanical Turk re-labeling.

  • mseg-api: utilities to download the MSeg dataset, prepare the data on disk in a unified taxonomy, on-the-fly mapping to a unified taxonomy during training.
  • mseg-semantic: provides HRNet-W48 pre-trained models and training code (sufficient to train a winning entry on the WildDash benchmark)

One additional repo will be introduced in August 2020:

  • mseg-panoptic: provides Panoptic-FPN and Mask-RCNN training, based on Detectron2

Dependencies

Install the mseg module from mseg-api.

Install the MSeg-mturk module:

  • mseg_mturk can be installed as a python package using

      pip install -e /path_to_root_directory_of_the_repo/
    

Make sure that you can run import mseg_mturk in python, and you are good to go!

Repo Contents

This repository contains the following items:

  • mseg_mturk: Python module with HIT publishing + evaluation scripts
  • hit_html: auto-populated HTML to render HIT UI page
  • image_elements: auto-populated HTML element for each mask
  • instruction_files: auto-populated instruction HTML pages for workers
  • template_html: template HTML code that is used to auto-populate HIT specifications
  • tests: unit tests

Work Statistics

  • Total time spent relabeling: 1.34 years of uninterrupted work.

Most time-intensive tasks:

  • 106 days (~3.5 months) COCO "person",
  • 87 days (~3 months) for IDD "rider",
  • 20 days for COCO "table,
  • 19 days for COCO "chair",
  • 19 days for COCO "counter".
  • ...

Workflow Overview

We design a careful workflow to ensure a high quality bar for annotations submitted by Mechanical Turk workers.

Our re-labeling workflow proceeds in 6 main stages:

(1) Hand-classify sentinels for each task, and create a BatchResult class with SentinelHIT specification.
(2) Run `mseg_mturk/publish_tasks.py` to generate HIT html, HIT csv, and instructions html. Sentinels are embedded into the 100-image HIT csv.
(2) Submit HIT on Amazon Mechanical Turk (AMT).
(2) Analyze accuracy of each submitted HIT using `mseg_mturk/eval_result.py`. 
	For each one, for all 100 images, check if it is a sentinel.
	If it is a sentinel, check correctness. Compute mean accuracy per HIT.
	Set status in WorkerHITResult for each HIT to 'Approved' or 'Rejected'
	based on 100% accuracy cutoff.

(3) Enter WorkerHITResult decisions into 'analyzed' version of csv. Upload analyzed csv to MTurk, and re-assign rejected jobs.
(4) Analyze multinomial worker agreement. For those HITs that were approved, 
	make a list of assigned labels per URL. Also record the number of approved
	observations per image.
(5) Take mode from approved, consider this the relabeled category.
(6) Manually review batch quality.
(7) Record relabeled list for each (dataset, original_classname) tuple.

Class Examplar Images

Via Google Drive, we provide access to the class examplar images we provided to MTurk annotators in their instructions: animals, rug vs. carpet, cabinet, nightstand, desk, chest-of-drawers, wardrobe, curtain vs. shower curtain, mountain vs. hill vs. snow, fence vs. guardrail, and all other shattered classes.

Example MTurk UIs

Citing MSeg

If you find this code useful for your research, please cite:

@InProceedings{MSeg_2020_CVPR,
author = {Lambert, John and Liu, Zhuang and Sener, Ozan and Hays, James and Koltun, Vladlen},
title = {{MSeg}: A Composite Dataset for Multi-domain Semantic Segmentation},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2020}
}

Acknowledgements

Many thanks to Qifeng Chen for his base AMT workflow, which he shared with us. We are also grateful to the Amazon Mechanical Turk workers who completed 1.34 years of uninterrupted annotation to make MSeg happen!

About

An Official Repo of CVPR '20 "MSeg: A Composite Dataset for Multi-Domain Segmentation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published