Skip to content

Density-map estimation using supervised deep learning

Notifications You must be signed in to change notification settings

jeremiastraub/CountNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CountNet

Using supervised deep learning to count people in images. The CountNet model is based on the U-Net architecture with multiscale inception blocks as decoder. For more details, have a look at the Project Report.

Datasets

This repository includes four ready-to-use datasets. Any new dataset requires its own derived dataset class (see datasets.py). So far, the CountNet model could only be trained successfully on the Mall dataset.

  • Mall: has diverse illumination conditions and crowd densities. Moreover, the scene contained in the dataset has severe perspective distortion. The dataset also presents severe occlusions. The video sequence in the dataset consists of 2000 frames of size 640x480 with 6000 instances of labeled pedestrians split into training (1600) and evaluation set (400).
  • UCF_CC_50: includes a wide range of densities and diverse scenes with varying perspective distortion. It contains a total of 50 images with an average of 1280 individuals per image. Due to the limited number of images is recommended to define a cross-validation protocol for training and testing. However, it is currently split into training (40) and test (10) set.
  • Shanghai Tech (2 parts: A and B): consists of 1198 images with 330165 annotated heads. The dataset successfully attempts to create a challenging dataset with diverse scene types and varying density levels.

Generating ground-truth

Before training the model the first time, the ground-truth density maps need to be created. These are not included in this repository as they take up disk space of around 1GB (when creating them for all four datasets). We provide a script for automatic ground-truth generation. Run it via:

cd CountNet/datasets
python generate_density_maps.py

(This might take a while.)

Configuration

We aimed at designing the project such that the pipeline of configuring the datasets and training, starting a training run, and evaluating previous runs is easy to use and that results are reproducible. For example, we use the YAML language for most configurations, which makes it easy to adjust parameters and configurations are stored alongside with model output.

The preprocessing transformations applied to the image data (e.g., downscaling, random crops) can be configured in the dataset configuration

The core of the training and validation configuration is the run configuration.

Training the model

Start a training run via:

python run.py

This automatically reads in all current dataset and training configurations and stores a checkpoint as well as the configuration in an output folder.

Note that previous checkpoints can be loaded via Trainer.load_from in the run configuration. That way, training is continued from that checkpoint.

Validating the model

To validate a model trained previously, specify the model checkpoint to be loaded as the Trainer.validate_run entry in the run configuration. Then run the validation script:

python validate.py

We also provide various evaluation and plotting utilities in the plotting module.

Related Papers & Articles

Surveys on crowd counting:

Models:

About

Density-map estimation using supervised deep learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published