Estimation of Rate Control Parameters for Video Coding Using CNN

Rate-control is essential to ensure efficient video delivery. Typical rate-control algorithms rely on bit allocation strategies, to appropriately distribute bits among frames. As reference frames are essential for exploiting temporal redundancies, intra frames are usually assigned a larger portion of the available bits. We proposed an accurate method to estimate number of bits and quality of intra frames, which can be used for bit allocation in a rate-control scheme. The algorithm is based on deep learning, where networks are trained using the original frames as inputs, while distortions and sizes of compressed frames after encoding are used as ground truths.

Check our blog post

Machine learning for video coding optimisation

Publication

2018 IEEE Visual Communications and Image Processing (VCIP). The paper is available at IEEE Xplore and arXiv.

Please cite this work as:

@inproceedings{Santamaria2018,
  author = {M. Santamaria and E. Izquierdo and S. Blasi and M. Mrak},
  booktitle = {2018 IEEE Visual Communications and Image Processing (VCIP)},
  title = {Estimation of Rate Control Parameters for Video Coding Using CNN},
  year = {2018},
  doi = {10.1109/VCIP.2018.8698721},
  ISSN = {1018-8770},
  month = dec,
}

How to use

Dependencies

The code is compatible with Python 3.5.2 and TensorFlow 1.14.0.

Install all dependencies with:

pip install -r requirements.txt

YUV files are accessed with VMAF reader.

Data preparation

Training data is generated from MSCOCO dataset.

Encoding data

Convert the images to YUV 420p 8bit and make sure the dimensions width x height are in the file name, i.e. ffmpeg -i Lenna.png -pix_fmt yuv420p Lenna_512x512.yuv
Encode the images with a video codec, such as HEVC, using different QPs and:
Collect the bits per CTU, i.e. x y bits (x and y indicate the starting position of the CTU).
Keep the reconstructed YUV files. Add both the dimension width x height and QP to the file name, i.e Lenna_512x512_QP_22.yuv.
For each image, reformat the bits in a single text file like: x y bits_qp_1 bits_qp_2 ... bits_qp_n
Make sure you re-organise the data as follows

data
├── original
│   └── img1_512x512.yuv
├── reconstruction
│   └── img1
│       ├── img1_512x512_QP_22.yuv
│       ├── img1_512x512_QP_27.yuv
│       ├── img1_512x512_QP_32.yuv
│       └── img1_512x512_QP_37.yuv
└── bits
    └── img1.txt

Check the example in resources.

Creating datasets

You need two create two different datasets, one for each model.

For the distortion model dataset, complete and run the following command:

python3 cnn-rate-distortion --task prepare_data --data_name [dataset name] --model_name distortion --input_dir [dir original yuv] --label_dir [dir reconstructed yuv] --width [CTU width] --height [CTU height] --levels [num QPs] --h5_dir [output dir]

Example:

python3 cnn-rate-distortion --task prepare_data --data_name Lenna --model_name distortion --input_dir ./resources/data/original --label_dir ./resources/data/reconstruction --width 128 --height 128 --levels 4 --h5_dir ./resources

For the rate model dataset, complete and run the following command:

python3 cnn-rate-distortion --task prepare_data --data_name [dataset name] --model_name rate --input_dir [dir original yuv] --label_dir [dir bits] --width [CTU width] --height [CTU height] --levels [num QPs] --h5_dir [output dir]

Example:

python3 cnn-rate-distortion --task prepare_data --data_name Lenna --model_name rate --input_dir ./resources/data/original --label_dir ./resources/data/bits --width 128 --height 128 --levels 4 --h5_dir ./resources

Training

Once the datasets are created, train the models by completing and running the following command:

python3 cnn-rate-distortion --task train --model_name [model name] --data_name [dataset name] --epochs [max epochs] --batch_size [imgs per batch] --loss_type [function] --width [CTU width] --height [CTU height] --levels [num QPs] --archive_dir [output dir] --h5_dir [dir to dataset]

Testing

Complete and run the following command to test the trained models:

python3 cnn-rate-distortion --task test --model_name [model name] --data_name [dataset name] --batch_size [imgs per batch] --loss_type [function] --width [CTU width] --height [CTU height] --levels [num QPs] --archive_dir [output dir] --test_data [path to test dataset]

Acknowledgements

This work was co-supported by the Engineering and Physical Sciences Research Council of the UK, through an iCASE studentship in collaboration with the School of Electronic Engineering and Computer Science, Queen Mary University of London.

Contact

If you have any general doubt about our work or code which may be of interest for other researchers, please use the issues section. Alternatively, drop us an e-mail at maria.santamaria@bbc.co.uk.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
cnn-rate-distortion		cnn-rate-distortion
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
Readme.md		Readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cnn-rate-distortion

cnn-rate-distortion

resources

resources

.gitignore

.gitignore

LICENSE

LICENSE

Readme.md

Readme.md

requirements.txt

requirements.txt

Repository files navigation

Estimation of Rate Control Parameters for Video Coding Using CNN

Check our blog post

Publication

How to use

Dependencies

Data preparation

Encoding data

Creating datasets

Training

Testing

Acknowledgements

Contact

About

Releases

Packages

Languages

License

bbc/cnn-rate-distortion

Folders and files

Latest commit

History

Repository files navigation

Estimation of Rate Control Parameters for Video Coding Using CNN

Check our blog post

Publication

How to use

Dependencies

Data preparation

Encoding data

Creating datasets

Training

Testing

Acknowledgements

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages