Skip to content

bcosculluela/checkerboard-artifacts-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Checkerboard artifacts analysis

This is an extension of my master's thesis: "Development of a multiscale deep neural network for ECG denoising using large synthetic and real databases". In this project I have developed and trained a U-net type network in order to cancel the noise of an electrocardiogram (ECG).
The architecture is based on the one developed by Défossez et al. in [1], which aims at speech enhancement. On this basis, changes have been made in order to adapt it to the objective.

As I delved deeper into convolutional neural networks, I became aware of their potential, but also of the problems that arise from their use. One of the major problems associated with deconvolution layers (also known as transposed convolutions) is the presence of checkerboard artifacts in the output signal [2][3][4]. These artifacts occur in both the deconvolution and backpropagation of the convolutional layers. For more details on the origin and comprehensive illustrations of this fact, you can visit https://distill.pub/2016/deconv-checkerboard/.

Some of the advice given by these authors to avoid artifacts are:
- Using a kernel size that is divisible by the stride.
- Separation of the processes of increasing resolution and applying convolution, i.e., resizing the image using interpolation techniques first and then applying the convolutional layer.
In addition, the use of deconvolutions with stride=1 are quite effective in damping the artefacts [2].

Following these guidelines and experimenting, a new architecture has emerged.

Thus, we have on the one hand the original architecture, using deconvolution layers, and on the other hand the architecture modified in order to avoid these artifacts, replacing deconvolutions with upsampling layers followed by ‘1x1’ convolutions.

Both models have been trained on the same dataset, in order to study the presence of artifacts. The original and estimated signals have been analysed in its frequency content by means of spectrograms.
This experiment is inspired by the work carried out in: https://github.com/DolbyLaboratories/neural-upsampling-artifacts-audio [5].

References

[1] Défossez, A., Synnaeve, G., & Adi, Y. (2020). Real Time Speech Enhancement in the Waveform Domain. ArXiv, abs/2006.12847.
[2] Odena, et al., "Deconvolution and Checkerboard Artifacts", Distill, 2016. http://doi.org/10.23915/distill.00003
[3] Aitken, A.P., Ledig, C., Theis, L., Caballero, J., Wang, Z., & Shi, W. (2017). Checkerboard artifact free sub-pixel convolution: A note on sub-pixel convolution, resize convolution and convolution resize. ArXiv, abs/1707.02937.
[4] Sugawara, Y., Shiota, S., & Kiya, H. (2019). Checkerboard artifacts free convolutional neural networks. APSIPA Transactions on Signal and Information Processing, 8, e9. doi:10.1017/ATSIP.2019.2
[5] J. Pons, S. Pascual, G. Cengarle and J. Serrà, "Upsampling Artifacts in Neural Audio Synthesis," ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 2021, pp. 3005-3009, doi: 10.1109/ICASSP39728.2021.9414913.

Releases

No releases published

Packages

No packages published