Halcyon (Instantaneous pitch estimator, ICASSP-2016)

Abstract

The package contains a Matlab (R2012b) implementation of the instantaneous pitch estimation algorithm 'Halcyon'.

The algorithm decomposes the signal into subband components and uses their instantaneous representations in order to evaluate candidate generating function. It is assumed that possible pitch variation range is proportional to pitch value. In order to get accurate estimates robust to rapid variations the analysis of signal is carried out using different time scales for each candidate. The algorithm shows a good frequency resolution for pitch-modulated sounds and performs well both in clean and noisy conditions.

Citation

A short algorithm description is given in

Azarov, E., Vashkevich, M. and Petrovsky, A., "Instantaneous Pitch Estimation Algorithm Based on Multirate Sampling", In Proc. ICASSP 2016, pp. 4970-4974.

@inproceedings{Azarov-16,
author={E. {Azarov} and M. {Vashkevich} and A. {Petrovsky}},
title={Instantaneous pitch estimation algorithm based on multirate sampling},
year={2016},
booktitle={2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={4970-4974},
doi={10.1109/ICASSP.2016.7472623}}

Halcyon (example)

Period candidate generation function (PCGF)

For period candidates generation we use an autocorrelation-based measure. The following figure compares the PCGFs using in RAPT, IRAPT anf Halcyon algorithms.

Figure 1. Period candidates generation. (a) – source signal, (b) – NCCF, (c) – instantaneous model-based NCCF (IRAPT), (d) – proposed period candidate generation function

Experimental results

The proposed technique is compared with other pitch estimation algorithms in terms of gross pitch error (GPE, %) and mean fine pitch error (MFPE, %).

In order to explore time resolution of the algorithms and their robustness against pitch variations we synthesized artificial signals with changing pitch in the range from 100 to 350 Hz. All obtained measurements were separated into six groups distinguished by variation rate: 0–0.3, 0.3–0.6, 0.6–0.9, 0.9–1.2, 1.2–1.5, >1.5 percent of pitch change per millisecond. Averaged errors are shown in figure 2.

Figure 2. Performance for artificial signals

For natural speech experiments the PTDB-TUG speech database was used. Obtained averaged results for clean speech are given in table 1.

Table 1. Performance for natural speech

	Male		Female
	GPE	MFPE	GPE	MFPE
RAPT	3.69	1.74	6.07	1.18
YIN	3.18	1.39	3.96	0.84
SWIPE'	0.756	1.51	4.27	0.80
PEFAC	20.521	1.383	31.192	0.972
IRAPT	1.63	1.61	3.78	0.98
Halcyon	0.743	1.268	3.600	1.039

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
image		image
LICENSE		LICENSE
README.md		README.md
Readme.txt		Readme.txt
build_halcyon_cfg.m		build_halcyon_cfg.m
dp.m		dp.m
halcyon.m		halcyon.m
halcyon_cfg.mat		halcyon_cfg.mat
halcyon_example.m		halcyon_example.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

image

image

LICENSE

LICENSE

README.md

README.md

Readme.txt

Readme.txt

build_halcyon_cfg.m

build_halcyon_cfg.m

dp.m

dp.m

halcyon.m

halcyon.m

halcyon_cfg.mat

halcyon_cfg.mat

halcyon_example.m

halcyon_example.m

Repository files navigation

Halcyon (Instantaneous pitch estimator, ICASSP-2016)

Abstract

Citation

Halcyon (example)

Period candidate generation function (PCGF)

Experimental results

About

Releases

Packages

Languages

License

Mak-Sim/Halcyon

Folders and files

Latest commit

History

Repository files navigation

Halcyon (Instantaneous pitch estimator, ICASSP-2016)

Abstract

Citation

Halcyon (example)

Period candidate generation function (PCGF)

Experimental results

About

Topics

Resources

License

Stars

Watchers

Forks

Languages