Skip to content
This repository has been archived by the owner on Aug 28, 2018. It is now read-only.

Rocaloid/CVESVP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CVESVP

CVE Spectral Voice Processing Library

Todo & Plan List

####Step1: Voice(Low Level) Modeling

  • Spectrum filter utilities
  • Implement PSOLAIterlyzer(in: Wave, out: PulseList)
  • Implement PSOLAItersizer(in: Position DataFrame, out: Wave)
  • Implement VOTFromWave based on STFTIterlyzer(in: Wave, out: Position)
  • Implement Sinusoid structure and List_Sinusoid and Sinusoid_ToReal
  • Implement HNMFrame HNMContour structure and List_HNMFrame
  • Implement SinusoidIterlyzer(in: Wave, out: List_Sinusoid)
  • Implement SinusoidItersizer(in: Position Sinusoid, out: Wave)
  • Implement HNMIterlyzer(in: Wave, out: List_HNMFrame)
  • Glottal pulse & phase reconstruction in SinusoidItersizer
  • Implement HNMItersizer based on SinusoidItersizer(in: Position HNMFrame, out: Wave)
  • Implement F0Iterlyzer(in: Wave, out: Real)
  • Phase control points for SinusoidItersizer
  • SNT analysis
  • Turbulent Noise reconstruction in HNMItersizer

####Step2: Structural Changes & Minor Improvements

  • Replace CDSP2_If_Debug_Check with RAssert
  • Down integrate Lists to CVEDSP2
  • Conversion between HNMFrame and HNMContour
  • Implement F0FromWave_YIN
  • Implement GainIterfector in CVEDSP2(in: Wave Wave, out: Wave)
  • Implement MixIterfector in CVEDSP2(in: Wave Wave, out: Wave)
  • Implement PulseItersizer in CVEDSP2(in: Position, out: Wave)

####Step3: Voice Manipulation

  • Timbre adjustment related to pitch scaling
  • Implement GenKlatt based on FWindow
  • Implement EpRParam structure
  • Implement EpRParam_ToHNMContour
  • EpR fitting algorithm
  • EpR manipulating utilities
  • PSOLA manipulating utilities

####Others/External

  • Implement VMaxIndex, VMinIndex, VMaxEI, VMinEI, VLog in RFNL
  • More interpolation kernels(Cubic, Sinc) in RFNL
  • Default analysis window for _F0.rc/Window Cache

Bibilography

  • Serra, X. 1989. "A System for Sound Analysis/Transformation/Synthesis based on a Deterministic plus Stochastic Decomposition" Ph.D. Thesis. Stanford University.

  • Sanjaume, Jordi Bonada. Voice processing and synthesis by performance sampling and spectral models. Diss. Universitat Pompeu Fabra, 2008.

  • Quatieri, Thomas F., and R. McAulay. "Phase coherence in speech reconstruction for enhancement and coding applications." Acoustics, Speech, and Signal Processing, 1989. ICASSP-89., 1989 International Conference on. IEEE, 1989.

  • De Cheveigné, Alain, and Hideki Kawahara. "YIN, a fundamental frequency estimator for speech and music." The Journal of the Acoustical Society of America 111.4 (2002): 1917-1930.

  • Childers, Donald G., and C. K. Lee. "Vocal quality factors: Analysis, synthesis, and perception." the Journal of the Acoustical Society of America 90.5 (1991): 2394-2410.