Voice Morphing




Voice morphing means the transition of one speech signal into another. The new morphed signal will have the same information content as the two input speech signals but a different pitch, which is determined by the morphing algorithm. To do this, each signal’s information has to be converted into another representation, which enables the pitch and spectral envelope to be encoded on orthogonal axes. Individual components of the speech signal are then matched and the signal’s amplitudes are then interpolated to produce a new speech signal. This new signal’s representation then has to be converted back to an acoustic waveform. This project vividly describes the representations of the signals required to affect the morph and also the techniques required to match the signal components, interpolate the amplitudes and invert the new signal’s representation back to an acoustic waveform.
INTRODUCTION
        Voice morphing means the transition of one speech signal into another. Like image morphing, speech morphing aims to preserve the shared characteristics of the starting and final signals, while generating a smooth transition between them. Speech morphing is analogous to image morphing. In image morphing the in-between images all show one face smoothly changing its shape and texture until it turns into the target face. It is this feature that a speech morph should possess. One speech signal should smoothly change into another, keeping the shared characteristics of the starting and ending signals but smoothly changing the other properties. The major properties of concern as far as a speech signal is concerned are its pitch and envelope information. These two reside in a convolved form in a speech signal. Hence some efficient method for extracting each of these is necessary. We have adopted an uncomplicated approach namely cepstral analysis to do the same. Pitch and formant information in each signal is extracted using the cepstral approach. Necessary processing to obtain the morphed speech signal include methods like Cross fading of envelope information, Dynamic Time Warping to match the major signal features (pitch) and Signal Re-estimation to convert the morphed speech signal back into the acoustic waveform.


If you like this please Link Back to this article...



Post a Comment