Vocoder Manual, Published in ? date ? by EMS, written by ?


  • INTRODUCTION
    The EMS Vocoder (voice-coder) is a 22-channel vocoder that has been specifically designed to process speech and other sounds in a variety of new and interesting ways.
    In the following chapters, I will explain its many functions and describe how to operate the machine.
    I would now like to discuss the way in which natural speech mechanisms work and how the Vocoder analyses them. The speech production system is a very elaborate machine.
    When, for instance, vowel sounds are being voiced, that is when the vocal cords are oscillating, it acts like a reed instrument. But, when whistle sounds are being generated, it acts like a wind instrument. It can also generate articulated noise sounds. Therefore, a brief description of the speech production system would be a network composed of a set of variable resonators which can be excited either by the vocal cords at a variable pitch amplitude or by a flow of air (from the lungs) producing noise or whistles. This system is described in Fig. 2.
    fig 2


    The vocal cords, airflow and the resonators are all controlled by the brain and manipulated in such a way as to produce intelligible speech (except in the case of politicians).
    Consider, for example, the following piece of speech: "EMS Vocoder", Fig. 3.
    fig 3


    Note that the 'S' sounds, (I will call them the unvoiced parts of speech) are generated by noise excitation, and that the rest of the words (these I will call the voiced parts) are generated by the vocal cord excitation. Also note that the pitch of the vocal cord excitation changes during the words.
    Now, this information is only part of the data necessary to reconstruct the speech. If you were to listen to a noise source and an oscillator being controlled in the same way as they are in the vocal tract, you would not hear anything speech-like at all. The mechanism that transforms this sound into speech is the set of resonators. That is, the throat, nose, mouth in conjunction with the tongue. Therefore, to be able to reproduce speech, the Vocoder has to analyse the signal in the following way. It must decide whether or not the speech is voiced or unvoiced. It must calculate the pitch of the voiced portions and it must continuously analyse the frequency spectrum of the speech so as to determine the operation of the set of resonators, see Fig. 4.
    fig 4


    Having done this, we have all the information necessary to reproduce the speech, but the important thing is that in doing so, we can change some of the parameters and so get some interesting effects. For instance, in the forthcoming example we exchange the vocal cords and noise source for the output of an organ and get a talking organ.
    To be able to reproduce the speech, the Vocoder uses an electronic model of the vocal tract, Fig. 5.
    fig 5


    In this model a noise source and an oscillator (VCO) can be turned on and off and the VCO controlled in pitch. Their outputs are used to excite the synthesis filter bank. This is a model of the set of resonators and it is controlled by the data from the analysing filter bank. Figures 4 and 5 are the analysis and synthesis sections of the Vocoder and they form the main part of the machine.










    fig 17


    Back To Top this Page

    Back    Back to  Richard Davies NLnet Home Page