There is a new voice synth chip on the market SpeakJet

ADPCM for Highly Intelligible Speech Synthesis

Steve Ciarcia

Copyright © 1983 Steven A. Ciarcia. All rights reserved, Juni 1983 @ BYTE Publication Inc

posted 1 February 2002

Steve Ciarcia
POB 582
Glastonbury1 CT 06033
Special thanks to Bill Curlew for his software expertise.

Some new integrated circuits from Oki Semiconductor compress digitized speech data efficiently.




Fig.1

Functional Block Diagram of a digitized speech-reproduction system that employs pulse-code modulation

Fig. 1 Functional Block Diagram



The ear is sensitive, and too coarse a reproduction will sound unnatural or even unintelligible.


Fig. 2

Waveform sampling by pulse-code modulation (PCM). The interval between samples is T0;
the sampling frequency is the reciprocal of the interval. Each sample of PCM data consists of N bits;
the leftmost is the most significant bit and the rightmost is the least significant bit.

Fig. 2 Waveform sampling








Fig.3a

Waveform sampling by delta modulation. Each sample of the source waveform is tested to see
if its amplitude is higher or lower (within the resolution of a fixed quantization value Ar-delta-r)
than that of the previous sample. If the amplitude is higher, the single-bit delta-modulated
encoding value is set to 1; if lower, the encoding value is set to 0.

Fig. 3a


Fig. 3b

Two potential problems occurring in delta modulation. When the source waveform changes too rapidly,
the fixed quantization value may be too small to express the full change in the input; this slope overload
causes a compliance error. Or when there is little change in the input waveform (at the exfreme, a DC
signal), vertical deflection in the quantization value results in granular noise in the output.

Fig. 3b



Competing Digitizing Methods

Competing digitizing methods





Fig.4

Differential pulse-code modulation (DPCM) is an attempt to reduce the amount of data stored or
transmitted, as compared with regular PCM. For each sample, the difference between the previous
PCM code and the current code is expressed in terms of a fixed quantization value Ar (delta-r),
which must be chosen with attention to the characteristics of the source waveform.
If too large or small a quantization value is used, compliance errors occur.

Fig. 4




ADPCM Is a specialized form of PCM that offers significantly Improved inteliiglbility at lower data rates.









Fig.5

Adaptive differential pulse-code modulation (ADPCM) improves upon DPCM by dynamically varying
the quantization between samples, depending upon their rate of change, while maintaining a low bit rate,
condensing 12-bit PCM samples into only 3 or 4 bits. In ADPCM, each sample's encoding is derived
by a procedure that includes the following steps. A PCM-value differential dn is obtained by sub fracting
the previous PCM-code value from the current value. The quantization value An (delta-n) is obtained by
multiplying the previous quantization value times a coefficient times the absolute value of the previous PCM-
code value. The PCM-value differential is then expressed in terms of the quantization value and encoded
in four bits. The mathematical relations are shown here in figure 5a, whereas figure 5b shows
a typical encoded waveform.

Fig.5a

Fig. 5a



Fig.5b

Fig. 5b









MSM5218 INTERNAL BLOCK DIAGRAM

Fig. 6 MSM5218


Figure 6: Functional block diagram of the Oki Semiconductor MSMS218RS ADPCM integrated circuit.


Photo 2

Photo 2



Fig. 7

Fig. 7a



Fig. 7

Fig. 7b


Fig. 7

An ADPCM speech analysis and synthesis (storage and reproduction) circuit built around the Oki MSM5218RS chip.
A low-cost 8-bit A/D converter is used in place of a higher-resolution and more costly 12-bit converter. The Oki
MSM5204RS 8-bit CMOS A/D converter, used in this circuit, contains a successive-capacitor-ladder conversion system.
It also incorporates a sample and hold stage that enables direct input of rapidly changing analog signals. An external clock-
signal provides timing for the chip; the clock's frequency is not critical and can be anywhere from 450 to 500 khz. The
frequency bandwidth of the signal input to the A/D converter is limited by an active low-pass filter, IC2, an Oki
ALP-2 filter with a 1.7-kHz cutoff frequency and attenuation of 18 dB per octave above the cutoff frequency.




Flowchart Fig. 8

Algorithm of the LOAD routine in the program of listing 1. Used with the circuit
of figure 7, LOAD takes analog voice signals from a microphone or other
source and stores them in ADPCM-encoded form in user memory.

Fig. 8



Listing 1

Listing 1


Listing 1 continued

Listing 1 continued



One significant aspect of ADPCM speech synthesis is the ease of producing a custom vocabulary.

Photo 3

Photo 3



Fig. 10

A voice-reproduction circuit built around the Oki MSM5205RS speech synthesis chip.
This circuit is useful in applications where you need a fairly inexpensive means of
reproducing a custom vocabulary. You can store your vocabulary with the circuit
of figure 7., and load the encoded speech into this simple circuit for output

Fig. 10



Fig. 9 Algorithm of the DUMP routine from listing 1.

Fig. 9



References

References


Parts

Parts



Back    Back to  Richard Davies NLnet Home Page