|Publication number||US3706929 A|
|Publication date||Dec 19, 1972|
|Filing date||Jan 4, 1971|
|Priority date||Jan 4, 1971|
|Publication number||US 3706929 A, US 3706929A, US-A-3706929, US3706929 A, US3706929A|
|Inventors||Munnich Robert F, Robinson John L|
|Original Assignee||Philco Ford Corp|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (2), Referenced by (19), Classifications (13)|
|External Links: USPTO, USPTO Assignment, Espacenet|
United States Patent Robinson et a1.
 COMBINED MonEM ANn voc ons PIPELINEPROCESSOR 324/77 A, 77 B; 325/30, 38 B, 38 R, 15; 178/67; 340/148, 152
 References Cited UNITED STATES PATENTS 3,403,227 9/1968 Malm ..l79/1 SA 3,360,610 12/1967 Flanagan ..l79/l SA ABSTRACT A digital pipeline processing system for implementing both vocoder and'data modem functions. The vpipeline capability is provided by three circulating memories and two associated arithmetic units. The first circulating memoryand its arithmetic unit implement the functions of vocoder spectrum analysis modem modulation and modem demodulation. The second circulating memory and its arithmetic unit implement the functions of vocoder pitch extraction, vocoder parameter filtering, and vocoder speech synthesis. The third circulating memory is used for temporary storage of data while computations arecarried out by the other circulating memories and their arithmetic units. The processing system also comprises a control unit which provides timing and gating signals for control of the remainder of the processor, an impulse response synthesizer which provides sinusoids used in speech synthesis, encoding and decoding circuitry for formating data, and a plurality of read-only memories for permanent storage of functions required by the processor.
4 Claims, 13 Drawing Figures BACKGROUND OF THE INVENTION The invention herein described was made in the course of or under a contract or .subcontract thereunder, with the U.S. Army Electronics Command, Fort Monmouth, N. J.
This invention relates to a pipeline digital processor for implementation of both vocoder and data modem functions I Vocoder systems function to transmit speech signals in a coded manner to reduce the transmission bandwidth which would otherwise be required if the speech was to be transmitted in an uncoded manner. Thus a vocoder system includes both a transmit terminal to analyze the characteristics of the speech wave to be encoded and to encode the speech wave, and a receive terminal to synthesize from the coded signal sent to it reconstruction of the original speech wave.
Data modems function to facilitate the transmission of data (for example from a speech vocoder) over a transmission medium. Thus a modem includes both a transmit terminal to convert the'encoded data into a modulating signal used to modulate a carrier, and a receive terminal to demodulate the received signal and thereby recover the transmitted data.
- Both vocoder and modern equipment are therefore required for transmission of speech signals in an efficient high performance system. Prior art systems have provided separate digital hardware to implement the vocoder and the modem functions. A digital processor which can use the same hardware to implement both these functions would result in a substantial savings in equipment. Moreover a substantial decrease in processing time can be achieved by performing these functions in a pipeline processor. The latter processor differs from a conventional digital computer where a complete cycle time to retrieve a number from memory, perform an operation'on it and retum' it to memory is required before the next operation can begin. In the pipeline processor, data is continuously circulated through memory and multiple arithmetic units, i.e., retrieval of a second operand begins before the first result has been returned to memory,.and-the arithmetic unit starts working on a second set of operands before the results from the first set are returned to memory. Thus by choosing the sequence of mathematical operations or algorithms so that the pipeline can be kept full, processing time is greatly reduced.
A key factor in the construction of the pipeline processor which can implement both the vocoder and the modem functions is the choice of algorithms to be implemented. Only by proper choice of the algorithms which are to be implemented to synthesize the vocoder and modem can the simultaneous objectives of (l) producing digital apparatus which can implement both the vocoder and modem and (2) making possible a sequence of mathematical operations amenable to pipeline processing, be achieved.
Accordingly, an object of this invention is to provide a digital processor which can implement the functions required of both the vocoder and the modem.
Another object is to provide a pipeline digital processor adapted to function as a speech vocoder.
.2 Another object is to provide a pipeline digital processor adapted to function as a data modern.
Another object is to provide a pipeline digital processor adapted to function as a combined vocoder and modem.
SUMMARY OF INVENTION In accordance with the invention, these objects are achieved by a digital apparatus which functions in two operating modes. In the transmit mode the input signal to the apparatus is representative of a speech wave in the time domain and in the receive mode the input signal is representative of the spectral density and the pitch frequency of a speech wave. In either mode input signals are applied to an analog todigital converter where they are sampled and converted to binary I form, and then applied to the pipeline processor portion of the apparatus. The pipeline processor comprises first and second portions each of which comprises a circulating memory and an arithmetic unit. I
When the apparatus is arranged in the.transmit" mode, the first portion of the pipeline processor computes the spectral density of the speech wave and generates a binary representation of a modulated carri- DRAWING FIG. 1 is a simplified block diagram of the invention.
FIG. 2 is a more detailed block diagram of the impulse response synthesizer 46 ofFIG. 1.
FIG. 3 is a more detailed block diagram of a first portion of the pipeline processor of FIG. 1.
FIG. 4 is a more detailed block diagram of a second portion of the pipeline processor of FIG. 1.
FIG. 5 is a flow diagram depicting a process for autocorrelation pitch extraction.
FIG. 6 is a flow diagram depicting a process for vocoder spectrum analysis.
FIG. 7 is a flow diagram depicting a process for modem tone synthesis.
FIG. 8 is a flow diagram depicting a process fo modem modulation.
FIG. 9 is a flow diagram depicting a process for modem demodulation.
FIG. 10 is a flow diagram depicting a process for vocoder synthesis.
FIGS. Ila and b are more detailed block diagrams of vocoder and modern data processors 94 and 96, respectively.
FIG. 12 is a more detailed block diagram of a portion of control unit 52 of FIG. 1.
1 mm 1 i Mi DESCRIPTION OF THE INVENTION Before entering upon a detailed description of the invention, the concept upon which the invention is based is described briefly.
' In order to achieve a pipeline processor which performs the functions of both a vocoder and a data modem on a real time basis, it is necessary to choose those mathematical algorithms for implementation of these functions which allow use of digital apparatus compatible with both functions. In the preferred embodiment of the invention, implementation by the following algorithms of a channel vocoder and a differentially-coherent phase-shift-keyed frequency division-multiplexed (DCP SK/FDM) modem has been chosen:
Pitch Extraction: Autocorrelation pitch extraction Spectrum Analysis: A discrete Fourier transform (DFT) with triangular weighting.
Voiced Synthesis: Impulse response synthesis using table lookup Unvoiced Synthesis: Heterodyned noise algorithm I High Frequency Modem:
Modulation: Tone synthesis by table lookup Phase detection: The discrete Fourier transform (DFT) Differential phase Calculation: Vector multiplication Diversity Combining: Vector addition Channel vocoders attempt to reproduce the short time power spectrum of the speech waveform. The conventional channel vocoder comprisesa pitch extractor to measure the pitch or fundamental frequency of thespeechwave and a bank of filters (or its digital equivalent) to measure. the spectral content of the speech wave. Presence or absence of the pitch signal or a test based on the speech energy in the filter bank can be used to indicate the presence of voiced or unvoiced sounds. The signals are then transmitted to the receiver for reconstruction of the speech waveform. Excitation derived from a pitch modulated pulse generator for voiced synthesis, or a broadband noise generator for unvoiced synthesis, is applied to a bank of filters in the receiver identical to that used in the transmitter. These filter outputs are amplitude modulated by the received signals which define the spectral content of the speechwave, and combined to provide a reconstruction of the speechwave.
In the digital processor of the invention, the functions of the channel vocoder are accomplished in the following manner:
Pitch extraction is accomplished by processing the speech signal in accordance with an autocorrelation pitch extractor algorithm wherein the autocorrelation function of the incoming signal is computed, and the pitch period is estimated by measuring the distance between autocorrelation peaks.
Spectrum analysis is accomplished by computing the discrete Fourier transform (DFT) of the incoming speech wave. As is well known, the computation of the DFT involves an integration process. To assure that significant temporal changes are accounted for, the spectrum is computed by analyzing a portion of the incoming signal as seen through a triangular time window or weighting function. This processing is equivalent 'to analyzing the speech waveform with a bank of 16 analyzing filters.
The voicing decision is made by a conventional energy balancing type of process wherein the energies in different portions of the spectrum are compared to preset thresholds and to each other. If the thresholds are exceeded and the energies in the different portions of the spectrum bear the correct relationship to each other, the decision that voiced signals are present is made.
Voiced synthesis is accomplished by synthesizing the impulse response of each channel by table lookup and then multiplying each channel by its corresponding amplitude parameter which is derived from the spectrum analysis performed on the original speechwave.
Unvoiced synthesis is accomplished by a heterodyne noise algorithm wherein each channel is modulated by a noise-like signal and then processed as in the case of voiced signals.
The algorithms used to implement the modem functions result in '16 frequency division multiplexed (FDM) channels of information carrying data representative of the results of the spectral analysis of the speech wave and the pitch frequency information. The data is carried in each channel by means of four phase differentially coherent phase shift keyed (DCP SK) modulation of the carrier. v
Modulation is accomplished by a table lookup algorithm wherein the value of every quantized sample of each tone corresponding to each channel is calculated in advance and stored in permanent storage. The modulated signal bearing the voice information is then synthesized by computing the sequence of addresses required to generate each tone with its proper phase shift, and then retrieving the stored samples. The stored samples are then added to form the modem output.
In the demodulation process, the digital equivalent of supplying a filter bank to separate the transmitted signals is accomplished by computing the discrete Fourier transform (DFT) of the composite of received signals. This DFT algorithm results in a series of complex frequency'coefficients representative of the amplitude and phase of each of the tones.
The DC? SK modulation of each tone is demodulated by a vector multiplication algorithm wherein the differential phase vector is computed by calculating the vector product of a complex frequency coefficient and the complex conjugate of the previously received coefficient.
Diversity combining, if required, is accomplished by vector addition, wherein the real and imaginary parts of the differential phase vectors of each of the channels to be combined are summed separately. In the preferred embodiment of the invention, maximal ratio combining is implemented. This technique is described in: D. G. Brennan, Linear Diversity Combining Techniques," Proceedings of the IRE, Vol. 47, No. 6, pages 1075-1101, June 1959.
The pipeline processor of the invention operates half-duplex i.e., in either a transmitting or receiving mode.
In the transmit" mode the input signal to the processor is a speech wave and the processor performs the vocoder function of speech wave analysis and the modem function of generating a modulating signal nun" All! which carries the results of the speech wave analysis. This modulating signal becomes the output signal of the processor in this mode and can be used in the modulator of a conventional communication transmitting system.
ln the receive mode the input signal to the processor is a speech information bearing communication signal such as may be derived from a conventional communications receiver. The same type of modulation used in the transmit mode, must be used in this mode. The processor performs the modem function of demodulation of the input signal and the vocoder function of synthesis of the speech wave. The output signal of the processor in this mode .may be used to drive conventional voice reproduction circuitry.
FIG. 1 is a block diagram which shows the general organization of the pipeline processor of the invention.
At the heart of the processor are three circulating memories 62, 66 and 88, respectively designated memory No. 1, memory No. 2 and memory No. 3. Memory 62 along with its arithmetic unit comprising multiplier 58 and adder 60, and memory 88' together with its arithmetic unit comprising multiplier 78 andadders 84 and- 86 function as first and second portions of the pipeline processor, respectively. Memory 66 is used to store data for the pipeline processor and has no arithmetic unit of its own. The functions performed by each portion of the pipeline processor in both the transmit and receive modes are listed within the blocks representative of memories 62, 66 and 88 and are described in detail hereinafter. In the preferred embodiment of the invention, these memories each have a capacity of 88 words. The word lengths are l2, l2, and 16 bits for memories 62, 66 and 88, respectively.
-Transmit Mode by an eight-digit binary word.
A/D circuit 44 is coupled to multiplier 58, adder 60 and memory 62 via data switches 54 and 56, and to memory 66 via data switch 64, all of which form a first portion of the pipeline processor. The data switches are gating circuits which function under control of control unit 52 to transfer data at the proper time and to the proper circuitry during each mode of operation. The first portion of the pipeline processor performs pitch extraction by computing the autocorrelation function of the incoming speechwave and measuring the period between autocorrelation peaks. Each autocorrelation coefficient, as will be discussed subsequently, is thesum of a number of currently-received speech samples multiplied by an equal number of preceding samples. Memory 66 stores samples of the incoming speech wave. Stored samples are then transferred via data switch 56 to multiplier 58' where they become the preceding samples to be multiplied by current samples of the incoming speech wave. The products are summed in adder 60, and the resulting autocorrelation coefficients are accumulated in memory 62. After all the autocorrelation coefficients have been calculated, they are read out of memory 62 and sequentially passed through adder 60 which functions as a comparator to find the largest value (i.e., the peak of the autocorrelation function) among them. Adder 60 is coupled to pitch selection circuitry 63, where the pitch frequency is determined and assigned a six bit code. Pitch selection circuitry 63 is coupled to voicing decision circuitry 95, where the pitch bits are all changed to zero only if it has been determined that the speech signal is unvoiced. Otherwise the pitch bits are not I changed from the six-bit code that was initially assigned. Voicing decision circuitry is coupled to vocoder data processor 96.
A/D circuit 44 is also coupled, via data switch 74, to
multiplier 78, adders 84 and 86, and memory 88 all of trum analysis is carried out by computing the DFT of the product of the incoming speech wave and a triangular weighting function (or window). The values of the window weighting functions are computed in advance and stored in memory 88 in the following manner: Window ROM (read only memory) 68, which stores values of window increments (one increment for each of the 16 vocoder channels) necessary to generate samples of the window, is coupled by data switch 88 to adder 84. Memory 88, which stores the instantaneous values of the window function, is also coupled to adder 84. Beginning, in memory 88, with an instantaneous window value, of zero, the increment for each channel is sequentially added in adder 84 to and then subtracted from the instantaneous window values stored in memory 88 so that a complete set of sampled values of the triangular function are generated for each channel. The width of these windows, which differs for each channel, is controlled by control unit 52 which specifies the number of additions and subtractions to be performed for each channel. The output of memory 88 is coupled by data switch 76 to multiplier 78 where the samples of incoming speech signal are multiplied by the window function. The DFT computation, as will be discussed subsequently, requires a multiplication of the input signal-window function product by sine and cosine waves. Quantized samples of the required sine and cosine waves are stored in ROM 72. Memory 88, which stores the sequences of addresses necessary to select the correct sequence of samples, is coupled to ROM 72 for the selection of samples. ROM 72 is also coupled by data switch 76 to multiplier 78, where multiplicaticn of the input signal-window product by the sine and cosine waves takes place. The result of the parameters can be converted into three bit words each word representative of one-half the logarithm of the sum of the squares of the real and imaginary parts of each complex frequency coefficient; i.e., the frequency synthesizer output is representative of the logarithm of the magnitude of each of the complex frequency coefficients. Logarithmic steps are chosen because the human aural perception of the loudness is proportional to the logarithm of sound energy. A three-bit code is chosen because it provides quantum steps at approximately 4dB intervals, thereby conforming to conventional vocoder practice. This process is repeated for eachof the 16 vocoder channels. ROM 92 is coupled to encoder 93 where the data is encoded into conventional 2,400 bits per second (Bps) or l,200'Bps (delta coded) formats. Encoder 93 supplies its output signal to vocoder data processor 96 where the vocoder ammation (three bits per channel, except for the 16" channel whose least significant bitis clamped to One and used as a synchronization bit) and (b) six bits of pitch information, is stored in parallel register 5(1).
Upon occurrence of .a read down pulse generated by a 2,400 I-l z clock 06 and a divider circuit 502, t he vocoder frame is transferred in parallel via gate 504 into vocoder data register 508. When the modem is acting as a modulator, the data in vocoder data register 508 is then shifted serially into modem data register 510, under control of clock 506, at a 2,400 bits per second (BPS) rate. When 32 bits of data, which comprise a modem data frame, have been accumulated in modern data register 510, the entire 32-bit modem frame is transferred, under control of clock 506 and divider 512, via gate 514 into parallel register 516.
When modem modulation is required, each 32-bit modem frame is transferred to ROM 68 and the second portion of the pipeline processor, where modulation is performed. In this process, each 32-bit modern frame is divided into l6 pairs of bits. Each bit pair is then used to determine the differential phase shift to be applied to one of 16 modem tones. The binary representations of the 16 modem tones, each bearing vocoder information via the quadrature DCP SK modulation, are then combined to form a composite FDM modem output signal. In the preferred embodiment of the invention, the 16 modem tones are:
11 2,035 Hz 12 2,145 Hz 13 2,255 Hz 14 2,365 Hz 15 2,475 Hz 16 2,585 Hz (It should be noted that because the 54-bit vocoder frames are converted first into a 2,400 BPS bit stream and then, in sequential order, into 32-bit modem frames, these 16 modem channels do not correspond with regard to data content on a one for one basis with the 16 vocoder channels.) Modem modulation is accomplished by cosine table look-up using the cosine table stored in ROM 72. (See FIG. 1). Memory 88, which is coupled to ROM 72, stores the ROM cosine table addresses for each modem channel. MODNO and CI-INO ROM 68, which contains the channel address increments for generating a tone representative of each channel as well as introducing the four-phase shifts required for four-phase DCP SK modulation are coupled by data switch to adder 84 where the necessary increments are added to the cosine table ROM 72 addresses stored in memory 88. The ROM 68 addresses necessary to access the proper channel address increments are determined from the 16 bit pairs transferred to ROM 68 from modern data processor 94 and the channel number transferred to ROM 68 from control unit 52. ROM 72 is coupled by data switch 47 to accumulator 48 where the sample values of each of the required tones are accumulated. These sample values are combined in accumulator 48 to produce the samples of the composite modem output signal representative of the 16 FDM modem channels to be transmitted. In the preferred embodiment of the invention each sample is represented by an eight bit word. Accumulator 48 is coupled to D/A 50 where the composite modern signal isconverted to analog form. D/A 50 may be coupled to the modulator of a conventional communications system (not shown).
Receive Mode In the receive mode, analog signals modulated by speech information in the manner just described are received by communications receiver 42 (see FIG. 1), which may be of conventional form. Communications receiver 42 is coupled to AID 44 where the analog received signal is sampled periodically and a binary representation of each analog sample is generated. As in the transmit mode, the sampling rate is 8,250 Hz, and each sample is represented by an eight-digit binary word.
The interconnection of the major components of the pipeline processor of FIG. 1 is the same in the receive mode as it is in the transmit mode. However, in the receive mode different functions are performed and different data switches are activated.
In the second portion of the pipeline processor, comprising multiplier 78, adders 84 and 86, and memory 88, modem demodulation is performed. The initial step in demodulation is the separation of the composite FDM signal into the 16 separate modem channels. This is accomplished by a DFT analysis of the composite signal and is carried out in the same manner and by the same apparatus (multiplier 78, adders 84 and 86, memory 88, ROM 68, ROM 72, and data switches 76 and 80) as is used for the spectrum analysis performed in the transmit mode. The resulting frequency coeffi- 9 cients which define each of the modem channels are stored initially in memory 88 and then in memory 66 which is'coupled from memory 88 by data switch 64. This makes available at the same instant, a current set of frequency coefficients (in memory 88) and the previously received set of coefficients (in memory 66) which, as will be discussed subsequently, is necessary for execution of the differential phase algorithm used to demodulate the DCP SK modern signal and recover the vocoder channel information. Memories 66 and 88 are coupled to multiplier 78 by data switches 74 and 76 respectively for the multiplication steps required in the differential phase computation. Multiplier 78 is coupled to accumulator 90 where the addition steps required in differential phase computation are carried out. Accumulator 90 is coupled to modern data processor 94 for storing ofeach 32-bit modem frame recovered by the demodulation process.
The transfer of data from modem to vocoder operation can best bedescribed with reference to FIG. 11b
which presents a more'detailed block diagram of the vocoder and modem data processor 94 and 96 of FIG. 1. The processing components of FIG. 1 1b are the same as those of FIG. 11a and are therefore similarly numbered. However the logical interconnection (via data switches which are not shown) is different. Accumulator 90 (FIG. 1) is coupled to parallel register 510. Upon occurrence of a read down pulse generated by clock 506 and divider circuit 512, each 32-bit modern frame is transferred in parallel via gate 514 into modern data register 516. Modern data register 516 is coupled to vocoder data register 500 into which data is transferred at a 2,400 BPS rate under control of clock 506. When 54 bits of data, which comprise a vocoder frame, have been accumulated in vocoder data register 500, the entire 54 bit frame is transferred, under control of clock 506 and divider 502, via gate504 into parallel register 508. r
Vocoder data processor 96 (FIG. 1) is coupled to decoder 97 which converts pitch frequency (which by convention is transmitted) to pitch period (which is required in impulse response synthesizer-46). In addition, when the 1,200 BPS delta coded mode is used, decoder 97 decodesthe delta modulation.
Decoder 97 is coupled to linear interpolator 61 where parameter filtering, the first step in the speech synthesis process, is performed. Parameter filtering is necessary to remove a 44.4 Hz noise component which results from vocoder data being supplied to the synthesizer at a rate of 44.4 frames per second (2,400 BPS/S4 bits per frame).
Impulse response synthesizer 46 (shown in more detail in FIG. 2), which synthesizes the impulse response of each vocoder channel, is coupled to multiplier 58 by data switch 54. Memory 62, which stores the vocoder amplitude parameters, is also coupled by data switch 56 to multiplier 58 where the product of the impulse response and the amplitude parameter of each vocoder channel is obtained. Multiplier 58 is coupled by data switch 47 to accumulator 48 where digital samples representative of the composite synthesized voice signal are formed. Accumulator 48 is coupled to D/A 50 where the analog composite of the speech signal is formed. D/A 50 may be coupled to a speech transducer (not shown) of conventional form.
Unvoiced sounds are indicated by the presence of zeros in all of the six pitch-representative bit positions of the vocoder frame. Those zeros are detected by circuitry in impulse response synthesizer 46, which gates the first portion-of the processor into the unvoiced synthesis mode. In this mode, the channel impulse responses are not used. Instead binary samples of sine waves at the centerfrequency of each channel are generated by cosine ROM 72, which is coupled to multiplier 58 by data switch 56. Memory 62, which stores the vocoder amplitude parameters, is also coupled by data switch 56 to multiplier 58, where the samples of the sine waves are modulated by the vocoder amplitude parameters. Noise generator 59 is coupled to adder 60 for further modulation of each of the sine waves. This processing produces, in effect, a band of noise centered at each vocoder channel frequency, and modulated by the appropriate channel amplitude parameter. The remainder of unvoiced synthesis processing by accumulator 48 and D/A 50 is the same as in voiced synthe- SIS. I
Control unit 52, which contains gating and. timing circuitry of conventional form, is coupled to all data switches and shift registers to control the flow of all data in the system.
Structural Details of Components of FIG. 1 System FIGS. 2 through 4 and 12 show in more detail the structure of the major components of the system of FIG. 1. (The constant inputs shown in these figures, viz, One or Zero, 0, K, 106 are internally generated by connecting the gate inputs to appropriate constant voltage levels.)
FIG. 2 Impulse Response Synthesizer 46 The impulse response synthesizer 46 of FIG. 1 is shown in detail in FIG. 2. Pitch counter 103 receives samples of filtered pitch from memory 62 (FIG. 1). Pitch counter 103 is coupled to pitch logic network 105 which transmits pulses to impulse flip-flop IMPFF 112 when each new pitch period should commence. Impulse ROM 116 stores the impulse responses for each of the 16 vocoder channels. In the preferred embodiment of the invention, the impulse response for each vocoder channel is represented by samples which are read out of memory at a 8,250 Hz rate. The addresses in ROM 116 which are to be accessed are generated by combining a channel time signal supplied by control unit 52 (FIG. 1) to addend register (ADR) 100, with index numbers which circulate in the fiveword circulating memory consisting of adder 101, index logic network 110 and memory 114. Each word in this loop circulates once per channel time. The channel time, 6.7 microseconds, is the time it takes to perform five calculations and make five data shifts. (See subsequent discussion of control unit 52 for' further discussion of timing.) Memory 114 is coupled to and controls the readout of data from impulse ROM 116. Pitch pulses supplied by pitch logic circuitry to impulse flip-flop (IMPFF) 112 cause IMPFF 112 to change to the set condition. IMPFF 112 which is coupled to index logic network 112 thenv causes the addresses in the circulating memory to be incremented by 1. IMPFF 112 is then cleared. Each address of the circulating memory continues to be incremented by one until all the impulseresponses are read out of ROM 116. At that time the index numbers in the circulating memory are reset to correspond to the addresses until the arrival of the next pitch pulse. Impulse ROM 116 is coupled to accumulator 119 which stores the samples of the impulse responses generated during each channel time.
FIG. 3 First Portion of Pipeline Processor FIG. 3 shows the first portion of the pipeline processor in more detail. The interconnection of addend register (ADR) 126, a'ugend register (AGR) 128, adder 130, and summer 132 which together are comprised in accumulator 48; buffers 144 and 146, ADR 148 and ADR 154, AGR 150 and AGR 158, adder 152 and summer 156 which together are comprised in adder 60; and multiplicand register (MC) 134, multiplier registers (MP) 136 and 142, multiplier 138 and product register 140 which together are comprised in multiplier 58 are shown. Buffers 160, 162, 164, 168, 170 and 172, which are connected to memory 66 serve to provide along with memory 66, a 9l-word circulating loop for computation of the autocorrelation coefficients during pitch extraction, and to allow for reorganization of words in memory 66 during modem differential phase computations. Reorganization of the words is required to change from calculating the real part of the differential phase vector to calculating the imaginary part. Pitch selection circuit 63, which comprises selection logic network 157, modulo 90 counter 155, pitch register 159 and ROM 161, determines, during the transmit mode, the pitch frequency from the autocorrelation data transferred to circuit 63 from the first portion of the pipeline processor.
Parameter filtering during voice synthesis operatio in the receive mode is accomplished by linear interpolator network 61 in conjunction with memory 62. As will be discussed subsequently, linear interpolation is accomplished by determining the difference between successive transmitted samples of vocoder data and then adding a portion of the difference to subsequent samples within a frame. The vocoder data frames are supplied to buffer 151 by decoder 97. Buffer 151 is coupled in turn to buffers 149, 147, and 145 which provide suitable storage and time delay necessary for the interpolator processing. Bufiers 151 and 147 are also coupled to adder 143 where the difference between successive samples is determined. Shift networks 137 and 139 and adder 141 then compute the required fraction of this difference, which is to be added to subsequent vocoder data. Shift network 139 is coupled to memory 62 where this addition takes place.
FIG. 4 Second Portion of Pipeline Processor FIG. 4 shows the second portion of the pipeline processor in more detail. Cosine ROM 72 comprises ROM 182 which stores 150 binary samples representative of a sinusoid. The frequency of the sinusoid produced is dependent upon the order in which the samples are read out, and the phase is dependent on which sample is selected as the starting point. Tone index register (TXR) 174, which contains the address of the next sample to be read out, is coupled to logic network 178. Modulation index register (MXR) 176 which contains the next address necessary to generate a tone having the proper phase to implement modem DCPSK modulation is also coupled to logic network 178. Logic network 178, which is coupled to index register (XR) 180, selects (depending on whether a vocoder or modern tone is required) either the contents of TXR 174 or MXR 176 to be loaded into XR180. XR 180 is coupled to ROM 182 for selection of the next sample to be read out. ROM 182 is coupled to tone register TR1184 where the sequence of samples necessary to generate each required tone is stored.
The interconnections of MC186, MP188, multiplier 190 and product registers 190 and 194, which are comprised in multiplier 78; ADR 196, AGR 198, adder-200 and summer 202 which are comprised in adder 84; ADR 204, AGR 206, adder 208 and summer 210, which are comprised in adder 8 6; and ADR 212, AGR 214, adder 216 and summer 218 which are comprised in accumulator are also shown.-
Adder 84 is coupled to and provides spectrum analyzer filter outputs to voicing decision circuitry 95. Voicing decision circuitry comprises scratch-pad registers 225 and 226 where data is held temporarily during the voicing computations, and flip-flop circuitry 227 where the voicing decisions are made..The decision as to presence of voiced or unvoiced sounds, based on energy in the spectrum analyzer filters, is made in the following manner:
First, by use of adder 84 and registers 225 and 226, a summation is made of the outputs from analyzer channels 1 through 5. These registers act as scratch pads in which data can be held temporarily for later reinsertion into the adder. The sum of the five lowest frequency channels, designated TOTS, is transmitted to memory 88. The summation process continues with the remaining 11 channels, so that the sum of all 16 channels is also formed. This quantity, designated TOT16, is also transmitted to memory 88. TOTS is then compared in logic circuitry 227 with a constant designated KZ, which is permanently wired in by connecting to appropriate constant voltages inputs to voicing logic and flip-flop circuitry 227. If TOTS is greater than or equal to KZ, voicing logic circuit 227 recognizes that fact as a partial requirement for a voiced condition. TOT 5 is also multiplied in multiplier 78 by a second permanently stored constant which is designated as KY, and the product is compared with TOT 16. If TOT 16 is less than the product of KY and TOT 5 and if the condition TOT 5 greater than or equal to [(2 has already been fulfilled, the voicing logic circuit 227 produces a 1 indicating a voiced frame. TOT 16 is also compared with a third constant designated KX. If TOT 16 is greater than or equal to KX the frame of data is also treated as voiced. If neither criterion'for vo icing is fulfilled, the voicing logic produces a 0. When a frame is unvoiced the pitch extractor output is forced to an all-zero condition. When a frame is voiced, the pitch extractor output is gated into the vocoder bit stream and stored in vocoder data processor 96.
Since an unvoiced sound is produced by turbulent air passing through a constriction of the mouth or throat, a large amount of high frequency noise will be present. Therefore, in the preferred embodiment of the invention, the test adopted to determine presence of voiced or unvoiced sounds makes use of the presence of a large amount of energy in the high portion of the frequency spectrum during unvoiced sounds.
The physical significance of the parameters KX, KY and K2 is as follows: i
KX is a high-threshold parameter, KZ is a low threshold parameter, and KY is a constant of proportionality. 1n the preferred embodiment of the invention, each vocoder amplitude parameter can have an integer value between and 127 (six bits). KX is set at 150, KY at 1.9 and K2 at 20. These values were chosen empirically by examining different values of TOT and TOT 16 in simulation work. TOT 5 represents the energy in the low-frequency portion of the speech spectrum, and TOT 16 represents the total energy in the speech spectrum. 1f TOT 16 is greater than or equal KX, the total speech energy is high, indicating presence of voiced sounds. If TOT 16 is less than the product of KY and TOT 5, low-frequency energy constitutes a significant portion of the total speech energy. In addition, if TOT 5 is greater than or equal to K2, the low frequency energy content of the speech wave exceeds at least a minimum amount. The processor then will determine the presence of voiced sounds whenever the speech energy is very high (TOT 16 2 KX) or when the speech energy is of medium amount and is concentrated in the low frequency region (TOT 5 z KZ and TOT 16 KY X TOT 5). An unvoicedcondition will occur whenever the total speech energy is either very low (TOT 5 KZ and TOT 16 KX) or is at medium strength concentrated in the high frequency region (TOT 5 2 KZ and TOT 16 KY X TOT 5).
FIG. 4 also shows the circuitry used to encode the vocoder output data in the transmit mode and the circuitry to decode the vocoder input data in the receive mode.
Encoder unit 93 which comprises register 220, counter 222 and logicand comparator network 224, operates in conjunction with ROM 92 and vocoder data processor 96 to encode the vocoder output data into a 2,400 BPS or a 1,200 BPS standard format. The output of ROM 92 is a three-bit word representative of the amplitude parameter of each channel. The least significant bit for channel 16 is forced to assume a 1 value, making an effective two-bit description for that channel with the constant l acting as a synchronization bit. For 2,400 BPS operation, no further processing is carried on in encoder 93 and the three-bit words are inserted directly into vocoder data processor 96, where they become part of the vocoder data frame.
However, for operation at 1,200 BPS, delta coding is required to maintain compatibility with conventional vocoder equipment. Channels 1, 2, 3, and are processed as in the 2,400 BPS case, in that the three-bit codes are inserted directly into vocoder data processor 96. The codes for channels 3 and 10 are also inserted into counter 222 which is an up/down counter with preset capabilities and a round-off feature. This feature causes the counter to remain unchanged if it contains a minimum count and receives a step-down signal, or if it contains a maximum count and receives a step-up signal. ROM 92 is coupled to register 220 to which the three-bit word for each of the remaining channels (4 through 9 and 11 through 15) is transferred. Register 220 and counter 222 are coupled to logic and comparator network 224 where their contents are compared. Network 224 is also coupled to vocoder data processor 96. If the contents of register 220 are greater than the contents of counter 222, a 1" is gated to the vocoder data processor as the one-bit delta code for that channel. Counter 222 is then stepped-up by 1 subject to round-off. If the contents of counter 222 are greater than register 220,- a 0" is gated to vocoder data processor 96, and counter 222 is stepped down subject to round-off. After all channels have been processed,'a 1 is gated into the vocoder data processor as a synchronization bit.
Decoder unit 97, which comprises pitch ROM 229, input register228, reference register 230, decode logic 231, and decode ROM 233, converts received vocoder data into a format suitable for vocoder synthesis. Vocoder data processor 96 is coupled to pitch'ROM 229 where the six pitch bits which are representative of pitch frequency, are converted to a six-bit word representative of pitch'period. .When the 2,400 BPS format is used, this is the only decoder function performed. When the 1,200 BPS delta coded format is used, the remainder of the decoder circuitry functions to convert the delta coded information into the standard three bits per channel format.
- FIG. 12 Control Unit 52; Timing FIG. 12 shows a portion of control unit 52 of FIG. 1, and timing diagrams which illustrate basic system timmg.
Crystal oscillator 518 provides the basic 5.94 MHz clock source from which all processor timing pulses are derived. Oscillator S18 is coupled to counter 520 which divides the 5.94 MHz frequency modulo 8. The outputs of the three stages of divider 520, designated (111, (#3, and d 7 are used to control operation of all arithmetic units. They each provide outputs at 1.347 microsecond intervals, which is designated as the system word time.
This is the time between processor calculations and data shifts. Counter 520 is coupled to counter 522 which counts modulo 5 and thereby provides time slots for the execution of five complete consecutive operations in each arithmetic unit of the processor within a 6.734 microsecond interval designated as channel time. Counter 522 is coupled to counter 524 which counts modulo 18 to provide time slots for groups of 18 channel times. A complete cycle of counter 524 takes place every 0.1212 milliseconds and corresponds to the system sampling rate of 8.25 KHz. Thus, since there are 18 channel times during each sampling interval, computations for the 16 vocoder channels or the 16 modem channels can be performed consecutively with two additional channel times available for auxiliary functions. A complete processing cycle for vocoder analysis, and pitch extraction takes place during sampling intervals. Counter 526 which is coupled from counter 524 provides capability for counting each such processing cycle.
Theory of Operation Mechanization of the algorithms used to implement the vocoder and modern functions will be explained with the aid of flow diagrams shown in FIGS. 5 through 10. The reference numerals shown in parentheses within the logic boxes of the flow diagrams refer to the particular apparatus of FIGS. 1 through 4 and 11 and 12 by which the particular logical operation is carried out. The unparenthesized reference numerals designate respective steps of the algorithm.
Autocorrelation Pitch Extractor The input signal, f(t), is multiplied by a stored replica of itself delayed by 1', f(t 1'). The product is time-integrated over the interval to 'r, and the. integral is averaged over 7. The function Mr) is evaluated for various values of 'r. The value of T which yields the largest value of Mr) is taken to correspond to the fundamental pitch period of the speakers voice. In the actual mechanization of this algorithm, the autocorrelation function of equation (1) is approximated by:
where T is the sampling interval (l/8250 Hz) and m is chosen so that mT equals the maximum expected pitch period. In the preferred embodiment of the invention, m is set to the maximum value at the beginning of each voiced interval and thereafter adjusted to the period previously found for the speakers voice.
, FIG. 5 Flow Diagram of Algorithm for Computing Pitch Period FIG. 5 shows the flow diagram for this algorithm. The algorithm consists of two phases: computing the autocorrelation function as approximated by equation (2), and determining the value of 1' for which the autocorrelation function peaks. To compute the autocorrelation function, the input speech wave, in step 230, is sampled at the system sampling rate (8,250 Hz) by A/D converter 44 and the samples are converted to digital form. Each pitch extraction interval (or frame) comprises 180 sample times. During the first half-frame (90 samples), each incoming sample is sequentially multiplied, step 234, by each of the preceding samples in the frame. The delay is obtained by circulating, step 232, the preceding samples in a 9l-word input sample delay line (ISDL). The products which are obtained are accumulated, step 238, in a 90-word correlation accumulator delay line (CADL). During this process the two delay lines are recirculated synchronously. Because of the one word difference in delay line lengths a slippage between samples being correlated occurs at the rate of one sample per delay cycle. This allows the 90-word delay line to accumulate, step 236,
in successive words the cumulative sums of autocorrelation products taken between samples separated by l to m. The additional word in the 9l-word delay line also permits the insertion of each incoming sample into that delay line during the first half-frame. At the end of the first half-frame, the 90-word delay line contains sums of from one to 89 terms representing correlation products of samples separated by a delay time of from one to 89 sample times. During the second half-frame, multiplication continues and one word per sample time is transferred to correlation accumulators located in memory 62. Thus at the end of 180 sample times, the correlation accumulators each contain the sum of 90 correlation products representing pitch periods of from .16., one to 90 sample time intervals. In the peak picking phase, each autocorrelation sum is transferred, step 240, to a first comparison register. The contents of the first comparison register and a second comparison register, which is initially set to 0, step 246, are then compared 242 by subtraction. If the number in the first register is greater than the number in the second register the contents of the first register are transferred into the second register for subsequent comparisons and a pitch count 250 corresponding to r is also gated, step 248, into the pitch register. Thus when all correlation sums have been processed, the maximum value of all stored values of R(r) will reside in the second comparison register. The corresponding value of r is equal to the pitch period. Logic step 244 is provided to: insert the greater input from each comparison into second com-, parison register 246, to set the second comparison regist'er 246 to 0 at the beginning of each frame, and to disable comparison 242 except during a prescribed interval optimized to minimize false autocorrelation peaks.
During the first voiced frame, the prescribed interval is selected from accumulated sums n 27 to n 90 corresponding to the upper pitch frequency of 305 Hz (8,250/27 and a minimum pitch of 92 Hz (8,250/90) respectively. During the remainder of the voiced interval the searched region is limited to within plus or minus 20 samples of the r at which a peak was found in the last frame.
The range of measurable pitch periods can be changed to cover the pitch frequency range of Hz to 300 Hz conventionally used in vocoders by adding an additional delay between A/D conversion 230 and one word delay 232. If for example a 27-word delay were to be inserted, pitch periods corresponding to frequencies from 70 Hz to 300 Hz would be measurable.
Spectrum Analyzer A spectral analysis equivalent to that performed by a conventional channel vocoder analyzer is performed by using a computation of the discrete Fourier transform (DPT) of the speech wave.
The DFT is characterized by:
where A, the rth Fourier coefiicient,
x (nT) is the sampled waveform to be analyzed,
T= the time between sample points (118250 Hz),
n the nth sample, and
m the number of sample points to be analyzed (i.e., mT is the analysis frame time).
Analysis of a fixed number of unweighted samples (i.e., no adjustment of the amplitude of the samples) of the speech input is equivalent to analyzing the speech input as seen through a rectangular window in the time domain. The equivalent vocoder filter that would result would have a (sin x)/x shape (i.e., the Fourier transform of a rectangular time function). In order to reduce the spectral contamination between vocoder filters a triangular window or weighting function, w,(nT), where v represents the vth channel of the analyzer, is
van 1 M'I'l where Af, is the frequency difference from the filters center frequency. The magnitude of the function of equation (4) falls to within about 3dB of its peak value when Therefore the relationship between the bandwidth, 8,, of the vth filter and the length of the analyzer fram m,,t, necessary to achieve this bandwidth is:
B,,=2Af,, =4/1rm,,T. Table I shows the required frame times and number of samples necessary to simulate the l6 channels of the vocoder analyzer. The values chosen for fand Bare those conventionally used in vocoder practice.
' TABLE I Parameters for DFT Analyzer fv Bv m,,T m, channel center bandwidth window samples no. frequency (Hz) length (T,,,,,,=0.l2l2rnsec) (H (msec) Use of this algorithm for vocoder analysis permits direct achievement of filter banks having non-equal bandwidths (which conforms to conventional vocoder practice) without recourse to the combining of the outputs of many equal-bandwidth filters, as is commonly done in other systems.
Thus by substituting in equation (3) the sampled waveform to be analyzed for the vth filter,
x, (nT) w, (nT) f(nT), wheref(nT) is the sampled speech wave, the required computation is: Y
r is the ratio of the vth'channel center frequency, fv, to the basic frequency spacing of the Fourier series 1/m,,T. Therefore by substituting r-==f,,m,,T in equation (6), the
DFT coefficients can be represented by:
In order to avoid computation in complex arithmetic and thereby minimize equipment complexity, the rela' tionship e =cos9+jsin6 s is inserted into equation (7), to'yield:
mv 1 2 1/ (2 f(nT)w,(nT) sin 21rrf,T):|
FIG. 6 shows the flow diagram for implementation of equation (9). In step 252, the input speech wave,f(z), is sampled and the samples are converted to digital form. In step 254, the train of samples representative of the speech wave, f(nT), is multiplied for each channel by the triangular weighting function, w(nT), and in steps 256 and 264 respectively the latter product is multiplied in one branch by the cosine coefficients and in another branch by the sine coefficients. In steps 258 and 266 respectively the results of steps 256 and 264 are then added by circulating, steps 260 and 268, the contents of the accumulators through the adders. After the required number of samples in the analysis frame has been processed, the contents of the accumulators are squared 262 and 270, and added 272 in pairs. The result represents the output of each of 16 vocoder analyzer channels, which is then encoded by encoder unit 93 into the standard 54-bit vocoder format.
Modem Modulation Modern modulation consists of three processes:
a. tone synthesis, which is the generation of binary words respectively representative of the 16 modem carrier tones (sine and cosine functions for the DFT processes used in modern demodulation and vocoder analysis are generated in the same manner),
b. modulation, which consists of imparting the information carrying four-phase DCP SK modulation to the tones, and (0) generation of a modem preamble.
One hundred and fifty samples representative of one cycle of a sinusoid are permanently stored in ROM 72. If these samples were continuously read out, in order, at the systems sampling rate of 8,250 Hz, a 55 Hz sine or cosine wave would be synthesized. However, if at each sampling time, the ROM address were incremented by 'p, instead of l, a sinusoid of p times 55 Hz nun s would be generated. By choosing p, any tone which is a multiple of 55 Hz can be synthesized.
FIG. 7 Tone Synthesis FIG. 7 is the flow diagram for tone synthesis.
The ROM 72 (FIG. 1) addresses necessary to generate the tones for each channel are called TONEX and are stored in memory 88. The channel address increments corresponding to p which are necessary to generate all the different frequencies are called channel numbers (Cl-INO) and are stored in CHNO ROM 68. In order to synthesize the tone, the ROM 68 address corresponding to the required CHNO is supplied to CHNO ROM 68 by the channel counter (counter 524 of FIG. 12, which counts modulo 18). The Cl-INO is then selected (step 274) and stored (step 276) in CHNO register 70 (FIG. 4). The CI-INO is then added (step 278) to the current value of TONEX stored (step 286) in memory 88, and the sum is stored as the new TONEX. TONEX is then used to access (step 288) the cosine table ROM. Although access (steps 290 and 292) to both cosine and sine ROMs are shown in the flow diagram, from an apparatus standpoint these steps represent access to the same apparatus, viz, ROM 72. From the 150 sinusoid samples stored therein either sine waves or cosine waves can be generated merely by choosing the correct order of readout of samples.
Since one cycle of a sinusoid is represented by up to 150 samples, the addition step is performed modulo 150. This is accomplished by performing two series additions (steps 278 and 282) and testing the sums (step 284). In the first addition (step 278), CI-lNOis added to the current TONEX value. The number 106 is then added (step 282) to that sum, to form a second sum. If the second sum does not exceed 255 the first sum which has been temporarily stored (step 280) is loaded into memory (step 286) and becomes the new TONEX. If the second sum exceeds 255, the eight least significant bits of the second sum (which is of length nine bits) are loaded into memory (step 286) and become the new TONEX.
FIG. 8 Modulation FIG. 8 is the flow diagram for modulation. As can be seen, the sequence of operations is similar to that depicted in the flow diagram for tone synthesis (FIG. 7).
The ROM 72 (FIG. 1) addresses necessary to generate the tones for each channel are called MODX and are stored in memory 88. The channel address increments necessary to generate the required tone with the four phase changes are called modulation numbers (MODNO) and are stored in MODNO ROM 68. In order to synthesize a tone with the required phase shift, the ROM 68 address corresponding to the required MODNO is determined by combining the Cl-INO supplied by the channel counter (524 of FIG. 12) with the bit pair corresponding to that channel, which is' supplied (step 298) by modern data processor 94. The MODNO is then selected (step 300) and stored (step 302) in MODNO register 70 (FIG. 4). The MODNO is then added MODULO 150 (steps 304, 306, 308, 310 and 312) to the current value of MODX and the sum is stored (step 314) as the new MODX. MODX is then used to access (step 316) the cosine table ROM. In order to form the composite MODEM signal, the samples of each individual tone are summed (step 318) and the sums are accumulated (step 320). The digital composite is then converted (step 322) to analog form to form the analog modem composite.
The processor also has the capability for generating a MODEM preamble which can be transmitted for synchronization purposes prior to the transmission of data. In the preferred embodiment of the invention, the MODEM preamble comprises a 605 Hz Doppler tone and a synchronization tone at either 1,705 Hz or 2,915 l-Iz having 1,800 phase shifts. This preamble conforms to conventional modem practice. Since the tones required by the preamble are multiples of 55Hz, the processing steps are similar to those shown in FIGS. 7 and 8 with the only change being the prevention of the unwanted data tones from being accumulated (FIG. 8, step 320).
FIG. 9 Demodulation '21rrk 21.: (m. ex
where x is the kth sample of the composite and N is the number of sample points to be analyzed. Using equation (8), equation (10) can be transformed to:
Multiplying the numerators and denominators of the arguments of the trigometric functions of equation (ll) by T, the system sampling time, and substituting x (nT) (samples of the continuous function, x(t)) for x and W, for (21rr)/NT yields:
N-l A.= 2 x(nT)[c0s W,nT-j sin WmT] H YL.=.Q.......--......... ll?) By letting the first sample within a transform be represented by the Kth sample of the continuous function, equation (12) may be written:
K+74 Ar 2 a:(nT)[cos W.(nT)j sin W.(nT)],
where x(nT) represents the nth sample of the composite analog signal. Separating equation-( I 3) into real and imaginary parts yields:
K-l- A (real): 2 X(nT) cos W.(nT)
4 K+74 I A} (imag)= 2 X(nT) sin W.(nT).
tained by multiplying samples of the composite by samples of the cosine and sine functions at that frequency and summing the products in accumulators for 75 samples.
This algorithm is illustrated in the left-hand portion of the flow diagram of FIG. 9. The received modem composite signal is sampled and the samples are converted to digital form (step 324). Each sample is multiplied by the corresponding sample of the appropriate cosine wave (step 326) and sine wave (step 338). The products are accumulated (steps 328 and 340) as per equations 14) and (15) and the real, CR, and imaginary, CI, parts of the current coefficients are stored (steps 330 and 342) in the current real accumulator (CR-ACC) and the current imaginary accumulator (Cl-ACC) respectively. At-the end of each modem frame the contents of the accumulators are transferred (steps 336 and 348) to the delayed real memory (DR-MEM) and delayed imaginary memory (DI-MEM) for differential phase calculation.
Calculation of differential phase makes use of the principle that the complex product of a first vector and the complex conjugate of a second vector yields a third vector whose magnitude is the product of the magnitudes of the first and second vectors and whose phase is equal to the difference in phase between the first and second vectors.
L ULHLL computation of the vector product of the current frequency coefficient, A CR +jCI, and the complex conjugate of the previously received coefficient, A* M DR-jDI. The product thus obtained results in a differential phase vector, Ad), where, Ad) Ar A*r-l (CR-DR+CI'DI)+j(CIDRCR-DI). (16) The phase of this vector Ad: is equal to the difference in phase between Ar and Ar-l.
The fiow diagram for the differential phase calculation is shown in the right hand portion of FIG. 9. The real part of A4) iscalculated by multiplying CR and DR (step 334) and CI and DI (step 346) and then adding (step 354). The imaginary part of A is calculated by multiplying CR and DI (step 350) and CI and DR (step 352) and adding (step 360).
The remaining steps shown in FIG. 9 are used for diversity combining. Either in-band or out-band diversity may be used. In out-band diversity, two 2,400 BPS modem composites are received and combined. In in-band diversity, a 1,200 BPS transmission rate is used, with the 32 bits transmitted in each modem frame actually consisting oftwo identical sets of 16 bits each. The A4) vector is computed for each channel as previously described. Then the real parts of duplicate channels are summed (step 356) and stored (step 358) and the imaginary parts of duplicate channels are summed (step 362) and st ored (step 364). The most significant bit, which is the-sign bit, of each of the real and imaginary parts of the Ad vector are combined (step 366) to form a bit pair containing the four-phase information forthat channel. This process is continued for each channel until the 16 bit pairs constituting a complete modem frame are available in the output register of the modem data processor 94.
FIG. 10 Voiced and Unvoiced Synthesis The method of voiced synthesis uses a time-domain version of the inverse Fourier transform which, like the DFT analyzer, produces the effect of a vocoder filter bank. Impulse synthesizer 46 forms a sinusoidal oscillation, for each of the 16 channels, at the channels center frequency with a triangular window function imposed upon it. The effect, for each channel, is a sampled-data equivalent of the result of ringing with an impulse, a bandpass filter having a triangular envelope characteristic. Each channel oscillation is multiplied by its corresponding amplitude parameter which has been suitably filtered. All channels are then summed together to form the equivalent impulse response of a vocoder filter bank. New response waveforms are generated at intervals determined by the speakers pitch frequency and added into the remaining portions of waveforms which have been generated but have not finished ringing.
Unvoiced synthesis is accomplished by generating 16 white-noise waveforms, low-pass filtering each one, and heterodyning each with a sine wave at the center frequency of a vocoder channel. The result is a spectral distribution in which a symmetrical noise distribution occurs around each channel center frequency, but the noise in each channel band is unrelated to any other band. As in the voiced case, each channel signal is modulated by a filtered amplitude parameter.
FIG. 10 is the flow diagram for voiced and unvoiced synthesis.
The initial processing steps describe the conversion of received pitch words into pitch pulses to be used for vocoder synthesis. The pitch bits received from vocoder data processor 96 are detected (step 368) to determine the presence of voiced or unvoiced sounds. If the pitch bits are all zeros, unvoiced sounds are determined to be present and pitch pulse generation is inhibited. If the pitch bits are not all zeros, voiced sounds are indicated and pitch pulses are generated in the following manner:
The six-bit pitch frequency code is converted (step 370) by a 1]]: function into a number denoting the pitch period in terms of a number of sample times (i.e., n X l/8.25 KHz). In steps 372 through 376 a linear interpolation filtering operation is performed on the number to produce a smooth pitch variation (with time). The pitch signal is then gated into a six-bit digital count down circuit (step 380). Once during each sample time, the count down circuit is decremented by one. When its contents equal zero, a pitch pulse is generated and transmitted (step 384) to impulse flip-flop 112 (FIG. 2). Generation of the pitch pulse also enables gating of the next pitch word into the count down circuit (step 380). Steps 386, 388, and 390 illustrate the generation of the addresses necessary to read the channel impulse responses out of impulse ROM 116 (FIG. 2). Impulse response samples are read out (step 408)
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US3360610 *||May 7, 1964||Dec 26, 1967||Bell Telephone Labor Inc||Bandwidth compression utilizing magnitude and phase coded signals representative of the input signal|
|US3403227 *||Oct 22, 1965||Sep 24, 1968||Page Comm Engineers Inc||Adaptive digital vocoder|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US4310721 *||Jan 23, 1980||Jan 12, 1982||The United States Of America As Represented By The Secretary Of The Army||Half duplex integral vocoder modem system|
|US4468813 *||Dec 6, 1982||Aug 28, 1984||Motorola, Inc.||Digital voice storage system|
|US4495647 *||Dec 6, 1982||Jan 22, 1985||Motorola, Inc.||Digital voice storage mobile|
|US4620294 *||Sep 9, 1983||Oct 28, 1986||Cts Corporation||Digital signal processor modem|
|US4775951 *||Nov 24, 1986||Oct 4, 1988||Computer Basic Technology Research Association||Correlation function computing device|
|US4799144 *||Oct 12, 1984||Jan 17, 1989||Alcatel Usa, Corp.||Multi-function communication board for expanding the versatility of a computer|
|US5216747 *||Nov 21, 1991||Jun 1, 1993||Digital Voice Systems, Inc.||Voiced/unvoiced estimation of an acoustic signal|
|US5581656 *||Apr 6, 1993||Dec 3, 1996||Digital Voice Systems, Inc.||Methods for generating the voiced portion of speech signals|
|US5701390 *||Feb 22, 1995||Dec 23, 1997||Digital Voice Systems, Inc.||Synthesis of MBE-based coded speech using regenerated phase information|
|US5715365 *||Apr 4, 1994||Feb 3, 1998||Digital Voice Systems, Inc.||Estimation of excitation parameters|
|US5754974 *||Feb 22, 1995||May 19, 1998||Digital Voice Systems, Inc||Spectral magnitude representation for multi-band excitation speech coders|
|US5826222 *||Apr 14, 1997||Oct 20, 1998||Digital Voice Systems, Inc.||Estimation of excitation parameters|
|US5867575 *||Oct 20, 1995||Feb 2, 1999||Creative Technology Ltd.||Optimized interface between audio/modem board and sound board|
|US6014366 *||Apr 15, 1997||Jan 11, 2000||Nec Corporation||Variable-bandwidth frequency division multiplex communication system|
|US6131084 *||Mar 14, 1997||Oct 10, 2000||Digital Voice Systems, Inc.||Dual subframe quantization of spectral magnitudes|
|US6161089 *||Mar 14, 1997||Dec 12, 2000||Digital Voice Systems, Inc.||Multi-subframe quantization of spectral parameters|
|US6199037||Dec 4, 1997||Mar 6, 2001||Digital Voice Systems, Inc.||Joint quantization of speech subframe voicing metrics and fundamental frequencies|
|US6377916||Nov 29, 1999||Apr 23, 2002||Digital Voice Systems, Inc.||Multiband harmonic transform coder|
|WO1985001407A1 *||Sep 5, 1984||Mar 28, 1985||Cts Corporation||Digital signal processor modem|
|U.S. Classification||375/216, 324/76.47, 324/76.12, 324/76.55|
|International Classification||H04B1/66, H04L27/233, H04L27/20|
|Cooperative Classification||H04L27/2338, H04B1/66, H04L27/205|
|European Classification||H04L27/20D1B2, H04L27/233J, H04B1/66|