|Publication number||US3631520 A|
|Publication date||Dec 28, 1971|
|Filing date||Aug 19, 1968|
|Priority date||Aug 19, 1968|
|Also published as||DE1941336A1, DE1941336B2, DE1941336C3|
|Publication number||US 3631520 A, US 3631520A, US-A-3631520, US3631520 A, US3631520A|
|Inventors||Bishnu S Atal|
|Original Assignee||Bell Telephone Labor Inc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (5), Non-Patent Citations (1), Referenced by (72), Classifications (12)|
|External Links: USPTO, USPTO Assignment, Espacenet|
 Inventor Bishnu S. Ate] Murray Hill, NJ.  Appl. No. 753,408  Filed Aug. 19, 1968  Patented Dec. 28, 1971  Assignee Bell Telephone Laboratories, Incorporated Murray Hill, NJ.
 PREDICTIVE CODING 0F SPEECH SIGNALS 1 1 Claims, 7 Drawing Figs.
 US. 179/1 SA, 325/38 B, 325/41  Int. Cl G101 1/06  Field of Search 325/41,42, 38.1;179l15.55,1SA;332/l1D  References Cited UNITED STATES PATENTS 2,732,424 1/1956 O1iver 179/15.55 3,236,947 2/1966 Clapper. 179/1555 3,305,854 2/1967 Witt 325/41 3,026,375 3/1962 Graham 325/38.l
SPIE'ECH LPF SAMPLER DELAY (3KHz) (6KHz) T (60 SAMPLES) l0 ll 12 Primary Examiner-Kathleen H. Claffy Assistant Examiner-Jon Bradford Leaheeny Attorneys-R. J. Guenther and William L. Keefauver ABSTRACT: Predictive coding of signals, i.e., the reduction or redundancy in a signal by subtracting from it that part which can be predicted from its past, is a well-known technique for reducing the channel capacity required to transmit a signal with specified fidelity. It has been widely applied to signals, such as television signals which have regularly repeating intervals of information, but has not been satisfactorily applied to signals, such as speech, which exhibit characteristics that vary from speaker to speaker and from time to time for one speaker. According to this invention, an adaptive predictor is employed which is readjusted periodically to match the time-varying characteristics of a speech signal.
*W 7 OUT MULTIPLEXER ADAPTIVE PREDICTOR PREDICTOR PARAMETER /40 COMPUTER PATENTED 0EC28 um SHEET 3 [IF 5 PATENTEDUEC28TBTI 3631.520
SHEET 5 [IF 5 FIG. 5
' SPEECH IN k5; DIFFERENCE -----uu- SAMPLER NETWORK T ADAPTIVE A PREDICTOR b,k,oc
(OUT /40 PREDICTOR PARAMETER COMPUTER [Q ouT FIG. 7 1 I SPEECH IN E 01 FFERENCE 8N\ NETWORK SAMPLER (DIFFERENCE SIGNAL k (T 0, l l) ADAPTIVE A PREDICTOR b,k,(I
m ETTATRET PR EDICTOR PARAMETER 3 Z SIGNALS COMPUTER E (I U U7 /Q PREDICTIVE CODING OF SPEECH SIGNALS BACKGROUND OF THE INVENTION This invention relates to the efficient encoding of communication signals and to the reduction of the channel capacity required to transmit them. More particularly, it relates to the predictive coding of speech signals. It has for its object a reduction of redundancy in speech signals so that'the signals may more economically be transmitted to a receiver station.
I Field of the Invention The aim of efficient coding methods is to reduce the channel capacity required to transmit a signal with specified fidelity. To achieve this objective, it is often essential to reduce the redundancy of the transmitted signal. One well-known procedure for reducing the signal redundancy is known as predictive coding. In predictive coding, redundancy is reduced by subtracting from the signal that part which can be predicted from its past. For many signals, the first order entropy of the difference signal is much smaller than the first order entropy of the original signal; thus, the difference signal is better suited to encoding for transmission than the original signal. Predictive coding thus offers a practical way of coding signals efficiently without requiring large capacity storage facilities.
2. Discussion of the Prior Art One of the principal methods for efficiently encoding communication signals for transmission involves removing inherent signal redundancy through the use of prediction apparatus at both the transmitter and the receiver of a system. The current value of the signal is estimated at both locations by linear prediction based on the previously transmitted signals. The difference between this estimate and the true value of the signal is quantized, coded and transmitted to the receiver. At the receiver, the decoded difference signal is added to the predicted signal to reproduce the input speech signal. So long as a good prediction of the present signal value can be made, efficient coding may take place. However, speech is nonstationary so that a predictor with fixed coefficients fails to predict the value of a speech signal efficiently. For example, the speech signal is approximately periodic during voiced portions; thus, a good prediction of the present value can be based on the value of the signal exactly one period earlier. However, the period varies with time so that the predictor must change with the changing period of the input speech signal. Thus, since speech signal statistics are not constant, it is necessary that the prediction parameters be varied in accordance with the nature of the incoming signal to adapt the predictor to the needs of the signal.
SUMMARY OF THE INVENTION In accordance with the present invention, redundancy in speech signals is reduced by predicting the present value of the signal from its past and by subtracting the predicted value from the present value. To accommodate the constantly changing character of speech, a form of predictive coding is employed in which the predictor adapts to changing signal conditions. For speech signals, past signal intervals are selected for prediction that are comparable to individual pitch periods. The extent of the period and the magnitude of the signal within the period are, in accordance with the invention, periodically redefined. Preferably, the parameter signals controlling the predictor are changed every milliseconds. Such an interval has been found to be satisfactory for accommodating the time-varying characteristics of the input speech signal. The predictor parameter values are selected to minimize the power in the difference signal averaged over S-msec. intervals. As such, the predictor parameter values constitute slowly varying signals which can be transmitted efficiently.
To prevent errors that might be introduced in the system as a result of the prediction and subsequent encoding processes from being accumulated, it is in accordance with the invention to reconstruct speech samples at the transmitter and to perform the prediction operation on reconstructed speech signals and not on input speech signals.
The difference between the present value signal and the predicted value of the signal, if any, is eventually encoded and transmitted to a receiver station together with the slowly varying parameter signals which characterize the prediction.
Unlike previously proposed speech coding methods, the predictive coding system of this invention accurately reproduces a speech waveform rather than its spectrum. Listening tests show that there is only slight, often imperceptible, degradation in the quality of speech reproduced after transmission. In addition, experiments indicate that binary difference signals and predictor parameter signals prepared in accordance with the invention together can be transmitted at rates of less than 10 kilobits per second, or several times less than the rate required for ordinary pcm encoding with comparable speech quality.
Since the difference signal developed in accordance with the invention is the result of continuous efficient prediction, it has low or zero intelligibility. It may be used, therefore, together with the signals representative of the parameters of the adaptive predictor, which are themselves unintelligible, to provide secure transmission of speech signals. Only a recipient of the signals with the appropriate decoding apparatus can properly reconstruct the signals. Moreover, as a feature of the invention, the predictor parameter signals may themselves be suitably scrambled for transmission. Only if the appropriate key to scrambling is known can they be properly recovered. Similarly, the difference signal and the parameter signal may be transmitted via independent channels as opposed to being multiplexed for transmission as a composite signal.
DESCRIPTION OF THE DRAWINGS The invention will be fully apprehended from the following detailed description of a preferred illustrative embodiment thereof taken in connection with the appended drawings in which:
FIG. I is a block schematic diagram of a transmitter station which illustrates the principles of the invention;
FIG. 2 is a block schematic diagram of a receiver station constructed in accordance with the principles of the invention;
FIG. 3 is a block schematic diagram of an adaptive predictor suitable for use in the practice of the invention;
FIG. 4 is a block schematic diagram which illustrates the construction of a suitable predictor parameter computer,
FIG. 5 is a block schematic diagram of a suitable arrangement for computing the values of parameter a used for adjusting an adaptive predictor used in the practice of the invention,
FIG. 6 is a block schematic diagram of a transmitter station in accordance with the invention in which the difference signal and the several parameter signals are conveyed to a receiver station by separate transmission facilities, and
FIG. 7 is a block schematic diagram which illustrates the manner in which the parameter signals may be scrambled for transmission.
DETAILED DESCRIPTION A predictive coding system for speech signals in accordance with the invention, includes: a transmitter (FIG. 1) for converting an input speech signal to a low-bit rate digital signal for transmission to a receiver; a predictor parameter computer (FIG. 4) to calculate the parameters of an adaptive predictor (FIG. 3); and a receiver (FIG. 2) to synthesize a speech signal from the received digital signal.
Transmitter A block diagram of a transmitter which illustrates the principles of the predictive coding system of the invention is shown in FIG. 1. An input speech signal supplied at an input terminal is initially filtered in conventional low-pass filter l0 and sampled in sampling unit II. In accordance with the wellknown sampling theorem, the sampling rate is twice the cutoff frequency of the filter. A suitable sampling rate for speech signals is 6 kilohertz, so that low-pass filter 10 has a cutoff frequency of 3 kHz. Speech samples from sampler 11 are delayed by an interval of 60 samples msec.) by delay line 12 and delivered to one terminal of differencing network 13, for example, a subtracting network. (Since sampler 11 converts the input speech signal to a sequence of brief samples, i.e., to digital form, it is appropriate to consider the operation of the circuit on a sample-by-sample basis).
A predicted value 2,, of the speech sample, obtained by predieting (in network 30) the present value of the signal on the basis, for example, of the value of past samples r,,, r is delivered to a second terminal of network 13. The difference, 8 between 8,, and Z if any, issuing from the network is next supplied to an adjustable gain amplifier l4 and altered in amplitude by a factor Q. The resultant signal is thereupon quantized to one of two levels, for example, in two-level quantizer 15. To provide for the construction of a predicted value of the signal, the signal developed by quantizer is altered in amplitude by a factor HQ in amplifier l6 and supplied to one terminal of adder network 17. The predicted value Z is supplied to the other terminal of adder 17. The sum of these signals, designated r forms the reconstructed signal. Note that at the Nth sampling instant, predictor 30 uses only earlier samples (N-l, N2,...) of the reconstructed signal. The current sample r of the reconstructed signal is formed after the difference signal 8 is quantized and added to the predicted value Z Adaptive predictor 30, which may be of the form illustrated in FIG. 3, is periodically adapted to accommodate changing signal conditions, for example, in accordance with parameter signals developed in computer 40. Details of a suitable computer are discussed hereinafter with reference to FIG. 4. Predictor parameter computer 40 operates on signal samples supplied directly from sampler 11, and hence on signals in advance of their interaction in the differential operation, since signal S is delayed in unit 12 for a time sufficient to allow computer 40 to complete its operations. With the form of computer employed in the illustrative embodiment, it has been found that all computer operations can be completed in the time required for approximately 60 samples; signals supplied directly from sampler 11 thus are designated S Parameter value signals, designated b, K and a, are thus developed to denote selected characteristics of the speech signal on the basis of intervals corresponding nominally to pitch periods of the signal. Parameter signal X represents the duration of a pitch period of the applied signal, and parameter b represents the relative amplitudes of corresponding signal values in contiguous pitch periods used in the prediction operation. Parameter signals a are amplitude factors related to the formant structure of the vocal-tract transmission function and to the spectral envelope of the vocal source. Similarly, computer 40 develops a signal, designated Q representative of the gain of amplifiers 14 and 16 (and their counterpart at the receiver).
The binary signal at the output of quantizer l5, parameter signals for adjusting the predictor, and the signal Q, representative of the gain of amplifiers 14 and 16, thus constitute components of the transmitted signal. They may be combined for transmission to a distant station in any desired manner. For example, the binary signal at the output of quantizer 15 may be supplied directly to multiplexer 18 and the parameter value signals b, K and or, and the signal 0 may also be supplied to multiplexer 18 for composite transmission to a receiver station. Alternatively, to achieve a degree of signal security, the several signals may be transmitted via independent channels as shown in FIG. 6. Moreover, additional security may be achieved by scrambling the parameter signals according to a known code prior to transmission. A suitable arrangement is illustrated in H6. 7. Scramblers suitable for cryptically encoding signals are known to those in the art. It is obvious that scrambled parameter signals and the difference signal may be transmitted to a distant station in any desired fashion, for example, by multiplexing as illustrated in FIG. 1 or by transmission over diverse paths as shown in FIG. 6.
Receiver A block diagram illustrating the various functions performed by a receiver constructed in accordance with the invention is shown in FIG. 2. Demultiplexer 21 serves to separate the various components of the composite signal received at an input terminal, namely, the quantized difference signal, signals denoting predictor parameters, and a signal representative of the gain of the amplifiers used at the transmitter. The predictor parameters are supplied to adaptive predictor 30', which may be identical in all respects to adap tive predictor 30 at the transmitter. The signal representative of the gain 0 is supplied to amplifier 22 and decoded difference signals are delivered to amplifier 22. After being adjusted in gain by the factor 1/0, the difference signal is added to a predicted value 2,, of the present value of the signal developed at the receiver, for example, in adder network 23. The reconstructed samples n, are delivered to adaptive predictor 30' and also supplied by way of low-pass filter 24 to an output terminal. Low-pass filter 24, which has a cutoff frequency of one-half the sampling rate, smooths the supplied samples to produce an output speech signal r'(t). If there are no digital channel transmission errors, evidently predicted values Z are identical to values 2,, predicted at the transmitter, since predictor 30 is adjusted identically to its counterpart 30 at the transmitter. Hence, reconstructed sample r,,' is virtually identical to n, at the transmitter. It is apparent that the error between the reconstructed speech sample r,, and the input speech sample S is identical to the difference 5-8,.,- between the output of amplifier 16 and the input of amplifier 14. Since, on the average, power of samples 6,, is much smaller compared to power of samples S the quantizing noise power in the reconstructed speech signal is a very small fraction of the power in the input speech signal. Output signal r(t) is thus an extremely close approximation to the signal supplied as an input to the transmitter.
Adaptive Predictor Two of the main causes of redundancy in speech are (l) quasi-periodicity during voiced segments and (2) lack of flatness of the short-time spectral envelopes. In accordance with the invention, redundancy due to the quasi-periodic nature of speech is reduced by a linear predictor consisting, for example, of a delay and a gain. The z-transform of the predictor is given by where z represents a delay of K samples and b is an amplitude factor. For voiced speech, delay K corresponds nominally to a pitch period. The factor b compensates for possible unequal amplitudes of the speech signal during contiguous pitch periods. During the onset of voicing, b is frequently greater than unity; the reverse is the case at the end of a voiced segment. For unvoiced speech sounds, b is ordinarily close to zero.
Redundancy caused by the spectral envelope of speech is reduced, in accordance with the invention, by means of an eighth-order linear predictor. The z-transform of such a predictor is given by An eighth-order linear predictor has been found to substantially reduce redundancies due to three formants of the vocaltract transmission function and the spectral envelope of the vocal source.
An adaptive predictor, which is suitable for speech signals, and is in accordance with these considerations, is illustrated in FIG. 3. It consists essentially of two separate linear predictor systems, which exhibit transfer characteristics in accordance with equations l and (2), and means for combining them.
Reconstructed signal samples, r,, (delivered from adder network 17 of the transmitter, and correspondingly, from adder network 23 of the receiver, are delivered to storage unit 31. This unit is equipped to store a variable digital signal y, for
values of n=-l20, -1 l9,..., -l, 0, +l,..., +29. Thus, it has a storage capability of 130 digits. Of these, the last 30 digits are replaced every 5 msec. Every 5 msec, storage unit 31 is actuated, for example, by a pulse from clock 37, such that the signal in storage location y replaces the signal stored at location y the signal at y replaces the signal at y and so on. Thus, every 5 msec. a new group of samples is advanced into locations y yu to constitute a stored sequence of past samples. The locations, y y are vacated and made available to incoming reconstructed value signals r for the next 30 sample intervals.
During each 5 msec. interval, the values of y stored in locations y y are delivered sequentially to arithmetic unit 32 which is equipped to compute value, c,, in accordance with equation la) for each value of y Equation 1a), viz,
l yn!( defines an input-output characteristic which corresponds to the form of equation (1 and specifies an output signal c for each supplied value of y,,.,.. The necessary factors, b and K, are supplied to arithmetic unit 32 from predictor parameters computer 40 (FIG. 1). The resulting signals are delivered both to arithmetic unit 33 and to arithmetic unit 34.
Arithmetic unit 33 is programmed to develop values of u in accordance with the relation Fms The momentary value of r is supplied to arithmetic unit 33 from the input to adaptive predictor 30. Evidently, arithmetic unit 33 comprises a simple subtractor network.
Similarly, storage unit 35 provides digital storage facility for a variable u,. for values of n==*-8,..., 1 0,+29. Unit 35 may be a shift register or the like. It is reset every 5 msec. by a pulse, for example, from clock 37, to shift the signalsstored in locations 22 through +29 into the first eight locations and to free the locations through +29 for incoming signals. The vacant storage locations are filled progressively with values of signal u, developed by arithmetic unit 33. During each msec. interval, values of u, stored in unit 35 are delivered to arithmetic unit 36 which is arranged to compute values of c, in accordance with equation (2a), as follows:
Equation (2a) corresponds to the generalized relation of equation (2). In essence, arithmetic unit 36 is a cumulative multiplier network which sums the product of a and u for values of F1 through m=8 for each value of u, supplied from storage. The neceswry amplitude factors a are supplied to unit 36 from predictor parameters computer 40 (FIG. 1). Computed values of c; are delivered to arithmetic unit 34 wherein they are arithmetically added to values of c supplied by arithmetic unit 32 in accordance with equation (4) as follows:
ZN=C|+C2- Evidently, arithmetic unit 34 may comprise an adder network.
The resulting valuesof Z constitute the predicted value of the incoming speech signal sample S and are delivered, as an output signal, to subtractor network 13 of the transmitter (FIG. 1) and, correspondingly, to adder network 23 of the receiver (FIG. 2). The above-described arithmetic operations are carried on sequentially for each value of n from 0 through 29. In the above discussion, the integer N indicates the count of the current sample of the input signal, i.e., from sampler 12 (FIG. ll), minus 60 samples to take account of the msec. delay. The integer N indicates a corresponding count within each unit. Variables a and r are consecutively stored in storage units 3i and 35, respectively, in locations 0 through 29. Every 5 msec., both storage units are reset, as described above, and consecutive samples of r are again stored in locations 0 through 29 in storage unit 311, and consecutive samples u, are again stored in locations 0 through 29 and storage unit 35.
Predictor Parameters Computer Parameters for the adaptive predictors at the transmitter and receiver stations are calculated in special computation apparatus which may be of the form illustrated in H6. 4. Such apparatus develops the predictor parameters necessary to ad just the predictor optimally despite the nonstationary, timevarying character of the input speech signals. Predictor parameters are recalculated every 5 msec. to ensure that prediction is efficient even when the speech characteristics are changing relatively fast.
Input speech samples, S from sampler lll (FIG. I) are supplied to storage unit M which is equipped with sufficient storage capacity to accommodate an array w, in a configuration identical to that described above. Incoming samples are thus stored in the array as w w. w.,,..., w The sample at location w =S that at location w,=S and so on through Wgg=sM+ g, where M indicates the sample number of the first sample of the current frame of samples, i.e., samples in a 5 msec. group. Storage unit M is reset every 5 msec., for example, by a pulse from clock 37 of FIG. 3 (connections not shown for simplicity) such that w;= w for all values of j=l20,..., l. Accordingly, storage locations w w are vacated every 5 msec. and used to store the new samples incoming from sampler ill. The set of 30 newly installed samples constitutes a new frame of signals.
Signals from storage unit ill are supplied in parallel to arithmetic unit 42 wherein computational values x, according to equation (5), are computed as follows:
Arithmetic unit 42 includes individual computational units, 42a, 42b,..., 42m which operate in parallel to compute x, according to the equation for values of Fl5,..., 120. A special purpose computer programmed according to the equation to be employed to evaluate these signal values or, alternatively, several individual arithmetic operations, e.g., multiplication, summation, rooting, and division, may be performed serially according to techniques well known in the an.
The computed array of values of x, i.e., x are supplied in parallel to peak locating network 43 wherein the largest value of X is determined. Thus, peak locating network 43 finds the value of j such that x, is the maximum of all values of Networks for picking the biggest from among a plurality of signals are well known in the art; a suitable one typically includes a progressively biased diode matrix. The index of the largest selected value of X is designated K and is supplied as one parameter necessary to adjust the adaptive predictors at the transmitter and receiver locations.
Parameter It is also supplied to 8 computer 44 and to arithmetic unit 45. Computer 44 is also supplied with the signals w from storage unit 41. It computes b from these data according to equation (6) as follows:
n n nl n=0,...,29. (7) Values of signals in the array w, are supplied to arithmetic unit 45 from storage network 41.
The various computations outlined above are carried out serially in the order stated. The suboperations, e.g., the computation of values of x in arithmetic unit 42, b in computer 441, and u,, in arithmetic unit 45, are carried out in parallel circuits within those units.
Every 5 msec. the array of signal values u,, is transferred into storage unit 46 to replace the previous arrays of signals in storage. Storage unit 46 thus stores an array of signal values u u v u,,,..., I129. Every 5 msec. the values u u are replaced by the valves u u The incoming samples are placed in the vacated storage locations u,,,..., [129. Thus, the signals u,,,..., 29, are consecutively stored as they are received in storage unit 46.
Periodically, under the influence of clock signals, an array of signal values 11,, are read out of storage unit 46 and transferred to arithmetic unit 47A. This unit comprises 36 arithmetic units designated f f, f, f f; f f ,...,f which operate in parallel. Each unit serves to compute one value offaccording to equation (8) as follows:
29 ft. 1' 2 n-i ni index i varies from 1 to 8 index j varies from i to 8 unur. 9, 3, 1 (9) Arithmetic unit 473 preferably comprises an array of eight individual units operating in parallel to evaluate the several values of g. The resultant array, g,,..., g,,, designated G, is delivered every msec. to computer 48.
Computer 48 is programmed to solve the matrix equation Fa=G (10) to yield values of 11. Although any special purpose computer may be programmed for this evaluation, one suitable arrange ment is described below with reference to FIG. 5. Suffice it to say at this point that the output of computer 48 is an array (1 of signal values (1,, a a which constitutes parameter values necessary for adjusting the adaptive predictors at the transmitter and receiver stations. These signals are thus applied directly to adaptive predictor at the transmitter (FIG. 1), and to multiplexer 18 at the transmitter for delivery to the receiver and adaptive predictor 30'.
Array :1 is also delivered to 0 computer 49. Computer 49 constitutes an arithmetic unit arranged to evaluate values of Q according to the relation Arithmetic units for obtaining products, summations, differentials, absolute values and so on, are well known to those skilled in the art. Values of Q thus evaluated are used both at the transmitter and at the receiver to set the gains of the several adjustable gain amplifiers used in the predictive networks. At the transmitter, values of signal Qare used to set the gains of amplifiers l4 and 16; at the receiver to set the gain of amplifier 22.
Although the several individual processing steps required to evaluate the various intermediate parameter values take place sequentially in the apparatus of FIG. 4, it is evident that essentially instantaneous processing takes place in the various computational units within each frame interval. The various sets of parameter signals are advanced, one unit to the next, for example, in accordance with pulses from a clock (such as clock 37 in H6. 3).
The various predictor parameters and gain factor Q are recalculated every 5 msec. These calculated values are held fixed for a duration of 5 msec., the period over which the predictor parameters have been optimized. Due to the 10 msec. delay of incoming signals at the transmitter, the predictor parameters computer calculates the parameters ahead of the time they are needed at the transmitter. The adaptive predictors are reset just before the arrival of the first speech sample of each frame at the transmitter.
Parameter a Computer Operations sufiicient for evaluating a in accordance with equation (10) are described, for example, at pages -146 of Computational Methods of Linear Algebra by D. K, Feddeev and V. N. Faddena (English translation by R. C. Williams published by W. H. Freeman & Co., San Francisco, 1963. Although conventional operations as described in the literature may be employed, one arrangement that has been found particularly suitable is illustrated in the block schematic diagram of H6. 5.
in FlG. 5, the array of signals F, representative of values off developed in arithmetic unit 47A, are supplied, respectively, to arithmetic units 51. The system of arithmetic units operates on the supplied values off to produce an array of modified functions, designated h for values of i=1 8, and for values ofj=i,..., 8. Values of h are individually stored in storage unit 52. Arithmetic unit 510, for example, develops a value of h, in accordance with equation 12), as follows:
h, QT, (12) Apparently arithmetic unit 510 comprises a square rooting device. Values ofh h are evaluated in arithmetic unit 51b in accordance with the relation shown in equation (l3), viz, lHIJ/ LlsF v-w Evidently, arithmetic unit 51b comprises a plurality of individual units for developing a quotient signal. The necessary value of h, is delivered to arithmetic unit 51!) from storage unit 52.
In like manner, values of h are sequentially derived in arithmetic units 51, progressing from left to right in the drawing, according to the relation:
Evidently, units shown in the drawing and designated 51c, Sle, 51g, make the evaluations according to equation l4), i.e., for values of h h h The remaining evaluations, according to equation l5 are made in units 51d, 51f, and so on.
It is, of course, possible that values of h are at times zero. Hence, to avoid any ambiguity in evaluating functions according to equations (l4)and 15), it is in accordance with the invention to prescribe an arbitrary rule to accommodate this situation. Any similar rule may, of course, be used. According to the selected rule an increment e is added to each input for a detected input of zero. As a result, the input always assumes a finite value and the required division operation can take place. A small signal e, derived for example from battery 53, is supplied to adders 54 in the input circuits of the requisite arithmetic units 51. The magnitude of e is selected in accordance with the relative signal magnitudes accommodated by units 51, to be insignificant as far as signal evaluation is concerned, but sufficient to avoid the divide-by-zero ambiguity. If desired, switch 55 may be used to open the e circuit except when zero signal is detected.
The computed values of h are supplied to arithmetic units 56 together with values of G (from arithmetic unit 47B) and functions p, are developed as follows:
i-l 3-2 LJp l=1 pi hi I SjS The array of values of p, viz, p,,..., p,,, is stored in storage apparatus 57 and supplied as required to arithmetic units 58 wherein an array of signal values a is developed for values of i=1,..., 8 according to the equation The necessary values of h for this evaluation are supplied from storage unit 52. The resulting array of values a is delivered to storage apparatus 59. Periodically this array is delivered to arithmetic unit 49 for the evaluation of Q and to the adaptive predictors 30 and 30' (FIG. 3).
It is apparent that the apparatus described herein represents merely one suitable manner of carrying out the necessary operations to adaptively predict the values of a speech signal to promote efficient coding for transmission. Numerous alternative techniques may be employed for the evaluation; in fact, many of the operations may be programmed for evaluation by a special purpose computer. Moreover, the signals prepared for transmission may be combined in any desired fashion or, in the alternative, may be transmitted separately to achieve secure transmission of the speech signals.
It is of interest that the quantizing noise appearing at the output of the receiver, as described in this invention, is essentially white in nature (flat spectrum). Frequently it is desirable that quantizing noise have a nonflat spectrum. For example, noise whose spectrum is weighted down at high frequencies may be subjectively less annoying. Any desired noise spectral characteristics can be obtained by employing a suitable preemphasis network before low-pass filter 10 in the transmitter and a dcemphasis network after low-pass filter 24 at the receiver. A suitable preemphasis characteristic for speech signals is one which is flat up to about 500 Hz. and rises at about l db. per octave between 500 and 300 Hz. lt is not necessary that a preemphasis network be used prior to lowvpass filtering. It may, for example, be used just after the sampler 11. Similarly, the dcemphasis network may be used just prior to low-pass filter 24 in the receiver.
In all events, the above-described arrangements are merely illustrative of the application of the principles of the invention. Numerous arrangements may be devised by those skilled in the art without, however, departing from the spirit and scope of the invention.
What is claimed is:
l. Speech signal processing apparatus, which comprises:
means, adjusted in accordance with parameters representative of identifying characteristics of selected pitch periods of an applied speech signal, for predicting the present value of said speech signal on the basis of signals in selected past intervals thereof;
means for coding the differences between the predicted value and the present value of said signal for transmission;
5 means for analyzing selected pitch periods of said speech signal to develop a plurality of parameter signals which represent vocal tract transmission and source characteristics of said speech signal within said periods; and
means for periodically adjusting said predicting means in accordance with said parameter signals.
2. Speech signal processing apparatus as defined in claim 1, wherein,
said characteristics of said speech signal represented by said parameter signals include the extent of selected past pitch periods and the magnitude of signals within said pitch periods.
3. Speech signal processing apparatus as defined in claim 1, wherein new parameter signals are developed every 5 milliseconds.
2O 4. Speech signal processing apparatus as defined in claim 1,
wherein said means for predicting the present value of said applied speech signal comprises,
a linear predictor characterized by a z-transform given by where b is a factor representative of signal values during consecutive selected signal intervals, K is a number representative of the duration of consecutive pitch periods of said applied signal, a, are amplitude factors representative of the short time spectral envelope of said speech signal, and N represents a selected number of said factors a,,,.
5. A communication system for conveying the information content of a speech signal over a channel of relatively small capacity which comprises, in combination:
at a transmitter station;
means for reducing the redundancy in a speech signal by subtracting from it a predicted value of the signal derived from past pitch period intervals thereof selected in response to parameter signals developed from an analysis of selected pitch period intervals,
means for analyzing selected pitch periods of said speech signal to develop a plurality of parameter signals which denote selected time varying characteristics of said speech signal within said intervals,
means for periodically adjusting said predicting means in accordance with said parameter signals, and
means for transmitting both the difference between said predicted value and the present value of a speech signal and said parameter signals to a receiver station, and
at said receiver station;
means, adjusted in response to received parameter signals, for predicting the value of said speech signal in response to previously reconstructed speech signals, and
means for adding received difference signals to said predicted value signals to develop a replica of said speech signal.
6. A communication system as defined in claim 5 in further combination with,
means at said transmitter station for encoding said difference signal and said parameter signals for transmission as a composite signal, and
means at said receiver station for decoding said received signals to recover said difference signals and said parameter signals.
7. A communication system as defined in claim 5 wherein 70 said difference signals and said parameter signals are transmitted to said receiver station via diverse transmission facilities.
8. A communication system as defined in claim 5 wherein said parameter signals are scrambled according to a 75 prescribed code for transmission.
9. Apparatus for predicting the present value of a speech plitudes of correlated signals in a number of selected coni l f i past which comprises; secutive mtervalsof said applied speech signal;
means supplied with reconstructed samples of a predictively mefms for developmg from P Samples of said pp coded speech signal and with parameter signals which Signal a Set of Signals which represents the Short im denote, respectively, the values during each of a selected 5 Spectra] F 9W of said pp Signal during said number of consecutive intervals of said speech signal of selected 8" Intervals; and the duration K of a pitch period of said speech signal, the means for periodically selecting a number of consecutive inrelative amplitudes b of correlated signals in a number of tervals of said applied speech signal to represent past said selected signal intervals, and amplitude factors a,,, samples of said applied speech signal. representative of the short time spectral envelope of said 10 11. Ap aratus for reconstructing a speech signal from Speech Signal during Said Selected intervals for psignals representative of the difference between the present ing Signal Samples that closely represent the Present Value value of said speech signal and a predicted value derived from of said speech signal; and means for periodically adjusting the values of b, K, and a in accordance with current speech signal values. 1 5 10. Apparatus for developing parameter signals for use in the predictive coding of speech signals, which comprises, in combination:
means for developing from past samples of an applied speech signal a first signal which denotes the duration of a pitch period of said applied speech signal; means for developing from past samples of said applied signal a second signal which specifies the relative ampast pitch period intervals thereof, which comprises,
means, adjusted in accordance with received parameter signals representative of vocal tract transmission and source characteristics of a speech signal, for predicting the value of said speech signal in response to previously reconstructed speech signals, and
means for adding said received difference signal to said predicted value signal to develop a replica of said speech signal.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US2732424 *||Apr 13, 1951||Jan 24, 1956||oliver|
|US3026375 *||May 9, 1958||Mar 20, 1962||Bell Telephone Labor Inc||Transmission of quantized signals|
|US3236947 *||Dec 21, 1961||Feb 22, 1966||Ibm||Word code generator|
|US3305854 *||Dec 19, 1963||Feb 21, 1967||Raytheon Co||Sampled data system|
|US3461244 *||Aug 16, 1966||Aug 12, 1969||Bell Telephone Labor Inc||Delta modulation system with continuously variable compander|
|1||*||Linear and Adaptive Delta Modulation J. E. Abate, Proceedings of the IEEE VOl. 55, No. 3, pages 298 308, March, 1967|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US3715512 *||Dec 20, 1971||Feb 6, 1973||Bell Telephone Labor Inc||Adaptive predictive speech signal coding system|
|US3742138 *||Aug 30, 1971||Jun 26, 1973||Bell Telephone Labor Inc||Predictive delayed encoders|
|US3784747 *||Dec 3, 1971||Jan 8, 1974||Bell Telephone Labor Inc||Speech suppression by predictive filtering|
|US3909533 *||Oct 8, 1974||Sep 30, 1975||Gretag Ag||Method and apparatus for the analysis and synthesis of speech signals|
|US3916105 *||Feb 28, 1974||Oct 28, 1975||Ibm||Pitch peak detection using linear prediction|
|US3973081 *||Sep 12, 1975||Aug 3, 1976||Trw Inc.||Feedback residue compression for digital speech systems|
|US3975587 *||Sep 13, 1974||Aug 17, 1976||International Telephone And Telegraph Corporation||Digital vocoder|
|US4008435 *||Feb 25, 1975||Feb 15, 1977||Nippon Electric Company, Ltd.||Delta modulation encoder|
|US4051470 *||Mar 31, 1976||Sep 27, 1977||International Business Machines Corporation||Process for block quantizing an electrical signal and device for implementing said process|
|US4052563 *||Oct 7, 1975||Oct 4, 1977||Nippon Telegraph And Telephone Public Corporation||Multiplex speech transmission system with speech analysis-synthesis|
|US4099122 *||May 27, 1976||Jul 4, 1978||U.S. Philips Corporation||Transmission system by means of time quantization and trivalent amplitude quantization|
|US4121051 *||Jun 29, 1977||Oct 17, 1978||International Telephone & Telegraph Corporation||Speech synthesizer|
|US4133976 *||Apr 7, 1978||Jan 9, 1979||Bell Telephone Laboratories, Incorporated||Predictive speech signal coding with reduced noise effects|
|US4224689 *||Oct 3, 1978||Sep 23, 1980||Sundberg Carl Erik W||Apparatus for smoothing transmission errors|
|US4354057 *||Apr 8, 1980||Oct 12, 1982||Bell Telephone Laboratories, Incorporated||Predictive signal coding with partitioned quantization|
|US4520491 *||Nov 4, 1982||May 28, 1985||Telecommunications Radioelectriques Et Telephoniques T. R. T.||Transmission system using differential pulse code modulation with adaptive prediction|
|US4532494 *||Dec 30, 1981||Jul 30, 1985||Tokyo Shibaura Denki Kabushiki Kaisha||Adaptive delta codec which varies a delta signal in accordance with a characteristic of an input analog signal|
|US4561102 *||Sep 20, 1982||Dec 24, 1985||At&T Bell Laboratories||Pitch detector for speech analysis|
|US4617676 *||Sep 4, 1984||Oct 14, 1986||At&T Bell Laboratories||Predictive communication system filtering arrangement|
|US4700362 *||Aug 21, 1984||Oct 13, 1987||Dolby Laboratories Licensing Corporation||A-D encoder and D-A decoder system|
|US4701954 *||Mar 16, 1984||Oct 20, 1987||American Telephone And Telegraph Company, At&T Bell Laboratories||Multipulse LPC speech processing arrangement|
|US4709390 *||May 4, 1984||Nov 24, 1987||American Telephone And Telegraph Company, At&T Bell Laboratories||Speech message code modifying arrangement|
|US4719642 *||Feb 27, 1985||Jan 12, 1988||Scientific Atlanta, Inc.||Error detection and concealment using predicted signal values|
|US4726037 *||Mar 26, 1986||Feb 16, 1988||American Telephone And Telegraph Company, At&T Bell Laboratories||Predictive communication system filtering arrangement|
|US4791654 *||Jun 5, 1987||Dec 13, 1988||American Telephone And Telegraph Company, At&T Bell Laboratories||Resisting the effects of channel noise in digital transmission of information|
|US4817157 *||Jan 7, 1988||Mar 28, 1989||Motorola, Inc.||Digital speech coder having improved vector excitation source|
|US4896361 *||Jan 6, 1989||Jan 23, 1990||Motorola, Inc.||Digital speech coder having improved vector excitation source|
|US4961160 *||Apr 27, 1988||Oct 2, 1990||Oki Electric Industry Co., Ltd.||Linear predictive coding analysing apparatus and bandlimiting circuit therefor|
|US5086471 *||Jun 29, 1990||Feb 4, 1992||Fujitsu Limited||Gain-shape vector quantization apparatus|
|US5127055 *||Feb 11, 1991||Jun 30, 1992||Kurzweil Applied Intelligence, Inc.||Speech recognition apparatus & method having dynamic reference pattern adaptation|
|US5151968 *||Aug 3, 1990||Sep 29, 1992||Fujitsu Limited||Vector quantization encoder and vector quantization decoder|
|US5195168 *||Mar 15, 1991||Mar 16, 1993||Codex Corporation||Speech coder and method having spectral interpolation and fast codebook search|
|US5261027 *||Dec 28, 1992||Nov 9, 1993||Fujitsu Limited||Code excited linear prediction speech coding system|
|US5263119 *||Nov 21, 1991||Nov 16, 1993||Fujitsu Limited||Gain-shape vector quantization method and apparatus|
|US5274559 *||Oct 16, 1989||Dec 28, 1993||Hitachi, Ltd.||Method for predicting a future value of measurement data and for controlling engine fuel injection based thereon|
|US5657358 *||Apr 22, 1993||Aug 12, 1997||Interdigital Technology Corporation||Subscriber RF telephone system for providing multiple speech and/or data signals simultaneously over either a single or plurality of RF channels|
|US5687194 *||Apr 22, 1993||Nov 11, 1997||Interdigital Technology Corporation||Subscriber RF telephone system for providing multiple speech and/or data signals simultaneously over either a single or a plurality of RF channels|
|US5734678 *||Oct 2, 1996||Mar 31, 1998||Interdigital Technology Corporation||Subscriber RF telephone system for providing multiple speech and/or data signals simultaneously over either a single or a plurality of RF channels|
|US5852604 *||May 20, 1996||Dec 22, 1998||Interdigital Technology Corporation||Modularly clustered radiotelephone system|
|US6014374 *||Sep 9, 1997||Jan 11, 2000||Interdigital Technology Corporation||Subscriber RF telephone system for providing multiple speech and/or data signals simultaneously over either a single or a plurality of RF channels|
|US6208630||Dec 21, 1998||Mar 27, 2001||Interdigital Technology Corporation||Modulary clustered radiotelephone system|
|US6282180||Nov 4, 1999||Aug 28, 2001||Interdigital Technology Corporation|
|US6393002||Aug 6, 2001||May 21, 2002||Interdigital Technology Corporation|
|US6496488||Nov 2, 2000||Dec 17, 2002||Interdigital Technology Corporation||Modularly clustered radiotelephone system|
|US6771667||Feb 26, 2003||Aug 3, 2004||Interdigital Technology Corporation|
|US6842440||Apr 25, 2002||Jan 11, 2005||Interdigital Technology Corporation|
|US6954470||May 14, 2002||Oct 11, 2005||Interdigital Technology Corporation|
|US7218491 *||Dec 23, 2002||May 15, 2007||Intel Corporation||Electrostatic discharge protection unit including equalization|
|US7245596||Jul 11, 2002||Jul 17, 2007||Interdigital Technology Corporation||Modularly clustered radiotelephone system|
|US7386446 *||Aug 3, 2006||Jun 10, 2008||Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V.||Predictive coding scheme with adaptive speed parameters|
|US7801732 *||Feb 24, 2005||Sep 21, 2010||Lg Electronics, Inc.||Audio codec system and audio signal encoding method using the same|
|US8519867||Dec 23, 2010||Aug 27, 2013||General Electric Company||Method and system for reducing transmitter power consumption while avoiding delay of displayed information|
|US8688438 *||Feb 9, 2010||Apr 1, 2014||Massachusetts Institute Of Technology||Generating speech and voice from extracted signal attributes using a speech-locked loop (SLL)|
|US20040120092 *||Dec 23, 2002||Jun 24, 2004||Jaussi James E.||Electrostatic discharge protection unit including equalization|
|US20050192796 *||Feb 24, 2005||Sep 1, 2005||Lg Electronics Inc.||Audio codec system and audio signal encoding method using the same|
|US20070016409 *||Aug 3, 2006||Jan 18, 2007||Gerald Schuller||Predictive coding scheme|
|US20100217601 *||Feb 9, 2010||Aug 26, 2010||Keng Hoong Wee||Speech processing apparatus and method employing feedback|
|USRE32124 *||Oct 12, 1984||Apr 22, 1986||At&T Bell Laboratories||Predictive signal coding with partitioned quantization|
|DE2207141A1 *||Dec 16, 1972||Aug 2, 1973||Western Electric Co||Schaltungsanordnung zur unterdrueckung unerwuenschter sprachsignale mittels eines vorhersagenden filters|
|DE2233872A1 *||Jul 10, 1972||Jan 18, 1973||Western Electric Co||Signalanalysator|
|DE19729494A1 *||Jul 10, 1997||Jan 14, 1999||Grundig Ag||Verfahren und Anordnung zur Codierung und/oder Decodierung von Sprachsignalen, insbesondere für digitale Diktiergeräte|
|DE19729494C2 *||Jul 10, 1997||Nov 4, 1999||Grundig Ag||Verfahren und Anordnung zur Codierung und/oder Decodierung von Sprachsignalen, insbesondere für digitale Diktiergeräte|
|EP0116975A2 *||Feb 20, 1984||Aug 29, 1984||Nec Corporation||Speech-adaptive predictive coding system|
|EP0116975A3 *||Feb 20, 1984||Mar 16, 1988||Nec Corporation||Speech-adaptive predictive coding system|
|EP0138548A2 *||Oct 8, 1984||Apr 24, 1985||Dolby Laboratories Licensing Corporation||Analog-to-digital encoder and digital-to-analog decoder|
|EP0138548A3 *||Oct 8, 1984||Jan 27, 1988||Dolby Laboratories Licensing Corporation||Analog-to-digital encoder and digital-to-analog decoder|
|EP0384544A2 *||Oct 8, 1984||Aug 29, 1990||Dolby Laboratories Licensing Corporation||Analog-to-digital encoder and digital-to-analog decoder|
|EP0384544A3 *||Oct 8, 1984||Sep 12, 1990||Dolby Laboratories Licensing Corporation||Analog-to-digital encoder and digital-to-analog decoder|
|WO1979000901A1 *||Mar 28, 1979||Nov 15, 1979||Western Electric Co||Predictive speech signal coding with reduced noise effects|
|WO1981002942A1 *||Mar 18, 1981||Oct 15, 1981||Western Electric Co||Predictive signals coding with partitioned quantization|
|WO1983002696A1 *||Jan 14, 1983||Aug 4, 1983||Western Electric Co||Adaptive differential pcm coding|
|WO1986005340A1 *||Feb 21, 1986||Sep 12, 1986||Scientific Atlanta, Inc.||Error detection and concealment using predicted signal values|
|U.S. Classification||704/219, 375/250, 375/244|
|International Classification||H03M3/02, H04B14/06, G10L19/04, H04B1/66|
|Cooperative Classification||H03M3/022, G10L19/04, H05K999/99|
|European Classification||G10L19/04, H03M3/022|