US 4340781 A
Computation of the partial correlation coefficients (PARCOR K.sub.i) of a signal, using less cascaded hardware, is implemented by first deriving a sequence of auto-correlation coefficients (v.sub.j) which are then transformed into a sequence of K.sub.i using a single section digital filter plus recirculating circuitry for data iteration.
1. A speech analysing device comprising a correlator for obtaining an auto-correlation coefficient sequence of input speech signals, a computation portion for obtaining a partial auto-correlation coefficient sequence of said input speech signals, and a data circulation portion coupled to both said correlator and said computation portion to receive as its input said auto-correlation coefficient sequence and said partial auto-correlation coefficient sequence, wherein an output of said data circulation portion is coupled to an input of said computation portion so that output signals of said data circulation portion are employed as input signals to said computation portion for obtaining said partial auto-correlation sequence.
2. The speech analysing device as defined in claim 1 wherein said data circulation portion comprises;
two independent shift registers;
a digital filter using either one of two output signals from said shift registers and said partial auto-correlation coefficient sequence from said computation portion as its input signals; and
two switching circuits for selecting two output signals from said digital filter at predetermined timings, respectively;
said two output signals from said digital filter being fed back to the corresponding input of said shift registers, respectively;
said output signals of said two switching circuits being used as the input signals to said computation portion for obtaining said partial auto-correlation coefficient sequence.
3. The speech analysing device as defined in claim 1 wherein said data circulation portion comprises;
an addition circuit;
a first switching circuit for selecting either said auto-correlation coefficient sequence from said correlator or output signals from said addition circuit;
a first shift register for storing the output signal from said first switching circuit;
a second switching circuit for selecting either the output signal of said first shift register or that of said first switching circuit at a predetermined timing;
a second shift register for storing the output of said second switching circuit;
a third shift register for storing the output signal of said partial auto-correlation coefficient computation portion receiving the output signal of said first switching circuit as its input signal; and
a multiplier for multiplying a signal corresponding to the output of said third register and the output signal of said second register;
the output signal of said multiplier and that of said first register being applied as input signals to said addition circuit.
The procedures of the present invention to obtain the PARCOR coefficients in accordance with the method proposed by J. Le Roux are shown in FIG. 1.
First, the auto-correlation coefficients v.sub.0 -v.sub.p (where p is the order of the PARCOR coefficients to be determined) are first calculated, and the initial condition is set in the following manner;
e.sub.j o=e.sub.-j o=v.sub.j (j=0, 1, . . . , p) (1)
The PARCOR coefficient k.sub.1, k.sub.2, . . . , k.sub.p are sequentially obtained by solving the asymptotic equation ##EQU1##
The first embodiment of the present invention discloses a device for solving the abovementioned asymptotic equation to determine k.sub.1 by repeated use of two shift registers and a one-stage lattice type digital filter. The second embodiment of the invention discloses a device for solving the asymptotic equation to determine k.sub.1 by utilizing the delay of a shift register and the delay timing of a multiplier. Both of these embodiments make it possible to realize the algorithm proposed by J. Le Roux through an extremely simple hardware construction.
FIG. 2 shows a circuit diagram of the first embodiment of the speech analyzing device of the present invention, in which auto-correlation coefficient sequence SS (v.sub.0, v.sub.1, . . . , v.sub.p) is calculated by a known Auto-Correlator 11 from input speech signals IN to be analysed, and is applied to the data circulation portion 51.
A register R.sub.0 of a digital filter 16 included in the data circulation portion 51 is cleared and switches S.sub.1 and S.sub.2 are set to the side of "1" before the operation to compute the PARCOR coefficients is started in the data circulation portion 51 and in the PARCOR coefficient computation portion 52.
The auto-correlation coefficient sequence SS (v.sub.o, v.sub.1, . . . , v.sub.p) input to the data circulation portion 51 is stored in a shift register 6 (hereinafter called "A-Reg") and in a shift register 7 (hereinafter called "B-Reg") through multipliers 3-1 and 3-2 (the result of multiplication is 0 because the content of R.sub.0 is 0), adders 4-1 and 4-2 and a 1-data delay circuit 5.
The A-Reg and B-Reg may have such a data length (p words) as to correspond to the number of orders of the PARCOR coefficients to be determined.
For the sake of simplicity, the operation of FIG. 2 will be explained in detail in the case of p=10.
When switches S.sub.3 and S.sub.4 are turned on at the timing at which v.sub.1 enters A-Reg, v.sub.0 which is retarded by one-data by the delay circuit 5 enters the input of B-Reg.
Accordingly, since the output x and y of the switches S.sub.3 and S.sub.4 become
x=v.sub.1 =e.sub.1.sup.0 and y=v.sub.0 =e.sub.0.sup.0, respectively,
the output of adders 8 and 9 become
(x+y) and (x-y), respectively,
and they are sent to the PARCOR coefficient computation portion 52.
In the PARCOR coefficient computation portion 52, logarithmic contents are read from a ROM 10 using (x+y) and (x-y) as the addresses. The results of reading 101 and 102 are subtracted in an adder circuit 103, and the output 11 becomes as follows; ##EQU2## Thus, a product two times a parameter tan h .sup.-1 k.sub.1 called "log area ratio" is obtained.
It is known that the influence of quantization is smaller on the log area ratio than on the PARCOR coefficient k when each is quantized.
The abovementioned result is multiplied by 1/2 by a shifter 111 (1-bit shift may be made) to obtain tan h .sup.-1 k.sub.1, which is quantized by a digitizer 12 to obtain result 13. The result 13 is produced as output at an external terminal 130. Using this result as the address, a reverse conversion table of tan h .sup.-1 k.sub.1 written in a ROM 14 is read out therefrom to return the log area ratio to the PARCOR coefficient k.sub.1, is fed back to the data circulation portion 51 and is then stored in the register R.sub.1.
Needless to say, it is naturally possible to directly obtain k.sub.1 as k.sub.1 =x/y.
The switches S.sub.3 and S.sub.4 are turned off at the timing at which v.sub.2 enters A-Reg. The switches S.sub.1 and S.sub.2 are connected to the "2" side at the timings at which v.sub.0, v.sub.1, . . . , v.sub.10 are stored in A-Reg and B-Reg, the switch S.sub.5 is turned on and the content of the register R.sub.1 is transferred to the register R.sub.0. At this time, the contents of A-Reg and B-Reg are such as shown in FIG. 3(a). Symbol * in the drawing represents meaningless data. Due to the delay circuit 5, data each deviated by one word from the corresponding data of A-Reg are stored in B-Reg. Next, the data are fed out one word by one from both A-Reg and B-Reg and multiplication is made by means of the output of the register R.sub.0 and multipliers 3-1 and 3-2. The result of multiplication is applied to the adders 4-1 and 4-2 to operate the following equation (3) corresponding to the aforementioned equation (2); ##EQU3## As a result, the contents of A-Reg and B-Reg become such as shown in FIG. 3(b).
During the process, e.sub.2.sup.0 is produced as output from A-Reg and at the timing at which e.sub.1.sup.0 is produced from B-Reg, the input to the switch S.sub.3 is as follows;
Also, the input to the switch S.sub.4 becomes a signal
which is by one timing before (e.sub.1.sup.0 -k.sub.1 due to the delay circuit 5.
At this timing, the switches S.sub.3 and S.sub.4 are turned on to attain x=e.sub.2.sup.1 and y=e.sub.0.sup.1, and the PARCOR coefficient k.sub.2 can be obtained in the same way as k.sub.1. When e.sub.10.sup.1 is stored in A-Reg and e.sub.-8.sup.1 in B-Reg, the switch S.sub.5 is turned on whereby k.sub.2 is transferred to the register R.sub.0 to prepare for the operation to obtain k.sub.3.
In the same way, at the timing when e.sub.3.sup.1 is produced from A-Reg and e.sub.-1.sup.1 from B-Reg, the input of the switch S.sub.3 becomes e.sub.3.sup.2 and that of the switch S.sub.4 becomes e.sub.0.sup.2 which is by one timing earlier than e.sub.-1.sup.2. At this timing the switches S.sub.3 and S.sub.4 are turned on to attain x=e.sub.3.sup.2 and y=e.sub.0.sup.2 and the PARCOR coefficient k.sub.3 can now be obtained.
The operation is continued while retarding the turn-on timing of the switch S.sub.3 and S.sub.4 by one data till k.sub.10 (or k.sub.p, generally) is computed.
FIG. 4 illustrates signal changes of the output portions of A-Reg and B-Reg when the PARCOR coefficients k.sub.1, k.sub.2, . . . , k.sub.10 are sequentially obtained.
The abscissa represents the number of circulation times (i) of the circulation processing in which the data pass through the digital filter 16 of FIG. 2, the operation of the equation (2) is effected and its result is stored in the registers 6 and 7. At the same time, the timings, at which the digital filter 16 is repeatedly used and the coefficients k.sub.1, k.sub.2, . . . , k.sub.10 are obtained, are illustrated by an exploded view. The ordinate represents the number of transfer clocks when the data are transferred in A-Reg and B-Reg during each circulation processing.
To take an example of the step where i=3 and j=3 in FIG. 4, e.sub.3.sup.2 and e.sub.0.sup.2 on the left side of the column represent the signals that are output of the adder 4-1 and delay circuit 5 and appear at the output of A- and B-Regs through them in FIG. 2, while e.sub.3.sup.3 and e.sub.-0.sup.3 on the right side of the column are calculated as the output of the adders 4-1 and 4-2 of FIG. 2 in the following manner;
e.sub.3.sup.3 =e.sub.3.sup.2 -k.sub.3
e.sub.-0.sup.3 =e.sub.0.sup.2 -k.sub.3
The PARCOR coefficients k.sub.i (i=1, 2, 3, . . . ) are sequentially obtained using the result of the computation of the preceeding steps as represented by arrows. If i>j, the data disappear one by one due to the delay circuit 5 whenever the data are repeatedly circulated and hence, do not represent correct values. However, there occurs no problem because e.sub.i.sup.i-1 and e.sub.0.sup.i-1 necessary for obtaining k.sub.i are correct values.
In the abovementioned operation, since the digitizer 12 is actuated before k.sub.i of the subsequent stage is obtained, the quantization error can be incorporated in the subsequent stage and compensated for in the stage of high order. Hence, the accuracy of analysis as a whole can be improved.
In the ordinary lattice method and modified lattice method, the circuit for obtaining tan h .sup.-1 k from x and y is processed in the waveshape range. Hence, the circuit requires 4 adders and 2 each squarers and accumulators. By contrast, the present invention can be constructed in an extremely simple manner using only two adders 8 and 9.
In the foregoing description, two sets each of the multipliers 3-1, 3-2 and the adders 4-1, 4-2 are required to form the digital filter 16. However, it is possible to use one each multiplier and adder on the time-sharing basis.
FIG. 5 shows a circuit diagram of the second embodiment of the present invention.
In FIG. 5, the switches S.sub.6 and S.sub.8 are connected to the terminal 1 and the auto-correlation coefficient sequence SS (v.sub.0, v.sub.1, . . . , v.sub.p) is computed by the auto-correlator 11 from input speech signals IN to be analyzed in the same was as in FIG. 2.
The auto-correlation coefficient sequence SS is assumed to be produced in the sequence of the equation (4) or (5) by referring to equation (1);
v.sub.0, v.sub.1, . . . , v.sub.p-1, v.sub.1, v.sub.2, . . . , v.sub.p (4)
v.sub.1, v.sub.2, . . . , v.sub.p, v.sub.0, v.sub.1, . . . , v.sub.p-1 (5)
For the sake of simplicity, the case of the equation (5) will be discussed here. The case of the equation (4) can also be processed in the same way by changing the timings for the switches as will be described next.
From the equation (1), the equation (5) can be regarded as the following data sequence of 2p;
e.sub.1.sup.0, e.sub.2.sup.0, . . . e.sub.p.sup.0, e.sub.-0.sup.0, e.sub.-1.sup.0, e.sub.-2.sup.0, . . . e.sub.-(p-1).sup.0 (6)
The auto-correlation coefficient sequence SS expressed by the equation (6) is divided into three parts and sent to the switch S.sub.7 in the PARCOR coefficient computation portion 51, to the switch S.sub.8 in the circulation processing portion 52 and to a shift register 26 (consisting of 2p words). The switch S.sub.7 at the input portion of the PARCOR coefficient computation portion 52 is turned on at the timing at which e.sub.1.sup.0 and e.sub.-0.sup.0 appear. The contents written logarithmically in a ROM 10 are read out twice using e.sub.1.sup.0 and e.sub.-0.sup.0 as the addresses and the results are sequentially stored in registers 21 and 22. The difference between the read results are computed by an adder 23, and a ROM 14 storing the inverse logarithm of the result is read twice to obtain the PARCOR coefficient k.sub.1.
That is to say, ##EQU4## Generally, the switch S.sub.7 is turned on at the timing when e.sub.1.sup.i-1 and e.sub.0.sup.i-1 appear, and the PARCOR coefficient k.sub.i is obtained as ##EQU5## This can be taken out from the output terminal 130.
In the PARCOR coefficient computation portion 52, on the other hand, the ROM 10 is read out twice and the calculation to obtain the difference is made by the adder 13 to obtain the difference. Further, the ROM 14 is read once, thus yielding 4-bit delay q=4.
The PARCOR coefficient 15 obtained in the PARCOR coefficient computation portion 52 is sent to the data circulation processing portion 32 and is first stored in the register R.sub.1. On the other hand, the data sequence of the equation (6) are sequentially stored in the shift register 26 from the side of the terminal 1 of the switch S.sub.6. When e.sub.1.sup.0, e.sub.2.sup.0, . . . , e.sub.p.sup.0 are stored, the switch S.sub.8 is connected to the side of the terminal 2 and subsequent data sequence e.sub.-0.sup.0, e.sub.-1.sup.0, . . . , e.sub.-(p-1).sup.0 are also stored in the register 28.
The switch S.sub.9 is turned on at a timing which is by one data belated than the timing of the appearance of e.sub.-0.sup.0 (generally, e.sub.-0.sup.i) and k.sub.1 stored in the register R.sub.1 is transferred to the register R.sub.0. Generally, whenever the processing to be later described makes one circulation, the timing may further be retarded by one data. This is because the first result of the data applied to the multiplier 29 is not used.
When k.sub.1 is obtained at the output of the register R.sub.0, the output of the register 28 is e.sub.-1.sup.0 which is next to e.sub.-0.sup.0. Accordingly, the output of the multiplier 29 is k.sub.1 delay by the multiplier 29 can be made r=l/2-1 where l is the data length of the shorter data of the two to be multiplied.
Accordingly, in order to adjust the timing so that e.sub.2.sup.0 is obtained at the output of the register 26 when k.sub.1 following relation may be satisfied;
where q is the delay of the register 28. The output of the adder 30 at this time is
and the result of the equation (2) can thus be obtained.
In the PARCOR analysis, the correlation data is usually 12 to 16-bit while the PARCOR coefficient is 3 to 12-bit. Hence, it is possible to obtain r=5 if l=12.
At the timing when e.sub.2.sup.1 is obtained at the output of the adder 30, the switch S.sub.6 is connected to the terminal 2 and the switch S.sub.7 is turned on whereby log(e.sub.2.sup.1) is read out from the ROM 10 and stored in the register 21. Further, the switch S.sub.8 is connected to the terminal 1 and the switch S.sub.6 is kept connected to the terminal 2 until all the PARCOR coefficients are obtained. Accordingly, the output of the shift register 26 is applied to the register 28 through the delay circuit 27 for the one-data delay.
In the same way as e.sub.2.sup.1, e.sub.3.sup.1, e.sub.4.sup.1, . . . , e.sub.p.sup.1, e.sub.-0.sup.1, e.sub.-1.sup.1, . . . , e.sub.-(p-1).sup.1 are obtained at the output of the adder 30 is accordance with e.sub.j.sup.1 =e.sub.j.sup.0 -k.sub.1 equation (2), and are sequentially stored in the shift register 26.
At the timing when e.sub.-0.sup.1 is obtained at the output of the adder 30, the switch S.sub.7 is turned on and log(e.sub.-0.sup.1) is set from the ROM 10 to the register 21. In the same way as k.sub.1, k.sub.2 is obtained at the timing by q data later than the turn-on of the switch S.sub.7 and is then stored in the register R.sub.1. At this timing the switch S.sub.8 is connected to the terminal 2. At the timing when e.sub.-1.sup.1, which is by one timing later than e.sub.-0.sup.1, appears at the output of the register 28 while it is further belated by q data, the switch S.sub.9 is turned on and k.sub.2 is transferred from the register R.sub.1 to R.sub.0. When k.sub.2 obtained at the output of the multiplier 29 at the timing retarded by r data, the output of the adder becomes as follows since the output of the shift register 26 is e.sub.2.sup.1.
There is thus obtained the result of the equation (2).
In the same way as e.sub.2.sup.2, e.sub.3.sup.2, e.sub.4.sup.2, . . . , e.sub.p.sup.2, e.sub.-0.sup.2, e.sub.-1.sup.2, . . . e.sub.-(p-1).sup.2 are obtained at the output of the adder 30 in accordance with e.sub.j.sup.2 =e.sub.j.sup.1 -k.sub.2 equation (2) and are sequentially stored in the shift register 26.
Thereafter, the operation is continued till k.sub.p is obtained by alternately changing over the switch S.sub.8 between the terminals 1 and 2 at every p timing so as to circulate the data p times.
In the case of this embodiment (p=10), the delay of the PARCOR coefficient computation portion 52 is 4. In order to apply k.sub.1 to the multiplier 29 at the practically necessary timing, it is convenient to make the register R.sub.0 the same as R.sub.1. For, under the condition p=10, k.sub.i would be retarded by one clock than the initially necessary timing at the multiplier 29 if k.sub.i has to pass through the two registers R.sub.0 and R.sub.1 at one each timing. If p>10, it is preferred to use the separate registers R.sub.0 and R.sub.1 in order not to erase k.sub.i obtained at the PARCOR coefficient computation portion 52 and k.sub.i-1 which is being used at the multiplier 29.
If the PARCOR coefficient computation portion in this embodiment performs the operation in which k is first converted to tan h .sup.-1 k and tan h .sup.-1 k is quantized and is again returned to k in the same way as in the first embodiment, the delay q in the PARCOR coefficient computation portion becomes great. When q+r>p-1, the processing at the data circulation processing portion 51 may be stopped by the following timing.
in order to adjust the timing.
Generally, the total stop time(τ negligibly smaller in comparison with the time length of the speech to be analyzed. Hence, the abovementioned operation may be carried out without any practical problem.
On the contrary, when the adder 29 is reduced in size and r becomes smaller as expressed by the following relation;
the operation by the PARCOR coefficient computation portion 52 may be stopped by the following clock;
The foregoing explains the case where the autocorrelation coefficience sequence is given by the equation (5). When it is given by the equation (4), the turn-on timing of the switch S.sub.7 is so changed as to obtain predetermined data and the polarity of the input to the adder 23 is reversed.
FIG. 6 illustrates the flow of signals at the portions (a, b, c, d, e, f, g, h, k, k') of FIG. 5 at every timing (T).
This is the case where p=10, q=4 and r=5. The data are circulated at every T=0˜19 and the switch S.sub.8 is alternately connected to the terminals 1 and 2 at every p=10 timing.
Values in parentheses represent the operation that is not necessary for the subsequent computation. By utilizing this characteristics, k.sub.i can be obtained and the processing can be made even if the first data (represented by *) appearing at k' as the input to the multiplier 29 is not in time for the timing of the operation of the equation (2).
As represented in the column h, the turn-on timing of the switch S.sub.7 is T=0 and T=10 between e.sub.1.sup.0 and e.sub.-0.sup.0 to obtain k.sub.1 and has a gap of 10 timings. Between e.sub.2.sup.1 and e.sub.-2.sup.1 to obtain k.sub.2, however, it is T=1 and T=10 and the gap becomes 9 timings. Similarly, the gap between e.sub.i+1.sup.i and e.sub.0.sup.i to obtain k.sub.i becomes smaller by one timing each whenever the data make one circulation.
As explained in the foregoing paragraph, the present invention makes it possible to realize the algorithm proposed by J. Le Roux through an extremely simplified hardware construction.
FIG. 1 is a diagram showing the procedures for obtaining the PARCOR coefficients using the present invention in accordance with the algorithm of J. Le Roux;
FIG. 2 is a circuit diagram of an embodiment of the speech analysing device of the present invention for carrying out the procedures of FIG. 1;
FIG. 3 is a diagram showing an example of data array stored in the A and B registers of FIG. 2;
FIG. 4 is a diagram showing the change in signals appearing at the outputs of the A and B registers of FIG. 2 at every clock timing;
FIG. 5 is a diagram showing the circuit construction of the second embodiment of speech analysing device of the present invention; and
FIG. 6 is a diagram showing the flow of signals appearing at the principal portions of FIG. 5 at every clock timing.
(1) Field of the Invention
This invention relates to a speech analysing device, more particularly to an improvement in an analysing device using a "PARTIAL AUTO-CORRELATION COEFFICIENT." (Hereinafter, this coefficient will be called "PARCOR coefficient" for short and an analysing system using the coefficient, "PARCOR system.")
(2) Description of the Prior Art
About a decade has passed since Itakura and Saitoh devised the PARCOR system speech analysis (Itakura et al., REPORTS OF THE MEETING BY THE ACOUSTICAL SOCIETY OF JAPAN, 1976, October, p. 555). Since the content of this system is well known to those skilled in the art, the explanation of the system is hereby deleted.
As devices for determining the PARCOR coefficient k in this PARCOR system, there have so far been proposed a device which incorporates a mini-computer in the device to determine the coefficient k in accordance with the algorithm given by Itakura and Saitoh, a device which determines the coefficient by a lattice method using a lattice type filter and a correlator disclosed in the abovementioned report, and a device by a modified lattice method proposed by Kobayashi and Yamamoto (Yamamoto et al.; "OPERATION ACCURACY OF MODIFIED LATTICE TYPE PARCOR ANALYSING CIRCUIT," REPORTS OF THE MEETING BY THE ACOUSTICAL SOCIETY OF JAPAN, 1977, April, p. 257), and so forth.
The abovementioned lattice method and modified lattice method are suited for the adaptation to a device because they use simple algorithms, However, since the number of operational steps is large, a hardware construction having high processing capacity is required.
On the other hand, the method proposed by J. Le Roux (J. Le. Roux, "A Fixed Point Computation of Partial Correlation Coefficients," IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1977, June, p. 257-259) has a characterizing feature in that the number of steps to be processed is small and the operation accuracy is high. To this date, however, no method has been developed to realize the abovementioned method using simple hardware capable of processing at a high rate.
In view of the abovementioned problem, the present invention is directed to provide a device which realizes the algorithm proposed by J. Le Roux using a simple hardware construction.
To accomplish this object, the present invention uses a hardware construction consisting of a data circulation portion cascade-connected to a PARCOR coefficient computation portion, and is characterized in that the PARCOR coefficients are computated sequentially by applying a sequence of auto-correlation coefficients of input speech signals to the data circulation portion while feeding back the output of the PARCOR coefficient computation portion to the data circulation portion, and repeating this process.