US RE38279 E1 Abstract Representative vectors Z
_{1i }and Z_{2j }are selected from code-books codebooks CB1 and CB1CB2, respectively, and multiplied by weighting coefficient vectors w_{1 }and w_{2 }of the same number of dimensions as those of the representative vectors, whereby weighted representative vectors Z_{1i}w_{1 }and Z_{2j}w_{2 }are generated. These weighted representative vectors are vector combined into a combined vector y_{ij}, and a combination of the representative vectors is selected by a control part in such a manner as to minimize the distance between the combined vector y_{ij }and an input vector X. The weighting coefficient vectors w_{1 }and w_{2 }each have a maximum component in a different dimension and are selected so that the sum of diagonal matrixes W_{1 }and W_{2 }using components of the weighting coefficient vectors as their diagonal elements becomes a constant multiple of the unit matrix.Claims(41) 1. A method of encoding an input vector through the use of M codebooks each having a plurality of labeled representative vectors of the same number of dimensions, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) selecting said one representative vectors one by one vector from each of said M codebooks;
(b) multiplying said representative vectors, each selected from one of said M codebooks, by M predetermined weighting coefficient vectors of the same number of dimensions as those of said representative vectors to generate M weighted representative vectors, said M weighting coefficient vectors having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
(c) adding all of said M weighted representative vectors to generate a combined representative vector;
(d) calculating the distance between said input vector and said combined representative vector;
(e) repeating steps (a), (b), (c) and (d) to search for and determine said combined representative vector which has the minimum distance between it and said input signal vector; and
(f) encoding and outputting labels attached to said representative vectors of said codebooks which provided said combined representative vector of said minimum distance.
2. A method of encoding an input vector through the use of M codebooks each having a plurality of labeled representative vectors of the same number of dimensions, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) multiplying representative vectors, each selected from one of said M codebooks, by M weighting coefficient vectors of the same number of dimensions as those of said representative vectors of said M codebooks to obtain an M groups of weighted representative vectors, and determining M straight lines for approximating said M groups of weighted representative vectors, respectively, said M weighted coefficient vectors each having at least one maximum component in a different dimension;
(b) projecting said input vector on said M straight lines on an M-dimensional coordinate system and pre-selecting pluralities of weighted representative vectors adjacent or close to said projections, respectively, to form M subgroups;
(c) selecting weighted representative vectors from said M subgroups and adding them to obtain a combined representative vector;
(d) calculating the distance between said combined representative vector and said input vector;
(e) repeating said steps (c) and (d) for each combination of weighted representative vectors of said M subgroups to calculate said distance; and
(f) determining labels in said M codebooks which correspond to said weighted representative vectors of the combination that was minimum in said distance, and outputting said labels as encoded results of said input signal vector.
3. A method of encoding an input vector through the use of M codebooks each having a plurality of labeled representative vectors of the same number of dimensions, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) pregenerating M weighted codebooks each having weighted representative vectors obtained by multiplying representative vectors of said M codebooks by M predetermined weighting coefficient vectors of the same number of dimensions as those of said representative vectors, said M weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
(b) selecting one representative vectors one by one vector from each of said M weighted codebooks;
(c) adding all of said M weighted representative codebooks to generate a combined representative vector;
(d) calculating the distance between said input vector and said combined representative vector;
(e) repeating steps (b), (c) and (d) to search for and determine said combined representative vector which has the minimum distance between it and said input vector; and
(f) encoding and outputting labels attached to representative vectors of said codebooks which provided said combined representative vectors of said minimum distance.
4. A method of encoding an input vector through the use of M codebooks each having a plurality of labeled representative vectors of the same number of dimensions, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) multiplying representative vectors of said M codebooks by M weighting coefficient vectors of the same number of dimensions as those of said representative vectors to obtain an M groups of weighted representative vectors, said M weighting coefficient vectors each having at least one maximum component in a different dimension;
(b) determining M straight lines for approximating said M groups of weighted representative vector, respectively;
(c) projecting said input vector on said M straight lines on an M-dimensional coordinate system and pre-selecting pluralities of weighted representative vectors adjacent or close to said projections, respectively, to form M subgroups;
(d) selecting weighted representative vectors from said M subgroups, respectively, and adding them to obtain a combined representative vector;
(e) calculating the distance between said combined representative vector and said input vector;
(f) repeating said steps (d) and (e) for each combination of weighted representative vectors of said M subgroups to calculate said distance; and
(g) determining labels in said M codebooks which correspond to said weighted representative vectors of the combination which was minimum in said distance, and outputting said labels as encoded results of said input vector.
5. The method of
2, 3, or 4 wherein said M weighting coefficient vectors are selected so that the sum of M diagonal matrixes using components of said weighting coefficient vectors as their diagonal elements becomes a constant multiple of the unit matrix.6. The method of
7. The method of
8. The method of
9. A method of encoding the vector of an input acoustic signal through the use of M excitation source codebooks each having a plurality of labeled excitation vectors, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) calculating spectrum envelope parameters of said vector of said input acoustic signal, quantizing said spectrum envelope parameters and setting said quantized parameters as filter coefficients of a synthesis filter;
(b) selecting M excitation vectors from said M excitation source codebooks so that distortion of an acoustic signal synthesized by said synthesis filter, from said input acoustic signal, is minimized;
(c) selecting one gain vector from each of M gain codebooks each having a plurality of labeled M-dimensional gain vectors;
(d) multiplying said gain vectors selected from said M gain codebooks by M predetermined M-dimensional weighting coefficient vectors, respectively, to generate M weighted gain vectors, said M weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
(e) adding all of said M weighted gain vectors to obtain a combined gain vector, and defining first to M-th components of said combined gain vector as first to M-th gains;
(f) providing said first to M-th gains to said M determined excitation vectors, respectively;
(g) adding said M gain-provided excitation vectors and exciting said synthesis filter by said added output to generate a synthesized acoustic signal;
(h) calculating distortion of said synthesized acoustic signal from said input acoustic signal;
(i) repeating steps (c) to (h) for every combination of respective gain vectors of said M gain codebooks to search for and determine M gain vectors that minimize said distortion, and obtaining M gain labels corresponding to said M gain vectors; and
(j) outputting, as at least one part of encoded results of said input acoustic signal, said M gain labels obtained in said step (i) and the labels of said excitation codebooks obtained in said step (b).
10. A method of encoding the vector of an input acoustic signal through the use of M excitation source codebooks each having a plurality of labeled excitation vectors, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) calculating spectrum envelope parameters of said vector of said input acoustic signal, quantizing said spectrum envelope parameters and setting said quantized parameters as filter coefficients of a synthesis filter;
(b) selecting M excitation vectors from said M excitation source codebooks, respectively, so that distortion of synthesized speech by said synthesis filter, from said input acoustic signal, is minimized;
(c) providing first to M-th gains to said M determined excitation vectors, respectively, adding them to generate an excitation signal vector for excitation of said synthesis filter to generate said synthesized acoustic signal and, for each combination of said excitation vectors, searching for optimum values of said first to M-th gains which minimize said distortion of said synthesized acoustic signal from said input acoustic signal;
(d) multiplying respective gain vectors of M gain codebooks each having a plurality of labeled M-dimensional gain vectors by M predetermined M-dimensional weighting coefficient vectors for said M gain codebooks to obtain M groups of weighted gain vectors, and determining M straight lines for approximating said M groups of weighted gain vectors, said M weighting coefficient vectors each having at least one maximum component in a different dimension;
(e) projecting a vector, composed of said optimum first to M-th gains, on said M straight lines of an M-dimensional coordinate system, and pre-selecting from said M groups pluralities of weighted gain vectors adjacent or close to said projections to form M subgroup;
(f) selecting said one weighted representative vectors one by one vector from each of said M subgroups, adding them to obtain an M-dimensional combined gain vector, and defining first to M-th gain components of said combined gain vector as first to M-th gains;
(g) multiplying said M excitation vectors, determined in step (b), by said first to M-th gain components of said combined gain vector, respectively, and adding them to generate an excitation signal vector;
(h) applying said excitation signal vector, generated in said step (g), to said synthesis filter to synthesize an acoustic signal and calculating distortion of said acoustic signal from said input acoustic signal;
(i) repeating said steps (f), (g) and (h) for every combination of weighted representative vectors of said M subgroups to calculate said distortion, and determining gain labels in said M codebooks which correspond to said combination of weighted gain vectors which was minimum in said distortion; and
(j) outputting labels of said M excitation vectors and said gain labels as at least one part of encoded results of said vector of said input acoustic vector.
11. A method of encoding the vector of an input acoustic signal through the use of M excitation source codebooks each having a plurality of labeled excitation vectors, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) calculating spectrum envelope parameters of said vector of said input acoustic signal, quantizing said spectrum envelope parameters and setting said quantized parameters as filter coefficients of a synthesis filter;
(b) selecting said excitation vectors one by one from said M excitation source codebooks so that distortion of a synthesized acoustic signal by said synthesis filter, from said input acoustic signal, is minimized;
(c) pregenerating M weighted gain codebooks each having M groups of labeled weighted gain vectors obtained by multiplying M-dimensional gain vectors of M gain codebooks by M predetermined M-dimensional weighting coefficient vectors, respectively, said M weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
(d) selecting one representative weighted gain vectors one by one vector from each of said M weighted codebooks;
(e) adding all of said M weighted gain vectors to generate a combined gain vector, and defining first to M-th components of said combined gain vector as first to M-th gains;
(f) providing said first to M-th gains to said M excitation vectors respectively;
(g) adding said M gain-provided excitation vectors, and exciting said synthesis filter by said added output to generate a synthesized acoustic signal;
(h) calculating distortion of said synthesized acoustic signal from said input acoustic signal;
(i) repeating said steps (d) to (h) for every combination of said weighted gain vectors of said M weighted codebooks to search for and determine M weighted gain vectors which minimize said distortion, and obtaining M labels corresponding to said M weighted gain vectors; and
(j) outputting, as at least one part of encoded results of said input acoustic signal, said M labels of said M weighted gain codebooks obtained in said step (i) and labels of said M excitation codebooks obtained in said step (b).
12. A method of encoding the vector of an input acoustic signal through the use of M excitation source codebooks each having a plurality of labeled excitation vectors, said M being an integer equal to or greater than 2, said method comprising the steps of:
(a) calculating, spectrum envelope parameters of said vector of said input acoustic signal, quantizing said spectrum envelope parameters and setting said quantized parameters as filter coefficients of a synthesis filter;
(b) selecting said excitation vectors one by one from said M excitation source codebooks so that distortion of a synthesized acoustic signal by said synthesis filter, from said input acoustic signal, is minimized;
(c) providing first to M-th gains to said selected M excitation vectors, adding them to generate an excitation signal vector, and searching for and determining optimum values of said first to M-th gains such that distortion of a synthesized acoustic signal from said synthesis filter excited by said excitation signal vector, from said input acoustic signal, is minimized;
(d) pregenerating M weighted codebooks each having M groups of labeled gain vectors obtained by multiplying M-dimensional gain vectors of M gain codebooks by M predetermined M-dimensional weighting coefficient vectors, respectively, said M weighting coefficient vectors each having at least one maximum component in a different dimension, and determining M straight lines for approximating said M groups of weighted gain vectors, respectively;
(e) projecting a vector, composed of said optimum first to M-th gains, said M straight lines on an M-dimensional coordinate system, and pre-selecting from said M groups pluralities of weighted gain vectors adjacent or close to said projections to form M subgroups;
(f) selecting said one weighted gain vectors one by one vector from each of said M subgroups, adding them to obtain an M-dimensional combined gain vector, and defining first to M-th gain components of said combined gain vector as first to M-th gains,
(g) multiplying said M excitation vectors, determined in said step (b), by said first to M-th gain components of said combined gain vector, and adding them to generate an excitation signal vector;
(h) applying said excitation signal vector to said synthesis filter to synthesize an acoustic signal, and calculating its distortion from said input acoustic signal;
(i) repeating said steps (f), (g) and (h) for every combination of weighted representative vectors of said M subgroups to calculate said distortion, and searching for and determining gain labels in said M codebooks which correspond to weighted gain vectors of the combination which was minimum in said distortion; and
(j) outputting labels of said M determined excitation vectors and said determined gain labels as at least one part of encoded results of said vector of said input acoustic signal vector.
13. The method of
10, 11, or 12 wherein said M weighting coefficient vectors are selected so that the sum of M diagonal matrixes using components of said M weighting coefficient vectors as diagonal elements becomes a constant multiple of the unit matrix.14. The method of
15. The method of
16. The method of
17. An encoder for encoding the vector of an input signal through the use of a plurality of codebooks, comprising:
M codebook each having a plurality of labeled representative vectors;
multiplying means for multiplying representative vectors selected from said M codebooks by M predetermined but different weighting coefficient vectors to generate weighted representative vectors, and an M being an integer equal to or greater than 2 and said M weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
a vector combining part for adding said M weighted vectors to generate a combined representative vector;
a distance calculating part for calculating the distance between said combined representative vector from said vector combining part and said input vector; and
a control part for operating said vector combining part and said distance calculating part while changing the selection of weighted representative vectors from said M weighted codebooks, for determining a combination of weighted representative vectors of said M codebooks which minimizes said distance, and for outputting their corresponding labels as encoded results of the vector of said input signal.
18. An encoder for encoding the vector of an input signal through the use of a plurality of codebooks, comprising:
M weighted representative codebooks each having M groups of weighted representative vectors generated by multiplying representative vectors of M groups by M different weighting coefficient vectors, respectively, said M being an integer equal to or greater than 2 and said weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
a vector combining part for adding M weighted vectors respectively selected from said M weighted codebooks to generate a combined representative vector;
a distance calculating part for calculating the distance between said combined representative vector from said vector combining part and the vector of said input signal; and
a control part for operating said vector combining part and said distance calculating part while changing the selection of weighted representative vectors from said M weighted codebooks, for determining a combination of weighted representative vectors of said M weighted codebooks which minimizes said distance, and for outputting their corresponding labels as encoded results of the vector of said input signal.
19. The encoder of
20. The encoder of
21. An encoder for encoding the vector of an input acoustic signal through the use of a plurality of codebooks, comprising:
M excitation source codebooks each having a plurality of excitation vectors, said M being an integer equal to or greater than 2;
first to M-th gain providing parts for multiplying said M excitation vectors from said M excitation source codebooks by first to M-th gains, respectively;
an adding part for adding said M gain-provided excitation vectors from said first to M-th gain providing parts to generate an excitation signal vector;
filter coefficient generating means which analyzes said input acoustic signal to obtain parameters representing its spectrum envelope and quantizes said parameters to generate filter coefficients;
a synthesis filter which has said filter coefficients set therein and is excited by said excitation signal vector to synthesize an acoustic signal;
distortion calculating means for calculating the difference between said input acoustic signal and said synthesized acoustic signal and for calculating from said difference the distortion of said synthesized acoustic signal from said input acoustic signal;
M gain codebooks each having a plurality of labeled M-dimensional gain vectors;
multiplying means for multiplying gain vectors respectively selected from said M gain codebooks by M predetermined M-dimensional weighting coefficient vectors to generate weighted gain vectors, said M weighting coefficient vectors each having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
a vector combining part for adding said M weighted gain vectors to generate an M-dimensional combined gain vector and for setting first to M-th components of said combined gain vector as first to M-th gains in said first to M-th gain providing parts, respectively; and
control means for controlling the selection of said M excitation vectors from said M excitation source codebooks, for determining a combination of said M excitation vectors which minimizes said distortion of said synthesized acoustic signal from said input acoustic signal, for calculating said distortion by calculating means for every combination of gain vectors respectively selected from said M gain codebooks to determine a combination of M gain vectors which minimizes said distortion, and for outputting labels of said M gain codebooks corresponding to said M determined gain vectors and labels corresponding to said M determined excitation vectors as at least one part of encoded results of said input acoustic signal.
22. An encoder for encoding the vector of an input acoustic signal through the use of a plurality of codebooks, comprising:
M excitation source codebooks each having a plurality of excitation vectors, said M being an integer equal to or greater than 2;
first to M-th gain providing parts for multiplying said M excitation vectors from said M excitation source codebooks, respectively;
an adding part for adding said M gain-provided excitation vectors from said first to M-th gain providing parts to generate an excitation signal vector;
filter coefficient generating means which analyzes said input acoustic signal to obtain parameters representing its spectrum envelope and quantizes said parameters to generate filter coefficients;
a synthesis filter which has said filter coefficients set therein and is excited by said excitation signal vector to synthesize an acoustic signal;
distortion calculating means which calculates the difference between said input acoustic signal and said synthesized acoustic signal and calculates from said difference the distortion of said synthesized acoustic signal from said input acoustic signal;
M weighted gain codebooks each having M groups of weighted gain vectors generated by multiplying M groups of M-dimensional gain vectors by M-dimensional weighting coefficient vectors, said M weighting coefficient vectors having at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector being different from at least one of the other components of said weighting coefficient vector;
a vector combining part which adds M weighted vectors respectively selected from said M weighted codebooks to generate a combined representative vector and sets first to M-th components of said combined gain vector as first to M-th gains in said first to M-th gain providing parts, respectively; and
control means which controls the selection of said M excitation vectors from said M excitation source codebooks, determines a combination of said M excitation vectors which minimizes said distortion of said synthesized acoustic signal from said input acoustic signal, obtains by calculating means said distortion for each combination of weighted gain vectors selected from said M weighted gain codebooks, determines a combination of M weighted gain vectors which minimizes said distortion, and outputs labels of said M weighted gain codebooks corresponding to said M determined weighted gain vectors and labels corresponding to said M determined excitation vectors as at least one part of encoded results of said input acoustic signal.
23. The encoder of
24. The encoder of
25. A decoder for decoding an inputted code by referring to a plurality of codebooks, comprising:
M codebook each having a plurality of labeled representative vectors, said M being an integer equal to or greater than 2;
multiplying means which selects representative vectors corresponding to respective labels in said input code from the corresponding ones of said codebooks and multiplies said selected representative vectors by M weighting coefficient vectors predetermined for said M codebooks to generate M weighted representative vectors; and
a vector combining part which combines said M weighted representative vectors into a reconstructed vector;
wherein said M weighting coefficient vectors each has at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector is different from at least one of the other components of said weighting coefficient vector.
26. A decoder for decoding an inputted code by referring to a plurality of codebooks, comprising:
M weighted codebooks each having a plurality of labeled weighted representative vectors, said M being an integer equal to or greater than 2; and
a vector combining part which selects from said M weighted codebooks weighted representative vectors corresponding to M labels in said input code and combines them into a reconstructed vector;
wherein said M weighting coefficient vectors each have at least one maximum component in a different dimension and at least one of the components of each weighting coefficient vector is different from at least one of the other components of said weighting coefficient vector.
27. The decoder of
28. A method of encoding an input vector through the use of M codebooks, each including a plurality of representative vectors, each representative vector having a corresponding label, said M being an integer equal to or greater than
2, and the method comprising the steps of:searching a combination of M representative vectors, each representative vector of the M representative vectors being selected from a different one of the M codebooks, for a combined minimum vector of the M selected representative vectors which provides a minimum distance between the input vector and itself; and
outputting the corresponding labels for the M selected representative vectors of the combined minimum vector,
wherein the representative vectors of a respective codebook of the M codebooks have a distribution which is concentrated close to a respective dimensional axis of the representative vectors in the respective codebook, said respective dimensional axis being different for each of the M codebooks.
29. The method of
(
a) selecting one representative vector from each of the M codebooks; (
b) adding all representative vectors selected in step (a) to generate a combined vector; (
c) calculating a distance between the input vector and the combined vector generated in step (b); and (
d) repeating steps (a), (b) and (c) to determine the combined minimum vector which provides the minimum distance between the input vector and itself. 30. A method of encoding an input vector through the use of M codebooks, each including a plurality of representative vectors, each representative vector having a corresponding label, said M being an integer equal to or greater than
2, and the method comprising the steps of:projecting the input vector onto M straight lines, each straight line approximating a respective distribution of the representative vectors in a respective codebook of the M codebooks;
preselecting a predetermined number of representative vectors present around each projection of the input vector onto the M straight lines so as to form M sets of pre-selected representative vectors;
searching a combination of the preselected representative vectors, each preselected vector being selected from a different one of the M sets or preselected representative vectors, for a combined minimum vector which provides a minimum distance between the input vector and itself; and
outputting the corresponding labels for the preselected representative vectors of the combined minimum vector,
31. The method of
(
a) selecting one representative vector from each of the M sets of preselected representative vectors; (
b) adding all representative vectors selected in step (a) to generate a combined vector; (
c) calculating a distance between the input vector and the combined vector generated in step (b); and (
d) repeating steps (a), (b) and (c) to determine the combined minimum vector which provides the minimum distance between the input vector and itself. 32. The method of claims
30 or 31, wherein the preselecting step selects the predetermined number of representative vectors in an increasing order of a difference between a dimensional value of the projected input vector at the respective dimensional axis and a dimensional value of a respective representative vector at the respective dimensional axis.33. An encoder for encoding an input vector, comprising:
M codebooks, each including a plurality of representative vectors, each representative vector having a corresponding label, in which the representative vectors of a respective codebook of the M codebooks have a distribution which is concentrated close to a respective dimensional axis of the representative vectors in the respective codebook, said respective dimensional axis being different from each of the M codebooks, and said M being an integer equal to or greater than
2; a determining mechanism configured to search a combination of M representative vectors, each representative vector of the M representative vectors being selected from a different one of the M codebooks, for a combined minimum vector of the M selected representative vectors which provides a minimum distance between the input vector and itself; and
a controlling mechanism configured to output the corresponding labels for the M selected representative vectors of the combined minimum vector.
34. The encoder of
a selecting mechanism configured to select one representative vector from each of the M codebooks;
an adding mechanism configured to add all representative vectors selected by the selecting mechanism to generate a combined vector; and
a distance calculating mechanism configured to calculate a distance between the input vector and the combined vector generated by the adding mechanism,
wherein the processes performed by the selecting, adding and distance calculating mechanisms are repeated to determine the combined minimum vector which provides the minimum distance between the input vector and itself.
35. An encoder for encoding an input vector through the use of M codebooks, each including a plurality of representative vectors, each representative vector having a corresponding label, said M being an integer equal to or greater than
2, the encoder comprising:a projecting mechanism configured to project the input vector onto M straight lines, each M straight line approximating a respective distribution of the representative vectors in a respective codebook of the M codebooks;
a preselecting mechanism configured to preselect a predetermined number of representative vectors present around each projection of the input vector onto the M straight lines so as to form M sets of preselected representative vectors;
a searching mechanism configured to search a combination of the preselected representative vectors, each preselected representative vector being selected from a different one of the M sets of preselected representative vectors, for a combined minimum vector which provides a minimum distance between the input vector and itself; and
a controlling mechanism configured to output the corresponding labels for the preselected representative vectors of the combined minimum vector,
36. The encoder of
a selecting mechanism configured to select one representative vector from each of the M sets of preselected representative vectors;
a calculating mechanism configured to calculate a distance between the input vector and the combined vector generated by the adding mechanism,
wherein the processes performed by the selecting, adding and calculating mechanisms are repeated to determine the combined minimum vector which provides the minimum distance between the input vector and itself.
37. The encoder of claims
35 or 36, wherein the preselecting mechanism selects the predetermined number of representative vectors in an increasing order of a difference between a dimensional value of the projected input vector at the respective dimensional axis and a dimensional value of a respective representative vector at the respective dimensional axis.38. A method of decoding an input code through the use of M codebooks, each including a plurality of representative vectors, each representative vector having a corresponding label, said M being an integer equal to or greater than
2, and the method comprising the steps of:selecting one representative vector from each of the M codebooks which correspond to labels in the input code; and
obtaining a reconstructed vector by combining the representative vectors selected in the selecting step,
39. A decoder for decoding an input code comprising:
M codebooks each including a plurality of representative vectors, each representative vector having a corresponding label, in which the representative vectors of a respective codebook of the M codebooks have a distribution which is concentrated close to a respective dimensional axis of the representative vectors in the respective codebook, said respective dimensional axis being different for each of the M codebooks, and said M being an integer equal to or greater than
2; a selecting mechanism configured to select one representative vector from each of the M codebooks which correspond to labels in the input code; and
a vector combining mechanism configured to reconstruct a vector by combining the representative vectors selected by the selecting mechanism.
40. A speech coding method, comprising the steps of:
calculating spectrum envelope parameters of an inputted speech signal;
quantizing the spectrum envelope parameters;
setting the quantized spectrum envelope parameters as filter coefficients of a synthesis filter;
selecting a pitch-excitation vector from a pitch-excitation source codebook which stores pitch-excitation vectors respectively containing different pitch-period components; providing the selected pitch-excitation vector with a first gain; selecting a random-excitation vector from a random-excitation source codebook; providing the random-excitation vector with a second gain; adding the pitch-excitation vector provided with the first gain and the random-excitation vector provided with the second gain; and reproducing a synthesized speech signal by driving said synthesis filter having the filter coefficients with the added signal,
wherein a gain combined vector including M components is calculated by combining representative vectors each selected from a different codebook of the M codebooks, each codebook including a plurality of representative vectors, each representative vector having a corresponding label, in which the representative vectors of a respective codebook of the M codebooks have a distribution which is concentrated close to a respective dimensional axis of the representative vectors in the respective codebook, said respective dimensional axis being different for each of the M codebooks, and said M being an integer equal to or greater than
2, wherein the first and second gains are respectively provided by using each component of the gain combined vector, and
wherein the gain combined vector is selected to minimize the distortion of the synthesized speech signal with respect to the inputted speech signal.
41. A speech encoder, comprising:
a calculating mechanism configured to calculate spectrum envelope parameters of an inputted speech signal;
a quantizing mechanism configured to quantize the spectrum envelope parameters;
a synthesis filter having filter coefficients set as the quantized spectrum envelope parameters;
a selecting mechanism configured to select a pitch-excitation vector from a pitch-excitation source codebook which stores pitch-excitation vectors containing different pitch-period components, the selected pitch-excitation vector being provided with a first gain; a selecting mechanism configured to select a random-excitation vector from a random-excitation source codebook, the random-excitation vector being provided with a second gain; and an adding mechanism configured to add the pitch-excitation vector provided with the first gain and the random-excitation vector provided with the second gain, wherein a synthesized speech signal is reproduced by driving said synthesis filter with the added signal,
wherein a gain combined vector including M components is calculated by combining representative vectors each selected from a different codebook of M codebooks, each codebook including a plurality of representative vectors, each representative vector having a corresponding label, in which the representative vectors of a respective codebook of the M codebooks have a distribution which is concentrated close to a respective dimensional axis of the representative vectors in the respective codebook, said respective dimensional axis being different for each of the M codebooks, and said M being an integer equal to or greater than
2, wherein the synthesized representative vector is selected to minimize the distortion of the synthesized speech signal with respect to the inputted speech signal.
Description The present invention relates to a vector coding method that is used to encode speech, images and various other pieces of information and is particularly suited to encoding of information that is transmitted over an error-prone channel such as a mobile radio channel and encodes an input vector through the use of a plurality of codebooks each composed of plural representative vectors. The invention also pertains to a vector encoder using the above-mentioned vector coding method and a vector decoder for decoding codes encoded by the vector encoder. Methods that have been proposed to transmit vectors over channels prone to channel errors set representative vectors in anticipation of possible channel errors and take into account the channel errors when labeling representative vectors. These methods are disclosed in Kumazawa, Kasahara and Namekawa, “A Communication of Vector Quantizers for Noisy Channels,” Transactions of the Institute of Electronics, Information and Communication Engineers of Japan, Vol. J67-B, No. 1, pp. 1-8, 1984, Zeger and Gersho, “Pseudo-Gray Coding,” IEEE Trans. on Comm., Vol. 38, No. 12, pp. 2147-2158, 1990, and other literature. These methods hold all representative vectors directly in one codebook, and hence require large storage capacity for storing the codebook. As a method that does not need large storage capacity in transmitting vectors over channels prone to channel errors, it has been proposed to transmit vectors after quantizing them through the use of two structured codebooks. This is disclosed in Moriya, “Two-Channel Vector Quantizer Applied to Speech coding,” Transactions of the Institute of Electronics, Information and Communication Engineers of Japan, IT87-106, pp. 25-30, 1987 and other literature. This method has two small-scale codebooks and uses two representative vectors in combination to reduce the storage capacity needed and transmits two labels indicative of the two representative vectors to lessen the influence of channel errors. This method will be described with reference to FIGS. 1A and 1B. The representative vectors of the codebooks are generated beforehand by learning, for instance. In an encoder depicted in FIG. 1A, one representative vector Z In a decoder shown in FIG. 1B, the control part The method described above in respect of FIGS. 1A and 1B reduces the storage capacity of the codebooks needed for storing the representative vectors and lessens the influence of channel errors by combining the vectors in the decoder through utilization of the two labels corresponding thereto. With this method, however, if an error arises in the labels during transmission over the channel, then distortion will occur in all elements of the received vector. According to the circumstances, the error will cause an abnormally large amount of distortion in the decoded output. Another problem of this method is that the amount of processing required is very large because it involves the calculation of the distance d(X,y An object of the present invention is to provide a vector coding method that prevents an error in the input code to the decoder from causing serious distortion of its output. Another object of the present invention is to provide a vector coding method that prevents an error in the input code to the decoder from causing serious distortion of its output and permits reduction of the amount of processing required. Another object of the present invention is to provide a vector encoder that embodies the above-mentioned vector coding method. Still another object of the present invention is to provide a vector decoder that decodes a vector encoded by the vector coding method that serves the above-mentioned objects. According to the vector coding method and the encoder of the first aspect of the present invention, representative vectors from respective codebooks are combined and the distance between the combined representative vector and the input vector is calculated; in this instance, the representative vectors to be combined are those multiplied by predetermined different weighting coefficient vectors, each of which is composed of the same number of components. At least one of the components in each weighting coefficient vector assumes a maximum value, and the positions of the maximum components in the respective weighting coefficient vectors differ with the codebooks. The multiplication of each representative vector by the weighting coefficient vector is done on the representative vector read out of each codebook, or weighted representative vectors respectively premultiplied by the weighting coefficient vectors are prestored in each codebook. According to the vector coding method and the vector encoder of a second aspect of the present invention, in the first aspect, the distribution of the set of weighted representative vectors multiplied by the weighting coefficient vectors for each codebook are approximated with straight lines, then the input vector is projected on each straight line and a plurality of weighted representative vectors present around the projection are chosen for each codebook. The thus chosen weighted representative vectors of the respective codebooks are combined in pairs and that one of the combined vectors which has the minimum distance to the input vector is selected as the combined representative vector. In a third aspect, the present invention relates to a decoder for codes encoded into vectors according to the first or second aspect. The representative vectors read out of respective codebooks are multiplied by weighting coefficient vectors and the multiplied weighted representative vectors are combined into a reconstructed vector; the weighting coefficient vectors are selected in the same fashion as in the first aspect of the invention. FIG. 1A is a block diagram showing the configuration of an encoder embodying a conventional vector coding method; FIG. 1B is a block diagram showing a conventional decoder for use with the encoder of FIG. 1A; FIG. 2A is a block diagram illustrating an example of the encoder embodying the present invention; FIG. 2B is a block diagram illustrating an embodiment of the decoder according to the present invention; FIG. 3A is a graph showing each representative vector z FIG. 3B is a graph showing a vector obtained by multiplying each representative vector z FIG. 3C is a graph showing each representative vector z FIG. 3D is a graph showing a vector obtained by multiplying each representative vector z FIG. 3E is a graph showing examples of a combined vector and an erroneous combined vector in the present invention; FIG. 3F is a graph showing examples of a combined vector and an erroneous combined vector in the prior art; FIG. 4 is a graph showing a set of weighted representative vectors and a straight line for approximation, for explaining the vector coding method of the present invention; FIG. 5 is a block diagram illustrating an example of the encoder of the present invention applied to the CELP scheme; FIG. 6A is a table showing a first codebook with weighted gain vectors stored therein; FIG. 6B is a table showing a second codebook with weighted gain vectors stored therein; FIG. 7 is a graph showing weighted gain vectors of FIGS. 6A and 6B on a coordinate system; FIG. 8 is a block diagram illustrating the encoder of the present invention applied to the quantization of random excitation vectors; FIG. 9 is a block diagram illustrating the encoder of the present invention applied to the quantization of pitch excitation vectors; FIG. 10 is a block diagram illustrating the encoder of the present invention applied to the VSELP scheme; FIG. 11 is a graph showing segmental SN ratio of reconstructed speech with respect to a channel error rate in the cases of using one gain codebook and two gain codebooks for vector encoding of gains g FIG. 12 is a graph showing, as an equivalent Q value, the MOS of the reconstructed speech with respect to the channel error rate in the two cases of FIG. 11; and FIG. 13 is a graph showing the segmental SN ratio of reconstructed speech with respect to the channel-error rate in the case of the present invention that uses two weighted gain codebooks for vector encoding of the gains g In FIGS. 2A and 2B there are illustrated in block form an embodiment of the present invention, in which the parts corresponding to those in FIGS. 1A and 1B are identified by the same reference numerals. In the encoder of FIG. 2A, multipliers the weighting coefficient vectors w where K is a predetermined constant. Vectors w With such a configuration as described above, for example, when L=2, the representative vectors z Suppose, for example, that when it is judged at the transmitting side that the weighted combined vector y In other words, in the example of FIG. 3E, when the weighted representative vector w FIG. 2B illustrates in block form an embodiment of the decoder of the present invention, which is supplied with the labels i and j and the weighting coefficient vectors w In the encoder of FIG. 2A , the combined vector y Let is be assumed, for example, that the codebooks CB The two-dimensional weighting coefficient vectors w With this scheme, the set of weighted representative vectors (indicated by the crosses) of the codebook CB The input vector X is projected on the approximating straight lines Only the weighted representative vectors thus pre-selected from the codebooks CB FIG. 5 illustrates an embodiment of the coding method of the present invention applied to speech coding of the CELP (Code-Excited Linear Prediction Coding) system. In the CELP system, as disclosed by M. R. Schroeder and B. S. Atal in “Code-Excited Linear Prediction (CELP): High-Quality Speech at Very Low Bit Rates”, Proc. ICASSP'85, pp. 937-940, 1985, for instance, pitch excitation vectors read out of a pitch excitation source codebook and random excitation vectors read out of a random excitation source codebook are respectively provided with gains and are combined in pairs, then the combined vectors are each fed as an excitation signal to a synthesis filter to obtain synthesized speech, then two vectors and two gains therefor are determined which minimize distortion of the synthesized speech with respect to input speech, and labels of these vectors and labels of the gains are outputted as encoded results of the input speech, together with the filter coefficients of the synthesis filter. By applying the vector coding method of the present invention to the encoding of the gains for the two vectors in the CELP system, it is possible to prevent the occurrence of gross distortion in the decoded or reconstructed speech by a channel error in the codes representing the gains. The input speech signal X fed via an input terminal After the selection of the pitch excitation vector and the random excitation vector as mentioned above, gains g
In a vector combining part
The first and second components g The pitch excitation vector g In the FIG. 5 embodiment, the distortion is calculated for every combination of gain vectors selected from the gain codebooks CB
which are the products of the gain vectors z As is the case with the FIG. 5 example, the gains of the gain providing parts
where t indicates a transposition. By partially differentiating the above equation by the gains g
Since the distortion D is a downward convex function with respect to each of the gains g
−g from Eqs. (7) and (8), respectively. From the following equation, the gains g Expanding Eq(12), the gains g
where k=1/({circumflex over (X)} In this case, predetermined numbers of weighted gain vectors (for example, four from Table I with respect to the point P A description will be given of still another method of pre-selecting candidates of vectors from the codebooks without using the aforementioned approximating straight lines L
Then, a predetermined number, for example, three, of gain vectors are pre-selected from the gain codebook CB
then, a predetermined number, for example, three, of gain vectors are pre-selected from the gain codebook CB
Then, i and j that minimize the value are provided as encoded outputs. This method also permits reduction of the computational complexity involved. All pairs of thus pre-selected weighted gain vectors that are selected one by one from the first and second subgroups, respectively, are searched for a pair or weighted gain vectors that provide the minimum distortion from the input speech signal X, that is, an optimum pair of first and second gains g At any rate, the multiplication of the weighting coefficient vectors w In the FIG. 5 embodiment, the weighting coefficient vectors, the weighted gain vectors and the combined gain vector have all been described to be two-dimensional vectors with a view to providing gains to the excitation vectors read out from both of the pitch excitation source codebook In the FIG. 5 embodiment, the random excitation source codebook As is the case with the encoding of the random excitation vector, the present invention is also applicable to the encoding of the pitch excitation vector in the configuration of FIG. The present invention can be applied to the quantization in the filter coefficient quantizing part shown in FIG. 5 by configuring the filter coefficient quantizing part The vector coding method of the present invention is applicable to the VSELP system as well. FIG. 10 illustrates the principal parts of its embodiment. In this instance, the random excitation source codebook As will be understood from the description given above with reference to FIG. 10, the random excitation source codebook As described previously, the multipliers Next, a description will be given of characteristics that are obtained in the cases of applying conventional techniques and the present invention to the vector coding of the gains g (A) A first conventional technique substitutes the two gain codebooks CB (B) A second conventional technique uses the two gain codebooks CB (C) In the example of the present invention applied to the encoding of gain vectors in FIG. 5, the gain vectors read out of the gain codebooks CB In FIG. 11 there are indicated, by the curves A In FIG. 12 there are shown, by curves A FIG. 13 shows, by the curve C As described above, according to the present invention, in the case of encoding vectors through the use of a plurality (M) of codebooks each having L-dimensional vectors, L-dimensional weighting coefficient vectors w Furthermore, according to the present invention, a plurality of weighted vectors of each codebook are pre-selected for input signal vectors and the distortion by encoding is calculated with respect to such pre-selected weighted vectors alone—this appreciably reduces the amount of calculations involved in the encoding and speeds up the encoding. The present invention is also applicable to what is called a CELP or VSELP speech coding scheme, in which case the invention can be applied to the vector coding of spectrum envelope parameters, the vector coding of power and the vector coding of each codebook individually or simultaneously. It will be apparent that many modifications and variations may be effected without departing from the scope of the novel concepts of the present invention. Patent Citations
Non-Patent Citations
Referenced by
Classifications
Legal Events
Rotate |