|Publication number||US3681530 A|
|Publication date||Aug 1, 1972|
|Filing date||Jun 15, 1970|
|Priority date||Jun 15, 1970|
|Publication number||US 3681530 A, US 3681530A, US-A-3681530, US3681530 A, US3681530A|
|Inventors||Harold J Manley, Harry L Shaffer|
|Original Assignee||Gte Sylvania Inc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (7), Non-Patent Citations (2), Referenced by (28), Classifications (7)|
|External Links: USPTO, USPTO Assignment, Espacenet|
United States Patent Manley et al.
[151 3,681,530 45 Aug. 1,1972
 METHOD AND APPARATUS FOR SIGNAL BANDWIDTH COMPRESSION UTILIZING THE FOURIER TRANSFORM OF THE LOGARITIIM OF THE FREQUENCY SPECTRUM MAGNITUDE  Inventors: Hamid J. Manley, Sudbury; Barry L. Shaffer, Lynnfield, both of Mass.
 Assignees GTE Sylvanla Incorporated  Filed: June 15, 1970 I  Appl. No.: 46,128
 US. Cl. ..l79/l SA  Int. Cl ..Gl0l l/02, G101 1/08  Field of Search ....l79/15 A, 15.55 R; 324/77 C, 324/77 F  References Cited UNITED STATES PATENTS 3,448,216 6/1969 Kelly ..179/l SA 3,566,035 2/1971 Noll ..179/1 SA 3,344,349 9/1967 Schroeder ..l79/l SA 3,403,227 9/1968 Malm ..l79/1 SA 3,330,910 7/1967 Flanagan ..l79/1 SA 3,493,684 2/1970 Kelly ..|79/1 SA 3,471,648 10/1969 Miller ..179/l SA OTHER PUBLICATIONS Noll, Short-Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296- 302.
Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485- 491.
Primary Examiner- Kathleen H. Clafiy Assistant Examiner-Jon Bradford Leaheey Attorney-Norman J. OMalley, Elmer J. Nealon and Robert T. Orner  ABSTRACT A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal. The logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (PTLSM) of the input speech signal. An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis. The encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals.
The analysis section of a vocoder terminal employs a decodin device whicg decodes he rec eived data apd separate it into pite data an voca tract llllpll se 37 Claims, 19 Drawing Figures PATENTED M19 1 SHEET 10 [1F 18 o wajm hum m 'IZX'H ms HAROLD J. MA NLEY m HARRY L. SHAFFER zoEQzoQ sum 110$ 18 T I kmm Y R a E mm W N M 5 V 5. I L mm .E My R m A A H H E 5 :5 Gm Em 1022B @2203? to 20.282 02 mw ML; FPV no N: c w c A.:|1 O .1 A M AA: 82 zit; w @i F E oz 822 Q9 a 56 29% 2:63; A F 9 V25 Oz; 0 V
PATENTEDAus H972 3.681.530
SHEET 15 8F 18 N4 AVE N (k) INYHYTURF HAROLD J. MANLEY HARRY LSHAFFER \Hnmn WW METHOD AND APPARATUS FOR SIGNAL BANDWIUI'H COMPRESION UTILIZING THE FOURIER TRANSFORM OF THE IDGARITHM OF THE FREQUENCY SPECTRUM MAGNI'I'UDE BACKGROUND OF THE INVENTION This invention relates to speech compresion systems and in particular to digital vocoder systems.
It is well-known that the vocal tract, consisting of throat, mouth, tongue, lips, teeth and nasal passages, forms a time varying linear filter in which the amplitude response versus frequency characteristics is responsible for practically all the information content in a speech signal. This filter is driven by energy sources, cornmonly known as buu" and hiss energy sources.
The term buzz is associated with the type of vocal source excitation function which exists when the vocal cords are oscillating at some quasi-periodic rate (called the pitch). Under this condition the chest cavity is supplying pufi's of air to the vocal tract at the quasiperiodic rate at which the vocal cords are oscillating. The term hiss is associated with the type of vocal source excitation which exists when the vocal cords are not oscillating in a quasi-periodic manner but are always allowing air to pass through from the chest cavity and excite the vocal tract.
For voiced sounds, e.g., vowels, the excitation is from the buzz energy source. For unvoiced sounds, e.g., ss, sh, f and whispered speech, the excitation is from the his source. The information content is impressed upon the speech signal by the vocal tract acting essentially as a time varying distributed constant linear filter. Thus, to recreate speech which is both intelligible and natural sounding, it is necessary to use both the information describing the time varying spectral shape and the information describing the buzz and hiss energy sources. The latter information generally takes the form of measurements of the fundamental frequency of the bus sources as a function of time (pitch extraction). Information as to whether the excitation is buzz or hiss is used by the speech compression system. Combinations of buzz and his excitation are used to generate some sounds, but speech compression systems do not generally try to detect the combined excitation. A decision is usually made as to whether to use buzz or his excitation for this combined excitation in a speech compression system of this type.
Speech compression systems using spectral analysis are generally called vocoders. ln existing speech compression systems, the spectrum data are transmitted by digitally encoding the logarithm of about 16 voltage spectrum amplitude which are derived from a filter bank spectrum analyzer. This method is known to be inefficient because of the high correlations among the various spectrum amplitudes. Various techniques are now used to remove these correlations and therefore reduce the required data rate for a given transmission fidelity. One approach which produces sonne improvements is the use of a delta pulse code modulation scheme in which only the decibel diflerences in level between adjacent frequency channels are transmitted. Another scheme is to form weighted sum of the logged, digitiud spectrum amplitudes, the weighting being arranged so that cross-correlation of the speech wave against a waveform derived from the input speech are markedly reduced.
Another type of vocoder is called the autocorrelation vocoderwhichderivesitsnamefromthefactthatinthe first step of the analysis process the autocorrelation function of the speech input is measured in terms of orthonormal functiom. Just as the power spectrum of the speech input varies with tinne (as a talker articulates various sounds), so does the autocorrelation function. There is a one-to-one correspondence between the power spectrum and the autocorrelation function of the speech signal so that measuring one is equivalent to measuring the other. Mathematically, the power spectrum and the autocorrelation function are Fourier transform pairs. Thus, autocorrelation is simply an alternative method of measuring the short time energy spectrum of the speech sigrnal. in an autocorrelation vocoder, the input signal is first applied to the inputs of a set of orthogonal filters. The filter output signals are multiplied by the input speech signal, and the product signal is then directed through low pass filters. The output signals from the low pass filter are the coefficients in an expansion of the power spectrum.
The power spectrum P0) of a speech signal is the product of the power spectrum of a pitch excitation,
V(f), and the magnitude squares |H(f)|' of a vocal tract transfer function H(f).
P(f)=lH(f)l' V0) (1) As stated above, the autocorrelation function is the Fourier transform of P(j) and is composed of the convolution of the transform of ]H(f) and V0). Practically, this means that the autocorrelation function repeats itself at multiples of the pitch period, and it is necessary to represent the vocal tract out to fairly large delay values (near one-half of a pitch period) in order to represent the speech spectrum with any fidelity. The overlap of successive autocorrelation functions due to convolution properties raises some doubt as to the validity of the values of the autocorrelation function alone as a measure of the vocal tract shape. While the autocorrelation vocoder obtains nearly independent spectral measurements, it does not solve the problem caused by confounding the spectral envelope (vocal tract) data with the excitation spectrum data, which results in higher order transmitted coefficients. Furthermore, this type of vocoder is basically an analog device yielding an output consisting of voltage spectrum values which are subsequently digitized.
SUMMARY OF THE INVENTION Briefly, a bandwidth compression system according to the present invention includes a means for generating an electrical signal representing the Fourier transform of the logarithm of the spectrum magnitudes (FTLSM) of an input sigrnal having excitation and impulse response information included therein. A first detection means, coupled to the means for generating the FILSM electrical signal, is operative to separate out a first predetermined portion of the Fl'LSM electrical signal to represent the excitation infonnation of tlne input signal. A second detection means, also coupled to the means for generating the FTLSM electrical signal, is operative to separate out a second predetermined portion of the FILSM electrical signal to represent the impulse response information of the input signal. The bandwidth required to pass the combined first and
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US3330910 *||May 6, 1964||Jul 11, 1967||Bell Telephone Labor Inc||Formant analysis and speech reconstruction|
|US3344349 *||Oct 7, 1963||Sep 26, 1967||Bell Telephone Labor Inc||Apparatus for analyzing the spectra of complex waves|
|US3403227 *||Oct 22, 1965||Sep 24, 1968||Page Comm Engineers Inc||Adaptive digital vocoder|
|US3448216 *||Aug 3, 1966||Jun 3, 1969||Bell Telephone Labor Inc||Vocoder system|
|US3471648 *||Jul 28, 1966||Oct 7, 1969||Bell Telephone Labor Inc||Vocoder utilizing companding to reduce background noise caused by quantizing errors|
|US3493684 *||Jun 15, 1966||Feb 3, 1970||Bell Telephone Labor Inc||Vocoder employing composite spectrum-channel and pitch analyzer|
|US3566035 *||Jul 17, 1969||Feb 23, 1971||Bell Telephone Labor Inc||Real time cepstrum analyzer|
|1||*||Noll, Short Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296 302.|
|2||*||Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485 491.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US4076960 *||Oct 27, 1976||Feb 28, 1978||Texas Instruments Incorporated||CCD speech processor|
|US4184049 *||Aug 25, 1978||Jan 15, 1980||Bell Telephone Laboratories, Incorporated||Transform speech signal coding with pitch controlled adaptive quantizing|
|US4310721 *||Jan 23, 1980||Jan 12, 1982||The United States Of America As Represented By The Secretary Of The Army||Half duplex integral vocoder modem system|
|US4495620 *||Aug 5, 1982||Jan 22, 1985||At&T Bell Laboratories||Transmitting data on the phase of speech|
|US4914749 *||Oct 29, 1984||Apr 3, 1990||Nec Corporation||Method capable of extracting a value of a spectral envelope parameter with a reduced amount of operations and a device therefor|
|US4941178 *||May 9, 1989||Jul 10, 1990||Gte Laboratories Incorporated||Speech recognition using preclassification and spectral normalization|
|US5179626 *||Apr 8, 1988||Jan 12, 1993||At&T Bell Laboratories||Harmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis|
|US5216748 *||Nov 4, 1991||Jun 1, 1993||Bull, S.A.||Integrated dynamic programming circuit|
|US5412589 *||Jun 16, 1992||May 2, 1995||University Of Michigan||System for detecting reduced interference time-frequency distribution|
|US5715363 *||May 18, 1995||Feb 3, 1998||Canon Kabushika Kaisha||Method and apparatus for processing speech|
|US5809453 *||Jan 25, 1996||Sep 15, 1998||Dragon Systems Uk Limited||Methods and apparatus for detecting harmonic structure in a waveform|
|US6026348 *||Oct 14, 1997||Feb 15, 2000||Bently Nevada Corporation||Apparatus and method for compressing measurement data correlative to machine status|
|US6108621 *||Oct 7, 1997||Aug 22, 2000||Sony Corporation||Speech analysis method and speech encoding method and apparatus|
|US6507804||Feb 14, 2000||Jan 14, 2003||Bently Nevada Corporation||Apparatus and method for compressing measurement data corelative to machine status|
|US6725108||Jan 28, 1999||Apr 20, 2004||International Business Machines Corporation||System and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds|
|US7333930 *||Mar 14, 2003||Feb 19, 2008||Agere Systems Inc.||Tonal analysis for perceptual audio coding using a compressed spectral representation|
|US7580534 *||Mar 3, 2005||Aug 25, 2009||Siemens Audiologische Technik Gmbh||Method and arrangement for transmitting signals to a hearing aid|
|US8065138 *||Aug 31, 2007||Nov 22, 2011||Japan Advanced Institute Of Science And Technology||Speech processing method and apparatus, storage medium, and speech system|
|US8364492 *||Jul 6, 2007||Jan 29, 2013||Nec Corporation||Apparatus, method and program for giving warning in connection with inputting of unvoiced speech|
|US8447605 *||Jun 3, 2005||May 21, 2013||Nintendo Co., Ltd.||Input voice command recognition processing apparatus|
|US8768638 *||Aug 26, 2005||Jul 1, 2014||Rohde & Schwarz Gmbh & Co. Kg||Method and device for performing spectrum analysis of a wanted signal or noise signal|
|US20040181393 *||Mar 14, 2003||Sep 16, 2004||Agere Systems, Inc.||Tonal analysis for perceptual audio coding using a compressed spectral representation|
|US20050201578 *||Mar 3, 2005||Sep 15, 2005||Siemens Audiologische Technik Gmbh||Method and arrangement for transmitting signals to a hearing aid|
|US20050273323 *||Jun 3, 2005||Dec 8, 2005||Nintendo Co., Ltd.||Command processing apparatus|
|US20070250558 *||Aug 26, 2005||Oct 25, 2007||Rohde & Schwarz Gmbh & Co. Kg||Method and Device for Performing Spectrum Analysis of a Wanted Signal or Noise Signal|
|US20080281588 *||Aug 31, 2007||Nov 13, 2008||Japan Advanced Institute Of Science And Technology||Speech processing method and apparatus, storage medium, and speech system|
|US20090254350 *||Jul 6, 2007||Oct 8, 2009||Nec Corporation||Apparatus, Method and Program for Giving Warning in Connection with inputting of unvoiced Speech|
|DE2934489A1 *||Aug 25, 1979||Mar 27, 1980||Western Electric Co||Schaltung und verfahren zur sprachsignalverarbeitung|
|U.S. Classification||704/203, 704/224, 704/207|
|Cooperative Classification||H05K999/99, G10L19/02|