Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS3681530 A
Publication typeGrant
Publication dateAug 1, 1972
Filing dateJun 15, 1970
Priority dateJun 15, 1970
Publication numberUS 3681530 A, US 3681530A, US-A-3681530, US3681530 A, US3681530A
InventorsHarold J Manley, Harry L Shaffer
Original AssigneeGte Sylvania Inc
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Method and apparatus for signal bandwidth compression utilizing the fourier transform of the logarithm of the frequency spectrum magnitude
US 3681530 A
Abstract
A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal. The logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (FTLSM) of the input speech signal. An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis. The encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals.
Images(18)
Previous page
Next page
Claims  available in
Description  (OCR text may contain errors)

United States Patent Manley et al.

[151 3,681,530 45 Aug. 1,1972

[54] METHOD AND APPARATUS FOR SIGNAL BANDWIDTH COMPRESSION UTILIZING THE FOURIER TRANSFORM OF THE LOGARITIIM OF THE FREQUENCY SPECTRUM MAGNITUDE [72] Inventors: Hamid J. Manley, Sudbury; Barry L. Shaffer, Lynnfield, both of Mass.

[73] Assignees GTE Sylvanla Incorporated [22] Filed: June 15, 1970 I [21] Appl. No.: 46,128

[52] US. Cl. ..l79/l SA [51] Int. Cl ..Gl0l l/02, G101 1/08 [58] Field of Search ....l79/15 A, 15.55 R; 324/77 C, 324/77 F [56] References Cited UNITED STATES PATENTS 3,448,216 6/1969 Kelly ..179/l SA 3,566,035 2/1971 Noll ..179/1 SA 3,344,349 9/1967 Schroeder ..l79/l SA 3,403,227 9/1968 Malm ..l79/1 SA 3,330,910 7/1967 Flanagan ..l79/1 SA 3,493,684 2/1970 Kelly ..|79/1 SA 3,471,648 10/1969 Miller ..179/l SA OTHER PUBLICATIONS Noll, Short-Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296- 302.

Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485- 491.

Primary Examiner- Kathleen H. Clafiy Assistant Examiner-Jon Bradford Leaheey Attorney-Norman J. OMalley, Elmer J. Nealon and Robert T. Orner [57] ABSTRACT A bandwidth compression system such as a digital vocoder including an analysis section employs a transducer to convert an input speech wave into an electrical signal which is then digitized by an analog to digital converter. The digitized signal is directed through a spectrum device where the magnitudes of the frequency spectrum of the input speech wave are obtained. These magnitudes are then directed to a logging circuit to obtain the logarithm of the frequency spectrum magnitudes of the input speech signal. The logged magnitudes of the frequency spectrum are then directed to a computer where the discrete Fourier transform of the logged spectrum magnitudes are obtained to form the Fourier transform of the logarithm of the frequency spectrum magnitude (PTLSM) of the input speech signal. An encoding unit selects and encodes certain ones of the FTLSM coefficients for transmission to a remote terminal for analysis. The encoded signals include pitch data and vocal tract impulse data, both of which are derived from the FTLSM signals.

The analysis section of a vocoder terminal employs a decodin device whicg decodes he rec eived data apd separate it into pite data an voca tract llllpll se 37 Claims, 19 Drawing Figures PATENTED M19 1 SHEET 10 [1F 18 o wajm hum m 'IZX'H ms HAROLD J. MA NLEY m HARRY L. SHAFFER zoEQzoQ sum 110$ 18 T I kmm Y R a E mm W N M 5 V 5. I L mm .E My R m A A H H E 5 :5 Gm Em 1022B @2203? to 20.282 02 mw ML; FPV no N: c w c A.:|1 O .1 A M AA: 82 zit; w @i F E oz 822 Q9 a 56 29% 2:63; A F 9 V25 Oz; 0 V

PATENTEDAus H972 3.681.530

SHEET 15 8F 18 N4 AVE N (k) INYHYTURF HAROLD J. MANLEY HARRY LSHAFFER \Hnmn WW METHOD AND APPARATUS FOR SIGNAL BANDWIUI'H COMPRESION UTILIZING THE FOURIER TRANSFORM OF THE IDGARITHM OF THE FREQUENCY SPECTRUM MAGNI'I'UDE BACKGROUND OF THE INVENTION This invention relates to speech compresion systems and in particular to digital vocoder systems.

It is well-known that the vocal tract, consisting of throat, mouth, tongue, lips, teeth and nasal passages, forms a time varying linear filter in which the amplitude response versus frequency characteristics is responsible for practically all the information content in a speech signal. This filter is driven by energy sources, cornmonly known as buu" and hiss energy sources.

The term buzz is associated with the type of vocal source excitation function which exists when the vocal cords are oscillating at some quasi-periodic rate (called the pitch). Under this condition the chest cavity is supplying pufi's of air to the vocal tract at the quasiperiodic rate at which the vocal cords are oscillating. The term hiss is associated with the type of vocal source excitation which exists when the vocal cords are not oscillating in a quasi-periodic manner but are always allowing air to pass through from the chest cavity and excite the vocal tract.

For voiced sounds, e.g., vowels, the excitation is from the buzz energy source. For unvoiced sounds, e.g., ss, sh, f and whispered speech, the excitation is from the his source. The information content is impressed upon the speech signal by the vocal tract acting essentially as a time varying distributed constant linear filter. Thus, to recreate speech which is both intelligible and natural sounding, it is necessary to use both the information describing the time varying spectral shape and the information describing the buzz and hiss energy sources. The latter information generally takes the form of measurements of the fundamental frequency of the bus sources as a function of time (pitch extraction). Information as to whether the excitation is buzz or hiss is used by the speech compression system. Combinations of buzz and his excitation are used to generate some sounds, but speech compression systems do not generally try to detect the combined excitation. A decision is usually made as to whether to use buzz or his excitation for this combined excitation in a speech compression system of this type.

Speech compression systems using spectral analysis are generally called vocoders. ln existing speech compression systems, the spectrum data are transmitted by digitally encoding the logarithm of about 16 voltage spectrum amplitude which are derived from a filter bank spectrum analyzer. This method is known to be inefficient because of the high correlations among the various spectrum amplitudes. Various techniques are now used to remove these correlations and therefore reduce the required data rate for a given transmission fidelity. One approach which produces sonne improvements is the use of a delta pulse code modulation scheme in which only the decibel diflerences in level between adjacent frequency channels are transmitted. Another scheme is to form weighted sum of the logged, digitiud spectrum amplitudes, the weighting being arranged so that cross-correlation of the speech wave against a waveform derived from the input speech are markedly reduced.

Another type of vocoder is called the autocorrelation vocoderwhichderivesitsnamefromthefactthatinthe first step of the analysis process the autocorrelation function of the speech input is measured in terms of orthonormal functiom. Just as the power spectrum of the speech input varies with tinne (as a talker articulates various sounds), so does the autocorrelation function. There is a one-to-one correspondence between the power spectrum and the autocorrelation function of the speech signal so that measuring one is equivalent to measuring the other. Mathematically, the power spectrum and the autocorrelation function are Fourier transform pairs. Thus, autocorrelation is simply an alternative method of measuring the short time energy spectrum of the speech sigrnal. in an autocorrelation vocoder, the input signal is first applied to the inputs of a set of orthogonal filters. The filter output signals are multiplied by the input speech signal, and the product signal is then directed through low pass filters. The output signals from the low pass filter are the coefficients in an expansion of the power spectrum.

The power spectrum P0) of a speech signal is the product of the power spectrum of a pitch excitation,

V(f), and the magnitude squares |H(f)|' of a vocal tract transfer function H(f).

P(f)=lH(f)l' V0) (1) As stated above, the autocorrelation function is the Fourier transform of P(j) and is composed of the convolution of the transform of ]H(f) and V0). Practically, this means that the autocorrelation function repeats itself at multiples of the pitch period, and it is necessary to represent the vocal tract out to fairly large delay values (near one-half of a pitch period) in order to represent the speech spectrum with any fidelity. The overlap of successive autocorrelation functions due to convolution properties raises some doubt as to the validity of the values of the autocorrelation function alone as a measure of the vocal tract shape. While the autocorrelation vocoder obtains nearly independent spectral measurements, it does not solve the problem caused by confounding the spectral envelope (vocal tract) data with the excitation spectrum data, which results in higher order transmitted coefficients. Furthermore, this type of vocoder is basically an analog device yielding an output consisting of voltage spectrum values which are subsequently digitized.

SUMMARY OF THE INVENTION Briefly, a bandwidth compression system according to the present invention includes a means for generating an electrical signal representing the Fourier transform of the logarithm of the spectrum magnitudes (FTLSM) of an input sigrnal having excitation and impulse response information included therein. A first detection means, coupled to the means for generating the FILSM electrical signal, is operative to separate out a first predetermined portion of the Fl'LSM electrical signal to represent the excitation infonnation of tlne input signal. A second detection means, also coupled to the means for generating the FTLSM electrical signal, is operative to separate out a second predetermined portion of the FILSM electrical signal to represent the impulse response information of the input signal. The bandwidth required to pass the combined first and

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US3330910 *May 6, 1964Jul 11, 1967Bell Telephone Labor IncFormant analysis and speech reconstruction
US3344349 *Oct 7, 1963Sep 26, 1967Bell Telephone Labor IncApparatus for analyzing the spectra of complex waves
US3403227 *Oct 22, 1965Sep 24, 1968Page Comm Engineers IncAdaptive digital vocoder
US3448216 *Aug 3, 1966Jun 3, 1969Bell Telephone Labor IncVocoder system
US3471648 *Jul 28, 1966Oct 7, 1969Bell Telephone Labor IncVocoder utilizing companding to reduce background noise caused by quantizing errors
US3493684 *Jun 15, 1966Feb 3, 1970Bell Telephone Labor IncVocoder employing composite spectrum-channel and pitch analyzer
US3566035 *Jul 17, 1969Feb 23, 1971Bell Telephone Labor IncReal time cepstrum analyzer
Non-Patent Citations
Reference
1 *Noll, Short Time Spectrum and Cepstrum Techniques for Vocal Pitch Detection, J.A.S.A. 2/1964 p. 296 302.
2 *Shively, A Digital Processor to Generate Spectra in Real Time, IEEE Trans. on Computers, 5/1968 p. 485 491.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US4076960 *Oct 27, 1976Feb 28, 1978Texas Instruments IncorporatedCCD speech processor
US4184049 *Aug 25, 1978Jan 15, 1980Bell Telephone Laboratories, IncorporatedTransform speech signal coding with pitch controlled adaptive quantizing
US4310721 *Jan 23, 1980Jan 12, 1982The United States Of America As Represented By The Secretary Of The ArmyHalf duplex integral vocoder modem system
US4495620 *Aug 5, 1982Jan 22, 1985At&T Bell LaboratoriesTransmitting data on the phase of speech
US4914749 *Oct 29, 1984Apr 3, 1990Nec CorporationMethod capable of extracting a value of a spectral envelope parameter with a reduced amount of operations and a device therefor
US4941178 *May 9, 1989Jul 10, 1990Gte Laboratories IncorporatedSpeech recognition using preclassification and spectral normalization
US5179626 *Apr 8, 1988Jan 12, 1993At&T Bell LaboratoriesHarmonic speech coding arrangement where a set of parameters for a continuous magnitude spectrum is determined by a speech analyzer and the parameters are used by a synthesizer to determine a spectrum which is used to determine senusoids for synthesis
US5216748 *Nov 4, 1991Jun 1, 1993Bull, S.A.Integrated dynamic programming circuit
US5412589 *Jun 16, 1992May 2, 1995University Of MichiganSystem for detecting reduced interference time-frequency distribution
US5715363 *May 18, 1995Feb 3, 1998Canon Kabushika KaishaMethod and apparatus for processing speech
US5809453 *Jan 25, 1996Sep 15, 1998Dragon Systems Uk LimitedMethods and apparatus for detecting harmonic structure in a waveform
US6026348 *Oct 14, 1997Feb 15, 2000Bently Nevada CorporationApparatus and method for compressing measurement data correlative to machine status
US6108621 *Oct 7, 1997Aug 22, 2000Sony CorporationSpeech analysis method and speech encoding method and apparatus
US6507804Feb 14, 2000Jan 14, 2003Bently Nevada CorporationApparatus and method for compressing measurement data corelative to machine status
US6725108Jan 28, 1999Apr 20, 2004International Business Machines CorporationSystem and method for interpretation and visualization of acoustic spectra, particularly to discover the pitch and timbre of musical sounds
US7333930 *Mar 14, 2003Feb 19, 2008Agere Systems Inc.Tonal analysis for perceptual audio coding using a compressed spectral representation
US7580534 *Mar 3, 2005Aug 25, 2009Siemens Audiologische Technik GmbhMethod and arrangement for transmitting signals to a hearing aid
US8065138 *Aug 31, 2007Nov 22, 2011Japan Advanced Institute Of Science And TechnologySpeech processing method and apparatus, storage medium, and speech system
US8364492 *Jul 6, 2007Jan 29, 2013Nec CorporationApparatus, method and program for giving warning in connection with inputting of unvoiced speech
US8447605 *May 21, 2013Nintendo Co., Ltd.Input voice command recognition processing apparatus
US8768638 *Aug 26, 2005Jul 1, 2014Rohde & Schwarz Gmbh & Co. KgMethod and device for performing spectrum analysis of a wanted signal or noise signal
US20040181393 *Mar 14, 2003Sep 16, 2004Agere Systems, Inc.Tonal analysis for perceptual audio coding using a compressed spectral representation
US20050201578 *Mar 3, 2005Sep 15, 2005Siemens Audiologische Technik GmbhMethod and arrangement for transmitting signals to a hearing aid
US20050273323 *Jun 3, 2005Dec 8, 2005Nintendo Co., Ltd.Command processing apparatus
US20070250558 *Aug 26, 2005Oct 25, 2007Rohde & Schwarz Gmbh & Co. KgMethod and Device for Performing Spectrum Analysis of a Wanted Signal or Noise Signal
US20080281588 *Aug 31, 2007Nov 13, 2008Japan Advanced Institute Of Science And TechnologySpeech processing method and apparatus, storage medium, and speech system
US20090254350 *Jul 6, 2007Oct 8, 2009Nec CorporationApparatus, Method and Program for Giving Warning in Connection with inputting of unvoiced Speech
DE2934489A1 *Aug 25, 1979Mar 27, 1980Western Electric CoSchaltung und verfahren zur sprachsignalverarbeitung
Classifications
U.S. Classification704/203, 704/224, 704/207
International ClassificationG10L19/02
Cooperative ClassificationH05K999/99, G10L19/02
European ClassificationG10L19/02