Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS3855416 A
Publication typeGrant
Publication dateDec 17, 1974
Filing dateDec 1, 1972
Priority dateDec 1, 1972
Publication numberUS 3855416 A, US 3855416A, US-A-3855416, US3855416 A, US3855416A
InventorsFuller F
Original AssigneeFuller F
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Method and apparatus for phonation analysis leading to valid truth/lie decisions by fundamental speech-energy weighted vibratto component assessment
US 3855416 A
Abstract
A method and apparatus for indicating emotional stress in speech by detecting the presence of vibratto or rapid modulation and weighting the vibratto with a detected peak value of a preselected frequency band.
Images(4)
Previous page
Next page
Claims  available in
Description  (OCR text may contain errors)

United States Patent Fuller Dec. 17, 1974 [54] METHOD AND APPARATUS FOR 3,346.694 10/1967 Brady 179/1 SA PHONATION ANALYSIS LEADING o 3,592,969 7/1971 Yoshino 179/1 SA VALID TRUTH/LIE DECISIONS BY FOREIGN PATENTS OR APPLICATIONS FUNDAMENTAL SPEEC -EN R 1,113,225 5/1968 Great Britain .v 179/1 SA WEIGHTED VIBRATTO COM ASSESSMENT PONENT OTHER PUBLICATIONS Lieberman & Michaels. Some Aspects of Fundamen- Inventor: Fred Fuller, 4450 Park tal Frequency & Envelope Amplitude As Related to Chevy Chase, 20014 the Emotional Content of Speech, J.A.S.A. 7/1962, 221 Filed: Dec. 1, 1972 P [2]] Appl- N04 311,422 Primary E.\'aminerDavid L. Stewart Attorney, Agent, or Firm-Fidelman, Wolffe, Leitner [52 11.5. C1 179/1 SA, 179/1 SP [51] Int. Cl. G101 U04 1581 Field ofSearch... 179/1 SA, 1 SB, 1 US, 15.55 R, 1571 ABSTRACT 179/l5.55 T; 128/206; 35/21 A method and apparatus for indicating emotional stress in speech by detecting the presence of vibratto [56] References Cited or rapid modulation and weighting the vibratto with 21 UNITED STATES PATENTS detected peak value of a preselected frequency band.

3268.661 8/1966 Coulter 179/1 SA 14 Claims, 12 Drawing Figures AMPLIFIER AND PASS LOW PASS 1a FlLTER RECHFIER FILTER g fi g 15o 300112 (SMOOTHING) s a 10 12 27 30*] ANALOG d3 6 MULTIPLIER RECORDER 2 29 LOW PASS TIME AND VOLTAGE RECTIFIER F'LTER gq g afi i COMPARATOR (SMOOTHINGI AMPL'FEQ lSOLATlNG Wk AMPLIFIER 26 METHOD AND APPARATUS FOR PHONATION ANALYSIS LEADING TO VALID TRUTH/LIE DECISIONS BY FUNDAMENTAL SPEECH-ENERGY WEIGHTED VIBRATTO COMPONENT ASSESSMENT BACKGROUND OF THE INVENTION The present invention relates generally to voice signal analysis systems and more specifically to a method and apparatus for detecting emotional stress within a voice pattern. The presence of an emotional state will be used to determine the truthfulness of a response to questions asked by a skilled interrogator. This invention must be understood and examined in the light of my copending applications, Ser. Nos. 311,391 and 31 1,392. These inventions provide parts of the technology of this particular invention which is different from and is an extension of the technology of both copending applications.

DESCRIPTION OF THE PRIOR ART It has long been known that the voice may be, and often is, used to convey the emotions of the speaker. The emotional state of the speaker produces readily observable variation of measurable parameters of the voice.

Speech is the acoustic energy response of: (a) the voluntary motions of the vocal cords and the vocal tract which consists of the throat, the nose, the mouth, the tongue, the lips and the pharynx, and (b) the resonances of the various openings and cavities of the human head. The primary source of speech energy is excess air under pressure, contained in the lungs. This air pressure is allowed to flow out of the mouth and nose under muscular control which produces modulation. This flow is controlled or modulated by the human speaker in a variety of ways.

The major source of modulation is the vibration of the vocal cords. This vibration produces the major component of the voiced speech sounds, such as those required when conus the vowel sounds in a normal manner. These voiced sounds, formed by the buzzing action of the vocal cords, contrast to the voiceless sounds such as the letter s or the letter f produced by the nose, tongue and lips. This action of voicing is known as phonation.

The basic buzz or pitch frequency, which establishes phonation, is different for men and woman. The vocal cords of a typical adult male vibrate or buzz at a frequency of about l20I-Iz, whereas for women this basic rate is approximately an octave higher, near 250 Hz. The basic pitch pulses of phonation contain many harmonics and overtones of the fundamental rate in both men women.

The vocal cords are capable of a variety of shapes and motions. During the process of simple breathing, they are involuntarily held open and during phonation, they are brought together. As air is expelled from the lungs, at the onset of phonation, the vocal cords vibrate back and forth, alternately closing and opening. Current physiological authorities hold that the muscular tension and the effective mass of the cords is varied by learned muscular action. These changes strongly influence the oscillating or vibrating system.

Certain physiologists consider that phonation is established by or governed by two different structures in the pharynx, i.e., the vocal cord muscles and a mucous membrane called the cones elasticus. These two structures are acoustically coupled together at a mutual edge within the pharynx, and cooperate to produce two different modes of vibration.

In one mode, which seems to be an emotionally stable or non stressful timbre of voice, the conus elasticus and the vocal cord muscle vibrate as a unit in synchronism. Phonation in this mode sounds soft or mellow and few overtones are present.

In the second mode, a pitch cycle begins with a subglottal closure of the conus elasticus. This membrane is forced upward toward the coupled edge of the vocal cord muscle in a wave-like fashion, by air pressure being expelled from the lungs. When the closure reaches the coupled edge, a small puff of air explosively occurs, giving rise to the open phase of vocal cord motion. After the explosive puff of air has been released, the subglottal closure is pulled shut by a suction which results from the aspiration of air through the glottis. Shortly after this, the vocal cord muscles also close. Thus in this mode, the two masses tend to vibrate in opposite phase. The result is a relatively long closed time, alternated with short sharp air pulses which may produce numerous overtones and harmonics.

The balance of respiratory tract and the nasal and cranial cavities give rise to a variety of resonances, known as formants in the physiology of speech. The

' lowest frequency format can be approximately identified with the pharyngeal cavity, resonating as a closed pipe. The second formant arises in the mouth cavity. The third formant is often considered related to the second resonance of the pharyngeal cavity. The modes of the higher order forrnants are too complex to be very simply identified. The frequency of the various formants vary greatly with the production of the various voiced sounds.

As pointed out in my copending applications, the fine structure of the fundamental pitch frequency, as well as the relative peak energy at high and low frequency regions, appears to be an acoustic correlate of emotional content, transmitted through speech. Other parameters thought to be related to the emotional transmission of information include: Phonetic Content, Gross Changes in Fundamental Frequency, Relative Energy Levels in Various Frequency Bands, and the Speech Envelope Amplitude. These parameters all contribute to the conveyance of emotion or a stressful condition existing in the speaker.

Speech analysis and the equipment for accomplishing the same has been developed for a variety of loosely related purposes. One of the primary concerns is the transmission of speech with a high order to intelligibility and presence over a very reduced bandwidth. The applicability of this particular art becomes obivous in civil and military communications. Other fields in which speech analysis equipment are used are the voice operated printing or recording device, such as a typewriter and systems, equipment and devices commanded and controlled by the spoken word or phrase. While these activites are interesting and valuable in themselves, they do not relate to the detection of emotional content of a speech wave nor to its use to determine the veracity of the speaker.

According to the present invention, the fine structure of the basic phonation may be assessed and quantified by measurement of the amount of rapid amplitude modulation on the speech signal envelope of a spoken word and weighted by the peak amplitude in a selected frequency band. This rapid variation of the speech signal is called vibratto for the purposes of this application.

This invention discloses a means whereby the measure of vibratto in the speech envelope of a person under interrogation may be meaningfully quantified in real time, so that a Truth/Lie decision can be made. Research into the vibratto component of the speech wave has conclusively demonstrated that the amount of vibratto correlates well with stress or emotional involvement which leads to the Truth/Lie decision.

There are many ways to detect and measure the amount of vibratto in the phonation of an emotionally involved person under interrogation. Frequency fluctuation in the basic pitch frequency could be quantified with the aid of a frequency discriminator, for example. In addition, variations in the time between successive pitch pulses could be obtained by conventional zero crossing analysis.

SUMMARY OF THE INVENTION The present invention provides a means for determining the truth and veracity of a speakers response under interrogation by quantification of vibratto content of his answer and weighting such quantification with the peak amplitude in a selected frequency band of his speech. The vibratto is quantified by rectification, smoothing, time and amplitude discrimination and level detected to produce a series of pulses of uniform width and amplitude. These pulses represent the vibratto content. Simultaneously in a parallel circuit, the speakers voice peak is detected by processing the signal through a band pass filter, rectifier, smoothing filter, and peak-detection-and-hold circuit. The vibratto pulse train is weighted by multiplying it by the detected peak amplitude in an analog manner. The resulting pulse train is a series of pulses whose timing is related to the vibratto content of the speech and whose amplitude is proportional to the peak energy level in the fundamental pitch region as selected or defined by the band pass filter in the peak detector circuit.

Though using an analog multiplier to weight the two functions is simple and straightforward, other means are available for producing the desired weighted function. For example, the two signals may be recorded in a two-track chart recorder and the value of the vibratto pulse could be multiplied by the peak voltage reading by hand. The vibratto measurement could also be quantified by digital counter with the resultant number being multiplied by the voltage reading of the peak energy circuit. Similarly, instead of using a DC chart recorder, the output from the analog multiplier may be further quantified by measuring the average value with a simple DC voltimeter or with an averaging and recording instrument.

The present invention provides a visual real-time display of a weighted vibratto content of the speech of the subject from which a trained interrogator may derive the veracity thereof by comparison with a known truthful response.

OBJECTS OF THE INVENTION It is an object of the present invention to provide a means for detecting a stressful or emotional condition in a human being who is speaking.

An additional object of this invention is to detect this emotional or stressful condition while the person who is speaking is under direct and skillful interrogation.

A further object of this invention is to provide means whereby a valid Truth/Lie decision can be rendered by direct observations of the data readout of a voice or speech analysis system.

A still further object of this invention is to detect the emotional or stressful condition by analysis of the vibratto content of speech weighted by the peak amplitude of a selected frequency band during the same phonation utterance.

Other objects, advantages and novel features of the present invention will become apparent from the following detailed description of the invention which considered in conjunction with the accompanying drawmgs.

BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an oscillograph of a male voice responding with the word yes in the English language, in answer to a direct question at a bandwidth of SkI-Iz,

FIG. 2 is an oscillograph of a male voice responding with the word no in the English language in answer to a direct question at a bandwidth of SkHz,

FIGS. 3a and 3b are oscillgraphs of a male voice responding yes in the English Language as measured in the 150-30OHz and 600l 200Hz frequency regions, respectively,

FIGS. 4a and 4b are oscillographs of a male voice responding no in the English language as measured in the l5030OI-Iz and 600-OI-Iz frequency regions, respectively,

FIGS. 5a and 5b are graphs of a yes" response with and without emotional stress, respectively.

FIGS. 6a and 6b are graphs of a no response with and without emotional stress, respectively.

FIG. 7 is a block diagram of the weighted vibratto signal processing circuit,

FIG. 8 is a detailed schematic of the weighted vibratto signal processing circuit.

DESCRIPTION OF PREFERRED EMBODIMENTS FIG. 1 shows an oscillograph of a male voice responding with the word yes in the English language in answer to a direct question at a bandwidth of SkHz. The wave form contains two distinct sections, the first being for the ye sound and the second being for the unvoiced 5 sound. Since the first section of the yes" signal wave form is a voiced sound being produced primarily by the vocal cords and conus elasticus, this portion will be processed to detect emotional stress content or vibratto modulation. The male voice responding with the word no in the English language at a bandwidth of SkI-Iz is shown in FIG. 2. This response has a single voiced section which will be analyzed by the present device to detect the presence of the vibratto or rapid modulation of the phonation constituent of the speech signal.

FIGS. 3 and 4 show an oscillograph of the same male voice as in FIGS. 1 and 2 responding yes and no, respectively, in the English language as measured in the l50300I-Iz frequency region. This spectral region contains a great deal of the fundamental energy of phonation. When this band of frequencies is rectified and smoothed, the peak amplitude of the phonation will make an ideal weighting function.

FIG. 5a is a drawn replica of a portion of the response yes, delivered under emotional stress. The rapid modulation or vibratto pulses can be seen extending above and below the normal envelope. These additional excursions occur as the result of non-symmetric action between the vocal cords and the conus elasticus. The basic repetition period of this male voice is about 8.3 milliseconds.

FIG. 5b is a drawn replica of a portion of a male voice responding yes delivered under conditions of no emotional stress. The smooth r'egular features of the pitch pulses can be easily seen.

FIG. 6a is a drawn replica of a portion of the same male voice responding no under a condition of emotional stress. The vibratto modulations appear as distortions near the axis of averages ,and as excessively high peaks in the position direction. This non-regularity is the result of interaction in the pharynx between the vocal cords and the conus elasticus leading to an explosive type of formant excitation.

FIG. 6b is a drawn replica of a portion of the same male voice answering no-to a non-stress question. The smoothness and regularity of the response can be readily seen.

Thus it is an object of the present invention to isolate the rapid modulation'of the phonation constituent of the speech signal envelope in order to detect the presence of emotional stress in the speaker.

The components for achieving the stated objectives are shown in block diagram form in FIG. 7. As in the previous copendng applications, a high fidelity acoustic transducer 2 is used to bring the transduced electrical energy of the phonation into the system. The electrical signal divides into two channels. The upper channel is concerned with the speech processing of the Fundamental Pitch Frequency, and the lower channel processes the Vibratto Component of the voice.

In theupper channel, an amplifier 4 serves to increase the energy of the signal from the transducer and to isolate the following circuits from theoutput impedance of the transducer. A band-pass filter 6 follows, which is selected or adjusted to accept only the fundamental pitch frequency of the voice. This is commonly found to be in the vicinity of 120 to lSOHz for men and about 250l-Iz for women and thus a filter with a bandwidth of l50-300I-Iz may be selected. The signal at the output of band-pass filter 6 is tied to the energy detection means which comprises a rectifier 8 followed by a smoothing or low pass filter 10. By this pair of circuits, a voltage which is representative of the envelope of the phonation is produced. This envelope is then peak detected and held or stored in device 12 until all further processing has been accomplished. The circuit is then reset, either manually or automatically, as desired by the operator of the machine.

In the lower channel, the electrical signal from the transducer 2 is amplified and isolated by amplifier 14. The energy is again rectified by rectifier 16, isolated and amplified by amplifier l7 and smoothed by filter 18 to form an envelope of the signal. At this point the bandwidth of the signal in the lower channel is unrestricted. The smoothing filter 18 must have its characteristics adjusted accordingly. Following this filter, a stage of isolation amplification 20 is used to separate the processing of the smoothing filter from the time and amplitude discriminator 22. This latter circuit acts to compare the amplitude of the signal envelope with the time derivative of the signal envelope.

The output of the time and amplitude discriminator 22 is a zero based burst of pulses of a single polarity. These pulses will have a high proportion of the vibratto component. The pulses, which are of varying amplitude and width, are fed into a voltage comparator 24, a module of conventional design and commercially available. The comparator compares the incoming wavetrain with a DC voltage from the potentiometer 26. The only pulses that are allowed to emerge from the comparator are those that are larger in voltage value than the set voltage of the potentiometer. This potentiometer is set by the operator to a value which is somewhat above the baseline. It has been demonstrated experimentally that the Vibratto Component is usually of high sharp pulses, so these are enhanced with a high setting of the potentiometer. The output of the comparator is a series of pulses, predominantly consisting of the Vibratto Component, all of constant amplitude one for each pulse exceeding the preset level.

The detected and held envelope peak enters input port 27 of the analog multiplier 28, while the series of vibratto pulses enters input port 29. These two waveforms are multiplied together in the mathematical manner known as weighting. The result will be a series of pulses of increased amplitude at a period defined by the vibratto pulse train at input port 29 that will be recorded in the DC recorder 30 for observation and analysls.

A more detailed explanation of the preferred embodiment of the invention is shown schematically in FIG. 8.

The acoustic phonation of the subject under interrogation enters the system of instrumentation in one of two ways. It may, at the discretion of the operator, enter the system directly (in real time) from the transducer 50 or from tape recorder 54 and its transducer 52. These transducers are high fidelity microphone types of devicesthat reproduce the electrical analog of the acoustic signal, with a minimum of frequency and amplitude distortion. Switch means 56 is used to select either the microphone directly or the tape recorder output. As in FIG. 7, the upper channel in FIG. 8, pertains to the envelope processing and peak detection of a selected frequency band and the lower channel pertains to the envelope processing and vibratto quantification.

The switch means 56 is followed in the upper channel by an operational amplfier 58 with its gain determining resistors 60 and 62. The broad band speech signal from the transducer and switch enters the band pass filter 64 and the spectral region typically from l5030OHz is allowed to pass. This exact region may be changed, depending upon the voice of the subject involved. The region of -300I-Iz is a good compromise and was employed in the preferred embodiment. FIGS. 1 and 2 show the waveforms of a male human voice responding the words yes and no in English when the bandwidth is limitd to 150-300l-Iz.

Another stage of isolation, employing an operational amplifier 66, with its gain determining resistors 68 and 70, is used to isolate the band-pass filter from the converter of the dual polarity signals. In the preferred embodiment of the invention, the conversion of the dual polarity signals out of the isolation amplifier are performed by a simple solid state diode 72 acting as a rectifier. A reversed polarity diode would function nearly as well. In addition, full wave or bridge rectifiers could also be used with attendant increase in signal level. If diodes with a particular characteristic, such as Square Law were employed, the instrument would operate upon a true measure of the power in the speech signal. A further operational amplifier 74 with its gain setting resistors 76 and 78 serves to isolate the rectifier circuit from the rest of the instrument. A smoothing filter 79 follows this amplifier.

The smoothing filter consists of a low-pass filter network that is comprised of a variable resistor 80 and a fixed capacitor 82. It can be seen that other types of low-pass filters could be employed here, to remove the higher frequency fluctuations of the rectified signal, thus rendering the output of the smoothing filter essentially that of the envelope of the speech wave, in the defined pass-band. The exact time constant of this filter is adjusted depending upon the pitch of the voice under assessment. The smoothed envelope is amplified and isolated by operational amplifier 84 with its gain determining resistors 86 and 88. This envelope of speech energy then passes into a peak detect and hold circuit 90 of conventional design. Such circuits can be readily fabricated from a variety of components and modules by those skilled in the art. It is preferred to use a single circuit to perform this task such as a single module named Infinite Sample Hold, manufactured by Hybrid Systems Corp. This particular module has the advantage of preventing decay of the peak value until the circuit is reset by a signal on input lead 94 from the control means 96 which may be a simple switch. The output peak value appears on lead 92 and is applied to one channel of the analog multiplier 98. There are many ways that two voltages could be multiplied together, thereby weighting or modulating one voltage with the other. This process could be performed manually, by obtaining the individual values of the voltages involved. It is preferred, however, to employ one of the modern analog computer modules that have become available such as Model 107C analog multiplier/divider, manufactured by Hybrid Systems Corp.

The lower channel that processes the speech or phonation energy to obtain the Vibratto components functions in the following manner. The electrical analog of the phonation enters the channel through an isolating operational amplifier 103 with its gain determining resistors 104 and 106. This amplifier is used to provide isolation and linear amplification of the signal with no frequency discrimination. This isolated and amplified signal is applied to a diode 108 where one polarity of the speech signal is allowed to pass into the following circuitry. A diode connected in the opposite polarity could be used nearly as well. A full wave rectification circuit or a bridge rectification circuit (not shown) could be used as well with a small additional complication of the circuit.

The electrical energy out of the diode, at the input of the following circuitry, is therefore predominantly and primarily of one polarity. Operational amplifer 110, with its gain determining resistances 112 and 114 is used to isolate the diode circuit from the follow-on circuitry. The follow-on circuitry consists of a smoothing filter 116 in the form of an R/C Integrator having a variable resistor 117 and a fixed capacitor 118. It can be seen, by those versed in the art, that a variety of different active and passive smoothing filters could be used to remove the high frequency energy of the phonation and to extract a signal which is representative of the envelope of the speech wave and yet retains the Vibratto Component. The R/C Integrator that is used in the present preferred embodiment functions quite well and is simple to employ. The time constant is variable to afford adjustment for different voices of various fundamental frequencies.

The R/C Integrator is folowed by another operational amplifier 120 with its gain determining resistors 122 and 124. This operational amplifier isolates the processing of the R/C Integrator from the subsequent circuit action. Following the isolation amplifier 120 is a special discriminator circuit that processes the time derivative of the incoming speech wave and the amplitude of the speech envelope at the same time. The special differentiator circuit involves a fixed capacitor 126 and a variable resistor 128. These two components perform the time differentiation function. The potentiometer 130 provides a measure of the undifferentiated signal envelope which is used to null out residual envelope energy. This component, connected as it is, performs the envelope amplitude discrimination function. An operational amplifier 132, with its gain and performance determining resistances 134, 136 and 138, accepts the time derivative signal and the amplitude discrimination signal and provides effective base line restoration for most types of phonation.

Base line restoration can be accomplished in a variety of ways as well, as by various types of clamping and DC restoration circuits. Irrespective of the circuit of use, the output of the amplifier 132 becomes a series of pulses with a defined base line that contains the Vibratto Component and thus comprises the variable fine structure of the phonation. These pulses are of varying amplitude and width.

The output pulses from the time and amplitude discriminator are applied to a comparator circuit 140 which determines the level of statistically signifcant pulses. This level is determined by a knowledgeable operator of the equipment, and is controlled by adjustment of potentiometer 142. This control is shown to function at either a positive or a negative voltage level. When the polarity of the diode 108 is selected, the comparator voltage level must be adjacent to the polarity that will select either excess positive or excess negative peaks. The potentiometer 142 may also be set at 0" volts at which time the circuit becomes conventional zero-crossing detector means. It has been found that the statistical significance of the Truth/Lie Decision process will improve if a level of voltage off the O baseline is selected for the comparator switching level. This comparator may be a simple diode circuit or it may be a Schmitt trigger circuit, each with suitable voltage supplies, passive and active components. However, for simplicity and economy, a differential voltage comparator such as the Motorola MC17I0 was used. The output of voltage comparator 140 is a series of equiamplitude pulses, one for each input above the selected level. This pulse train enters the analog multiplier 98, where it is weighed by the output voltage level of the peak detect and hold module 90. The output pulse train thus has an amplitude varied by the peak detected signal and a spacing of the vibratto pulse train.

The algebraic analog product of these two voltages emerges from the multiplier module on lead 100 and enters a DC chart recorder 146. The responses of each utterance of phonation from the subject under interrogation are thereby recorded for comparison and analysis. The chart recorder is also under the control (on cable 148) of the control means 96. By this means, prior to the act of asking a particular question, the operator/interrogator erases or resets the sample and hold voltage level from module 90 and primes the chart recorder. Immediately after the question is asked, the circuits are all enabled by the control means so that the phonation of the subject may be processed by the instrumentation. Although the invention has been described and illustrated in detail, it is to be clearly understood that the same is by way of illustration and example only and it not to be taken by way of limitation, the spirit and scope of the invention being limited only by the terms of the appended claims.

What is claimed is:

l. A method for detecting emotional stress in the utterance of an individual comprising:

converting said utterance to an electrical signal;

selecting a frequency band of said electrical signals;

detecting the peak amplitude of said selected frequency band;

simultaneously with said frequency selecting and peak detecting, smoothing said electrical signals to form an envelope, detecting rapid aperiodic amplitude modulation on said envelope, and selecting detected modulation exceeding a preselected amplitude;

weighting said selected modulating with said detected peak amplitude; and

displaying said weighted signal which is indicative of emotional stress.

2. A method as in claim 1 including rectifying and smoothing said selected frequency band before detectmg 3. A method as in claim 1 wherein detecting rapid modulation includes time and amplitude discriminating said smoothed signal.

4. A method as in claim 3 wherein said smoothing comprises integrating and said time and amplitude discriminating includes differentiating and DC. base line restoration.

5. A method as in claim 1 including holding said detected peak amplitude for the duration of said utterance and resetting said held peak amplitude upon termination of said utterance.

6. A device for indicating emotional stress from the utterances of a human comprising:

means for converting said utterances into electrical signals; means connected to said converting means for passing a frequency band of said electrical signals;

means connected to said band-passing means for detecting the peak amplitude of said passed electrical signals;

means connected to said converting means for shaping said electrical signals into an envelope;

means connected to said shaping means for detecting rapid aperiodic amplitude modulation on said envelope;

means connected to said peak detecting means and said modulation detecting means for weighting said detected modulation with said detected peak am plitude; and

means connected to said weighting means for indicating said weighted signal.

7. A device as in olaim 6 including rectifying means and smoothing means connected between said bandpassing means and said peak detecting means.

8. A device as in claim 7 wherein said band-passing means has a band of to 3001-12.

9. A device as in claim 7 wherein said peak detecting means also hold said detected peak until reset.

10. A device as in claim 6 including means connected between said modulation detecting means and said weighting means for comparing said detected modulation to a selected level.

11. A device as in claim 10 wherein said modulation detecting means includes differentiating means producing a series of varying amplitude pulses and said comparing means producing a series of uniform amplitude pulses for each varying amplitude pulse above said se lected level.

12. A device as in claim 11 wherein said weighting means produces a series of pulses whose spacing is that of said comparing means pulses and whose amplitude is proportional to said detected peak.

13. A device as in claim 10 wherein said shaping means includes rectifying means and integrating means.

14. A device as in claim 6 wherein said detecting means include differentiation means and baseline restoration means.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US3268661 *Apr 9, 1962Aug 23, 1966Melpar IncSystem for determining consonant formant loci
US3346694 *Jun 2, 1965Oct 10, 1967Bell Telephone Labor IncSpeech level measuring apparatus
US3592969 *Jul 22, 1969Jul 13, 1971Matsushita Electric Ind Co LtdSpeech analyzing apparatus
GB1113225A * Title not available
Non-Patent Citations
Reference
1 *Lieberman & Michaels, Some Aspects of Fundamental Frequency & Envelope Amplitude As Related to the Emotional Content of Speech, J.A.S.A. 7/1962, pgs. 922 927.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US4142067 *Apr 11, 1978Feb 27, 1979Williamson John DSpeech analyzer for analyzing frequency perturbations in a speech pattern to determine the emotional state of a person
US4335276 *Apr 16, 1980Jun 15, 1982The University Of VirginiaApparatus for non-invasive measurement and display nasalization in human speech
US4383135 *Jan 23, 1980May 10, 1983Scott Instruments CorporationMethod and apparatus for speech recognition
US4444199 *Jul 21, 1981Apr 24, 1984William A. ShaferMethod and apparatus for monitoring physiological characteristics of a subject
US5077800 *Oct 4, 1989Dec 31, 1991Societe Anonyme Dite: Laboratorie D'audiologie Dupret-Lefevre S.A.Electronic device for processing a sound signal
US5134657 *Jul 13, 1990Jul 28, 1992Winholtz William SVocal demodulator
US5148483 *Oct 18, 1990Sep 15, 1992Silverman Stephen EMethod for detecting suicidal predisposition
US5976081 *Jun 7, 1995Nov 2, 1999Silverman; Stephen E.Method for detecting suicidal predisposition
US6006188 *Mar 19, 1997Dec 21, 1999Dendrite, Inc.Speech signal processing for determining psychological or physiological characteristics using a knowledge base
US6427137 *Aug 31, 1999Jul 30, 2002Accenture LlpSystem, method and article of manufacture for a voice analysis system that detects nervousness for preventing fraud
US6463415 *Aug 31, 1999Oct 8, 2002Accenture Llp69voice authentication system and method for regulating border crossing
US6591238 *May 27, 1992Jul 8, 2003Stephen E. SilvermanMethod for detecting suicidal predisposition
US6719707Aug 24, 2001Apr 13, 2004Nathan MontgomeryApparatus and method for performing musical perception sound analysis on a system
US6724887Jan 24, 2000Apr 20, 2004Verint Systems, Inc.Method and system for analyzing customer communications with a contact center
US7062443Aug 22, 2001Jun 13, 2006Silverman Stephen EMethods and apparatus for evaluating near-term suicidal risk using vocal parameters
US7139699Oct 5, 2001Nov 21, 2006Silverman Stephen EMethod for analysis of vocal jitter for near-term suicidal risk assessment
US7165033Jul 11, 2000Jan 16, 2007Amir LibermanApparatus and methods for detecting emotions in the human voice
US7222075Jul 12, 2002May 22, 2007Accenture LlpDetecting emotions using voice signal analysis
US7451079Jul 12, 2002Nov 11, 2008Sony France S.A.Emotion recognition method and device
US7511606May 18, 2005Mar 31, 2009Lojack Operating Company LpVehicle locating unit with input voltage protection
US7565285Nov 14, 2006Jul 21, 2009Marilyn K. SilvermanDetecting near-term suicidal risk utilizing vocal jitter
US7590538Aug 31, 1999Sep 15, 2009Accenture LlpVoice recognition system for navigating on the internet
US7627475Mar 8, 2007Dec 1, 2009Accenture LlpDetecting emotions using voice signal analysis
US7839893 *Nov 25, 2003Nov 23, 2010Nec Infrontia CorporationVoice data transmitting and receiving system
US7869586Mar 30, 2007Jan 11, 2011Eloyalty CorporationMethod and system for aggregating and analyzing data relating to a plurality of interactions between a customer and a contact center and generating business process analytics
US7995717May 18, 2005Aug 9, 2011Mattersight CorporationMethod and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8023639Mar 28, 2008Sep 20, 2011Mattersight CorporationMethod and system determining the complexity of a telephonic communication received by a contact center
US8031075Oct 13, 2008Oct 4, 2011Sandisk Il Ltd.Wearable device for adaptively recording signals
US8069040Apr 3, 2006Nov 29, 2011Qualcomm IncorporatedSystems, methods, and apparatus for quantization of spectral envelope representation
US8078474Apr 3, 2006Dec 13, 2011Qualcomm IncorporatedSystems, methods, and apparatus for highband time warping
US8094790Mar 1, 2006Jan 10, 2012Mattersight CorporationMethod and software for training a customer service representative by analysis of a telephonic interaction between a customer and a contact center
US8094803May 18, 2005Jan 10, 2012Mattersight CorporationMethod and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8140324Apr 3, 2006Mar 20, 2012Qualcomm IncorporatedSystems, methods, and apparatus for gain coding
US8244526Apr 3, 2006Aug 14, 2012Qualcomm IncorporatedSystems, methods, and apparatus for highband burst suppression
US8258964Aug 26, 2011Sep 4, 2012Sandisk Il Ltd.Method and apparatus to adaptively record data
US8260611Apr 3, 2006Sep 4, 2012Qualcomm IncorporatedSystems, methods, and apparatus for highband excitation generation
US8264364Sep 8, 2008Sep 11, 2012Phillip Roger SpraguePsychophysiological touch screen stress analyzer
US8332228Apr 3, 2006Dec 11, 2012Qualcomm IncorporatedSystems, methods, and apparatus for anti-sparseness filtering
US8364494Apr 3, 2006Jan 29, 2013Qualcomm IncorporatedSystems, methods, and apparatus for split-band filtering and encoding of a wideband signal
US8484036 *Apr 3, 2006Jul 9, 2013Qualcomm IncorporatedSystems, methods, and apparatus for wideband speech coding
US8594285Jun 21, 2011Nov 26, 2013Mattersight CorporationMethod and system for analyzing separated voice data of a telephonic communication between a customer and a contact center by applying a psychological behavioral model thereto
US8718262Mar 30, 2007May 6, 2014Mattersight CorporationMethod and system for automatically routing a telephonic communication base on analytic attributes associated with prior telephonic communication
US8781102Nov 5, 2013Jul 15, 2014Mattersight CorporationMethod and system for analyzing a communication by applying a behavioral model thereto
US8891754Mar 31, 2014Nov 18, 2014Mattersight CorporationMethod and system for automatically routing a telephonic communication
US8965770Mar 29, 2011Feb 24, 2015Accenture Global Services LimitedDetecting emotion in voice signals in a call center
US8983054Oct 16, 2014Mar 17, 2015Mattersight CorporationMethod and system for automatically routing a telephonic communication
US20100211394 *Oct 3, 2006Aug 19, 2010Andrey Evgenievich NazdratenkoMethod for determining a stress state of a person according to a voice and a device for carrying out said method
EP1256937A2 *Jul 13, 2001Nov 13, 2002Sony France S.A.Emotion recognition method and device
Classifications
U.S. Classification704/272, 704/270, 704/E17.2
International ClassificationG10L17/00
Cooperative ClassificationG10L17/26
European ClassificationG10L17/26