Search Images Maps Play YouTube News Gmail Drive More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUS3916105 A
Publication typeGrant
Publication dateOct 28, 1975
Filing dateFeb 28, 1974
Priority dateDec 4, 1972
Publication numberUS 3916105 A, US 3916105A, US-A-3916105, US3916105 A, US3916105A
InventorsWilliam R Mccray
Original AssigneeIbm
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Pitch peak detection using linear prediction
US 3916105 A
Abstract  available in
Images(3)
Previous page
Next page
Claims  available in
Description  (OCR text may contain errors)

United States Patent 1191 1111 3,916,105

McCray Oct. 28, 1975 PITCH PEAK DETECTION USING LINEAR 3,631,520 12/1971 A'tal 179/1 SA PREDICTION [751 Inventor: William R. McCray, Lexington, Ky. Primary Examiner Kathleen H. Claffy [73] Assignee: International Business Machines Assistant Examiner-E. S. Kemeny Corporation, Armonk, NY. Attorney, Agent, or FirmD1 Kendell Cooper [22] Filed: Feb. 28, 1974 21 Appl. No.: 446,847

Related US. Application Data [57] ABSTRACT [63] Continuation-impart of Ser. No. 312,063, Dec. 4,

1972 abandoned. The application of linear predlctlon techniques to speech analysis is well covered by the papers referred 52 US. Cl 179/1.s D to below This case describes a technique to determine [51] Int. Cl. G01L 1/04 the Presence or absence of Voicing in a digitized [58 Field of Search 179/1 SA, 1 s1), 1 sc speech Signal and to 19921t9 the glottal impulse P tions n that signal when voicing is present. [56] References Cited UNITED STATES PATENTS 20 Claims, 4 Drawing Figures 3,624,302 11/1971 Atal 179/1'SA 1115111011011 mats- 95 'VOIOED/UIIVOICED OUTPUT UTPUT 88 I Vol 0 111115 11500110 1111vo1cEn 5s 1 112111011011 PREDIGTION as IEIGHT IEIGHT, STORM 39 .INTERVAL .GENERATOR V SmAcE 151111111 P 11* 62 1111111115 5%; A6 SPEECH PREDICIOR STORAGE 9511 m H9 5011111011011 54 win! Es 7 28 VOICED/UNVOIOED 82 1 5 1 911111111115011 1101151512111 volume 50\ 001111101 UH PICKER 'NEIWIRK NETWORK H 76 121 VOICING PEAK HULIIRKMIIW COMPARISON FOUND HOKER NETWORK NETWORK 13 ALUE /123 m A)! v 6 SUBTRACTUR m PEAK 001011115011 THREE TIIE NETWORK 511165 405 00111411115011 11s AGE 511m 1151110111 REGISIER 121 11515011511 5m i: WPARW FEM gm NEI'URK mus 141 SW mnxmu 0 B 158 t VALUE LEAs-T mu:

U.S. Patent A Oct. 28, 1975 Sheet 1 of3 3,916,105

PREDICTION PREDICTION 5 DATA AND WEIGHT ANALYZER ANALYZED GENERATOR SPEECH (CONTROL OUTPUT 4 5 K2 RERET JTETTEAL FROM GENERATOR FIG. 2

INTERROGATE VOICED-UNVOICED -1 UNVOICED MEMORY "A? I 19 (A) VOICED E PREDICTION 15 E l ERRoR 8 I ma 1-"- SIGNAL E GENERATOR u, i 9 10 l 5 i 1 A {6 i l SMPLED SPEECH RR C sEEcR STORAGE STORAGE UNVOICED "1 REGISTER J ESI FOUND REGISTER MEMORY NOT l 1 Q E FOUND i I t PREDICTIONWEIGHT g AND ERRoR i T SIGNAL GENERATOR H T g (F) I 12 n i 1 1 I PITCH PATTERN T oETEcToR (M) sET T0 VOICED, IF

PATTERN mum) 6/ 4 sET To UNVOICED, OTHERWISE INTERVAL To WEIGHT GENERATOR US. Patent Oct. 28, 1975 CENERATE PREDICTION WEIGHTS AND ERROR FOR 4P SECONDS (F) EXTRACT FIRST FOUR PEAKS Pk THRU Pk (G) I use Pk AS NEW FOURTH PEAK OUTPUT PL/2 AS NEXT INTERVAL (R) SET v-uv MEMORY T0 UNVOICEO SET v-uv MEMORY TO VOICED (0) Sheet 2 of 3 FIG. 3

P PERIOD OF LOWEST ACCEPTABLE PITCH Pkr-ERROR VALUE AT PEAK OF THE ERROR L LOCATION OF Pk;

p LENGTH OF PREVIOUS PITCH PERIOD b,c CONSTANTS I OUTPUT L1 1 AS NEXT INTERVAL U.S. Patent Oct. 28, 1975 Sheet 3 of 3 3,916,105

FIG.4

PREDICTION WEIGHTS W 95 VOICED/UNVOICED OUTPUT UTPUT GATE RECORD 88 VOIOED- v UNVOICED A PREDICTION RRETNcTToN STORAGE 95 NETGNT WEIGHT 89 \INTERVAL GENERATOR STORAGE LENGTH 52 P 62! ADAPTNE I M A sPEEGN V PREDICTOR STORAGE SAMPLED STORAGE F69 57 SPEECH INPUT SUBTRACTION 54 PREDICTION NETWORK VOIGED/UNVOICED ERROR SAMPLES 70 T 68 l 82 so \86 50 M PEAK Z COMPARISON CONSISTENT vo|c|NG j CONTROL PTGNER NETWORK NETWORK T? L T21 VOICING H PEAK MULTIPLICATION/ COMPARISON FOUND PICKER NETWORK 75 TIME OF 128 151 MAXIMUM VALUE 101 {O4 PEAK COMPARISON THREE TIME NETwoRN STAGE COMPARISON N5 A E SHIFT 106 NETWORK REGISTER m I 1 TRNN REJECTED T PEAK COMPARISON 2 VALUE PEAK E STORAGE NETWORK T41 VALUE T 152 sToRAGE 103MAX|MUM No 157 158 VALUE LEAsT VALUE T40 PITCH PEAK DETECTION USING LINEAR PREDICTION This is a continuation-in-part of application Ser. No. 312,063, filed Dec. 4, 1972, now abandoned.

REFERENCES OF INTEREST B. S. Atal and M. R. Schroeder, Adaptive Predictive Coding of Speech Signals, Bell System Technical Journal, 49, 1973-1986 (1970).

B. S. Atal, Characterization of Speech Signals by Linear Prediction of the Speech Wave, Proc. IEEE Symposium on Feature Extraction and Selection in Pattern Recognition, Argonne, Ill. (Oct. 1970), pp. 202-209.

B. S. Atal and Suzanne L. I-Ianauer, Speech Analysis and Synthesis by Linear Prediction of the Speech Wave, Journal of the Acoustical Society of America, Vol. 50, Number 2 (Part 2), pp. 637-655 (1971).

US. Pat. No. 3,631,520, Predictive Coding of Speech Signals, B. S. Atal.

US. Pat. No. 3,624,302, Speech Analysis and Synthesis by the Useof the Linear Prediction of 21 Speech Wave, B. S. Atal.

BACKGROUND OF THE INVENTION AND PRIOR ART SUMMARY OF THE INVENTION A region of consistently-voiced speech is characterized by having pitch periods of approximately equal length. Thus, such a region may be discovered by locating a pattern of regularly spaced, large prediction errors, and within such a region it is only necessary to compare the length of the next pitch period with the length of the previous pitch period to determine if con sistent voicing has ceased or continues OBJECTS Accordingly, the prime object of the present invention is to provide a speech analysis system based on linear prediction and having improved efficiency.

The foregoing and other objects, featuresnd advantages of the invention will be apparent from the following more particular description of the preferred embodiment of the invention as illustrated in the accompanying drawings.

DRAWINGS IN THE DRAWINGS FIG. 1 associates the prediction weight generator, previously taught by Atal and a prediction interval analyzer that is significant in practicing the present invention.

FIG. 2 is a block diagram of a system incorporating the speech analysis techniques of the present invention.

FIG. 3 is a flow chart related to the system of FIG. 2.

FIG. 4 is a detailedrepresentation of the system.

DETAILED DESCRIPTION FIG. 1 is a simplified diagram of a system incorporating the inventive techniques taught herein. Sample speech is considered to be available on line 1 for input to a prediction interval analyzer 2, and a prediction weight generator 3. Data and control is symbolized by line 4 with analyzed speech output on line 5.

Block 2 of FIG. 1 is particularly expanded upon in FIG. 2. As indicated, consistently-voiced speech is characterized by having pitch periods of approximately equal length. In accordance with the present technique, the length of a succeeding pitch period is compared with the length of the previous pitch period to determine is consistent voicing has ceased or continues.

For the sake of consistency, the blocks in the flow chart of FIG. 3 are designated with letters in parentheses (A) through (S), and where possible, corresponding letters in parentheses are incorporated in the blocks of FIG. 2. Thus, the hardware represented by block 7, FIG. 2, represents the decision block (A) in FIG. 3.

Other blocks shown in FIG. 2 include an error signal generator 8, a speech storage register 9, a next pitch detector 10, a prediction weight and error signal generator 11, a pitch pattern detector 12, an error storage register 13, and a voiced-unvoiced memory 14. The various blocks in the flow chart of FIG. 3 are designated 20-37.

Considering FIG. 2, first, the status of the voicedunvoiced memory 14 is checked by block 7 to determine the character of the previous voice segment. If the segment is voiced then a decision is made to generate an error signal by generator circuit 8. Errors are stored in the error storage register 13. An input from generator 3 is provided at terminal 15 indicative of the weights calculated for the previous pitch period. An error signal is generated for a predetermined bp seconds. This is stored in register 13 and serves as an input by line 16 to block 10. Block 10, related to blocks 22 and 23 in FIG. 3, determines two maxima and makes a decision as will be discussed in connection with block 23, FIG. 3, as to whether voicing has ended or a change in pitch occurred. If this has not occurred, a set of predictor weights is calculated and an output record written as determined by a control signal on line 17.

If the end of voicing or a change in putch has occurred, then the routine proceeds by control on line 18 to block 1 1. It is noted that if an unvoiced segment was determined by block 7 then acontrol signal so indicates on line 19 directly to block 11 for processing of the speech segment.

In any case, additional determinations are made by the pitch pattern detector block 12 corresponding to blocks 26-37, FIG. 3. This primarily has to do with the detection of a speech pattern and the control of memory 14 to a voiced or unvoiced state. Various output situations are represented by control on line 6 which indicates an interval of speech weighted in order to be written as an output record.

FLOW CHART OF FIG. 3

As indicated, FIG. 3 illustrates a flow chart for carrying out the present invention. A decision is made at block 20 as to whether the previous segment was voiced or unvoiced. If voiced, the routine proceeds to block 21, if not, it proceeds to block 25.

BLOCK 21 Using the predictor weights calculated for the previous pitch period and beginning at the end of that period, the speech waveform is predicted and the error signal generated for bp seconds. where p is the previous pitch period length and 19 determines the partial period to be examined beyond the next expected pitch period ending. Obviously, b must be between 1.0 and 2.0, so that the expected time of the next error signal peak is included in the interval, but so that the second succeeding peak is excluded.

BLOCK 22 The peaks (local maxima) of the error signal are scanned out to bp seconds and two maxima are obtained, the maximum peak (Pk within a small region around p seconds and the maximum peak (Pk outside this small region.

BLOCK 23 If Pk does not exceed Pk by a significant amount (Pk less than c Pkwhere c is a constant greater than 1.0), either voicing has ended or a significant change in pitch has occurred. In either case, the region of consistent voicing has ended, and the procedure must be abandoned. Block 25 is executed next.

Otherwise, the location of Pk is taken as the end of the pitch period. A set of predictor weights is calculated over the period beteen the two pitch period endings, and an output record written, block 24. The process is then repeated from block 20.

When a consistently-voiced region of speech occurs, it contains a significant number of pitch periods (more than three). This fact is utilized in discovering the beginning of such a region. The error signal is scanned for a sufficient time in an attempt to discover four error peaks with nearly constant spacing between them. The following steps are taken. In this discussion, P and P are the periods of the lowest and highest pitches of interest.

BLOCK 25 Predictor weights are calculated, the speech wave form is predicted, and the error signal is generated over 4P,

BLOCK 26 The peaks of the error signal are scanned beginning P into the region of the waveform being analyzed. The first four peaks encountered are collected.

BLOCKS 30 AND 33 If the first collected peak is found beyond P consistent voicing has not been found in this region of speech. A set of predictor weights is calculated over a period equal to P /2 and an output record written, at block 28 after setting memory 14 to an unvoiced state at block 31. The process is then repreated from block 20.

Otherwise, the four collected peaks are analyzed to determine if a pitch pattern exists, block 33. If the periods between adjacent peaks are approximately equal and each is not less than P,.,, such a pattern has been found. The collected peaks are assumed to be pitch period endings, and a region of consistent voicing has been found beginning at the first peak. A set of predictor weights is calculated up to the location of the first peak. and an output record written. at block 37, after setting memory 14 to a voiced state at block 34. Block 20 is executed next.

BLOCK 36 Ifa pitch pattern is not found, the smallest of the four collected peaks is discarded, block 36.

BLOCK 35 The error signal is scanned from the location of the most recently found peak to find the next error peak.

BLOCK 32 If the end of predicted speech (4P is found prior to the next error peak, this region of speech does not contain consistent voicing. Blocks 31 and 28 are executed next.

BLOCK 29 If a peak is found prior to the end of predicted speech, it is compared to the value of the peak discarded in block 36. If the new one is not larger, it is rejected and block 35 is repeated.

BLOCK 27 If the new peak is larger than the one discarded in block 36, it is taken as the new fourth peak, block 27. The pitch pattern recognition process is then repeated from block 30.

DETAILED SYSTEM, FIG. 4

A detailed implementation of the system is illustrated in FIG. 4.

Control Network 50 controls the timing and sequence of operation of the other portions of the system. Its outputs to the other blocks are represented by cable 51 to avoid undue complication of the diagram.

Sampled speech is inputted on line 52 and is stored in Speech storage 65 until its processing by the system is complete.

Voice-Unvoiced storage 53 indicates whether the previously analyzed segment of speech was voiced or unvoiced. To begin a cycle of operation, block 50 determines from block 53 by line 54 the status of the previous segment.

If the previous segment was voiced, Control Network 50 obtains the length of that segment p from Segment Length Storage 56 on line 57. This is multiplied by a constant factor b, (between 1.0 and 2.0) to determine the length of speech bp to be evaluated and whether voicing continues.

The prediction weights of the previous segment are moved from storage block 60 to Adaptive Predictor 61 on line 62. This predictor, as described by Atal, uses these weights and the speech samples from Speech storage 65 on line 66 to predict subsequent speech samples.

Predictor 61 will operate on an interval of speech of length bp producing predicted speech samples which are conducted to Subtraction Network 68 by line 69. Here the original speech samples from line 66 are subtracted from the predicted samples to produce the prediction error samples on line 70.

Line 70 carries the error samples to two peak Pickers 72 and 73. Picker 72 is controlled to scan the error on both before and after Picker 72, scanning the error samples throughout bp (the time of the speech predicted by block 61) except when Picker 72 is on. Thus, Picker 72 selects the largest error sample within a small interval around p and Picker 73 selects the largest sample outside this interval.

Consistent voicing is assumed to continue if the error peak found by Picker 72 is significantly greater than the one found by Picker 73. To determine this, the output of Picker 73 is transferred to Multiplication Network 76 by line 77 where it is multiplied by a constant greater than one (1). The result of this multiplication is presented to Comparison Network 80 on line 81 where it is compared to the peak found by Picker 72 on line 82.

Control Network 50 determines via line 84 the results of the comparison in block 80. If the output of Picker 72 is greater, then the exact location of the error peak in the speech interval is stored in Segment Length storage 56 via line 86. Prediction Weight Generator 88 (as described by Atal) uses the length of the speech segment from Segment Length storage 56 on line 89 and the speech samples from Speech storage 65 on line 90 to analyze the speech segment and transfers the results via line 91 to Storage block 60. The output gate 93 is opened to allow the contents of storage blocks 53, 56 and 60 to be outputted on line 95.

If, however, the result of the comparison showed that the output of Multiplication Network 76 was greater, then Control Network 50 would set the Voiced- Unvoiced storage 53 to unvoiced. Subsequent operation will then be identical to that which would have occurred had the previous segment of speech been unvoiced when the control cycle was initiated.

In the unvoiced case, Prediction Weight Generator 88 is controlled to calculate a set of weights over a portion of speech representing a time of at least 4 P where P is the pitch period of the lowest pitch frequency of interest. The calculated weights are stored in block 60 and used in Adaptive Predictor 6 1 to predict speech. Predictor 61 and Subtraction Network 68 then operate to produce the prediction error signal.

Rejected Peak Value Storage 100 is set initially to zero. The error signal enters a three-stage shift register 101, which presents to a Comparison Network 103 the most recent three error values via lines 104, 105 and 106, while storage block 100 presents its present value via line 107. Each stage of register 101 is capable of storing enough bits to represent the full value of the error signal. When Comparison Network 103 detects that the value on line 105 is greater than the other three, then a local maximum of the error signal has been found. This maximum value is transferred to Peak Value Storage 110 via line 111 while the time location of the maximum is transferred to Peak Time Storage 112 by line 113.

When four such maxima have been placed in storage blocks 110 and 112, the stored times are gated onto lines 115, 116, 117 and 118. Comparison Network 120 compares the time of the first peak with P If the time on line 115 does not exceed P then Control Network 50 is signalled on line 121 to continue the checks. Subtractor Networks 123, 124 and 125 produce the intervals between adjacent peaks which are presented to Comparison Network 127 by lines 128, 129, and 130. Network 127 compares these three values for approximate equality, and that each is less than P and greater than P the pitch period of the highest pitch frequency of interest.

If all the requirements are met, control network 50 is signalled on line 131 that voicing has been found beginning at the timeon line 115. The time on line 115 is placed in Segment Length Storage 56 (connection not shown) and generator 88 calculates a set of weights over this segment of unvoiced speech. The weights are stored in storage60 and Output Gate 93 is opened to output a record. Subsequently, the Voiced-Unvoiced Storage 53 is changed to voiced, the pitch interval on line 128 is placed in the Segment Length Storage 56 and a new cycle is initiated.

If, however, one or more of the requirements for voicing are absent, the peak values are gated to Comparison Network 132 via lines 135, 136, 137 and 138. Network 132 determines the least of the four, transfers this value by line 140 to Rejected Peak Value Storage 100, and signals Control Network 50 by line 141 which value is the least. Network 50 causes the least peak and its time to be removed from storage units 110 and 112, and the remaining peaks and times to be moved in storage to maintain chronological order and to leave position four vacant for a new error maximum. Shift Register 101 and Comparison Network 103 operate to locate error signal maxima as before, but now, since Storage block has a non-zero value stored in it, the maximum selected by Network 103 must exceed the value of the rejected peak in block 100. A selected maximum and its time will be gated into blocks and 112 and the aforementioned tests performed.

This process continues until voicing is found, or until one or more limits occur. These limits are:

1. that the location of the first peak in storage (110,

112) on line exceeds P 2. that one or more of the peak intervals on lines 128,

129 and exceeds P or 3. that Adaptive Predictor 61 has predicted a portion of speech of 4P,

When one of these limits is exceeded, the process is discontinued, Control Network 50 places a fixed length (10 milliseconds) in Segment Length Storage 56 causes the Prediction Weight Generator 88 to generate weights over this period, which are stored in block 60 and opens Output Gate 93 to output a record. A new cycle of operation is then initiated.

While the invention has been particularly shown and described with respect to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made without where P is the period of the lowest acceptable.

pitch, said error signal representing the difference between actual speech samples and the corresponding predicted values;

3. analyzing error peaks of said error signal to detect a pitch pattern comprising a predetermined minimum number of substantially equally spaced pitch periods indicative of consistent voicing.

2. The method of claim 1, further comprising:

4. when consistent voicing is detected, providing an output representation of the related interval.

3. The method of claim 1, further comprising:

4. when an unvoiced interval is detected, providing an output representation of said unvoiced interval.

4. The method of claim 1 wherein said predetermined time interval is four (4) P and said minimum number of peaks is four, designated Pk, Pk,

5. The method of claim 1, further comprising:

5. determining the continuation of consistently voiced speech by comparing the length of a next occurring pitch period in a voiced interval with the length of a previous pitch period.

6. The method of claim 5, further comprising:

6. storing an indication of the occurrence of a voiced interval;

7. analyzing prediction weights for a preceding speech interval in relation to a current speech interval to develop an error signal prediction for bp seconds where b is a constant representative of a partial pitch period to be examined beyond the next expected pitch period ending and where p is the length of the previous pitch period;

8. detecting occurrence of the next pitch period by extracting two local maxima Pk, and Pk,, respectively representative of maximum peaks within and outside of a small region around p seconds; and

9. determining the status of voicing by comparing Pk, with (c Pk where c is a constant greater than 1.0.

7. The method of claim 6, further comprising:

10. providing a signal indicative of the continuation of consistent voicing when Pk, equals or exceeds Pk 8. The method of claim 7, further comprising:

11. outputting the current voiced speech interval.

9. The method of claim 6, further comprising:

10. providing a signal indicative of the discontinuance of consistent voicing when Pk, does not equal or exceed 0 Pk 10. The method of claim 9 further comprising:

11. proceeding with steps (1) (3) to detect the next voiced interval.

11. The method of claim 1, further comprising the following steps between steps (2) and (3):

2a. determining if the first peak of said predetermined minimum number is prior to P, where P, is the lowest pitch of interest; and

21). if not prior, storing an indication that the speech signal interval is unvoiced and is not consistent voicing; and

20. if prior, proceeding with step (3).

12. The method of claim 11, further comprising the following steps after step (3):

3a. determining and discarding the smallest peak Pk from among said predetermined minimum number of peaks;

3b. scanning said error signal from the most recently formed peak to the next error peak Pk,,;

36. if end of predicted P seconds occurs, prior to next error peak Pk,,, outputting a record for P /2 seconds;

3d. if next error peak Pk,, is formed prior to P, seconds, comparing its value to the value of the peak Pk, discarded in step (30);

3e. if Pk,, is larger than Pk,, establish Pk,, as new last peak of said minimum number and repeat steps 2(1-2t'; and

3f. if Pk,, is smaller than Pk,, repeat steps 3b3d.

13. Apparatus for determining the presence or absence of consistent voicing in speech signals characterized by voiced intervals of substantially equally spaced voice pitch periods and unvoiced intervals of irregular unequally spaced unvoiced periods, comprising:

1. means for predicting speech values based on a weighted sum of a number of preceding samples of said speech signals;

2. means for generating an error signal having error peaks for a predetermined selected time interval P seconds where P is the period of the lowest acceptable pitch, said error signal representing the difference between actual speech samples and the corresponding predicted values; and

3. means for analyzing error peaks of said error signal to detect a pitch pattern comprising a predetermined minimum number of substantially equally spaced pitch periods indicative of consistent voicmg.

14. The apparatus of claim 13, further comprising:

4. means operable when consistent voicing is detected for providing an output representation of the related voiced interval.

15. The apparatus of claim 13, further comprising:

4. means operable when an unvoiced interval is detected for providing an output representation of said unvoiced interval.

16. The apparatus of claim 13, further comprising:

5. means for determining the continuation of consistently voiced speech by comparing the length of a next occurring pitch period in a voiced interval with the length of a previous pitch period.

17. The apparatus of claim 16, further comprising:

6. means for storing an indication of the occurrence of a voiced interval;

7. means for analyzing prediction weights for a preceding speech interval in relation to a current speech interval to develop an error signal prediction for hp seconds where b is a constant representative of a partial pitch period to be examined beyond the next expected pitch period ending and where p is the length of the previous pitch period;

8. means for detecting occurrence of the next pitch period by extracting two local maxima Pk, and Pk respectively representative of maximum peaks within and outside of a small region around p seconds; and

9. means for determining the status of voicing by comparing Pk, with (c Pk where c is a constant greater than 1.0.

18. The apparatus of claim 17, further comprising:

10. means for providing a signal indicative of the continuation of consistent voicing when Pk, equals or exceeds 0 Pk 19. The apparatus of claim 18, further comprising:

11. gating means for providing prediction weights, voiced/unvoiced status, and interval lengths of speech intervals following calculations.

20. The apparatus of claim 18, further comprising:

10. means for providing a signal indicative of the discontinuance of consistent voicing when Pk, does not equal or exceed c Pk

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US3624302 *Oct 29, 1969Nov 30, 1971Bell Telephone Labor IncSpeech analysis and synthesis by the use of the linear prediction of a speech wave
US3631520 *Aug 19, 1968Dec 28, 1971Bell Telephone Labor IncPredictive coding of speech signals
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US4519094 *Aug 26, 1982May 21, 1985At&T Bell LaboratoriesLPC Word recognizer utilizing energy features
US4710959 *Apr 29, 1982Dec 1, 1987Massachusetts Institute Of TechnologyVoice encoder and synthesizer
US4783807 *Aug 27, 1984Nov 8, 1988John MarleySystem and method for sound recognition with feature selection synchronized to voice pitch
US4802221 *Jul 21, 1986Jan 31, 1989Ncr CorporationDigital system and method for compressing speech signals for storage and transmission
US4879748 *Aug 28, 1985Nov 7, 1989American Telephone And Telegraph CompanyParallel processing pitch detector
US4890328 *Aug 28, 1985Dec 26, 1989American Telephone And Telegraph CompanyVoice synthesis utilizing multi-level filter excitation
US4912764 *Aug 28, 1985Mar 27, 1990American Telephone And Telegraph Company, At&T Bell LaboratoriesDigital speech coder with different excitation types
US4924508 *Feb 12, 1988May 8, 1990International Business MachinesPitch detection for use in a predictive speech coder
US7124075May 7, 2002Oct 17, 2006Dmitry Edward TerezMethods and apparatus for pitch determination
US8694308 *Nov 26, 2008Apr 8, 2014Nec CorporationSystem, method and program for voice detection
US20100268532 *Nov 26, 2008Oct 21, 2010Takayuki ArakawaSystem, method and program for voice detection
EP0280827A1 *Mar 5, 1987Sep 7, 1988International Business Machines CorporationPitch detection process and speech coder using said process
WO1984001049A1 *Aug 1, 1983Mar 15, 1984Western Electric CoLpc word recognizer utilizing energy features
WO1987001498A1 *Jul 25, 1986Mar 12, 1987American Telephone & TelegraphA parallel processing pitch detector
WO1987001500A1 *Jul 24, 1986Mar 12, 1987American Telephone & TelegraphVoice synthesis utilizing multi-level filter excitation
WO1988000754A1 *Jun 25, 1987Jan 28, 1988Ncr CoMethod and system for compressing speech signal data
WO2003038805A1 *Oct 16, 2002May 8, 2003Dmitry Edward TerezMethods and apparatus for pitch determination
WO2003038806A1 *Oct 23, 2002May 8, 2003Dmitry Edward TerezMethods and apparatus for pitch determination
Classifications
U.S. Classification704/219, 704/E11.7, 704/214
International ClassificationG10L25/90, G10L25/93
Cooperative ClassificationH05K999/99, G10L25/93
European ClassificationG10L25/93