CA2147772A1 - Method of and Apparatus for Signal Recognition that Compensates for Mismatching - Google Patents
Method of and Apparatus for Signal Recognition that Compensates for MismatchingInfo
- Publication number
- CA2147772A1 CA2147772A1 CA2147772A CA2147772A CA2147772A1 CA 2147772 A1 CA2147772 A1 CA 2147772A1 CA 2147772 A CA2147772 A CA 2147772A CA 2147772 A CA2147772 A CA 2147772A CA 2147772 A1 CA2147772 A1 CA 2147772A1
- Authority
- CA
- Canada
- Prior art keywords
- signal
- stored representations
- transform
- mismatching
- compensates
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/226—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
- G10L2015/228—Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
Abstract
Disclosed is a method for drastically reducing the average error rate for signals under mismatched conditions. The method takes a signal (e.g, speech signal) and a set of stored representations (e.g, stored representations of keyword) and performs at least one transformation that results in the signal more closely emulating the stored representations.
This is accomplished by using one of three techniques. First, one may transform the signal so that the signal may be better approximated by (e.g, is closer to) one of the stored representations. Second, one may transform the set of stored representations so that one of the stored representations better approximates the signal. Third, one may transform both the signal and the set of stored representations.
This is accomplished by using one of three techniques. First, one may transform the signal so that the signal may be better approximated by (e.g, is closer to) one of the stored representations. Second, one may transform the set of stored representations so that one of the stored representations better approximates the signal. Third, one may transform both the signal and the set of stored representations.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/263,284 US5727124A (en) | 1994-06-21 | 1994-06-21 | Method of and apparatus for signal recognition that compensates for mismatching |
US263,284 | 1994-06-21 |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2147772A1 true CA2147772A1 (en) | 1995-12-22 |
CA2147772C CA2147772C (en) | 2000-06-13 |
Family
ID=23001125
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002147772A Expired - Fee Related CA2147772C (en) | 1994-06-21 | 1995-04-25 | Method of and apparatus for signal recognition that compensates for mismatching |
Country Status (5)
Country | Link |
---|---|
US (1) | US5727124A (en) |
EP (1) | EP0689194B1 (en) |
JP (1) | JP3457431B2 (en) |
CA (1) | CA2147772C (en) |
DE (1) | DE69524994T2 (en) |
Families Citing this family (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10257583A (en) | 1997-03-06 | 1998-09-25 | Asahi Chem Ind Co Ltd | Voice processing unit and its voice processing method |
US5995927A (en) * | 1997-03-14 | 1999-11-30 | Lucent Technologies Inc. | Method for performing stochastic matching for use in speaker verification |
US6076057A (en) * | 1997-05-21 | 2000-06-13 | At&T Corp | Unsupervised HMM adaptation based on speech-silence discrimination |
US5960397A (en) * | 1997-05-27 | 1999-09-28 | At&T Corp | System and method of recognizing an acoustic environment to adapt a set of based recognition models to the current acoustic environment for subsequent speech recognition |
US5924065A (en) * | 1997-06-16 | 1999-07-13 | Digital Equipment Corporation | Environmently compensated speech processing |
US6151573A (en) * | 1997-09-17 | 2000-11-21 | Texas Instruments Incorporated | Source normalization training for HMM modeling of speech |
US6202047B1 (en) * | 1998-03-30 | 2001-03-13 | At&T Corp. | Method and apparatus for speech recognition using second order statistics and linear estimation of cepstral coefficients |
US6381571B1 (en) * | 1998-05-01 | 2002-04-30 | Texas Instruments Incorporated | Sequential determination of utterance log-spectral mean by maximum a posteriori probability estimation |
US6980952B1 (en) * | 1998-08-15 | 2005-12-27 | Texas Instruments Incorporated | Source normalization training for HMM modeling of speech |
TW418383B (en) * | 1998-09-23 | 2001-01-11 | Ind Tech Res Inst | Telephone voice recognition system and method and the channel effect compensation device using the same |
US6658385B1 (en) * | 1999-03-12 | 2003-12-02 | Texas Instruments Incorporated | Method for transforming HMMs for speaker-independent recognition in a noisy environment |
EP1116219B1 (en) | 1999-07-01 | 2005-03-16 | Koninklijke Philips Electronics N.V. | Robust speech processing from noisy speech models |
US7149690B2 (en) | 1999-09-09 | 2006-12-12 | Lucent Technologies Inc. | Method and apparatus for interactive language instruction |
US6539351B1 (en) * | 2000-02-04 | 2003-03-25 | International Business Machines Corporation | High dimensional acoustic modeling via mixtures of compound gaussians with linear transforms |
US20020049586A1 (en) * | 2000-09-11 | 2002-04-25 | Kousuke Nishio | Audio encoder, audio decoder, and broadcasting system |
US7400712B2 (en) * | 2001-01-18 | 2008-07-15 | Lucent Technologies Inc. | Network provided information using text-to-speech and speech recognition and text or speech activated network control sequences for complimentary feature access |
US7966177B2 (en) * | 2001-08-13 | 2011-06-21 | Hans Geiger | Method and device for recognising a phonetic sound sequence or character sequence |
US6778957B2 (en) * | 2001-08-21 | 2004-08-17 | International Business Machines Corporation | Method and apparatus for handset detection |
US7139704B2 (en) * | 2001-11-30 | 2006-11-21 | Intel Corporation | Method and apparatus to perform speech recognition over a voice channel |
US7139336B2 (en) * | 2002-04-05 | 2006-11-21 | Nokia Corporation | Method and system for channel estimation using iterative estimation and detection |
US7139703B2 (en) * | 2002-04-05 | 2006-11-21 | Microsoft Corporation | Method of iterative noise estimation in a recursive framework |
USH2172H1 (en) | 2002-07-02 | 2006-09-05 | The United States Of America As Represented By The Secretary Of The Air Force | Pitch-synchronous speech processing |
FR2848715B1 (en) * | 2002-12-11 | 2005-02-18 | France Telecom | METHOD AND SYSTEM FOR MULTI-REFERENCE CORRECTION OF SPECTRAL VOICE DEFORMATIONS INTRODUCED BY A COMMUNICATION NETWORK |
US7165026B2 (en) * | 2003-03-31 | 2007-01-16 | Microsoft Corporation | Method of noise estimation using incremental bayes learning |
US7729908B2 (en) * | 2005-03-04 | 2010-06-01 | Panasonic Corporation | Joint signal and model based noise matching noise robustness method for automatic speech recognition |
US7680656B2 (en) * | 2005-06-28 | 2010-03-16 | Microsoft Corporation | Multi-sensory speech enhancement using a speech-state model |
US20070033027A1 (en) * | 2005-08-03 | 2007-02-08 | Texas Instruments, Incorporated | Systems and methods employing stochastic bias compensation and bayesian joint additive/convolutive compensation in automatic speech recognition |
US7877255B2 (en) * | 2006-03-31 | 2011-01-25 | Voice Signal Technologies, Inc. | Speech recognition using channel verification |
US7630950B2 (en) * | 2006-08-18 | 2009-12-08 | International Business Machines Corporation | System and method for learning models from scarce and skewed training data |
JP5089295B2 (en) * | 2007-08-31 | 2012-12-05 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Speech processing system, method and program |
JP2010020273A (en) * | 2007-12-07 | 2010-01-28 | Sony Corp | Information processing device, information processing method and computer program |
US8615397B2 (en) * | 2008-04-04 | 2013-12-24 | Intuit Inc. | Identifying audio content using distorted target patterns |
KR101238731B1 (en) * | 2008-04-18 | 2013-03-06 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | Method and apparatus for maintaining speech audibility in multi-channel audio with minimal impact on surround experience |
US8239195B2 (en) * | 2008-09-23 | 2012-08-07 | Microsoft Corporation | Adapting a compressed model for use in speech recognition |
US8214215B2 (en) * | 2008-09-24 | 2012-07-03 | Microsoft Corporation | Phase sensitive model adaptation for noisy speech recognition |
US8306819B2 (en) * | 2009-03-09 | 2012-11-06 | Microsoft Corporation | Enhanced automatic speech recognition using mapping between unsupervised and supervised speech model parameters trained on same acoustic training data |
JP5150542B2 (en) * | 2009-03-26 | 2013-02-20 | 株式会社東芝 | Pattern recognition apparatus, pattern recognition method, and program |
US11465640B2 (en) * | 2010-06-07 | 2022-10-11 | Affectiva, Inc. | Directed control transfer for autonomous vehicles |
US9118669B2 (en) | 2010-09-30 | 2015-08-25 | Alcatel Lucent | Method and apparatus for voice signature authentication |
DE102012206313A1 (en) * | 2012-04-17 | 2013-10-17 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Device for recognizing unusual acoustic event in audio recording, has detection device detecting acoustic event based on error vectors, which describe deviation of test vectors from approximated test vectors |
JP5982297B2 (en) * | 2013-02-18 | 2016-08-31 | 日本電信電話株式会社 | Speech recognition device, acoustic model learning device, method and program thereof |
US10152298B1 (en) * | 2015-06-29 | 2018-12-11 | Amazon Technologies, Inc. | Confidence estimation based on frequency |
US10304440B1 (en) * | 2015-07-10 | 2019-05-28 | Amazon Technologies, Inc. | Keyword spotting using multi-task configuration |
CN105448299B (en) * | 2015-11-17 | 2019-04-05 | 中山大学 | A method of identifying digital audio AAC format codec |
CN109087630B (en) * | 2018-08-29 | 2020-09-15 | 深圳追一科技有限公司 | Method and related device for speech recognition |
US11532313B2 (en) * | 2020-08-27 | 2022-12-20 | Google Llc | Selectively storing, with multiple user accounts and/or to a shared assistant device: speech recognition biasing, NLU biasing, and/or other data |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS58115497A (en) * | 1981-12-28 | 1983-07-09 | シャープ株式会社 | Voice recognition system |
JPH01102599A (en) * | 1987-10-12 | 1989-04-20 | Internatl Business Mach Corp <Ibm> | Voice recognition |
FR2681715B1 (en) * | 1991-09-25 | 1994-02-11 | Matra Communication | PROCESS FOR PROCESSING SPEECH IN THE PRESENCE OF ACOUSTIC NOISE: NON-LINEAR SPECTRAL SUBTRACTION PROCESS. |
US5222146A (en) * | 1991-10-23 | 1993-06-22 | International Business Machines Corporation | Speech recognition apparatus having a speech coder outputting acoustic prototype ranks |
JPH0772840B2 (en) * | 1992-09-29 | 1995-08-02 | 日本アイ・ビー・エム株式会社 | Speech model configuration method, speech recognition method, speech recognition device, and speech model training method |
-
1994
- 1994-06-21 US US08/263,284 patent/US5727124A/en not_active Expired - Lifetime
-
1995
- 1995-04-25 CA CA002147772A patent/CA2147772C/en not_active Expired - Fee Related
- 1995-06-14 EP EP95304089A patent/EP0689194B1/en not_active Expired - Lifetime
- 1995-06-14 DE DE69524994T patent/DE69524994T2/en not_active Expired - Lifetime
- 1995-06-21 JP JP17687295A patent/JP3457431B2/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
EP0689194A1 (en) | 1995-12-27 |
US5727124A (en) | 1998-03-10 |
JPH0850499A (en) | 1996-02-20 |
CA2147772C (en) | 2000-06-13 |
DE69524994T2 (en) | 2002-08-29 |
EP0689194B1 (en) | 2002-01-16 |
DE69524994D1 (en) | 2002-02-21 |
JP3457431B2 (en) | 2003-10-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2147772A1 (en) | Method of and Apparatus for Signal Recognition that Compensates for Mismatching | |
CA2372676A1 (en) | Voice-operated services | |
CA2196553A1 (en) | Analysis of Audio Quality | |
CA2210887A1 (en) | Method and apparatus for speech recognition adapted to an individual speaker | |
EP0285275A3 (en) | Audio pre-processing methods and apparatus | |
CA2090159A1 (en) | Method and apparatus for coding audio signals based on perceptual model | |
CA2333137A1 (en) | Multiple waveform software radio | |
CA2150926A1 (en) | Transmission system implementing different coding principles | |
AU8633191A (en) | Method and apparatus for speech recognition | |
AU2519197A (en) | Video noise reducer | |
CA2247006A1 (en) | Speech processing | |
AU2427095A (en) | Method of training neural networks used for speech recognition | |
CA2100753A1 (en) | Clock rate matching in independent networks | |
AU6248594A (en) | Voice recognition dialing system | |
AU7304798A (en) | Voice recognition for information system access and transaction process ing | |
CA2476248A1 (en) | System and method for reducing delay in a speech coding system | |
DE3275779D1 (en) | Recognition of speech or speech-like sounds | |
WO2001018790A8 (en) | Method and apparatus in a telecommunications system | |
AU8047594A (en) | Speech recognition using bio-signals | |
AU8805991A (en) | Method and system for conducting offshore well operations | |
AU8800291A (en) | Method and system for speech recognition without noise interference | |
CA2154881A1 (en) | A system and method for compression and decompression of audio signals | |
CA2209216A1 (en) | Process and device for reducing the nitrate content of water | |
CA2175264A1 (en) | Improved codebook searching techniques | |
AU8442391A (en) | Corynebacteria integron, method of transformation of corynebacteria by said integron, and corynebacteria obtained |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |