CA2104933A1 - Keyword/Non-Keyword Classification in Isolated Word Speech Recognition - Google Patents
Keyword/Non-Keyword Classification in Isolated Word Speech RecognitionInfo
- Publication number
- CA2104933A1 CA2104933A1 CA2104933A CA2104933A CA2104933A1 CA 2104933 A1 CA2104933 A1 CA 2104933A1 CA 2104933 A CA2104933 A CA 2104933A CA 2104933 A CA2104933 A CA 2104933A CA 2104933 A1 CA2104933 A1 CA 2104933A1
- Authority
- CA
- Canada
- Prior art keywords
- keyword
- hmm
- scores
- stage
- gpd
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L2015/088—Word spotting
Abstract
A two-pass classification system and method that post-processes HMM scores with additional confidence scores to derive a value that may be applied to a threshold on which a keyword verses non-keyword determination may be based. The first stage comprises Generalized Probabilistic Descent (GPD) analysis which uses feature vectors of the spoken words and the HMM segmentation information (developed by the HMM detector during processing) as inputs to develop a first set of confidence scores through a linear combination (a weighted sum) of the feature vectors of the speech. The second stage comprises a linear discrimination method that combines the HMM scores and the confidence scores from the GPD stage with a weighted sum to derive a second confidence score. The output of the second stage may then be compared to a predetermined threshold to determine whether the spoken word or words include a keyword. <IMAGE>
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US989,299 | 1992-12-11 | ||
US07/989,299 US5440662A (en) | 1992-12-11 | 1992-12-11 | Keyword/non-keyword classification in isolated word speech recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2104933A1 true CA2104933A1 (en) | 1994-06-12 |
CA2104933C CA2104933C (en) | 1997-09-23 |
Family
ID=25534980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002104933A Expired - Fee Related CA2104933C (en) | 1992-12-11 | 1993-08-26 | Keyword/non-keyword classification in isolated word speech recognition |
Country Status (6)
Country | Link |
---|---|
US (1) | US5440662A (en) |
EP (1) | EP0601778B1 (en) |
AT (1) | ATE177555T1 (en) |
CA (1) | CA2104933C (en) |
DE (1) | DE69323841T2 (en) |
TW (1) | TW229302B (en) |
Families Citing this family (57)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2015410C (en) * | 1989-05-17 | 1996-04-02 | Chin H. Lee | Speech recognition employing key word modeling and non-key word modeling |
US5509104A (en) * | 1989-05-17 | 1996-04-16 | At&T Corp. | Speech recognition employing key word modeling and non-key word modeling |
US5613037A (en) * | 1993-12-21 | 1997-03-18 | Lucent Technologies Inc. | Rejection of non-digit strings for connected digit speech recognition |
JP3311460B2 (en) * | 1994-01-28 | 2002-08-05 | 富士通株式会社 | Voice recognition device |
JP2690027B2 (en) * | 1994-10-05 | 1997-12-10 | 株式会社エイ・ティ・アール音声翻訳通信研究所 | Pattern recognition method and apparatus |
US5710864A (en) * | 1994-12-29 | 1998-01-20 | Lucent Technologies Inc. | Systems, methods and articles of manufacture for improving recognition confidence in hypothesized keywords |
US5812972A (en) * | 1994-12-30 | 1998-09-22 | Lucent Technologies Inc. | Adaptive decision directed speech recognition bias equalization method and apparatus |
US5617509A (en) * | 1995-03-29 | 1997-04-01 | Motorola, Inc. | Method, apparatus, and radio optimizing Hidden Markov Model speech recognition |
US5675706A (en) * | 1995-03-31 | 1997-10-07 | Lucent Technologies Inc. | Vocabulary independent discriminative utterance verification for non-keyword rejection in subword based speech recognition |
US5842194A (en) * | 1995-07-28 | 1998-11-24 | Mitsubishi Denki Kabushiki Kaisha | Method of recognizing images of faces or general images using fuzzy combination of multiple resolutions |
US5717826A (en) * | 1995-08-11 | 1998-02-10 | Lucent Technologies Inc. | Utterance verification using word based minimum verification error training for recognizing a keyboard string |
US5737489A (en) * | 1995-09-15 | 1998-04-07 | Lucent Technologies Inc. | Discriminative utterance verification for connected digits recognition |
JP3459712B2 (en) * | 1995-11-01 | 2003-10-27 | キヤノン株式会社 | Speech recognition method and device and computer control device |
US5842165A (en) * | 1996-02-29 | 1998-11-24 | Nynex Science & Technology, Inc. | Methods and apparatus for generating and using garbage models for speaker dependent speech recognition purposes |
DE69613556T2 (en) * | 1996-04-01 | 2001-10-04 | Hewlett Packard Co | Keyword recognition |
US6470315B1 (en) * | 1996-09-11 | 2002-10-22 | Texas Instruments Incorporated | Enrollment and modeling method and apparatus for robust speaker dependent speech models |
DE19640587A1 (en) * | 1996-10-01 | 1998-04-02 | Siemens Ag | Method for calculating a threshold value for the speech recognition of a keyword |
FI973093A (en) * | 1997-07-23 | 1999-01-24 | Nokia Mobile Phones Ltd | A method for controlling a teleservice and a terminal |
CN1125433C (en) * | 1997-09-18 | 2003-10-22 | 西门子公司 | Method for recognising a keywood in speech |
US6038535A (en) * | 1998-03-23 | 2000-03-14 | Motorola, Inc. | Speech classifier and method using delay elements |
JP4199927B2 (en) * | 1998-05-15 | 2008-12-24 | シーメンス アクチエンゲゼルシヤフト | Method and apparatus for recognizing at least one keyword in spoken language by a calculator |
US6138095A (en) * | 1998-09-03 | 2000-10-24 | Lucent Technologies Inc. | Speech recognition |
US6560582B1 (en) * | 2000-01-05 | 2003-05-06 | The United States Of America As Represented By The Secretary Of The Navy | Dynamic memory processor |
US7263484B1 (en) * | 2000-03-04 | 2007-08-28 | Georgia Tech Research Corporation | Phonetic searching |
GB2364814A (en) * | 2000-07-12 | 2002-02-06 | Canon Kk | Speech recognition |
DE10034235C1 (en) * | 2000-07-14 | 2001-08-09 | Siemens Ag | Speech recognition method and speech recognizer |
US20030055640A1 (en) * | 2001-05-01 | 2003-03-20 | Ramot University Authority For Applied Research & Industrial Development Ltd. | System and method for parameter estimation for pattern recognition |
US6768815B2 (en) | 2001-05-10 | 2004-07-27 | The United States Of America As Represented By The Secretary Of The Navy | Color sensor |
FR2825496B1 (en) * | 2001-06-01 | 2003-08-15 | Synomia | METHOD AND SYSTEM FOR BROAD SYNTAXIC ANALYSIS OF CORPUSES, ESPECIALLY SPECIALIZED CORPUSES |
KR100449912B1 (en) * | 2002-02-20 | 2004-09-22 | 대한민국 | Apparatus and method for detecting topic in speech recognition system |
US7219059B2 (en) * | 2002-07-03 | 2007-05-15 | Lucent Technologies Inc. | Automatic pronunciation scoring for language learning |
US7539086B2 (en) * | 2002-10-23 | 2009-05-26 | J2 Global Communications, Inc. | System and method for the secure, real-time, high accuracy conversion of general-quality speech into text |
US7711560B2 (en) * | 2003-02-19 | 2010-05-04 | Panasonic Corporation | Speech recognition device and speech recognition method |
US7650282B1 (en) * | 2003-07-23 | 2010-01-19 | Nexidia Inc. | Word spotting score normalization |
US8229744B2 (en) * | 2003-08-26 | 2012-07-24 | Nuance Communications, Inc. | Class detection scheme and time mediated averaging of class dependent models |
CN1300763C (en) * | 2004-09-29 | 2007-02-14 | 上海交通大学 | Automatic sound identifying treating method for embedded sound identifying system |
CN1296887C (en) * | 2004-09-29 | 2007-01-24 | 上海交通大学 | Training method for embedded automatic sound identification system |
US20060149544A1 (en) * | 2005-01-05 | 2006-07-06 | At&T Corp. | Error prediction in spoken dialog systems |
US20060245641A1 (en) * | 2005-04-29 | 2006-11-02 | Microsoft Corporation | Extracting data from semi-structured information utilizing a discriminative context free grammar |
KR100679051B1 (en) * | 2005-12-14 | 2007-02-05 | 삼성전자주식회사 | Apparatus and method for speech recognition using a plurality of confidence score estimation algorithms |
US8509563B2 (en) * | 2006-02-02 | 2013-08-13 | Microsoft Corporation | Generation of documents from images |
US9245526B2 (en) * | 2006-04-25 | 2016-01-26 | General Motors Llc | Dynamic clustering of nametags in an automated speech recognition system |
US20090204399A1 (en) * | 2006-05-17 | 2009-08-13 | Nec Corporation | Speech data summarizing and reproducing apparatus, speech data summarizing and reproducing method, and speech data summarizing and reproducing program |
US7970614B2 (en) * | 2007-05-08 | 2011-06-28 | Nuance Communications, Inc. | Continuous adaptation in detection systems via self-tuning from target population subsets |
DE102007033472A1 (en) * | 2007-07-18 | 2009-01-29 | Siemens Ag | Method for speech recognition |
US20090237209A1 (en) * | 2008-03-20 | 2009-09-24 | Brian William Seal | Communicating keychain |
US20090254933A1 (en) * | 2008-03-27 | 2009-10-08 | Vishwa Nath Gupta | Media detection using acoustic recognition |
US9020816B2 (en) * | 2008-08-14 | 2015-04-28 | 21Ct, Inc. | Hidden markov model for speech processing with training method |
WO2013192535A1 (en) * | 2012-06-22 | 2013-12-27 | Johnson Controls Technology Company | Multi-pass vehicle voice recognition systems and methods |
DE102013000897B4 (en) * | 2013-01-18 | 2023-07-06 | Volkswagen Aktiengesellschaft | Method and device for voice recognition in a motor vehicle using garbage grammars |
CN104078050A (en) * | 2013-03-26 | 2014-10-01 | 杜比实验室特许公司 | Device and method for audio classification and audio processing |
US9390708B1 (en) * | 2013-05-28 | 2016-07-12 | Amazon Technologies, Inc. | Low latency and memory efficient keywork spotting |
US9183830B2 (en) * | 2013-11-01 | 2015-11-10 | Google Inc. | Method and system for non-parametric voice conversion |
US9177549B2 (en) * | 2013-11-01 | 2015-11-03 | Google Inc. | Method and system for cross-lingual voice conversion |
CN108615526B (en) * | 2018-05-08 | 2020-07-07 | 腾讯科技(深圳)有限公司 | Method, device, terminal and storage medium for detecting keywords in voice signal |
DE102020128497A1 (en) | 2020-10-29 | 2022-05-05 | Bayerische Motoren Werke Aktiengesellschaft | Computer-implemented method and system for dialog-supported remote diagnosis of a defect in a technical component and/or system of a vehicle and training method |
WO2023183663A1 (en) * | 2022-03-19 | 2023-09-28 | Qualcomm Incorporated | Dummy prototypical networks for few-shot open-set keyword spotting |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4713777A (en) * | 1984-05-27 | 1987-12-15 | Exxon Research And Engineering Company | Speech recognition method having noise immunity |
GB2240203A (en) * | 1990-01-18 | 1991-07-24 | Apple Computer | Automated speech recognition system |
US5303299A (en) * | 1990-05-15 | 1994-04-12 | Vcs Industries, Inc. | Method for continuous recognition of alphanumeric strings spoken over a telephone network |
US5125022A (en) * | 1990-05-15 | 1992-06-23 | Vcs Industries, Inc. | Method for recognizing alphanumeric strings spoken over a telephone network |
US5127043A (en) * | 1990-05-15 | 1992-06-30 | Vcs Industries, Inc. | Simultaneous speaker-independent voice recognition and verification over a telephone network |
JP3050934B2 (en) * | 1991-03-22 | 2000-06-12 | 株式会社東芝 | Voice recognition method |
US5199077A (en) * | 1991-09-19 | 1993-03-30 | Xerox Corporation | Wordspotting for voice editing and indexing |
-
1992
- 1992-12-11 US US07/989,299 patent/US5440662A/en not_active Expired - Lifetime
-
1993
- 1993-08-26 TW TW082106923A patent/TW229302B/zh active
- 1993-08-26 CA CA002104933A patent/CA2104933C/en not_active Expired - Fee Related
- 1993-12-01 DE DE69323841T patent/DE69323841T2/en not_active Expired - Fee Related
- 1993-12-01 AT AT93309587T patent/ATE177555T1/en not_active IP Right Cessation
- 1993-12-01 EP EP93309587A patent/EP0601778B1/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
EP0601778B1 (en) | 1999-03-10 |
US5440662A (en) | 1995-08-08 |
DE69323841D1 (en) | 1999-04-15 |
TW229302B (en) | 1994-09-01 |
DE69323841T2 (en) | 1999-08-19 |
ATE177555T1 (en) | 1999-03-15 |
EP0601778A1 (en) | 1994-06-15 |
CA2104933C (en) | 1997-09-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2104933A1 (en) | Keyword/Non-Keyword Classification in Isolated Word Speech Recognition | |
Hazen et al. | A comparison and combination of methods for OOV word detection and word confidence scoring | |
US5822728A (en) | Multistage word recognizer based on reliably detected phoneme similarity regions | |
US6738745B1 (en) | Methods and apparatus for identifying a non-target language in a speech recognition system | |
US6816836B2 (en) | Method and apparatus for audio-visual speech detection and recognition | |
US5797123A (en) | Method of key-phase detection and verification for flexible speech understanding | |
US5805771A (en) | Automatic language identification method and system | |
US5613037A (en) | Rejection of non-digit strings for connected digit speech recognition | |
Muthusamy et al. | A comparison of approaches to automatic language identification using telephone speech. | |
EP0582989A2 (en) | A recognition system, particularly for recognising people | |
Berkling et al. | Analysis of phoneme-based features for language identification | |
EP0237934A1 (en) | Speech recognition system | |
Kirchhoff et al. | Dynamic classifier combination in hybrid speech recognition systems using utterance-level confidence values | |
US6577997B1 (en) | System and method of noise-dependent classification | |
US6438519B1 (en) | Apparatus and method for rejecting out-of-class inputs for pattern classification | |
US5799274A (en) | Speech recognition system and method for properly recognizing a compound word composed of a plurality of words | |
JP3444108B2 (en) | Voice recognition device | |
Guillevic et al. | Robust semantic confidence scoring. | |
Duchateau et al. | Class definition in discriminant feature analysis | |
Raymond et al. | Automatic learning of interpretation strategies for spoken dialogue systems | |
Pellegrino et al. | Comparison of two phonetic approaches to language identification | |
JP3128251B2 (en) | Voice recognition device | |
Leung et al. | Phonetic classification and recognition using the multi-layer perceptron | |
Farrell | Discriminatory measures for speaker recognition | |
Song | New Postprocessing Methods for Rejectin Out-of-Vocabulary Words |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |