CA2303362A1 - Speech reference enrollment method - Google Patents

Speech reference enrollment method Download PDF

Info

Publication number
CA2303362A1
CA2303362A1 CA002303362A CA2303362A CA2303362A1 CA 2303362 A1 CA2303362 A1 CA 2303362A1 CA 002303362 A CA002303362 A CA 002303362A CA 2303362 A CA2303362 A CA 2303362A CA 2303362 A1 CA2303362 A1 CA 2303362A1
Authority
CA
Canada
Prior art keywords
utterance
similarity
enrollment method
speech reference
requesting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002303362A
Other languages
French (fr)
Other versions
CA2303362C (en
Inventor
Robert Wesley Bossemeyer Jr.
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
AT&T Teleholdings Inc
Original Assignee
Ameritech Corporation
Robert Wesley Bossemeyer Jr.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ameritech Corporation, Robert Wesley Bossemeyer Jr. filed Critical Ameritech Corporation
Publication of CA2303362A1 publication Critical patent/CA2303362A1/en
Application granted granted Critical
Publication of CA2303362C publication Critical patent/CA2303362C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/38Graded-service arrangements, i.e. some subscribers prevented from establishing certain connections
    • H04M3/382Graded-service arrangements, i.e. some subscribers prevented from establishing certain connections using authorisation codes or passwords
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0635Training updating or merging of old and new templates; Mean values; Weighting
    • G10L2015/0636Threshold criteria for the updating
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0638Interactive procedures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/487Arrangements for providing information services, e.g. recorded voice services or time announcements
    • H04M3/493Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals

Abstract

A speech reference enrollment method involves the following steps: (a) requesting a user speak a vocabulary word; (b) detecting a first utterance (354); (c) requesting the user speak the vocabulary word; (d) detecting a second utterance (358); (e) determining a first similarity between the first utterance and the second utterance (362); (f) when the first similarity is less than a predetermined similarity, requesting the user speak the vocabulary word; (g) detecting a third utterance (366); (h) determining a second similarity between the first utterance and the third utterance (370); and (i) when the second similarity is greater than or equal to the predetermined similarity, creating a reference (364).
CA002303362A 1997-09-17 1998-08-17 Speech reference enrollment method Expired - Fee Related CA2303362C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/932,078 1997-09-09
US08/932,078 US6012027A (en) 1997-05-27 1997-09-17 Criteria for usable repetitions of an utterance during speech reference enrollment
PCT/US1998/017095 WO1999013456A1 (en) 1997-09-09 1998-08-17 Speech reference enrollment method

Publications (2)

Publication Number Publication Date
CA2303362A1 true CA2303362A1 (en) 1999-03-18
CA2303362C CA2303362C (en) 2005-07-12

Family

ID=25461735

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002303362A Expired - Fee Related CA2303362C (en) 1997-09-17 1998-08-17 Speech reference enrollment method

Country Status (8)

Country Link
US (4) US6012027A (en)
EP (1) EP1019904B1 (en)
CN (1) CN1278944A (en)
AT (1) ATE261173T1 (en)
AU (1) AU9106898A (en)
CA (1) CA2303362C (en)
DE (1) DE69822179T2 (en)
WO (1) WO1999013456A1 (en)

Families Citing this family (50)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6012027A (en) * 1997-05-27 2000-01-04 Ameritech Corporation Criteria for usable repetitions of an utterance during speech reference enrollment
US6076055A (en) * 1997-05-27 2000-06-13 Ameritech Speaker verification method
US7630895B2 (en) * 2000-01-21 2009-12-08 At&T Intellectual Property I, L.P. Speaker verification method
GB9822931D0 (en) 1998-10-20 1998-12-16 Canon Kk Speech processing apparatus and method
US6760699B1 (en) * 2000-04-24 2004-07-06 Lucent Technologies Inc. Soft feature decoding in a distributed automatic speech recognition system for use over wireless channels
EP1161098B1 (en) * 2000-04-27 2011-06-22 Nippon Telegraph And Telephone Corporation Signal detection method and apparatus
DE10032756A1 (en) * 2000-07-05 2002-03-28 Deutsche Telekom Ag telecommunications network
GB2365145A (en) * 2000-07-26 2002-02-13 Canon Kk Voice control of a machine
GB2365189A (en) * 2000-07-26 2002-02-13 Canon Kk Voice-controlled machine
US7451085B2 (en) * 2000-10-13 2008-11-11 At&T Intellectual Property Ii, L.P. System and method for providing a compensated speech recognition model for speech recognition
EP1205906B1 (en) * 2000-11-07 2003-05-07 Telefonaktiebolaget L M Ericsson (Publ) Reference templates adaptation for speech recognition
US7545849B1 (en) 2003-03-28 2009-06-09 Google Inc. Signal spectrum spreading and combining system and method
US8374218B2 (en) * 2000-12-05 2013-02-12 Google Inc. Combining signals with a shuffled-hadamard function
US8385470B2 (en) * 2000-12-05 2013-02-26 Google Inc. Coding a signal with a shuffled-Hadamard function
US6829289B1 (en) * 2000-12-05 2004-12-07 Gossett And Gunter, Inc. Application of a pseudo-randomly shuffled hadamard function in a wireless CDMA system
US6982945B1 (en) 2001-01-26 2006-01-03 Google, Inc. Baseband direct sequence spread spectrum transceiver
US7295982B1 (en) 2001-11-19 2007-11-13 At&T Corp. System and method for automatic verification of the understandability of speech
US7453921B1 (en) * 2001-12-11 2008-11-18 Google Inc. LPC filter for removing periodic and quasi-periodic interference from spread spectrum signals
GB2388947A (en) * 2002-05-22 2003-11-26 Domain Dynamics Ltd Method of voice authentication
US7352833B2 (en) * 2002-11-18 2008-04-01 Google Inc. Method and system for temporal autocorrelation filtering
US7366352B2 (en) * 2003-03-20 2008-04-29 International Business Machines Corporation Method and apparatus for performing fast closest match in pattern recognition
US7227994B2 (en) * 2003-03-20 2007-06-05 International Business Machines Corporation Method and apparatus for imbedded pattern recognition using dual alternating pointers
US7299177B2 (en) * 2003-05-30 2007-11-20 American Express Travel Related Services Company, Inc. Speaker recognition in a multi-speaker environment and comparison of several voice prints to many
US7398207B2 (en) 2003-08-25 2008-07-08 Time Warner Interactive Video Group, Inc. Methods and systems for determining audio loudness levels in programming
KR100554442B1 (en) * 2003-10-06 2006-02-22 주식회사 팬택앤큐리텔 Mobile Communication Terminal with Voice Recognition function, Phoneme Modeling Method and Voice Recognition Method for the same
US8050918B2 (en) * 2003-12-11 2011-11-01 Nuance Communications, Inc. Quality evaluation tool for dynamic voice portals
US8874438B2 (en) 2004-03-12 2014-10-28 Siemens Aktiengesellschaft User and vocabulary-adaptive determination of confidence and rejecting thresholds
US20050288930A1 (en) * 2004-06-09 2005-12-29 Vaastek, Inc. Computer voice recognition apparatus and method
US8014496B2 (en) 2004-07-28 2011-09-06 Verizon Business Global Llc Systems and methods for providing network-based voice authentication
KR100705563B1 (en) * 2004-12-07 2007-04-10 삼성전자주식회사 Speech Recognition System capable of Controlling Automatically Inputting Level and Speech Recognition Method using the same
US7805300B2 (en) * 2005-03-21 2010-09-28 At&T Intellectual Property Ii, L.P. Apparatus and method for analysis of language model changes
US20060287867A1 (en) * 2005-06-17 2006-12-21 Cheng Yan M Method and apparatus for generating a voice tag
US7788101B2 (en) * 2005-10-31 2010-08-31 Hitachi, Ltd. Adaptation method for inter-person biometrics variability
EP2006836A4 (en) * 2006-03-24 2010-05-05 Pioneer Corp Speaker model registration device and method in speaker recognition system and computer program
US8234120B2 (en) * 2006-07-26 2012-07-31 Nuance Communications, Inc. Performing a safety analysis for user-defined voice commands to ensure that the voice commands do not cause speech recognition ambiguities
US20080082320A1 (en) * 2006-09-29 2008-04-03 Nokia Corporation Apparatus, method and computer program product for advanced voice conversion
US8355913B2 (en) * 2006-11-03 2013-01-15 Nokia Corporation Speech recognition with adjustable timeout period
JP4305509B2 (en) * 2006-12-26 2009-07-29 ヤマハ株式会社 Voice processing apparatus and program
CN101689366B (en) * 2007-07-02 2011-12-07 三菱电机株式会社 Voice recognizing apparatus
DE112009004357B4 (en) 2009-01-30 2019-06-13 Mitsubishi Electric Corp. Voice recognition system
DE102010041467A1 (en) * 2010-09-27 2012-03-29 Siemens Aktiengesellschaft Multi-pole electrical switching device
US8706499B2 (en) * 2011-08-16 2014-04-22 Facebook, Inc. Periodic ambient waveform analysis for enhanced social functions
US8949958B1 (en) * 2011-08-25 2015-02-03 Amazon Technologies, Inc. Authentication using media fingerprinting
US9299110B2 (en) * 2011-10-19 2016-03-29 Facebook, Inc. Periodic ambient waveform analysis for dynamic device configuration
US8818810B2 (en) * 2011-12-29 2014-08-26 Robert Bosch Gmbh Speaker verification in a health monitoring system
US8862476B2 (en) * 2012-11-16 2014-10-14 Zanavox Voice-activated signal generator
US9418651B2 (en) 2013-07-31 2016-08-16 Google Technology Holdings LLC Method and apparatus for mitigating false accepts of trigger phrases
WO2016015687A1 (en) * 2014-07-31 2016-02-04 腾讯科技(深圳)有限公司 Voiceprint verification method and device
US9741342B2 (en) 2014-11-26 2017-08-22 Panasonic Intellectual Property Corporation Of America Method and apparatus for recognizing speech by lip reading
WO2020111880A1 (en) * 2018-11-30 2020-06-04 Samsung Electronics Co., Ltd. User authentication method and apparatus

Family Cites Families (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3816722A (en) * 1970-09-29 1974-06-11 Nippon Electric Co Computer for calculating the similarity between patterns and pattern recognition system comprising the similarity computer
US3909532A (en) * 1974-03-29 1975-09-30 Bell Telephone Labor Inc Apparatus and method for determining the beginning and the end of a speech utterance
JPS5876899A (en) * 1981-10-31 1983-05-10 株式会社東芝 Voice segment detector
JPS59178587A (en) * 1983-03-30 1984-10-09 Nec Corp Speaker confirming system
US4618984A (en) * 1983-06-08 1986-10-21 International Business Machines Corporation Adaptive automatic discrete utterance recognition
US4696039A (en) * 1983-10-13 1987-09-22 Texas Instruments Incorporated Speech analysis/synthesis system with silence suppression
US4773093A (en) * 1984-12-31 1988-09-20 Itt Defense Communications Text-independent speaker recognition system and method based on acoustic segment matching
US4630305A (en) * 1985-07-01 1986-12-16 Motorola, Inc. Automatic gain selector for a noise suppression system
US4797929A (en) * 1986-01-03 1989-01-10 Motorola, Inc. Word recognition in a speech recognition system using data reduced word templates
CA1311059C (en) * 1986-03-25 1992-12-01 Bruce Allen Dautrich Speaker-trained speech recognizer having the capability of detecting confusingly similar vocabulary words
GB8613327D0 (en) * 1986-06-02 1986-07-09 British Telecomm Speech processor
US4910784A (en) * 1987-07-30 1990-03-20 Texas Instruments Incorporated Low cost speech recognition system and method
US5072418A (en) * 1989-05-04 1991-12-10 Texas Instruments Incorporated Series maxium/minimum function computing devices, systems and methods
US4937870A (en) * 1988-11-14 1990-06-26 American Telephone And Telegraph Company Speech recognition arrangement
GB2237135A (en) * 1989-10-16 1991-04-24 Logica Uk Ltd Speaker recognition
JPH03155599A (en) * 1989-11-13 1991-07-03 Nec Corp Speech recognition device
US5274695A (en) * 1991-01-11 1993-12-28 U.S. Sprint Communications Company Limited Partnership System for verifying the identity of a caller in a telecommunications network
US5293452A (en) * 1991-07-01 1994-03-08 Texas Instruments Incorporated Voice log-in using spoken name input
US5265191A (en) * 1991-09-17 1993-11-23 At&T Bell Laboratories Technique for voice-based security systems
US5222146A (en) * 1991-10-23 1993-06-22 International Business Machines Corporation Speech recognition apparatus having a speech coder outputting acoustic prototype ranks
CA2080862C (en) * 1991-12-19 1996-11-26 Priya Jakatdar Recognizer for recognizing voice messages in pulse code modulated format
US5452397A (en) * 1992-12-11 1995-09-19 Texas Instruments Incorporated Method and system for preventing entry of confusingly similar phases in a voice recognition system vocabulary list
US5717738A (en) * 1993-01-11 1998-02-10 Texas Instruments Incorporated Method and device for generating user defined spoken speed dial directories
US5698834A (en) * 1993-03-16 1997-12-16 Worthington Data Solutions Voice prompt with voice recognition for portable data collection terminal
US5430827A (en) * 1993-04-23 1995-07-04 At&T Corp. Password verification system
US5664058A (en) * 1993-05-12 1997-09-02 Nynex Science & Technology Method of training a speaker-dependent speech recognizer with automated supervision of training sufficiency
CA2126380C (en) * 1993-07-22 1998-07-07 Wu Chou Minimum error rate training of combined string models
AU7802194A (en) * 1993-09-30 1995-04-18 Apple Computer, Inc. Continuous reference adaptation in a pattern recognition system
US6471420B1 (en) * 1994-05-13 2002-10-29 Matsushita Electric Industrial Co., Ltd. Voice selection apparatus voice response apparatus, and game apparatus using word tables from which selected words are output as voice selections
US5870705A (en) * 1994-10-21 1999-02-09 Microsoft Corporation Method of setting input levels in a voice recognition system
JP3180655B2 (en) * 1995-06-19 2001-06-25 日本電信電話株式会社 Word speech recognition method by pattern matching and apparatus for implementing the method
US5845240A (en) * 1996-07-24 1998-12-01 Fielder; Mark Selective recall and preservation of continuously recorded data
MY119374A (en) * 1995-09-12 2005-05-31 Texas Instruments Inc Method and system for enrolling addresses in a speech recognition database
US5774841A (en) * 1995-09-20 1998-06-30 The United States Of America As Represented By The Adminstrator Of The National Aeronautics And Space Administration Real-time reconfigurable adaptive speech recognition command and control apparatus and method
US5937381A (en) * 1996-04-10 1999-08-10 Itt Defense, Inc. System for voice verification of telephone transactions
US5742694A (en) * 1996-07-12 1998-04-21 Eatwell; Graham P. Noise reduction filter
US5832429A (en) * 1996-09-11 1998-11-03 Texas Instruments Incorporated Method and system for enrolling addresses in a speech recognition database
US6212498B1 (en) * 1997-03-28 2001-04-03 Dragon Systems, Inc. Enrollment in speech recognition
US6012027A (en) * 1997-05-27 2000-01-04 Ameritech Corporation Criteria for usable repetitions of an utterance during speech reference enrollment
US20030091180A1 (en) * 1998-12-23 2003-05-15 Patrik Sorqvist Adaptive signal gain controller, system, and method
US6651040B1 (en) * 2000-05-31 2003-11-18 International Business Machines Corporation Method for dynamic adjustment of audio input gain in a speech system

Also Published As

Publication number Publication date
US20080015858A1 (en) 2008-01-17
DE69822179D1 (en) 2004-04-08
US6012027A (en) 2000-01-04
EP1019904A1 (en) 2000-07-19
CN1278944A (en) 2001-01-03
EP1019904A4 (en) 2000-07-19
DE69822179T2 (en) 2004-07-29
US6249760B1 (en) 2001-06-19
CA2303362C (en) 2005-07-12
US7319956B2 (en) 2008-01-15
EP1019904B1 (en) 2004-03-03
WO1999013456A1 (en) 1999-03-18
US20050036589A1 (en) 2005-02-17
AU9106898A (en) 1999-03-29
ATE261173T1 (en) 2004-03-15

Similar Documents

Publication Publication Date Title
CA2303362A1 (en) Speech reference enrollment method
WO1996021990A3 (en) Information system having a speech interface
CA2162696A1 (en) Topic Discriminator
CA2233728A1 (en) Multiple models integration for multi-environment speech recognition
EP0867857A3 (en) Enrolment in speech recognition
EP0762385A3 (en) Speech recognition
CA2299051A1 (en) Hierarchical subband linear predictive cepstral features for hmm-based speech recognition
CA2202656A1 (en) Speech recognition
EP0651372A3 (en) Automatic speech recognition (ASR) processing using confidence measures.
EP0825586A3 (en) Lexical tree pre-filtering in speech recognition
CA2233179A1 (en) Unsupervised hmm adaptation based on speech-silence discrimination
EP0834862A3 (en) Method of key-phrase detection and verification for flexible speech understanding
DE69330427T2 (en) VOICE RECOGNITION SYSTEM FOR LANGUAGES WITH COMPOSED WORDS
EP0570082A3 (en) Voice-operated service
CA2210887A1 (en) Method and apparatus for speech recognition adapted to an individual speaker
WO1999016052A3 (en) Speech recognition system for recognizing continuous and isolated speech
DE68912397D1 (en) Speech recognition with speaker adaptation through learning process.
WO2003038804A3 (en) Non-target barge-in detection
EP1959434A3 (en) Speech encoder
CA2437620A1 (en) Hierarchichal language models
WO2000054251A3 (en) Method of speech recognition
AU4887696A (en) Speech recognition
CA2428888A1 (en) Method and system for comfort noise generation in speech communication
EP0862162A3 (en) Speech recognition using nonparametric speech models
BR9913524A (en) Voice recognizer, and, voice recognition process

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed