CA2203132A1 - Method and apparatus for adapting the language model's size in a speech recognition system - Google Patents

Method and apparatus for adapting the language model's size in a speech recognition system

Info

Publication number
CA2203132A1
CA2203132A1 CA002203132A CA2203132A CA2203132A1 CA 2203132 A1 CA2203132 A1 CA 2203132A1 CA 002203132 A CA002203132 A CA 002203132A CA 2203132 A CA2203132 A CA 2203132A CA 2203132 A1 CA2203132 A1 CA 2203132A1
Authority
CA
Canada
Prior art keywords
language model
speech recognition
adapting
available
trigrams
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA002203132A
Other languages
French (fr)
Other versions
CA2203132C (en
Inventor
Upali Bandara
Siegfried Kunzmann
Karlheinz Mohr
Burn L. Lewis
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
Upali Bandara
Siegfried Kunzmann
Karlheinz Mohr
Burn L. Lewis
International Business Machines Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Upali Bandara, Siegfried Kunzmann, Karlheinz Mohr, Burn L. Lewis, International Business Machines Corporation filed Critical Upali Bandara
Publication of CA2203132A1 publication Critical patent/CA2203132A1/en
Application granted granted Critical
Publication of CA2203132C publication Critical patent/CA2203132C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models
    • G10L15/19Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
    • G10L15/197Probabilistic grammars, e.g. word n-grams
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/183Speech classification or search using natural language modelling using context dependencies, e.g. language models

Abstract

Disclosed are a method and an apparatus for adapting, particularly reducing, the size of a language model, which comprises word n-grams, in a speech recognition system.
The invention provides a mechanism to discard those n-grams for which the acoustic part of the system requires less support from the language model to recognize correctly.
The proposed method is suitable for identifying those trigrams in a language model for the purpose of discarding during the built-time of the system. Provided is also another automatic classification scheme for words which allows the compression of a language model, but under retention of accuracy. Moreover it allows an efficient usage of sparsely available text corpora because even singleton trigrams are used when they are helpful. No additional software tools are needed to be developed because the main tool, the fast match scoring, is a module readily available in the known recognizers themselves. Further improvement of the method is accomplished by classification of words according to the common text in which they occur as far as they distinguish from each other acoustically.
The invention opens the possibility to make speech recognition available in low-cost personal computers (PC's), even in portable computers like Laptops.
CA002203132A 1995-11-04 1995-11-04 Method and apparatus for adapting the language model's size in a speech recognition system Expired - Fee Related CA2203132C (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP1995/004337 WO1997017694A1 (en) 1995-11-04 1995-11-04 Method and apparatus for adapting the language model's size in a speech recognition system

Publications (2)

Publication Number Publication Date
CA2203132A1 true CA2203132A1 (en) 1997-05-05
CA2203132C CA2203132C (en) 2004-11-16

Family

ID=8166119

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002203132A Expired - Fee Related CA2203132C (en) 1995-11-04 1995-11-04 Method and apparatus for adapting the language model's size in a speech recognition system

Country Status (6)

Country Link
US (1) US5899973A (en)
EP (1) EP0801786B1 (en)
JP (1) JP3126985B2 (en)
CA (1) CA2203132C (en)
DE (1) DE69517705T2 (en)
WO (1) WO1997017694A1 (en)

Families Citing this family (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE211847T1 (en) * 1996-09-27 2002-01-15 Koninkl Philips Electronics Nv METHOD AND SYSTEM FOR RECOGNIZING A SPOKEN TEXT
US20020186858A1 (en) * 2001-06-07 2002-12-12 Masahisa Masuda Loopdown and looparound headsets
US7072476B2 (en) * 1997-02-18 2006-07-04 Matech, Inc. Audio headset
DE19708183A1 (en) * 1997-02-28 1998-09-03 Philips Patentverwaltung Method for speech recognition with language model adaptation
US7035799B1 (en) * 1997-11-21 2006-04-25 Siemens Aktiengesellschaft Method and device for voice recognition
DE19754957A1 (en) * 1997-12-11 1999-06-17 Daimler Chrysler Ag Speech recognition method
US6418431B1 (en) * 1998-03-30 2002-07-09 Microsoft Corporation Information retrieval and speech recognition based on language models
US6141641A (en) * 1998-04-15 2000-10-31 Microsoft Corporation Dynamically configurable acoustic model for speech recognition system
US6188976B1 (en) * 1998-10-23 2001-02-13 International Business Machines Corporation Apparatus and method for building domain-specific language models
US6253175B1 (en) * 1998-11-30 2001-06-26 International Business Machines Corporation Wavelet-based energy binning cepstal features for automatic speech recognition
US6577999B1 (en) * 1999-03-08 2003-06-10 International Business Machines Corporation Method and apparatus for intelligently managing multiple pronunciations for a speech recognition vocabulary
DE10014337A1 (en) * 2000-03-24 2001-09-27 Philips Corp Intellectual Pty Generating speech model involves successively reducing body of text on text data in user-specific second body of text, generating values of speech model using reduced first body of text
US7031908B1 (en) * 2000-06-01 2006-04-18 Microsoft Corporation Creating a language model for a language processing system
US6865528B1 (en) 2000-06-01 2005-03-08 Microsoft Corporation Use of a unified language model
US6813341B1 (en) * 2000-08-31 2004-11-02 Ivoice, Inc. Voice activated/voice responsive item locator
US7292678B2 (en) * 2000-08-31 2007-11-06 Lamson Holdings Llc Voice activated, voice responsive product locator system, including product location method utilizing product bar code and aisle-situated, aisle-identifying bar code
AU2000276397A1 (en) * 2000-09-30 2002-04-15 Intel Corporation Method and system to scale down a decision tree-based hidden markov model (hmm) for speech recognition
DE10100725C1 (en) * 2001-01-10 2002-01-24 Philips Corp Intellectual Pty Automatic dialogue system for speech interrogation of databank entries uses speech recognition system assisted by speech model obtained before beginning of dialogue
US20030041072A1 (en) * 2001-08-27 2003-02-27 Segal Irit Haviv Methodology for constructing and optimizing a self-populating directory
US8229753B2 (en) * 2001-10-21 2012-07-24 Microsoft Corporation Web server controls for web enabled recognition and/or audible prompting
US7711570B2 (en) * 2001-10-21 2010-05-04 Microsoft Corporation Application abstraction with dialog purpose
US7181392B2 (en) * 2002-07-16 2007-02-20 International Business Machines Corporation Determining speech recognition accuracy
DE10304460B3 (en) * 2003-02-04 2004-03-11 Siemens Ag Speech recognition method e.g. for mobile telephone, identifies which spoken variants of same word can be recognized with analysis of recognition difficulty for limiting number of acceptable variants
US7533023B2 (en) * 2003-02-12 2009-05-12 Panasonic Corporation Intermediary speech processor in network environments transforming customized speech parameters
US7200559B2 (en) * 2003-05-29 2007-04-03 Microsoft Corporation Semantic object synchronous understanding implemented with speech application language tags
US8301436B2 (en) * 2003-05-29 2012-10-30 Microsoft Corporation Semantic object synchronous understanding for highly interactive interface
US7826805B2 (en) * 2003-11-11 2010-11-02 Matech, Inc. Automatic-switching wireless communication device
WO2005048572A2 (en) * 2003-11-11 2005-05-26 Matech, Inc. Two-way communications device having a single transducer
US8160883B2 (en) * 2004-01-10 2012-04-17 Microsoft Corporation Focus tracking in dialogs
JP4631076B2 (en) * 2004-11-01 2011-02-16 株式会社国際電気通信基礎技術研究所 Method and system for optimizing phoneme unit sets
US8315379B2 (en) * 2004-11-10 2012-11-20 Matech, Inc. Single transducer full duplex talking circuit
US7983914B2 (en) * 2005-08-10 2011-07-19 Nuance Communications, Inc. Method and system for improved speech recognition by degrading utterance pronunciations
US20070078653A1 (en) * 2005-10-03 2007-04-05 Nokia Corporation Language model compression
US7562811B2 (en) 2007-01-18 2009-07-21 Varcode Ltd. System and method for improved quality management in a product logistic chain
WO2007129316A2 (en) 2006-05-07 2007-11-15 Varcode Ltd. A system and method for improved quality management in a product logistic chain
US8069032B2 (en) * 2006-07-27 2011-11-29 Microsoft Corporation Lightweight windowing method for screening harvested data for novelty
US20080091427A1 (en) * 2006-10-11 2008-04-17 Nokia Corporation Hierarchical word indexes used for efficient N-gram storage
US8433576B2 (en) * 2007-01-19 2013-04-30 Microsoft Corporation Automatic reading tutoring with parallel polarized language modeling
KR100897554B1 (en) * 2007-02-21 2009-05-15 삼성전자주식회사 Distributed speech recognition sytem and method and terminal for distributed speech recognition
WO2008108232A1 (en) * 2007-02-28 2008-09-12 Nec Corporation Audio recognition device, audio recognition method, and audio recognition program
US8528808B2 (en) 2007-05-06 2013-09-10 Varcode Ltd. System and method for quality management utilizing barcode indicators
CN105045777A (en) * 2007-08-01 2015-11-11 金格软件有限公司 Automatic context sensitive language correction and enhancement using an internet corpus
US8306822B2 (en) * 2007-09-11 2012-11-06 Microsoft Corporation Automatic reading tutoring using dynamically built language model
EP2218042B1 (en) 2007-11-14 2020-01-01 Varcode Ltd. A system and method for quality management utilizing barcode indicators
US11704526B2 (en) 2008-06-10 2023-07-18 Varcode Ltd. Barcoded indicators for quality management
CN102884518A (en) 2010-02-01 2013-01-16 金格软件有限公司 Automatic context sensitive language correction using an internet corpus particularly for small keyboard devices
US8972260B2 (en) * 2011-04-20 2015-03-03 Robert Bosch Gmbh Speech recognition using multiple language models
US8807422B2 (en) 2012-10-22 2014-08-19 Varcode Ltd. Tamper-proof quality management barcode indicators
WO2014189399A1 (en) 2013-05-22 2014-11-27 Axon Doo A mixed-structure n-gram language model
US9135911B2 (en) * 2014-02-07 2015-09-15 NexGen Flight LLC Automated generation of phonemic lexicon for voice activated cockpit management systems
CN103915092B (en) * 2014-04-01 2019-01-25 百度在线网络技术(北京)有限公司 Audio recognition method and device
US11060924B2 (en) 2015-05-18 2021-07-13 Varcode Ltd. Thermochromic ink indicia for activatable quality labels
CA2991275A1 (en) 2015-07-07 2017-01-12 Varcode Ltd. Electronic quality indicator
US9865254B1 (en) * 2016-02-29 2018-01-09 Amazon Technologies, Inc. Compressed finite state transducers for automatic speech recognition
KR20180070970A (en) * 2016-12-19 2018-06-27 삼성전자주식회사 Method and Apparatus for Voice Recognition

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4831550A (en) * 1986-03-27 1989-05-16 International Business Machines Corporation Apparatus and method for estimating, from sparse data, the probability that a particular one of a set of events is the next event in a string of events
US5072452A (en) * 1987-10-30 1991-12-10 International Business Machines Corporation Automatic determination of labels and Markov word models in a speech recognition system
US5127043A (en) * 1990-05-15 1992-06-30 Vcs Industries, Inc. Simultaneous speaker-independent voice recognition and verification over a telephone network
EP0602296A1 (en) * 1992-12-17 1994-06-22 International Business Machines Corporation Adaptive method for generating field dependant models for intelligent systems
US5710866A (en) * 1995-05-26 1998-01-20 Microsoft Corporation System and method for speech recognition using dynamically adjusted confidence measure
US5680511A (en) * 1995-06-07 1997-10-21 Dragon Systems, Inc. Systems and methods for word recognition

Also Published As

Publication number Publication date
EP0801786A1 (en) 1997-10-22
JPH10501078A (en) 1998-01-27
CA2203132C (en) 2004-11-16
DE69517705T2 (en) 2000-11-23
EP0801786B1 (en) 2000-06-28
US5899973A (en) 1999-05-04
WO1997017694A1 (en) 1997-05-15
JP3126985B2 (en) 2001-01-22
DE69517705D1 (en) 2000-08-03

Similar Documents

Publication Publication Date Title
CA2203132A1 (en) Method and apparatus for adapting the language model's size in a speech recognition system
AU1362199A (en) System and method for auditorially representing pages of sgml data
EP0917129A3 (en) Method and apparatus for adapting a speech recognizer to the pronunciation of an non native speaker
CA2162696A1 (en) Topic Discriminator
CA2210887A1 (en) Method and apparatus for speech recognition adapted to an individual speaker
AU2002211438A1 (en) Language independent voice-based search system
EP0805403A3 (en) Translating apparatus and translating method
EP1569202A3 (en) System and method for augmenting spoken language understanding by correcting common errors in linguistic performance
CA2236623A1 (en) Method and apparatus for automatically identifying key words within a document
EP0744734A3 (en) Speaker verification method and apparatus using mixture decomposition discrimination
US20060229864A1 (en) Method, device, and computer program product for multi-lingual speech recognition
TW357313B (en) Methods and apparatus for handwriting recognition
CA2653973A1 (en) Replacing text representing a concept with an alternate written form of the concept
EP1199707A3 (en) Method and apparatus for dynamic adaptation of a large vocabulary speech recognition system and for use of constraints from a database in a large vocabulary speech recognition system
WO2003058603A3 (en) System and method for speech recognition by multi-pass recognition generating refined context specific grammars
WO2004003688A3 (en) A method for comparing a transcribed text file with a previously created file
TW366483B (en) Method and apparatus for transmitting a voice sample to a voice activated data processing system
EP0977175A3 (en) Method and apparatus for recognizing speech using a knowledge base
EP1291790A3 (en) Text-based automatic content classification and grouping
EP2264697A3 (en) System and method for text-to-speech processing in a portable device
CA2366485A1 (en) System and method for parsing a document
EP0862162A3 (en) Speech recognition using nonparametric speech models
EP0949606A3 (en) Method and system for speech recognition based on phonetic transcriptions
EP1225567A3 (en) Method and apparatus for speech recognition
Hess et al. Prosodic modules for speech recognition and understanding in VERBMOBIL

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed