CA2060591A1 - Speaker-independent label coding apparatus - Google Patents

Speaker-independent label coding apparatus

Info

Publication number
CA2060591A1
CA2060591A1 CA2060591A CA2060591A CA2060591A1 CA 2060591 A1 CA2060591 A1 CA 2060591A1 CA 2060591 A CA2060591 A CA 2060591A CA 2060591 A CA2060591 A CA 2060591A CA 2060591 A1 CA2060591 A1 CA 2060591A1
Authority
CA
Canada
Prior art keywords
prototypes
feature
feature vector
class
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CA2060591A
Other languages
French (fr)
Other versions
CA2060591C (en
Inventor
Lalit R. Bahl
Michael A. Picheny
David Nahamoo
Peter V. De Souza
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Publication of CA2060591A1 publication Critical patent/CA2060591A1/en
Application granted granted Critical
Publication of CA2060591C publication Critical patent/CA2060591C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3082Vector coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components
    • G10L19/038Vector quantisation, e.g. TwinVQ audio

Abstract

The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values.
To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.
CA002060591A 1991-03-22 1992-02-04 Speaker-independent label coding apparatus Expired - Fee Related CA2060591C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US07/673,810 1991-03-22
US07/673,810 US5182773A (en) 1991-03-22 1991-03-22 Speaker-independent label coding apparatus

Publications (2)

Publication Number Publication Date
CA2060591A1 true CA2060591A1 (en) 1992-09-23
CA2060591C CA2060591C (en) 1996-08-13

Family

ID=24704198

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002060591A Expired - Fee Related CA2060591C (en) 1991-03-22 1992-02-04 Speaker-independent label coding apparatus

Country Status (5)

Country Link
US (1) US5182773A (en)
EP (1) EP0504485B1 (en)
JP (1) JP3337233B2 (en)
CA (1) CA2060591C (en)
DE (1) DE69129015T2 (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276766A (en) * 1991-07-16 1994-01-04 International Business Machines Corporation Fast algorithm for deriving acoustic prototypes for automatic speech recognition
US5280562A (en) * 1991-10-03 1994-01-18 International Business Machines Corporation Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer
US5222146A (en) * 1991-10-23 1993-06-22 International Business Machines Corporation Speech recognition apparatus having a speech coder outputting acoustic prototype ranks
US5497447A (en) * 1993-03-08 1996-03-05 International Business Machines Corporation Speech coding apparatus having acoustic prototype vectors generated by tying to elementary models and clustering around reference vectors
CA2115210C (en) * 1993-04-21 1997-09-23 Joseph C. Andreshak Interactive computer system recognizing spoken commands
US5465317A (en) * 1993-05-18 1995-11-07 International Business Machines Corporation Speech recognition system with improved rejection of words and sounds not in the system vocabulary
US5544277A (en) * 1993-07-28 1996-08-06 International Business Machines Corporation Speech coding apparatus and method for generating acoustic feature vector component values by combining values of the same features for multiple time intervals
US5522011A (en) * 1993-09-27 1996-05-28 International Business Machines Corporation Speech coding apparatus and method using classification rules
US5539860A (en) * 1993-12-22 1996-07-23 At&T Corp. Speech recognition using bio-signals
US5539861A (en) * 1993-12-22 1996-07-23 At&T Corp. Speech recognition using bio-signals
JP3321976B2 (en) * 1994-04-01 2002-09-09 富士通株式会社 Signal processing device and signal processing method
DE4417557A1 (en) * 1994-05-19 1995-11-23 Daimler Benz Aerospace Ag Method for recognizing speech signals and arrangement for carrying out the method
FR2738383B1 (en) * 1995-09-05 1997-10-03 Thomson Csf METHOD FOR VECTOR QUANTIFICATION OF LOW FLOW VOCODERS
US5963903A (en) * 1996-06-28 1999-10-05 Microsoft Corporation Method and system for dynamically adjusted training for speech recognition
US6151575A (en) * 1996-10-28 2000-11-21 Dragon Systems, Inc. Rapid adaptation of speech models
US6224636B1 (en) 1997-02-28 2001-05-01 Dragon Systems, Inc. Speech recognition using nonparametric speech models
US6023673A (en) * 1997-06-04 2000-02-08 International Business Machines Corporation Hierarchical labeler in a speech recognition system
US5946653A (en) * 1997-10-01 1999-08-31 Motorola, Inc. Speaker independent speech recognition system and method
EP1126438B1 (en) * 1998-09-09 2008-07-16 Asahi Kasei Kabushiki Kaisha Speech recognizer and speech recognition method
US6418409B1 (en) * 1999-10-26 2002-07-09 Persay Inc. Error derived scores for detection systems
US6978274B1 (en) * 2001-08-31 2005-12-20 Attenex Corporation System and method for dynamically evaluating latent concepts in unstructured documents
US7016839B2 (en) * 2002-01-31 2006-03-21 International Business Machines Corporation MVDR based feature extraction for speech recognition
US20040117186A1 (en) * 2002-12-13 2004-06-17 Bhiksha Ramakrishnan Multi-channel transcription-based speaker separation
US7571097B2 (en) * 2003-03-13 2009-08-04 Microsoft Corporation Method for training of subspace coded gaussian models
KR101190875B1 (en) * 2004-01-30 2012-10-15 프랑스 뗄레콤 Dimensional vector and variable resolution quantization
US7191175B2 (en) 2004-02-13 2007-03-13 Attenex Corporation System and method for arranging concept clusters in thematic neighborhood relationships in a two-dimensional visual display space
KR100629997B1 (en) * 2004-02-26 2006-09-27 엘지전자 주식회사 encoding method of audio signal
US8713018B2 (en) 2009-07-28 2014-04-29 Fti Consulting, Inc. System and method for displaying relationships between electronically stored information to provide classification suggestions via inclusion
CA3026879A1 (en) 2009-08-24 2011-03-10 Nuix North America, Inc. Generating a reference set for use during document review
WO2017210618A1 (en) 2016-06-02 2017-12-07 Fti Consulting, Inc. Analyzing clusters of coded documents
US11341185B1 (en) * 2018-06-19 2022-05-24 Amazon Technologies, Inc. Systems and methods for content-based indexing of videos at web-scale

Family Cites Families (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4032711A (en) * 1975-12-31 1977-06-28 Bell Telephone Laboratories, Incorporated Speaker recognition arrangement
JPS5722295A (en) * 1980-07-15 1982-02-05 Nippon Electric Co Speaker recognizing system
JPH0792673B2 (en) * 1984-10-02 1995-10-09 株式会社東芝 Recognition dictionary learning method
US4773093A (en) * 1984-12-31 1988-09-20 Itt Defense Communications Text-independent speaker recognition system and method based on acoustic segment matching
CA1232686A (en) * 1985-01-30 1988-02-09 Northern Telecom Limited Speech recognition
US4819271A (en) * 1985-05-29 1989-04-04 International Business Machines Corporation Constructing Markov model word baseforms from multiple utterances by concatenating model sequences for word segments
US4748670A (en) * 1985-05-29 1988-05-31 International Business Machines Corporation Apparatus and method for determining a likely word sequence from labels generated by an acoustic processor
CA1261472A (en) * 1985-09-26 1989-09-26 Yoshinao Shiraki Reference speech pattern generating method
JPH0792737B2 (en) * 1986-01-29 1995-10-09 株式会社ピーエフユー Video signal display controller
JPS62231993A (en) * 1986-03-25 1987-10-12 インタ−ナシヨナル ビジネス マシ−ンズ コ−ポレ−シヨン Voice recognition
US4847906A (en) * 1986-03-28 1989-07-11 American Telephone And Telegraph Company, At&T Bell Laboratories Linear predictive speech coding arrangement
US4837831A (en) * 1986-10-15 1989-06-06 Dragon Systems, Inc. Method for creating and using multiple-word sound models in speech recognition
US4805219A (en) * 1987-04-03 1989-02-14 Dragon Systems, Inc. Method for speech recognition
US4926488A (en) * 1987-07-09 1990-05-15 International Business Machines Corporation Normalization of speech by adaptive labelling
US4817156A (en) * 1987-08-10 1989-03-28 International Business Machines Corporation Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker
CA1337217C (en) * 1987-08-28 1995-10-03 Daniel Kenneth Freeman Speech coding
JPH0219480A (en) * 1988-07-06 1990-01-23 Nippon Yakin Kogyo Co Ltd Mechanical plating method
JP2739950B2 (en) * 1988-03-31 1998-04-15 株式会社東芝 Pattern recognition device
JPH0636156B2 (en) * 1989-03-13 1994-05-11 インターナショナル・ビジネス・マシーンズ・コーポレーション Voice recognizer
JPH0332228A (en) * 1989-06-29 1991-02-12 Fujitsu Ltd Gain-shape vector quantization system

Also Published As

Publication number Publication date
EP0504485A3 (en) 1993-05-26
JP3337233B2 (en) 2002-10-21
EP0504485B1 (en) 1998-03-04
EP0504485A2 (en) 1992-09-23
CA2060591C (en) 1996-08-13
JPH0535299A (en) 1993-02-12
US5182773A (en) 1993-01-26
DE69129015T2 (en) 1998-10-29
DE69129015D1 (en) 1998-04-09

Similar Documents

Publication Publication Date Title
CA2060591A1 (en) Speaker-independent label coding apparatus
CA2013247A1 (en) Vector quantizer search arrangement
CA2379817A1 (en) Method and system for organizing data
TW360859B (en) Vector quantization method and speech encoding method and apparatus
CA2159557A1 (en) Coding apparatus having adaptive coding at different bit rates and pitch emphasis
DE3264367D1 (en) Speech recognition system and method
CA2158847A1 (en) A Method and Apparatus for Speaker Recognition
EP0645755A1 (en) Speech coding apparatus and method using classification rules
SE0201109D0 (en) Vector quantization method and apparatus
JPS6478299A (en) Voice recognition method and apparatus
WO1997010588A1 (en) Speech recognition system
JPH0449717B2 (en)
CA2072721C (en) Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer
Bahl et al. Context dependent vector quantization for continuous speech recognition
CA2108460A1 (en) Vocabulary memory allocation for adaptive data compression of frame-multiplexed traffic
CA2115185A1 (en) Device for Encoding Speech Spectrum Parameters with a Smallest Possible Number of Bits
CA2228109A1 (en) Speaker recognition system capable of accurately selecting inhibiting reference patterns by using small amount of calculation
EP0318858A3 (en) Connected word recognition system including neural networks arranged along a signal time axis
CA2209570A1 (en) Adaptive transform coding system, adaptive transform decoding system and adaptive transform coding/decoding system
EP0395076A3 (en) Speech coding apparatus
Booth et al. Enhancements to DTW and VQ decision algorithms for speaker recognition
FI105248B (en) A method for transmitting digitized, block-coded audio signals using scaling factors
US10498358B2 (en) Data encoder and data encoding method
CA2054849A1 (en) Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits
JPH04248722A (en) Data coding method

Legal Events

Date Code Title Description
EEER Examination request
MKLA Lapsed