CA2060591A1 - Speaker-independent label coding apparatus - Google Patents
Speaker-independent label coding apparatusInfo
- Publication number
- CA2060591A1 CA2060591A1 CA2060591A CA2060591A CA2060591A1 CA 2060591 A1 CA2060591 A1 CA 2060591A1 CA 2060591 A CA2060591 A CA 2060591A CA 2060591 A CA2060591 A CA 2060591A CA 2060591 A1 CA2060591 A1 CA 2060591A1
- Authority
- CA
- Canada
- Prior art keywords
- prototypes
- feature
- feature vector
- class
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3082—Vector coding
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
- G10L19/038—Vector quantisation, e.g. TwinVQ audio
Abstract
The present invention is related to speech recognition and particularly to a new type of vector quantizer and a new vector quantization technique in which the error rate of associating a sound with an incoming speech signal is drastically reduced. To achieve this end, the present invention technique groups the feature vectors in a space into different prototypes at least two of which represent a class of sound. Each of the prototypes may in turn have a number of subclasses or partitions. Each of the prototypes and their subclasses may be assigned respective identifying values.
To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.
To identify an incoming speech feature vector, at least one of the feature values of the incoming feature vector is compared with the different values of the respective prototypes, or the subclasses of the prototypes. The class of sound whose group of prototypes, or at least one of the prototypes, whose combined value most closely matches the value of the feature value of the feature vector is deemed to be the class corresponding to the feature vector. The feature vector is then labeled with the identifier associated with that class.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US07/673,810 | 1991-03-22 | ||
US07/673,810 US5182773A (en) | 1991-03-22 | 1991-03-22 | Speaker-independent label coding apparatus |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2060591A1 true CA2060591A1 (en) | 1992-09-23 |
CA2060591C CA2060591C (en) | 1996-08-13 |
Family
ID=24704198
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002060591A Expired - Fee Related CA2060591C (en) | 1991-03-22 | 1992-02-04 | Speaker-independent label coding apparatus |
Country Status (5)
Country | Link |
---|---|
US (1) | US5182773A (en) |
EP (1) | EP0504485B1 (en) |
JP (1) | JP3337233B2 (en) |
CA (1) | CA2060591C (en) |
DE (1) | DE69129015T2 (en) |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5276766A (en) * | 1991-07-16 | 1994-01-04 | International Business Machines Corporation | Fast algorithm for deriving acoustic prototypes for automatic speech recognition |
US5280562A (en) * | 1991-10-03 | 1994-01-18 | International Business Machines Corporation | Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer |
US5222146A (en) * | 1991-10-23 | 1993-06-22 | International Business Machines Corporation | Speech recognition apparatus having a speech coder outputting acoustic prototype ranks |
US5497447A (en) * | 1993-03-08 | 1996-03-05 | International Business Machines Corporation | Speech coding apparatus having acoustic prototype vectors generated by tying to elementary models and clustering around reference vectors |
CA2115210C (en) * | 1993-04-21 | 1997-09-23 | Joseph C. Andreshak | Interactive computer system recognizing spoken commands |
US5465317A (en) * | 1993-05-18 | 1995-11-07 | International Business Machines Corporation | Speech recognition system with improved rejection of words and sounds not in the system vocabulary |
US5544277A (en) * | 1993-07-28 | 1996-08-06 | International Business Machines Corporation | Speech coding apparatus and method for generating acoustic feature vector component values by combining values of the same features for multiple time intervals |
US5522011A (en) * | 1993-09-27 | 1996-05-28 | International Business Machines Corporation | Speech coding apparatus and method using classification rules |
US5539860A (en) * | 1993-12-22 | 1996-07-23 | At&T Corp. | Speech recognition using bio-signals |
US5539861A (en) * | 1993-12-22 | 1996-07-23 | At&T Corp. | Speech recognition using bio-signals |
JP3321976B2 (en) * | 1994-04-01 | 2002-09-09 | 富士通株式会社 | Signal processing device and signal processing method |
DE4417557A1 (en) * | 1994-05-19 | 1995-11-23 | Daimler Benz Aerospace Ag | Method for recognizing speech signals and arrangement for carrying out the method |
FR2738383B1 (en) * | 1995-09-05 | 1997-10-03 | Thomson Csf | METHOD FOR VECTOR QUANTIFICATION OF LOW FLOW VOCODERS |
US5963903A (en) * | 1996-06-28 | 1999-10-05 | Microsoft Corporation | Method and system for dynamically adjusted training for speech recognition |
US6151575A (en) * | 1996-10-28 | 2000-11-21 | Dragon Systems, Inc. | Rapid adaptation of speech models |
US6224636B1 (en) | 1997-02-28 | 2001-05-01 | Dragon Systems, Inc. | Speech recognition using nonparametric speech models |
US6023673A (en) * | 1997-06-04 | 2000-02-08 | International Business Machines Corporation | Hierarchical labeler in a speech recognition system |
US5946653A (en) * | 1997-10-01 | 1999-08-31 | Motorola, Inc. | Speaker independent speech recognition system and method |
EP1126438B1 (en) * | 1998-09-09 | 2008-07-16 | Asahi Kasei Kabushiki Kaisha | Speech recognizer and speech recognition method |
US6418409B1 (en) * | 1999-10-26 | 2002-07-09 | Persay Inc. | Error derived scores for detection systems |
US6978274B1 (en) * | 2001-08-31 | 2005-12-20 | Attenex Corporation | System and method for dynamically evaluating latent concepts in unstructured documents |
US7016839B2 (en) * | 2002-01-31 | 2006-03-21 | International Business Machines Corporation | MVDR based feature extraction for speech recognition |
US20040117186A1 (en) * | 2002-12-13 | 2004-06-17 | Bhiksha Ramakrishnan | Multi-channel transcription-based speaker separation |
US7571097B2 (en) * | 2003-03-13 | 2009-08-04 | Microsoft Corporation | Method for training of subspace coded gaussian models |
KR101190875B1 (en) * | 2004-01-30 | 2012-10-15 | 프랑스 뗄레콤 | Dimensional vector and variable resolution quantization |
US7191175B2 (en) | 2004-02-13 | 2007-03-13 | Attenex Corporation | System and method for arranging concept clusters in thematic neighborhood relationships in a two-dimensional visual display space |
KR100629997B1 (en) * | 2004-02-26 | 2006-09-27 | 엘지전자 주식회사 | encoding method of audio signal |
US8713018B2 (en) | 2009-07-28 | 2014-04-29 | Fti Consulting, Inc. | System and method for displaying relationships between electronically stored information to provide classification suggestions via inclusion |
CA3026879A1 (en) | 2009-08-24 | 2011-03-10 | Nuix North America, Inc. | Generating a reference set for use during document review |
WO2017210618A1 (en) | 2016-06-02 | 2017-12-07 | Fti Consulting, Inc. | Analyzing clusters of coded documents |
US11341185B1 (en) * | 2018-06-19 | 2022-05-24 | Amazon Technologies, Inc. | Systems and methods for content-based indexing of videos at web-scale |
Family Cites Families (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4032711A (en) * | 1975-12-31 | 1977-06-28 | Bell Telephone Laboratories, Incorporated | Speaker recognition arrangement |
JPS5722295A (en) * | 1980-07-15 | 1982-02-05 | Nippon Electric Co | Speaker recognizing system |
JPH0792673B2 (en) * | 1984-10-02 | 1995-10-09 | 株式会社東芝 | Recognition dictionary learning method |
US4773093A (en) * | 1984-12-31 | 1988-09-20 | Itt Defense Communications | Text-independent speaker recognition system and method based on acoustic segment matching |
CA1232686A (en) * | 1985-01-30 | 1988-02-09 | Northern Telecom Limited | Speech recognition |
US4819271A (en) * | 1985-05-29 | 1989-04-04 | International Business Machines Corporation | Constructing Markov model word baseforms from multiple utterances by concatenating model sequences for word segments |
US4748670A (en) * | 1985-05-29 | 1988-05-31 | International Business Machines Corporation | Apparatus and method for determining a likely word sequence from labels generated by an acoustic processor |
CA1261472A (en) * | 1985-09-26 | 1989-09-26 | Yoshinao Shiraki | Reference speech pattern generating method |
JPH0792737B2 (en) * | 1986-01-29 | 1995-10-09 | 株式会社ピーエフユー | Video signal display controller |
JPS62231993A (en) * | 1986-03-25 | 1987-10-12 | インタ−ナシヨナル ビジネス マシ−ンズ コ−ポレ−シヨン | Voice recognition |
US4847906A (en) * | 1986-03-28 | 1989-07-11 | American Telephone And Telegraph Company, At&T Bell Laboratories | Linear predictive speech coding arrangement |
US4837831A (en) * | 1986-10-15 | 1989-06-06 | Dragon Systems, Inc. | Method for creating and using multiple-word sound models in speech recognition |
US4805219A (en) * | 1987-04-03 | 1989-02-14 | Dragon Systems, Inc. | Method for speech recognition |
US4926488A (en) * | 1987-07-09 | 1990-05-15 | International Business Machines Corporation | Normalization of speech by adaptive labelling |
US4817156A (en) * | 1987-08-10 | 1989-03-28 | International Business Machines Corporation | Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker |
CA1337217C (en) * | 1987-08-28 | 1995-10-03 | Daniel Kenneth Freeman | Speech coding |
JPH0219480A (en) * | 1988-07-06 | 1990-01-23 | Nippon Yakin Kogyo Co Ltd | Mechanical plating method |
JP2739950B2 (en) * | 1988-03-31 | 1998-04-15 | 株式会社東芝 | Pattern recognition device |
JPH0636156B2 (en) * | 1989-03-13 | 1994-05-11 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Voice recognizer |
JPH0332228A (en) * | 1989-06-29 | 1991-02-12 | Fujitsu Ltd | Gain-shape vector quantization system |
-
1991
- 1991-03-22 US US07/673,810 patent/US5182773A/en not_active Expired - Lifetime
- 1991-12-10 EP EP91121180A patent/EP0504485B1/en not_active Expired - Lifetime
- 1991-12-10 DE DE69129015T patent/DE69129015T2/en not_active Expired - Fee Related
-
1992
- 1992-01-10 JP JP00326892A patent/JP3337233B2/en not_active Expired - Fee Related
- 1992-02-04 CA CA002060591A patent/CA2060591C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
EP0504485A3 (en) | 1993-05-26 |
JP3337233B2 (en) | 2002-10-21 |
EP0504485B1 (en) | 1998-03-04 |
EP0504485A2 (en) | 1992-09-23 |
CA2060591C (en) | 1996-08-13 |
JPH0535299A (en) | 1993-02-12 |
US5182773A (en) | 1993-01-26 |
DE69129015T2 (en) | 1998-10-29 |
DE69129015D1 (en) | 1998-04-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2060591A1 (en) | Speaker-independent label coding apparatus | |
CA2013247A1 (en) | Vector quantizer search arrangement | |
CA2379817A1 (en) | Method and system for organizing data | |
TW360859B (en) | Vector quantization method and speech encoding method and apparatus | |
CA2159557A1 (en) | Coding apparatus having adaptive coding at different bit rates and pitch emphasis | |
DE3264367D1 (en) | Speech recognition system and method | |
CA2158847A1 (en) | A Method and Apparatus for Speaker Recognition | |
EP0645755A1 (en) | Speech coding apparatus and method using classification rules | |
SE0201109D0 (en) | Vector quantization method and apparatus | |
JPS6478299A (en) | Voice recognition method and apparatus | |
WO1997010588A1 (en) | Speech recognition system | |
JPH0449717B2 (en) | ||
CA2072721C (en) | Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer | |
Bahl et al. | Context dependent vector quantization for continuous speech recognition | |
CA2108460A1 (en) | Vocabulary memory allocation for adaptive data compression of frame-multiplexed traffic | |
CA2115185A1 (en) | Device for Encoding Speech Spectrum Parameters with a Smallest Possible Number of Bits | |
CA2228109A1 (en) | Speaker recognition system capable of accurately selecting inhibiting reference patterns by using small amount of calculation | |
EP0318858A3 (en) | Connected word recognition system including neural networks arranged along a signal time axis | |
CA2209570A1 (en) | Adaptive transform coding system, adaptive transform decoding system and adaptive transform coding/decoding system | |
EP0395076A3 (en) | Speech coding apparatus | |
Booth et al. | Enhancements to DTW and VQ decision algorithms for speaker recognition | |
FI105248B (en) | A method for transmitting digitized, block-coded audio signals using scaling factors | |
US10498358B2 (en) | Data encoder and data encoding method | |
CA2054849A1 (en) | Speech parameter encoding method capable of transmitting a spectrum parameter at a reduced number of bits | |
JPH04248722A (en) | Data coding method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |