DE602005014189D1 - System und Verfahren zur Erkennung von einer semantischen Absicht basierend auf akustischen Informationen - Google Patents
System und Verfahren zur Erkennung von einer semantischen Absicht basierend auf akustischen InformationenInfo
- Publication number
- DE602005014189D1 DE602005014189D1 DE602005014189T DE602005014189T DE602005014189D1 DE 602005014189 D1 DE602005014189 D1 DE 602005014189D1 DE 602005014189 T DE602005014189 T DE 602005014189T DE 602005014189 T DE602005014189 T DE 602005014189T DE 602005014189 D1 DE602005014189 D1 DE 602005014189D1
- Authority
- DE
- Germany
- Prior art keywords
- acoustics
- order
- cluster
- acoustic information
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/183—Speech classification or search using natural language modelling using context dependencies, e.g. language models
- G10L15/19—Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1815—Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/009,630 US7634406B2 (en) | 2004-12-10 | 2004-12-10 | System and method for identifying semantic intent from acoustic information |
Publications (1)
Publication Number | Publication Date |
---|---|
DE602005014189D1 true DE602005014189D1 (de) | 2009-06-10 |
Family
ID=36021832
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
DE602005014189T Active DE602005014189D1 (de) | 2004-12-10 | 2005-11-22 | System und Verfahren zur Erkennung von einer semantischen Absicht basierend auf akustischen Informationen |
Country Status (5)
Country | Link |
---|---|
US (1) | US7634406B2 (de) |
EP (1) | EP1669980B1 (de) |
JP (1) | JP4974510B2 (de) |
AT (1) | ATE430359T1 (de) |
DE (1) | DE602005014189D1 (de) |
Families Citing this family (50)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060277525A1 (en) * | 2005-06-06 | 2006-12-07 | Microsoft Corporation | Lexical, grammatical, and semantic inference mechanisms |
US8700404B1 (en) * | 2005-08-27 | 2014-04-15 | At&T Intellectual Property Ii, L.P. | System and method for using semantic and syntactic graphs for utterance classification |
US7752152B2 (en) * | 2006-03-17 | 2010-07-06 | Microsoft Corporation | Using predictive user models for language modeling on a personal device with user behavior models based on statistical modeling |
US8032375B2 (en) * | 2006-03-17 | 2011-10-04 | Microsoft Corporation | Using generic predictive models for slot values in language modeling |
JP4734155B2 (ja) * | 2006-03-24 | 2011-07-27 | 株式会社東芝 | 音声認識装置、音声認識方法および音声認識プログラム |
US7930183B2 (en) * | 2006-03-29 | 2011-04-19 | Microsoft Corporation | Automatic identification of dialog timing problems for an interactive speech dialog application using speech log data indicative of cases of barge-in and timing problems |
US7689420B2 (en) * | 2006-04-06 | 2010-03-30 | Microsoft Corporation | Personalizing a context-free grammar using a dictation language model |
US20070239453A1 (en) * | 2006-04-06 | 2007-10-11 | Microsoft Corporation | Augmenting context-free grammars with back-off grammars for processing out-of-grammar utterances |
US7707027B2 (en) * | 2006-04-13 | 2010-04-27 | Nuance Communications, Inc. | Identification and rejection of meaningless input during natural language classification |
US20080091423A1 (en) * | 2006-10-13 | 2008-04-17 | Shourya Roy | Generation of domain models from noisy transcriptions |
US8108205B2 (en) * | 2006-12-01 | 2012-01-31 | Microsoft Corporation | Leveraging back-off grammars for authoring context-free grammars |
US8712757B2 (en) * | 2007-01-10 | 2014-04-29 | Nuance Communications, Inc. | Methods and apparatus for monitoring communication through identification of priority-ranked keywords |
GB2453366B (en) * | 2007-10-04 | 2011-04-06 | Toshiba Res Europ Ltd | Automatic speech recognition method and apparatus |
US8660844B2 (en) * | 2007-10-24 | 2014-02-25 | At&T Intellectual Property I, L.P. | System and method of evaluating user simulations in a spoken dialog system with a diversion metric |
JP2010224194A (ja) * | 2009-03-23 | 2010-10-07 | Sony Corp | 音声認識装置及び音声認識方法、言語モデル生成装置及び言語モデル生成方法、並びにコンピューター・プログラム |
US20130219333A1 (en) * | 2009-06-12 | 2013-08-22 | Adobe Systems Incorporated | Extensible Framework for Facilitating Interaction with Devices |
KR101615262B1 (ko) * | 2009-08-12 | 2016-04-26 | 삼성전자주식회사 | 시멘틱 정보를 이용한 멀티 채널 오디오 인코딩 및 디코딩 방법 및 장치 |
US8457968B2 (en) * | 2009-12-08 | 2013-06-04 | At&T Intellectual Property I, L.P. | System and method for efficient tracking of multiple dialog states with incremental recombination |
US9378202B2 (en) * | 2010-03-26 | 2016-06-28 | Virtuoz Sa | Semantic clustering |
US8694304B2 (en) | 2010-03-26 | 2014-04-08 | Virtuoz Sa | Semantic clustering and user interfaces |
US8880399B2 (en) * | 2010-09-27 | 2014-11-04 | Rosetta Stone, Ltd. | Utterance verification and pronunciation scoring by lattice transduction |
US9524291B2 (en) | 2010-10-06 | 2016-12-20 | Virtuoz Sa | Visual display of semantic information |
US8688453B1 (en) * | 2011-02-28 | 2014-04-01 | Nuance Communications, Inc. | Intent mining via analysis of utterances |
US8798995B1 (en) | 2011-09-23 | 2014-08-05 | Amazon Technologies, Inc. | Key word determinations from voice data |
US9214157B2 (en) * | 2011-12-06 | 2015-12-15 | At&T Intellectual Property I, L.P. | System and method for machine-mediated human-human conversation |
US9082403B2 (en) * | 2011-12-15 | 2015-07-14 | Microsoft Technology Licensing, Llc | Spoken utterance classification training for a speech recognition system |
US8983840B2 (en) * | 2012-06-19 | 2015-03-17 | International Business Machines Corporation | Intent discovery in audio or text-based conversation |
US9158760B2 (en) * | 2012-12-21 | 2015-10-13 | The Nielsen Company (Us), Llc | Audio decoding with supplemental semantic audio recognition and report generation |
US9195649B2 (en) | 2012-12-21 | 2015-11-24 | The Nielsen Company (Us), Llc | Audio processing techniques for semantic audio recognition and report generation |
US9183849B2 (en) | 2012-12-21 | 2015-11-10 | The Nielsen Company (Us), Llc | Audio matching with semantic audio recognition and report generation |
US9047268B2 (en) * | 2013-01-31 | 2015-06-02 | Google Inc. | Character and word level language models for out-of-vocabulary text input |
US9454240B2 (en) | 2013-02-05 | 2016-09-27 | Google Inc. | Gesture keyboard input of non-dictionary character strings |
US10354677B2 (en) * | 2013-02-28 | 2019-07-16 | Nuance Communications, Inc. | System and method for identification of intent segment(s) in caller-agent conversations |
US9626960B2 (en) * | 2013-04-25 | 2017-04-18 | Nuance Communications, Inc. | Systems and methods for providing metadata-dependent language models |
US8756499B1 (en) * | 2013-04-29 | 2014-06-17 | Google Inc. | Gesture keyboard input of non-dictionary character strings using substitute scoring |
US9733894B2 (en) | 2013-07-02 | 2017-08-15 | 24/7 Customer, Inc. | Method and apparatus for facilitating voice user interface design |
US9842586B2 (en) | 2014-07-09 | 2017-12-12 | Genesys Telecommunications Laboratories, Inc. | System and method for semantically exploring concepts |
US9953646B2 (en) | 2014-09-02 | 2018-04-24 | Belleau Technologies | Method and system for dynamic speech recognition and tracking of prewritten script |
US9858923B2 (en) * | 2015-09-24 | 2018-01-02 | Intel Corporation | Dynamic adaptation of language models and semantic tracking for automatic speech recognition |
KR102429260B1 (ko) * | 2015-10-12 | 2022-08-05 | 삼성전자주식회사 | 음성 에이전트 기반의 제어 명령 처리 장치 및 방법과, 에이전트 장치 |
WO2018009231A1 (en) | 2016-07-08 | 2018-01-11 | Asapp, Inc. | Automatically responding to a request of a user |
US10083451B2 (en) | 2016-07-08 | 2018-09-25 | Asapp, Inc. | Using semantic processing for customer support |
JP6886651B2 (ja) * | 2016-12-08 | 2021-06-16 | 株式会社国際電気通信基礎技術研究所 | 行動コマンド生成システム、応答システムおよび行動コマンド生成方法 |
US10216832B2 (en) * | 2016-12-19 | 2019-02-26 | Interactions Llc | Underspecification of intents in a natural language processing system |
US10665228B2 (en) * | 2018-05-23 | 2020-05-26 | Bank of America Corporaiton | Quantum technology for use with extracting intents from linguistics |
US10477028B1 (en) | 2018-07-13 | 2019-11-12 | Bank Of America Corporation | System for processing voice responses using a natural language processing engine |
US11315256B2 (en) * | 2018-12-06 | 2022-04-26 | Microsoft Technology Licensing, Llc | Detecting motion in video using motion vectors |
CN109657186A (zh) * | 2018-12-27 | 2019-04-19 | 广州势必可赢网络科技有限公司 | 一种人数统计方法、系统及相关装置 |
US11587551B2 (en) | 2020-04-07 | 2023-02-21 | International Business Machines Corporation | Leveraging unpaired text data for training end-to-end spoken language understanding systems |
US20230419979A1 (en) * | 2022-06-28 | 2023-12-28 | Samsung Electronics Co., Ltd. | Online speaker diarization using local and global clustering |
Family Cites Families (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0494573A1 (de) * | 1991-01-08 | 1992-07-15 | International Business Machines Corporation | Verfahren zur automatischen Unterdrückung der Zweideutigkeit von den Verbindungen von Synonymen in einem elektronischen Wörterbuch für ein Natursprachenverarbeitungssystem |
NZ248751A (en) * | 1994-03-23 | 1997-11-24 | Ryan John Kevin | Text analysis and coding |
JP3745403B2 (ja) * | 1994-04-12 | 2006-02-15 | ゼロックス コーポレイション | オーディオデータセグメントのクラスタリング方法 |
JP3453456B2 (ja) * | 1995-06-19 | 2003-10-06 | キヤノン株式会社 | 状態共有モデルの設計方法及び装置ならびにその状態共有モデルを用いた音声認識方法および装置 |
JP3627299B2 (ja) * | 1995-07-19 | 2005-03-09 | ソニー株式会社 | 音声認識方法及び装置 |
US5835893A (en) * | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
US5806030A (en) * | 1996-05-06 | 1998-09-08 | Matsushita Electric Ind Co Ltd | Low complexity, high accuracy clustering method for speech recognizer |
US6601055B1 (en) * | 1996-12-27 | 2003-07-29 | Linda M. Roberts | Explanation generation system for a diagnosis support tool employing an inference system |
US5860063A (en) | 1997-07-11 | 1999-01-12 | At&T Corp | Automated meaningful phrase clustering |
US6449612B1 (en) * | 1998-03-17 | 2002-09-10 | Microsoft Corporation | Varying cluster number in a scalable clustering system for use with large databases |
US20030154072A1 (en) * | 1998-03-31 | 2003-08-14 | Scansoft, Inc., A Delaware Corporation | Call analysis |
US6725195B2 (en) * | 1998-08-25 | 2004-04-20 | Sri International | Method and apparatus for probabilistic recognition using small number of state clusters |
US6393460B1 (en) * | 1998-08-28 | 2002-05-21 | International Business Machines Corporation | Method and system for informing users of subjects of discussion in on-line chats |
WO2000025299A1 (de) * | 1998-10-27 | 2000-05-04 | Siemens Aktiengesellschaft | Verfahren und anordnung zur klassenbildung für ein sprachmodell basierend auf linguistischen klassen |
US6317707B1 (en) * | 1998-12-07 | 2001-11-13 | At&T Corp. | Automatic clustering of tokens from a corpus for grammar acquisition |
US6665681B1 (en) * | 1999-04-09 | 2003-12-16 | Entrieva, Inc. | System and method for generating a taxonomy from a plurality of documents |
AU5451800A (en) * | 1999-05-28 | 2000-12-18 | Sehda, Inc. | Phrase-based dialogue modeling with particular application to creating recognition grammars for voice-controlled user interfaces |
US6526379B1 (en) * | 1999-11-29 | 2003-02-25 | Matsushita Electric Industrial Co., Ltd. | Discriminative clustering methods for automatic speech recognition |
GB0000735D0 (en) | 2000-01-13 | 2000-03-08 | Eyretel Ltd | System and method for analysing communication streams |
US6751621B1 (en) * | 2000-01-27 | 2004-06-15 | Manning & Napier Information Services, Llc. | Construction of trainable semantic vectors and clustering, classification, and searching using trainable semantic vectors |
WO2002029783A1 (en) * | 2000-09-30 | 2002-04-11 | Intel Corporation | Method and system for using rule-based knowledge to build a class-based domain specific statistical language model |
ATE239966T1 (de) * | 2000-11-07 | 2003-05-15 | Ericsson Telefon Ab L M | Anwendung von referenzdaten für spracherkennung |
US6937983B2 (en) * | 2000-12-20 | 2005-08-30 | International Business Machines Corporation | Method and system for semantic speech recognition |
JP2002358095A (ja) * | 2001-03-30 | 2002-12-13 | Sony Corp | 音声処理装置および音声処理方法、並びにプログラムおよび記録媒体 |
EP1380153B1 (de) | 2001-04-19 | 2009-09-02 | BRITISH TELECOMMUNICATIONS public limited company | Sprachantwortsystem |
US7031909B2 (en) * | 2002-03-12 | 2006-04-18 | Verity, Inc. | Method and system for naming a cluster of words and phrases |
US7085771B2 (en) * | 2002-05-17 | 2006-08-01 | Verity, Inc | System and method for automatically discovering a hierarchy of concepts from a corpus of documents |
US7107207B2 (en) * | 2002-06-19 | 2006-09-12 | Microsoft Corporation | Training machine learning by sequential conditional generalized iterative scaling |
JP2004198597A (ja) * | 2002-12-17 | 2004-07-15 | Advanced Telecommunication Research Institute International | 音声認識装置および文分類装置としてコンピュータを動作させるコンピュータプログラム、階層化された言語モデルを作成する方法を実現する様にコンピュータを動作させるコンピュータプログラム、および記憶媒体 |
JP4392581B2 (ja) * | 2003-02-20 | 2010-01-06 | ソニー株式会社 | 言語処理装置および言語処理方法、並びにプログラムおよび記録媒体 |
JP4828091B2 (ja) * | 2003-03-05 | 2011-11-30 | ヒューレット・パッカード・カンパニー | クラスタリング方法プログラム及び装置 |
JP4223841B2 (ja) * | 2003-03-17 | 2009-02-12 | 富士通株式会社 | 音声対話システム及び方法 |
US7103553B2 (en) * | 2003-06-04 | 2006-09-05 | Matsushita Electric Industrial Co., Ltd. | Assistive call center interface |
JP4191021B2 (ja) * | 2003-12-01 | 2008-12-03 | 株式会社国際電気通信基礎技術研究所 | ドメイン検証器のトレーニング装置、入力データのドメイン検証装置、及びコンピュータプログラム |
-
2004
- 2004-12-10 US US11/009,630 patent/US7634406B2/en not_active Expired - Fee Related
-
2005
- 2005-11-08 JP JP2005323894A patent/JP4974510B2/ja not_active Expired - Fee Related
- 2005-11-22 AT AT05111074T patent/ATE430359T1/de not_active IP Right Cessation
- 2005-11-22 DE DE602005014189T patent/DE602005014189D1/de active Active
- 2005-11-22 EP EP05111074A patent/EP1669980B1/de not_active Not-in-force
Also Published As
Publication number | Publication date |
---|---|
JP4974510B2 (ja) | 2012-07-11 |
EP1669980A2 (de) | 2006-06-14 |
EP1669980B1 (de) | 2009-04-29 |
US7634406B2 (en) | 2009-12-15 |
US20060129397A1 (en) | 2006-06-15 |
EP1669980A3 (de) | 2007-11-28 |
JP2006171710A (ja) | 2006-06-29 |
ATE430359T1 (de) | 2009-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE602005014189D1 (de) | System und Verfahren zur Erkennung von einer semantischen Absicht basierend auf akustischen Informationen | |
Johnsrude et al. | Swinging at a cocktail party: Voice familiarity aids speech perception in the presence of a competing voice | |
Masataka | Music, evolution and language | |
CN104123115B (zh) | 一种音频信息处理方法及电子设备 | |
CN105975569A (zh) | 一种语音处理的方法及终端 | |
ATE440334T1 (de) | System für sprachgesteuerte auswahl einer audiodatei und verfahren dafür | |
CN102831891A (zh) | 一种语音数据处理方法及系统 | |
CN105096941A (zh) | 语音识别方法以及装置 | |
WO2007056344A3 (en) | Techiques for model optimization for statistical pattern recognition | |
CN107210036A (zh) | 会议词语云 | |
CN106067310A (zh) | 录音数据处理方法及装置 | |
ATE407411T1 (de) | Verfahren zum bereitstellen von kontoinformation und system zum aufschreiben von diktiertem text | |
CN109754788A (zh) | 一种语音控制方法、装置、设备及存储介质 | |
CN104867494A (zh) | 一种录音文件的命名分类方法及系统 | |
Guerreiro et al. | Scanning for digital content: How blind and sighted people perceive concurrent speech | |
Esposito et al. | Judgment of infant cry: the roles of acoustic characteristics and sociodemographic characteristics | |
DE602006009078D1 (de) | Verfahren und vorrichtung zum editieren von sprachnachrichten | |
ATE362271T1 (de) | Verfahren und system zur verbesserung der genauigkeit in einem spracherkennungssystem | |
Warner | Reduction | |
DE60214850D1 (de) | Für eine benutzergruppe spezifisches musterverarbeitungssystem | |
WO2007134049A3 (en) | Multichannel content personalization system and method | |
US10522135B2 (en) | System and method for segmenting audio files for transcription | |
Smith | Futures of hearing pasts | |
Bangerter et al. | Managing third-party interruptions in conversations: Effects of duration and conversational role | |
Dufour et al. | Access to talker-specific representations is dependent on word frequency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
8364 | No opposition during term of opposition |