CN1138386A - 分布式话音识别系统 - Google Patents

分布式话音识别系统 Download PDF

Info

Publication number
CN1138386A
CN1138386A CN94194566A CN94194566A CN1138386A CN 1138386 A CN1138386 A CN 1138386A CN 94194566 A CN94194566 A CN 94194566A CN 94194566 A CN94194566 A CN 94194566A CN 1138386 A CN1138386 A CN 1138386A
Authority
CN
China
Prior art keywords
word
unit
voice
voice recognition
lpc
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN94194566A
Other languages
English (en)
Other versions
CN1119794C (zh
Inventor
保罗·E·雅克布斯
张成生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qualcomm Inc
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of CN1138386A publication Critical patent/CN1138386A/zh
Application granted granted Critical
Publication of CN1119794C publication Critical patent/CN1119794C/zh
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis

Abstract

一种具有特征抽取装置(22)的话音识别系统,该系统位于远程站(40)中。特征抽取装置(22)从输入话音帧中抽取特征,然后将抽取的特征提供给中央处理站(42)。在中央处理站(42)中,将特征提供给确定输入话音帧之语法的字译码器(48)。

Description

分布式话音识别系统
发明领域
本发明涉及话音信号处理。本发明尤其涉及一种新的用于实现标准话音识别系统分布化的方法和设备。
背景技术
话音识别是赋予机器模拟智能以识别用户或用户话音指令并便于人机交互的最重要技术之一。它还是一种理解人类语言的关键技术。利用各种技术从声信号中还原语言信息的系统称为话音识别器(VR)。话音识别器由声处理器和字译码器组成,其中声处理器从来话原始话音中抽取VR所需的信息传播特征(information-bearingfeature)(矢量)序列,而字译码器对该特征(矢量)序列译码,产生所需的且有意义的输出格式,比如与输入发音相应的语音字序列。为改善某给定系统的性能,需要进行训练,以给系统配备有效的参数。换句话说,系统在优化工作前需要学习。
声处理器是话音识别器中的一个前端话音分析子系统。它对输入话音信号作出响应,为表征随时间变化的话音信号提供一种适当表示。它应该除去诸如背景噪声、信道失真、说话者特征和说话方式等无关的信息。有效的声特征将为话音识别器提供较高的声鉴辨力。最有用的特征是短时间频谱包络。为表征短时间频谱包络,两种最常用的频谱分析技术是线性预测编码(LPC)和基于滤波器组的频谱分析模式。但是容易证明,如L.R.Rabiner和R.W.Schafer在1978年Prentice Hall公司出版的“话音信号数字处理”一书中所讨论的,LPC不仅为声带频谱包络提供良好的近似,而且在计算方面的花费比所有数字化的滤波器组模式要小得多。经验还表明,基于LPC的话音识别器的性能可以与基于滤波器组的识别器相媲美,或者会更好。具体可以参见Prentice Hall公司于1993年出版的“话音识别基础”一书,作者是L.R.Rabiner和B.H.Juang。
参照图1,在基于LPC的声处理器中,将输入话音提供给传声器(未示出),并且将其转换成模拟电信号。然后A/D转换器(未示出)将该电信号数字化。为把数字化的话音信号频谱展平并使其在接下来的信号处理过程中少受有限精度的影响,让该信号通过预加重滤波器2。然后将经过预加重滤波的话音提供给分段单元4,在分段单元4处,话音被分段,或者被分成许多暂时重叠或不重叠的组。然后将话音数据帧提供给开窗单元6,在开窗单元6中除去分成帧的DC分量,并在每一帧上进行数字开窗操作,以减少因帧边界处的不连续性而引起的分组效应。LPC分析中最常用的开窗函数是Ham-ming窗口w(n),其定义是: w ( n ) = 0.54 - 0.46 · cos ( 2 πn N - 1 ) , 0 ≤ n ≤ N - 1 - - - ( 1 ) 经开窗的话音被提供给LPC分析单元8。在LPC分析单元8中,根据被开窗的采样计算自相关函数,并且从自相关函数中直接获得相应的LPC参数。
一般地说,字译码器将声处理器产生的声特性序列翻译成说话者原始字串的估计。它分两步完成:即声模式匹配(acoustic patternmatching)和语言建模(language modeling)。在孤立的字识别应用中可以避免语言建模步骤。将来自LPC分析单元8的LPC参数提供给声模型匹配单元10,以对音素、音节和字等可能的声模型进行检测和分类。将候选的模型提供给建立语法约束规则模型的语言建模单元12,这些语法约束规则可以确定哪些字序列是按语法构成并有意义的。当单独的声信息含义不清时,语法信息能为话音识别提供有价值的指导。基于语言建模,VR按顺序解释声特性匹配结果并提供估计字串。
字译码器中的声模型匹配和语言建模都需要数学模型,或为确定的或为随机的,以描述说话者的音韵和声音语音的变化。话音识别系统的性能直接与这两种建模的质量有关。在声模型匹配的各种模型类别中,基于模板的动态时间翘曲(dynamic time warping)(DTW)和随机隐含马尔可夫模型(hidden Markov modeling)(HMM)是两种最常用的。但是,已经知道,基于DTW的方法可视为基于HMM方法的一种特例,后者是一种参数化双随机模型。目前HMM系统是最成功的话音识别算法。HMM中的双随机特性在吸收声音和与话音信号有关的暂时变化方面提供了较好的适应性。这常常导致识别精度的改善。关于语言模型,已在实际的大词汇量话音识别系统中成功地应用了一种称为k语法语言模型的随机模型,详见1985年《电气与电子工程师协会会刊》,第73卷,第1616-1624页,由F.Jelink撰写的“实验分散口语识别器的开发”一文。在词汇量小的情况下,已在飞机订票和信息系统的应用中将确定性的语法制成一有限状态网络(FSN)(详见1985年六月《电气与电子工程师协会IASSP会刊》第33卷第3册,由L.R.Rabiner和S.Z.Levin-son撰写的“A Speaker-Independent,Syntax-Directed,Con-nected Word Recognition System Based on Hidden Markov Modeland Level Building”一文)。
从统计角度讲,为了尽量减少识别错误的可能性,可以按下述方法使话音识别问题形式化:利用声证据观测O,话音识别操作将是寻找最相似的字串W*,以使
W*=arg max P(W|O)       (1)其中取最大运算是针对所有可能的字串W。根据Bayes规则,可将上述方程中的后验概率重写成: P ( W | O ) = P ( W ) P ( O | W ) P ( O ) - - - ( 2 ) 由于P(O)与识别无关,所以可以用另一种方法获得字串估计,即
W*=arg max P(W)P(O|W)
                         (3)这里P(W)表示将发出字串W的先验概率,而P(O|W)是对给定说话者发出字序列W,观察到声证据O的概率。P(O|W)由声模型匹配确定,而先验概率P(W)由所用的语言模型定义。
在连贯的字识别中,如果词汇量较少(少于100),则可用确定性语法硬性规定那些字可以逻辑地接在其他字的后面,以形成语言中的合法句子。确定性语法可隐含地结合在声匹配算法中,以限制潜在字的搜索空间并大大减少计算量。但是,当词汇量中等(大于100但小于1000)或者较大(大于1000)时,可用随机语言建模获得可能的字序列W=(w1,w2,…,wn)。根据简单的概率论,可以如下分解先验概率P(W): P ( W ) = P ( w 1 , w 2 , . . . , w n ) = Π i = 1 n P ( w i | w 1 , w 2 , . . . , w i - 1 ) - - - ( 4 ) 其中P(wi|w1,w2,…,wi-1)是在给定序列(w1,w2,…,wi-1)后将说出wi的概率。wi的选择依赖于以前整个输入字的历史。对于词汇量为V的情况,需要Vi个值才能使P(wi|w1,w2,…,wi-1)完全确定。即使对于词汇量中等的情况,也需要惊人数量的样例来训练语言模型。因训练数据不充分而引起的对P(wi|w1,w2,…,wi-1)估计不精确,将降低原本声匹配结果的价值。
上述问题的实际解决方法是假设wi仅依赖于(k-1)个先行字wi-1,wi-2,…,wi-k+1。随机语言模型可以用导出k语法语言模型的P(wi|w1,w2,…,wi-k+1)来完整地描述。由于如果k>3,大多数字串将永远不会出现在语言中,所以单语法(k=1)、双语法(k=2)和三语法(k=3)是统计地考虑语法的最有效的随机语言模型。语言模型包含对识别有用的语法和语义信息,但这些概率必须从大量话音数据中训练得到。当有效的训练数据相当有限,K语法永远不会出现在数据中时,P(wi|wi-2,wi-1)可以直接从双语法概率P(wi|wi-1)估计得到。该过程详见1985年《电气与电子工程师协会会刊》,第73卷,第1616-1624页,由F.Jelink撰写的“实验分散口语识别器的开发”一文。在连贯的字识别中,整个字模型用作基本话音单位,而在连续话音识别中,可将诸如音素、音节或半音节等分波段单位用作基本话音单位。字译码器将作相应改进。
常规的话音识别系统把声处理器和字处理器合在一起,不考虑其可分离性,应用系统的限制(诸如功率损耗、可用存储等)和通信信道的特性。这激发了人们对设计上述两部分被适当分离的分布式话音识别系统的兴趣。
发明内容
本发明是一种新的改进型分布式话音识别系统,在该系统中,(i)前端声处理器可以基于LPC或者基于滤波器组;(ii)字译码器中的声模型匹配可以基于隐含马尔可夫模型(HMM)、动态时间翘曲(DTW)或者甚至基于神经网络(NN);并且(iii)对于连贯或连续的字识别,语言模型可以基于确定性或随机性的语法。本发明不同于通过适当分离特征抽取和字译码部分来提高系统性能的常规话音识别器。正如以下实施例所描述的,如果诸如倒频谱系数等基于LPC的特征通过通信信道发送,则可用LPC和LSP之间的转换来减少噪声对特征序列的影响。
附图概述
结合附图阅读以下叙述的详细说明将更加了解本发明的特征、目的和优点。附图中相同的标号自始至终表示相同的部分,其中
图1是常规话音识别系统的方框图;
图2是在无线电通信环境下本发明一实施例的方框图;
图3是本发明的总方框图;
图4是本发明转换单元和逆转换单元实施例的方框图;
图5是包括本地字检测器和远程字检测器的本发明优选实施例的方框图。
本发明的最佳实施方式
在标准话音识别器中,无论在识别过程中或者在训练过程中,复杂的计算大多集中在话音识别器的字译码子系统中。在实现具有分布式系统结构的话音识别器时,通常希望把字译码任务放在能适当吸收计算负载的子系统中。但声处理器应尽量靠近语音源,以减少信号处理引起的量化误差和/或信道引入误差的影响。
图2示出了本发明的一个实施例。在该实施例中,环境是无线电通信系统,系统包括一便携式蜂窝电话机或个人通信设备40,以及被称为蜂窝基站的中央通信中心42。在该实施例中,给出了分布式VR系统。在该分布式VR中,声处理器或特性抽取单元22在个人通信设备40中,而字译码器48在中央通信中心中。如果不用分布式VR,只在便携式蜂窝电话中实现VR,那么由于计算花费很大,所以即使对于词汇量中等的连贯字识别,也是极不可行的。另一方面,如果VR只在基站中,那么与话音编码译码器相关的话音信号的衰减和信道效应会大大降低精度。显然,该推荐的分布式系统设计有三个好处。第一个好处是由于字译码器硬件不再位于电话机40中,所以降低了蜂窝电话机的成本。第二个好处是,减缓了便携式电话40中电池(未示出)的耗电,本地进行计算强度很大的字译码器操作时会引起上述电池耗电。第三个好处是,除了分布式系统的灵活性和扩展能力,识别精度也有改善。
将话音提供给传声器20,传声器将该话音信号转换成电信号,提供给特性抽取单元22。传声器20输出的信号可以是模拟或是数字的。如果信号是模拟的,那么需要在传声器20和特性抽取单元22间安置一个模拟—数字转换器(未示出)。话音信号被提供给特性抽取单元22。特性抽取单元22抽取将用来对输入话音之语言解释进行译码的输入话音的相关特征。可用来估计话音的特征一个例子是输入话音帧的频率特性。该特性常常用作话音输入帧的线性预测编码参数。然后将抽取的话音特征提供给发射器24,发射器24对抽取的特征信号进行编码、调制和放大,并通过双工器将调制特征提供至天线28,天线将话音调制特征发送给蜂窝基站或中央通信中心42。本领域中已知的各种数字编码、调制和发射方式皆可使用。
在中央通信中心42处,天线44接收发送来的特征,并将其提供给接收器46。接收器46实行解调功能并对接收到的被发送来的特征译码,然后提供给字译码器48。字译码器48根据话音特征确定给话音的语言估计,并将一动作信号提供给发射器50。发射器50对该动作信号进行放大、调制和编码,并将放大后的信号提供给天线52,天线52将估计字或命令信号发送给便携式电话40。发射器50也可以使用已知的数字编码、调制或发送技术。
在便携式电话40处,天线28接收估计字或命令信号,并通过双工器26将接收到的信号提供给接收器30,接收器30对该信号解调、译码,然后将该命令信号或估计字提供个给控制单元38。控制单元38对接收到的命令信号或估计字作出响应,提供预定的反应(例如,拨电话号码、将信息提供给便携式电话上的显示屏等等)。
图2所示的系统还可以按略微不同的方式使用,即从中央通信中心42发回的信息不一定是被发送话音的解释,从中央通信中心42发回的信息也可以是对便携式电话所发译码消息的响应。例如,可以在通过通信网与中央通信中心42耦合的远程应答机(未示出)上询问消息,在该情况下,从中央通信中心42发送至便携式电话机40的信号可以是来自应答机的消息。第二控制单元49可以同在中央通信中心中。
以下是将特征抽取单元22放在便携式电话40中而不放在中央通信中心42处的重要性。如果与分布式VR相反,将声处理器放在中央通信中心42处,那么低带宽数字无线电信道由于量化失真而需要一个限制特征矢量分解(resolution)的声码器(在第一子系统处)。但是,通过将声处理器放在便携式或蜂窝式电话中,就可以把整个信道频带用于特征发送。通常,传输被抽取的声特征矢量比传输话音信号需要较小的带宽。由于识别精度高度依赖输入话音信号的衰减,所以应该尽可能地使特征抽取单元22接近用户,从而特征抽取单元22根据传声器话音抽取特征矢量,而不是根据可能会在传输中又出错的声码式电话话音抽取特征矢量。
在实际应用中,话音识别器被设计在诸如背景噪声等环境条件下工作。因此,考虑噪声存在情况下的话音识别问题是很重要的。已经证明,如果在与测试条件完全(或近似)相同的环境下进行词汇量(参考模型)的训练,那么话音识别器不仅能在噪声很大的环境下提供良好的性能,而且能大大降低因噪声引起的识别精度的降低。训练和测试条件之间的不匹配构成了识别性能降低的主要因素之一。如前所述由于传输声特征所需带宽比话音信号要小,所以可假设声特征比话音信号能更可靠地通过通信信道,由此所推荐的分布式话音识别系统在提供匹配状态方面具有优势。如果在远地实现话音识别器,那么诸如无线电通信中遇到的衰落等信道变化会大大破坏匹配状态。如果能在本地吸收大量的训练计算,那么在本地实现VR便能避免上述影响。不幸的是,在许多应用中,这是不可能的。显然,分布式话音识别装置可以避免由信道的复杂性引起的不匹配情况,并弥补集中化装置的缺点。
参照图3,将数字话音采样提供给特征抽取单元51,特征抽取单元51通过通信信道5 6将特征提供给字估计单元62,确定估计字串。话音信号被提供给声处理器52,确定每个话音帧的潜在特征。由于字译码器在执行识别和训练任务时都要求输入声特征序列,所以须将这些声特征通过通信信道56发送过来。但是,并非标准话音识别系统中使用的所有潜在特征都适于通过噪声信道传输。在某些情况下,需要转换单元22以便于进行源编码,并降低信道噪声的影响。话音识别器中广泛使用的一例基于LPC的声特征是倒频谱系数{ci}。它们可以如下从LPC系数{ai}直接获得: c m = a m + Σ k = 1 m - 1 ( k m ) c k a m - k , - - - - - - - m = 1 , . . . , P - - - ( 5 ) c m = Σ k = 1 m - 1 ( k m ) c k a m - k , - - - - - - m = P + 1 , . . . , Q - - - - - ( 6 ) 其中P是所用LPC滤波器的级,而Q是倒频谱特征矢量的大小。由于倒频谱特征矢量快速变化,所以不容易压缩倒频谱系数帧序列。但是,在LPC和线谱对(line spectrum pair)(LPC)频率之间存在一种转换,后者变化较慢,并能用δ脉冲编码的调制(DPCM)方案有效编码。由于倒频谱系数可以从LPC系数中直接导出,所以转换单元54将LPC转换成LPS,然后将其编码,通过通信信道56。在远程字估计单元62处,逆转换单元60对经转换的潜在特征进行逆转换,以将声特征提供给字译码器64,随后字译码器64提供估计字串。
转换单元54的一个实施例以图4中的转换子系统70示出。在图4中,来自声处理器52的LPC系数被提供给LPC至LPS转换单元72。在LPC至LPS单元72内,可如下确定LPS系数。对于第P级LPC系数,其相应的LPS频率可作为下列方程的在0和π间的P个根获得:
P(w)=cos5w+p1cos4w+…+p5/2                 (7)
Q(w)=cos5w+q1cos4w+…+q5/2                 (8)其中pi和qi可如下递归地计算:
p0=q0=1
                                        (9)pi=-ai-ap-i-Pi-1,1≤i≤P/2
                                        (10)qi=-ai+ap-i-qi-1,1≤i≤i≤P/2
                                        (11)LPS频率被提供给DPCM单元74,并在此编码以通过通信信道76发送出去。
在逆转换单元78处,从信道接收到的信号通过逆DPCM单元80和LPC至LPS单元82,恢复话音信号的LPS频率。LPS至LPC单元82进行LPC至LPS单元72的逆过程,将LPS频率转换回推导倒频谱系数用的LPC系数。LPS至LPC单元82进行下列转换: P ( z ) = ( 1 + z - 1 ) Π i = 1 P / 2 ( 1 - 2 cos ( w 2 i - 1 ) z - 1 + z - 2 ) - - - ( 12 ) Q ( z ) = ( 1 - z - 1 ) Π i = 1 P / 2 ( 1 - 2 cos ( w 2 i ) z - 1 + z - 2 ) - - - ( 13 ) A ( z ) = 1 - Σ i = 1 P a i z - i = P ( z ) + Q ( z ) 2 - - - ( 14 ) 然后LPC系数被提供给LPC至倒频谱单元84,单元84再根据方程5和方程6将倒频谱系数提供给字译码器64。
由于字译码器仅依赖于声特征序列(如果序列直接通过通信信道发送,则易产生噪声),所以如图3所示,在子系统51中将潜在的声特征序列推算或转换成另一种便于传输的表示。经过逆转换后可以获得字译码器中使用的声特征序列。因此,在VR分布式装置中,通过大气(信道)发送的特征序列可以与字译码器中真正使用的不同。预期,可用本领域中已知的任何错误保护方案对转换单元70的输出再编码。
在图5中,示出了本发明的改进实施例。在无线电通信应用中,部分由于昂贵的信道访问,用户可能不希望少数简单但又常用的话音命令占用通信信道。以在本地手机100处进行词汇量相当小的话音识别而词汇量较大的第二话音识别系统位于远程基站110中的方式,进一步在手机和基站间分配字译码功能,便可达到上述愿望。它们公用手机中的同一声处理器。本地字译码器中的词汇表包含最常用的字或字串。另一方面,远程字译码器中的词汇表包含通用字或字串。如图5所示,基于这种内在结构,可以缩短占用信道的平均时间,并且提高平均识别精度。
另外,有两组话音命令可以使用,一组称为特殊话音命令(spe-cial voiced command),与本地VR识别的命令对应,另一组称为通用话音命令(regular voiced command),它与本地VR不能识别的命令对应。无论何时发出特殊话音命令,皆从本地字译码器中抽取真正的声特征,并在本地实行话音识别功能,不访问通信信道。当发出通用话音命令时,通过信道发送经转换的声特征矢量,并在远程基站处进行字译码操作。
由于对于任何特殊话音命令不需要转换声特征,也不进行编码,并且本地VR的词汇量小,所以所需的计算将比远程所需的少得多(与在可能的词汇中寻找正确字串有关的计算正比于词汇量)。另外,由于声特征将在没有信道潜在错误的情况下直接送至本地VR,所以与远程VR相比,可以用简化的HMM(例如用较少状态数、较少状态输出概率混合部件数等)对本地话音识别器建模。尽管词汇量有所限制,但这将有可能在计算负载受到限制的手机中(子系统1)实施本地VR。可以预期,分布式VR还可以用在其他不同于无线电通信系统的应用目的中。
参照图5,将话音信号提供给声处理器102,然后从话音信号中抽取例如基于LPC的特征参数等特征。然后将这些特征提供给本地字译码器106,字译码器106在其较小的词汇表中搜寻,识别输入话音信号。如果对输入字串的译码失败并且断定远程VR应该对其译码,那么它就将信号传送给转换单元104,由转换单元104准备需发送的特征。然后通过通信信道108将转换后的特征发送至远程字译码器110。逆转换单元112接收被转换的特征,进行转换单元104的逆操作并将声特征提供给远程字译码器单元114,远程字译码器单元114作出响应,输出估计远程字串。
上述对优选实施例的描述能使本领域的技术人员实施或使用本发明。显然对这些实施例的各种修改对于本领域的技术人员而言是很容易的,并且无需创造性的智慧便能将此处定义的一般原理应用到其他实施例中。因此,不应将本发明局限于这里描述的实施例,本发明应被给予与此处揭示的原理和新特征相一致的最宽的范围。

Claims (3)

1.一种话音识别系统,其特征在于,包括:
特征抽取装置,位于远程站中,用于接收话音采样帧,并根据预定特征抽取格式从所述话音采样帧中抽取一组话音特征,并且提供所述话音特征组;和
字译码器,位于中央处理站中,用于接收所述话音特征组,并根据预定的译码格式确定语法。
2.如权利要求1所述的系统,其特征在于,所述特征组是线性预测编码参数。
3.如权利要求1所述的系统,其特征在于,所述系统还包括本地字检测器,它同在所述远程站中,根据一预定的小词汇表译码格式确定语法。
CN94194566A 1993-12-22 1994-12-20 分布式话音识别系统 Expired - Lifetime CN1119794C (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17324793A 1993-12-22 1993-12-22
US08/173,247 1993-12-22

Publications (2)

Publication Number Publication Date
CN1138386A true CN1138386A (zh) 1996-12-18
CN1119794C CN1119794C (zh) 2003-08-27

Family

ID=22631169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN94194566A Expired - Lifetime CN1119794C (zh) 1993-12-22 1994-12-20 分布式话音识别系统

Country Status (17)

Country Link
US (1) US5956683A (zh)
EP (3) EP0736211B1 (zh)
JP (1) JP3661874B2 (zh)
KR (1) KR100316077B1 (zh)
CN (1) CN1119794C (zh)
AT (1) ATE261172T1 (zh)
AU (1) AU692820B2 (zh)
BR (1) BR9408413A (zh)
CA (1) CA2179759C (zh)
DE (1) DE69433593T2 (zh)
FI (2) FI118909B (zh)
HK (1) HK1011109A1 (zh)
IL (1) IL112057A0 (zh)
MY (1) MY116482A (zh)
TW (1) TW318239B (zh)
WO (1) WO1995017746A1 (zh)
ZA (1) ZA948426B (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1327405C (zh) * 2001-02-02 2007-07-18 摩托罗拉公司 分布式语音识别系统中语音识别的方法和设备
CN103531197A (zh) * 2013-10-11 2014-01-22 安徽科大讯飞信息科技股份有限公司 一种对用户语音识别结果反馈的命令词识别自适应优化方法
CN110970031A (zh) * 2019-12-16 2020-04-07 苏州思必驰信息科技有限公司 语音识别系统及方法

Families Citing this family (280)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6594628B1 (en) * 1995-09-21 2003-07-15 Qualcomm, Incorporated Distributed voice recognition system
US9063719B1 (en) * 1995-10-02 2015-06-23 People Innovate For Economy Foundation, Inc. Table format programming
US5774858A (en) * 1995-10-23 1998-06-30 Taubkin; Vladimir L. Speech analysis method of protecting a vehicle from unauthorized accessing and controlling
US8209184B1 (en) 1997-04-14 2012-06-26 At&T Intellectual Property Ii, L.P. System and method of providing generated speech via a network
FI972723A0 (fi) * 1997-06-24 1997-06-24 Nokia Mobile Phones Ltd Mobila kommunikationsanordningar
CA2219008C (en) * 1997-10-21 2002-11-19 Bell Canada A method and apparatus for improving the utility of speech recognition
JP3055514B2 (ja) * 1997-12-05 2000-06-26 日本電気株式会社 電話回線用音声認識装置
US6208959B1 (en) 1997-12-15 2001-03-27 Telefonaktibolaget Lm Ericsson (Publ) Mapping of digital data symbols onto one or more formant frequencies for transmission over a coded voice channel
US6003004A (en) 1998-01-08 1999-12-14 Advanced Recognition Technologies, Inc. Speech recognition method and system using compressed speech data
US6614885B2 (en) * 1998-08-14 2003-09-02 Intervoice Limited Partnership System and method for operating a highly distributed interactive voice response system
US6260010B1 (en) * 1998-08-24 2001-07-10 Conexant Systems, Inc. Speech encoder using gain normalization that combines open and closed loop gains
US7003463B1 (en) * 1998-10-02 2006-02-21 International Business Machines Corporation System and method for providing network coordinated conversational services
US6389389B1 (en) 1998-10-13 2002-05-14 Motorola, Inc. Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters
GB2342828A (en) * 1998-10-13 2000-04-19 Nokia Mobile Phones Ltd Speech parameter compression; distributed speech recognition
US6185535B1 (en) 1998-10-16 2001-02-06 Telefonaktiebolaget Lm Ericsson (Publ) Voice control of a user interface to service applications
GB2343778B (en) * 1998-11-13 2003-03-05 Motorola Ltd Processing received data in a distributed speech recognition process
GB2343777B (en) * 1998-11-13 2003-07-02 Motorola Ltd Mitigating errors in a distributed speech recognition process
US6490621B1 (en) * 1998-11-20 2002-12-03 Orchestream Americas Corporation Calculation of resource availability using degradation factors
US6336090B1 (en) 1998-11-30 2002-01-01 Lucent Technologies Inc. Automatic speech/speaker recognition over digital wireless channels
KR100667522B1 (ko) * 1998-12-18 2007-05-17 주식회사 현대오토넷 Lpc 계수를 이용한 이동통신 단말기 음성인식 방법
US6411926B1 (en) 1999-02-08 2002-06-25 Qualcomm Incorporated Distributed voice recognition system
WO2000049601A1 (en) * 1999-02-19 2000-08-24 Custom Speech Usa, Inc. Automated transcription system and method using two speech converting instances and computer-assisted correction
DE19910236A1 (de) * 1999-03-09 2000-09-21 Philips Corp Intellectual Pty Verfahren zur Spracherkennung
DE60015531T2 (de) * 1999-03-26 2005-03-24 Scansoft, Inc., Peabody Client-server spracherkennungssystem
WO2000058946A1 (en) * 1999-03-26 2000-10-05 Koninklijke Philips Electronics N.V. Client-server speech recognition
US20050261907A1 (en) 1999-04-12 2005-11-24 Ben Franklin Patent Holding Llc Voice integration platform
US20050091057A1 (en) * 1999-04-12 2005-04-28 General Magic, Inc. Voice application development methodology
US6408272B1 (en) * 1999-04-12 2002-06-18 General Magic, Inc. Distributed voice user interface
US6290646B1 (en) 1999-04-16 2001-09-18 Cardiocom Apparatus and method for monitoring and communicating wellness parameters of ambulatory patients
US8419650B2 (en) 1999-04-16 2013-04-16 Cariocom, LLC Downloadable datasets for a patient monitoring system
US6292781B1 (en) * 1999-05-28 2001-09-18 Motorola Method and apparatus for facilitating distributed speech processing in a communication system
US6363349B1 (en) * 1999-05-28 2002-03-26 Motorola, Inc. Method and apparatus for performing distributed speech processing in a communication system
DE19930407A1 (de) * 1999-06-09 2000-12-14 Philips Corp Intellectual Pty Verfahren zur sprachbasierten Navigation in einem Kommunikationsnetzwerk und zur Implementierung einer Spracheingabemöglichkeit in private Informationseinheiten
KR20010019786A (ko) * 1999-08-30 2001-03-15 윤종용 이동통신 시스템에서 음성인식 및 문자표시 장치 및 방법
JP3969908B2 (ja) 1999-09-14 2007-09-05 キヤノン株式会社 音声入力端末器、音声認識装置、音声通信システム及び音声通信方法
US7194752B1 (en) 1999-10-19 2007-03-20 Iceberg Industries, Llc Method and apparatus for automatically recognizing input audio and/or video streams
US7689416B1 (en) * 1999-09-29 2010-03-30 Poirier Darrell A System for transferring personalize matter from one computer to another
US6963759B1 (en) * 1999-10-05 2005-11-08 Fastmobile, Inc. Speech recognition technique based on local interrupt detection
US6912496B1 (en) * 1999-10-26 2005-06-28 Silicon Automation Systems Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics
FI19992350A (fi) 1999-10-29 2001-04-30 Nokia Mobile Phones Ltd Parannettu puheentunnistus
EP1098297A1 (en) * 1999-11-02 2001-05-09 BRITISH TELECOMMUNICATIONS public limited company Speech recognition
US6725190B1 (en) * 1999-11-02 2004-04-20 International Business Machines Corporation Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope
US6665640B1 (en) 1999-11-12 2003-12-16 Phoenix Solutions, Inc. Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries
US9076448B2 (en) 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
US7050977B1 (en) 1999-11-12 2006-05-23 Phoenix Solutions, Inc. Speech-enabled server for internet website and method
US7392185B2 (en) 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US7725307B2 (en) 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US6615172B1 (en) 1999-11-12 2003-09-02 Phoenix Solutions, Inc. Intelligent query engine for processing voice based queries
US6633846B1 (en) 1999-11-12 2003-10-14 Phoenix Solutions, Inc. Distributed realtime speech recognition system
US6675027B1 (en) * 1999-11-22 2004-01-06 Microsoft Corp Personal mobile computing device having antenna microphone for improved speech recognition
WO2001039177A2 (en) * 1999-11-22 2001-05-31 Microsoft Corporation Distributed speech recognition for mobile communication devices
US6532446B1 (en) 1999-11-24 2003-03-11 Openwave Systems Inc. Server based speech recognition user interface for wireless devices
US6424945B1 (en) * 1999-12-15 2002-07-23 Nokia Corporation Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection
DE10003529A1 (de) * 2000-01-27 2001-08-16 Siemens Ag Verfahren und Vorrichtung zum Erstellen einer Textdatei mittels Spracherkennung
US7505921B1 (en) 2000-03-03 2009-03-17 Finali Corporation System and method for optimizing a product configuration
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
CN1315721A (zh) * 2000-03-23 2001-10-03 韦尔博泰克有限公司 客户服务器语音信息传送系统与方法
US6760699B1 (en) * 2000-04-24 2004-07-06 Lucent Technologies Inc. Soft feature decoding in a distributed automatic speech recognition system for use over wireless channels
US6502070B1 (en) * 2000-04-28 2002-12-31 Nortel Networks Limited Method and apparatus for normalizing channel specific speech feature elements
US6785653B1 (en) * 2000-05-01 2004-08-31 Nuance Communications Distributed voice web architecture and associated components and methods
JP3728177B2 (ja) 2000-05-24 2005-12-21 キヤノン株式会社 音声処理システム、装置、方法及び記憶媒体
ATE358316T1 (de) * 2000-06-08 2007-04-15 Nokia Corp Verfahren und system für adaptive verteilte spracherkennung
EP1304682A1 (en) * 2000-07-05 2003-04-23 Alcatel Distributed speech recognition system
CN1404603A (zh) * 2000-09-07 2003-03-19 皇家菲利浦电子有限公司 语音控制及加载的用户控制信息
US6823306B2 (en) 2000-11-30 2004-11-23 Telesector Resources Group, Inc. Methods and apparatus for generating, updating and distributing speech recognition models
US6915262B2 (en) 2000-11-30 2005-07-05 Telesector Resources Group, Inc. Methods and apparatus for performing speech recognition and using speech recognition results
US8135589B1 (en) 2000-11-30 2012-03-13 Google Inc. Performing speech recognition over a network and using speech recognition results
EP1215659A1 (en) * 2000-12-14 2002-06-19 Nokia Corporation Locally distibuted speech recognition system and method of its operation
US20020091515A1 (en) * 2001-01-05 2002-07-11 Harinath Garudadri System and method for voice recognition in a distributed voice recognition system
US20030004720A1 (en) * 2001-01-30 2003-01-02 Harinath Garudadri System and method for computing and transmitting parameters in a distributed voice recognition system
US7024359B2 (en) * 2001-01-31 2006-04-04 Qualcomm Incorporated Distributed voice recognition system using acoustic feature vector modification
FR2820872B1 (fr) * 2001-02-13 2003-05-16 Thomson Multimedia Sa Procede, module, dispositif et serveur de reconnaissance vocale
US7406421B2 (en) * 2001-10-26 2008-07-29 Intellisist Inc. Systems and methods for reviewing informational content in a vehicle
US8175886B2 (en) 2001-03-29 2012-05-08 Intellisist, Inc. Determination of signal-processing approach based on signal destination characteristics
US6487494B2 (en) * 2001-03-29 2002-11-26 Wingcast, Llc System and method for reducing the amount of repetitive data sent by a server to a client for vehicle navigation
USRE46109E1 (en) 2001-03-29 2016-08-16 Lg Electronics Inc. Vehicle navigation system and method
US20020143611A1 (en) * 2001-03-29 2002-10-03 Gilad Odinak Vehicle parking validation system and method
US6885735B2 (en) * 2001-03-29 2005-04-26 Intellisist, Llc System and method for transmitting voice input from a remote location over a wireless data channel
US7941313B2 (en) * 2001-05-17 2011-05-10 Qualcomm Incorporated System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system
US7203643B2 (en) * 2001-06-14 2007-04-10 Qualcomm Incorporated Method and apparatus for transmitting speech activity in distributed voice recognition systems
US7366673B2 (en) * 2001-06-15 2008-04-29 International Business Machines Corporation Selective enablement of speech recognition grammars
US20020198716A1 (en) * 2001-06-25 2002-12-26 Kurt Zimmerman System and method of improved communication
KR100777551B1 (ko) * 2001-06-29 2007-11-16 주식회사 케이티 채널용량에 따른 가변 구성이 가능한 음성인식 시스템 및그 방법
DE10228408B4 (de) 2001-07-10 2021-09-30 Sew-Eurodrive Gmbh & Co Kg Bussystem, umfassend mindestens einen Bus und Busteilnehmer und Verfahren zur Sprachsteuerung
ATE310302T1 (de) * 2001-09-28 2005-12-15 Cit Alcatel Kommunikationsvorrichtung und verfahren zum senden und empfangen von sprachsignalen unter kombination eines spracherkennungsmodules mit einer kodiereinheit
US7139704B2 (en) * 2001-11-30 2006-11-21 Intel Corporation Method and apparatus to perform speech recognition over a voice channel
GB2383459B (en) * 2001-12-20 2005-05-18 Hewlett Packard Co Speech recognition system and method
US7013275B2 (en) * 2001-12-28 2006-03-14 Sri International Method and apparatus for providing a dynamic speech-driven control and remote service access system
US6898567B2 (en) * 2001-12-29 2005-05-24 Motorola, Inc. Method and apparatus for multi-level distributed speech recognition
US20030154080A1 (en) * 2002-02-14 2003-08-14 Godsey Sandra L. Method and apparatus for modification of audio input to a data processing system
US8249880B2 (en) * 2002-02-14 2012-08-21 Intellisist, Inc. Real-time display of system instructions
US7099825B1 (en) 2002-03-15 2006-08-29 Sprint Communications Company L.P. User mobility in a voice recognition environment
US7089178B2 (en) * 2002-04-30 2006-08-08 Qualcomm Inc. Multistream network feature processing for a distributed speech recognition system
AU2003248523A1 (en) 2002-05-16 2003-12-02 Intellisist, Llc System and method for dynamically configuring wireless network geographic coverage or service levels
US20030233233A1 (en) * 2002-06-13 2003-12-18 Industrial Technology Research Institute Speech recognition involving a neural network
US6834265B2 (en) 2002-12-13 2004-12-21 Motorola, Inc. Method and apparatus for selective speech recognition
US7197331B2 (en) * 2002-12-30 2007-03-27 Motorola, Inc. Method and apparatus for selective distributed speech recognition
US7076428B2 (en) * 2002-12-30 2006-07-11 Motorola, Inc. Method and apparatus for selective distributed speech recognition
KR100956941B1 (ko) * 2003-06-27 2010-05-11 주식회사 케이티 네트워크 상황에 따른 선택적 음성인식 장치 및 그 방법
CA2537977A1 (en) * 2003-09-05 2005-03-17 Stephen D. Grody Methods and apparatus for providing services using speech recognition
US7283850B2 (en) 2004-10-12 2007-10-16 Microsoft Corporation Method and apparatus for multi-sensory speech enhancement on a mobile device
US8024194B2 (en) * 2004-12-08 2011-09-20 Nuance Communications, Inc. Dynamic switching between local and remote speech rendering
US7680656B2 (en) 2005-06-28 2010-03-16 Microsoft Corporation Multi-sensory speech enhancement using a speech-state model
US7406303B2 (en) 2005-07-05 2008-07-29 Microsoft Corporation Multi-sensory speech enhancement using synthesized sensor signal
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US7970613B2 (en) 2005-11-12 2011-06-28 Sony Computer Entertainment Inc. Method and system for Gaussian probability data bit reduction and computation
US7930178B2 (en) 2005-12-23 2011-04-19 Microsoft Corporation Speech modeling and enhancement based on magnitude-normalized spectra
US20070162282A1 (en) * 2006-01-09 2007-07-12 Gilad Odinak System and method for performing distributed speech recognition
DE102006002604A1 (de) * 2006-01-13 2007-07-19 Deutsche Telekom Ag Verfahren und System zur Durchführung einer Datentelekommunikation
US7778831B2 (en) * 2006-02-21 2010-08-17 Sony Computer Entertainment Inc. Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch
US8010358B2 (en) * 2006-02-21 2011-08-30 Sony Computer Entertainment Inc. Voice recognition with parallel gender and age normalization
US7599861B2 (en) 2006-03-02 2009-10-06 Convergys Customer Management Group, Inc. System and method for closed loop decisionmaking in an automated care system
US8644396B2 (en) 2006-04-18 2014-02-04 Qualcomm Incorporated Waveform encoding for wireless applications
US8406794B2 (en) 2006-04-26 2013-03-26 Qualcomm Incorporated Methods and apparatuses of initiating communication in wireless networks
KR101124785B1 (ko) * 2006-04-26 2012-03-23 콸콤 인코포레이티드 디바이스 기능의 동적 분배 및 자원 관리
US8289159B2 (en) 2006-04-26 2012-10-16 Qualcomm Incorporated Wireless localization apparatus and method
US7809663B1 (en) 2006-05-22 2010-10-05 Convergys Cmg Utah, Inc. System and method for supporting the utilization of machine language
US8379830B1 (en) 2006-05-22 2013-02-19 Convergys Customer Management Delaware Llc System and method for automated customer service with contingent live interaction
KR100794140B1 (ko) 2006-06-30 2008-01-10 주식회사 케이티 분산 음성 인식 단말기에서 음성 부호화기의 전처리를공유해 잡음에 견고한 음성 특징 벡터를 추출하는 장치 및그 방법
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
WO2008064137A2 (en) * 2006-11-17 2008-05-29 Rao Ashwin P Predictive speech-to-text input
US7904298B2 (en) * 2006-11-17 2011-03-08 Rao Ashwin P Predictive speech-to-text input
JP4658022B2 (ja) * 2006-11-20 2011-03-23 株式会社リコー 音声認識システム
US9830912B2 (en) 2006-11-30 2017-11-28 Ashwin P Rao Speak and touch auto correction interface
US20080154870A1 (en) * 2006-12-26 2008-06-26 Voice Signal Technologies, Inc. Collection and use of side information in voice-mediated mobile search
US20080153465A1 (en) * 2006-12-26 2008-06-26 Voice Signal Technologies, Inc. Voice search-enabled mobile device
US20080154612A1 (en) * 2006-12-26 2008-06-26 Voice Signal Technologies, Inc. Local storage and use of search results for voice-enabled mobile communications devices
US20080154608A1 (en) * 2006-12-26 2008-06-26 Voice Signal Technologies, Inc. On a mobile device tracking use of search results delivered to the mobile device
EP1976255B1 (en) 2007-03-29 2015-03-18 Intellisist, Inc. Call center with distributed speech recognition
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
JP5139747B2 (ja) * 2007-08-17 2013-02-06 株式会社ユニバーサルエンターテインメント 電話端末装置及びこれを用いた音声認識システム
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US9020816B2 (en) * 2008-08-14 2015-04-28 21Ct, Inc. Hidden markov model for speech processing with training method
US7933777B2 (en) * 2008-08-29 2011-04-26 Multimodal Technologies, Inc. Hybrid speech recognition
US9922640B2 (en) 2008-10-17 2018-03-20 Ashwin P Rao System and method for multimodal utterance detection
US9390167B2 (en) 2010-07-29 2016-07-12 Soundhound, Inc. System and methods for continuous audio matching
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US8442833B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Speech processing with source location estimation using signals from two or more microphones
US8788256B2 (en) 2009-02-17 2014-07-22 Sony Computer Entertainment Inc. Multiple language voice recognition
US8442829B2 (en) * 2009-02-17 2013-05-14 Sony Computer Entertainment Inc. Automatic computation streaming partition for voice recognition on multiple processors with limited memory
US10255566B2 (en) 2011-06-03 2019-04-09 Apple Inc. Generating and processing task items that represent tasks to perform
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
DE202011111062U1 (de) 2010-01-25 2019-02-19 Newvaluexchange Ltd. Vorrichtung und System für eine Digitalkonversationsmanagementplattform
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
TWI420510B (zh) * 2010-05-28 2013-12-21 Ind Tech Res Inst 可調整記憶體使用空間之語音辨識系統與方法
US9047371B2 (en) 2010-07-29 2015-06-02 Soundhound, Inc. System and method for matching a query against a broadcast stream
US9484018B2 (en) * 2010-11-23 2016-11-01 At&T Intellectual Property I, L.P. System and method for building and evaluating automatic speech recognition via an application programmer interface
US10049669B2 (en) 2011-01-07 2018-08-14 Nuance Communications, Inc. Configurable speech recognition system using multiple recognizers
US9674328B2 (en) * 2011-02-22 2017-06-06 Speak With Me, Inc. Hybridized client-server speech recognition
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9035163B1 (en) 2011-05-10 2015-05-19 Soundbound, Inc. System and method for targeting content based on identified audio and multimedia
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US8946812B2 (en) 2011-07-21 2015-02-03 Semiconductor Energy Laboratory Co., Ltd. Semiconductor device and manufacturing method thereof
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US8972263B2 (en) 2011-11-18 2015-03-03 Soundhound, Inc. System and method for performing dual mode speech recognition
US9043821B2 (en) 2012-02-07 2015-05-26 Turner Broadcasting System, Inc. Method and system for linking content on a connected television screen with a browser
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9153235B2 (en) 2012-04-09 2015-10-06 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature based on functional data analysis
US9685160B2 (en) * 2012-04-16 2017-06-20 Htc Corporation Method for offering suggestion during conversation, electronic device using the same, and non-transitory storage medium
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
US10354650B2 (en) 2012-06-26 2019-07-16 Google Llc Recognizing speech with mixed speech recognition models to generate transcriptions
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US10957310B1 (en) 2012-07-23 2021-03-23 Soundhound, Inc. Integrated programming framework for speech and text understanding with meaning parsing
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
WO2014055076A1 (en) 2012-10-04 2014-04-10 Nuance Communications, Inc. Improved hybrid controller for asr
US9691377B2 (en) 2013-07-23 2017-06-27 Google Technology Holdings LLC Method and device for voice recognition training
US9570076B2 (en) 2012-10-30 2017-02-14 Google Technology Holdings LLC Method and system for voice recognition employing multiple voice-recognition techniques
US9395234B2 (en) 2012-12-05 2016-07-19 Cardiocom, Llc Stabilizing base for scale
US9154841B2 (en) 2012-12-28 2015-10-06 Turner Broadcasting System, Inc. Method and system for detecting and resolving conflicts in an automatic content recognition based system
US9275638B2 (en) 2013-03-12 2016-03-01 Google Technology Holdings LLC Method and apparatus for training a voice recognition model database
US9542947B2 (en) 2013-03-12 2017-01-10 Google Technology Holdings LLC Method and apparatus including parallell processes for voice recognition
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
AU2014233517B2 (en) 2013-03-15 2017-05-25 Apple Inc. Training an at least partial voice command system
US9058805B2 (en) 2013-05-13 2015-06-16 Google Inc. Multiple recognizer speech recognition
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197335A1 (en) 2013-06-08 2014-12-11 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
EP3937002A1 (en) 2013-06-09 2022-01-12 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
AU2014278595B2 (en) 2013-06-13 2017-04-06 Apple Inc. System and method for emergency calls initiated by voice command
US9548047B2 (en) 2013-07-31 2017-01-17 Google Technology Holdings LLC Method and apparatus for evaluating trigger phrase enrollment
US9507849B2 (en) 2013-11-28 2016-11-29 Soundhound, Inc. Method for combining a query and a communication command in a natural language computer system
US9292488B2 (en) 2014-02-01 2016-03-22 Soundhound, Inc. Method for embedding voice mail in a spoken utterance using a natural language processing computer system
EP3107447B1 (en) * 2014-02-19 2022-12-07 Institut National De La Recherche Scientifique (INRS) Method and system for evaluating a noise level of a biosignal
US11295730B1 (en) 2014-02-27 2022-04-05 Soundhound, Inc. Using phonetic variants in a local context to improve natural language understanding
CN103915092B (zh) * 2014-04-01 2019-01-25 百度在线网络技术(北京)有限公司 语音识别方法和装置
US9564123B1 (en) 2014-05-12 2017-02-07 Soundhound, Inc. Method and system for building an integrated user profile
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
AU2015266863B2 (en) 2014-05-30 2018-03-15 Apple Inc. Multi-command single utterance input method
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
WO2015199653A1 (en) * 2014-06-24 2015-12-30 Nuance Communications, Inc. Methods and apparatus for joint stochastic and deterministic dictation formatting
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US9715498B2 (en) 2015-08-31 2017-07-25 Microsoft Technology Licensing, Llc Distributed server system for language understanding
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
EP3375201B1 (en) 2015-11-12 2020-06-24 Telefonaktiebolaget LM Ericsson (publ) Server, wireless device, methods and computer programs
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179588B1 (en) 2016-06-09 2019-02-22 Apple Inc. INTELLIGENT AUTOMATED ASSISTANT IN A HOME ENVIRONMENT
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10701438B2 (en) 2016-12-31 2020-06-30 Turner Broadcasting System, Inc. Automatic content recognition and verification in a broadcast chain
US10971157B2 (en) 2017-01-11 2021-04-06 Nuance Communications, Inc. Methods and apparatus for hybrid speech recognition processing
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
DK201770431A1 (en) 2017-05-15 2018-12-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10410635B2 (en) 2017-06-09 2019-09-10 Soundhound, Inc. Dual mode speech recognition
US10756795B2 (en) 2018-12-18 2020-08-25 XCOM Labs, Inc. User equipment with cellular link and peer-to-peer link
US11063645B2 (en) 2018-12-18 2021-07-13 XCOM Labs, Inc. Methods of wirelessly communicating with a group of devices
US11330649B2 (en) 2019-01-25 2022-05-10 XCOM Labs, Inc. Methods and systems of multi-link peer-to-peer communications
US10756767B1 (en) 2019-02-05 2020-08-25 XCOM Labs, Inc. User equipment for wirelessly communicating cellular signal with another user equipment
US11398238B2 (en) * 2019-06-07 2022-07-26 Lg Electronics Inc. Speech recognition method in edge computing device
US20210104233A1 (en) * 2019-10-03 2021-04-08 Ez-Ai Corp. Interactive voice feedback system and method thereof
US11586964B2 (en) * 2020-01-30 2023-02-21 Dell Products L.P. Device component management using deep learning techniques

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US108354A (en) * 1870-10-18 Improvement in combined eaves-troughs and lightning-rods
US4567606A (en) 1982-11-03 1986-01-28 International Telephone And Telegraph Corporation Data processing apparatus and method for use in speech recognition
FR2571191B1 (fr) * 1984-10-02 1986-12-26 Renault Systeme de radiotelephone, notamment pour vehicule automobile
JPS61105671A (ja) * 1984-10-29 1986-05-23 Hitachi Ltd 自然言語処理装置
US4991217A (en) * 1984-11-30 1991-02-05 Ibm Corporation Dual processor speech recognition system with dedicated data acquisition bus
DE3519915A1 (de) * 1985-06-04 1986-12-04 Telefonbau Und Normalzeit Gmbh, 6000 Frankfurt Verfahren zur spracherkennung an endgeraeten von fernmelde-, insbesondere fernsprechanlagen
JPS6269297A (ja) * 1985-09-24 1987-03-30 日本電気株式会社 話者確認タ−ミナル
US5231670A (en) * 1987-06-01 1993-07-27 Kurzweil Applied Intelligence, Inc. Voice controlled system and method for generating text from a voice controlled input
US5321840A (en) * 1988-05-05 1994-06-14 Transaction Technology, Inc. Distributed-intelligence computer system including remotely reconfigurable, telephone-type user terminal
US5040212A (en) * 1988-06-30 1991-08-13 Motorola, Inc. Methods and apparatus for programming devices to recognize voice commands
US5325524A (en) * 1989-04-06 1994-06-28 Digital Equipment Corporation Locating mobile objects in a distributed computer system
US5012518A (en) * 1989-07-26 1991-04-30 Itt Corporation Low-bit-rate speech coder using LPC data reduction processing
US5146538A (en) * 1989-08-31 1992-09-08 Motorola, Inc. Communication system and method with voice steering
US5045082A (en) * 1990-01-10 1991-09-03 Alza Corporation Long-term delivery device including loading dose
US5280585A (en) * 1990-09-28 1994-01-18 Hewlett-Packard Company Device sharing system using PCL macros
WO1993001664A1 (en) * 1991-07-08 1993-01-21 Motorola, Inc. Remote voice control system
DE4126882A1 (de) * 1991-08-14 1993-02-18 Philips Patentverwaltung Anordnung zur sprachuebertragung
US5305420A (en) * 1991-09-25 1994-04-19 Nippon Hoso Kyokai Method and apparatus for hearing assistance with speech speed control function

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1327405C (zh) * 2001-02-02 2007-07-18 摩托罗拉公司 分布式语音识别系统中语音识别的方法和设备
CN103531197A (zh) * 2013-10-11 2014-01-22 安徽科大讯飞信息科技股份有限公司 一种对用户语音识别结果反馈的命令词识别自适应优化方法
CN110970031A (zh) * 2019-12-16 2020-04-07 苏州思必驰信息科技有限公司 语音识别系统及方法
CN110970031B (zh) * 2019-12-16 2022-06-24 思必驰科技股份有限公司 语音识别系统及方法

Also Published As

Publication number Publication date
FI962572A (fi) 1996-08-20
JPH09507105A (ja) 1997-07-15
FI20070933A (fi) 2007-12-03
US5956683A (en) 1999-09-21
KR100316077B1 (ko) 2002-02-28
CA2179759A1 (en) 1995-06-29
WO1995017746A1 (en) 1995-06-29
EP0736211B1 (en) 2004-03-03
MY116482A (en) 2004-02-28
AU1375395A (en) 1995-07-10
FI118909B (fi) 2008-04-30
FI962572A0 (fi) 1996-06-20
IL112057A0 (en) 1995-03-15
DE69433593D1 (de) 2004-04-08
CN1119794C (zh) 2003-08-27
TW318239B (zh) 1997-10-21
ATE261172T1 (de) 2004-03-15
AU692820B2 (en) 1998-06-18
JP3661874B2 (ja) 2005-06-22
EP1381029A1 (en) 2004-01-14
EP0736211A1 (en) 1996-10-09
CA2179759C (en) 2005-11-15
DE69433593T2 (de) 2005-02-03
HK1011109A1 (en) 1999-07-02
BR9408413A (pt) 1997-08-05
ZA948426B (en) 1995-06-30
EP1942487A1 (en) 2008-07-09

Similar Documents

Publication Publication Date Title
CN1119794C (zh) 分布式话音识别系统
US6594628B1 (en) Distributed voice recognition system
CN1327405C (zh) 分布式语音识别系统中语音识别的方法和设备
CN1168070C (zh) 分布式语音识别系统
KR100923896B1 (ko) 분산형 음성 인식 시스템에서 음성 활성을 송신하는 방법및 장치
Digalakis et al. Quantization of cepstral parameters for speech recognition over the world wide web
CN1306472C (zh) 分布式语音识别系统中用于发送语音活动的系统和方法
CN1121681C (zh) 语言处理
CA2610269A1 (en) Method of adapting a neural network of an automatic speech recognition device
CN100527224C (zh) 有效存储语音识别模型的系统和方法
US11763801B2 (en) Method and system for outputting target audio, readable storage medium, and electronic device
Touazi et al. An experimental framework for Arabic digits speech recognition in noisy environments
Rabiner et al. Historical Perspective of the Field of ASR/NLU
JP3531342B2 (ja) 音声処理装置および音声処理方法
CN111199747A (zh) 人工智能通信系统及通信方法
Sakka et al. Using geometric spectral subtraction approach for feature extraction for DSR front-end Arabic system
KR100701253B1 (ko) 이동통신 환경 하에서의 서버 기반 음성 인식을 위한음성부호화 방법 및 장치
Shanthamallappa et al. Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review
Hokking et al. A hybrid of fractal code descriptor and harmonic pattern generator for improving speech recognition of different sampling rates
CN117041430A (zh) 一种提高智能协调外呼系统的外呼质量及鲁棒方法和装置
CN117351943A (zh) 音频处理方法、装置、设备和存储介质
Gerazov et al. Overview of Feature Selection for Automatic Speech Recognition
Gazi et al. Wavelet-based, speaker-independent isolated Hindi digit recognition
Mwangi Speaker independent isolated word recognition
Christiansen Word recognition in continuous speech using linear prediction analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: Holy land, California, Egypt

Patentee after: Qualcomm Inc.

Address before: Holy land, California, Egypt

Patentee before: Qualcomm Inc.

C17 Cessation of patent right
CX01 Expiry of patent term

Expiration termination date: 20141220

Granted publication date: 20030827