CN1262762A - 具有语音识别功能的计算机操作系统 - Google Patents

具有语音识别功能的计算机操作系统 Download PDF

Info

Publication number
CN1262762A
CN1262762A CN99800423A CN99800423A CN1262762A CN 1262762 A CN1262762 A CN 1262762A CN 99800423 A CN99800423 A CN 99800423A CN 99800423 A CN99800423 A CN 99800423A CN 1262762 A CN1262762 A CN 1262762A
Authority
CN
China
Prior art keywords
application program
voice command
application
operating system
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN99800423A
Other languages
English (en)
Other versions
CN1109328C (zh
Inventor
斯里拉姆·巴拉克里什南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Motorola Solutions Inc
Original Assignee
Motorola Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Motorola Inc filed Critical Motorola Inc
Publication of CN1262762A publication Critical patent/CN1262762A/zh
Application granted granted Critical
Publication of CN1109328C publication Critical patent/CN1109328C/zh
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/226Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics
    • G10L2015/228Procedures used during a speech recognition process, e.g. man-machine dialogue using non-speech characteristics of application context

Abstract

具有语音识别工具(36)以及第一和第二应用程序(32和34)的一种计算机操作系统。该操作系统(30)从与控制字有关的第一和第二应用程序中接收信息以便控制这些应用程序。信息被说明作为词汇表A(74)和词汇表B(76)。语音识别工具(36)被设计来从第一和第二应用程序中接收信息,并且执行与在输入端(60)上接收到的数字化语音的比较。根据接收到的数字化语音来控制应用程序(32和34)中被选择的一个。

Description

具有语音识别功能的计算机操作系统
本发明涉及具有语音识别功能的计算机操作系统。
随着计算机和计算设备在全社会的激增,越来越多的注意力转向代替传统键盘的数据输入方法。许多计算机程序具有语音识别功能,这些计算机程序中的大多数都是“应用程序”,也就是说,这些程序的功能和操作是为特定的目的而专门设计的。例如,听写就是一种众所周知的应用程序,该应用程序接受某个用户的语音作为输入并响应用户的语音输入,以类似字处理程序的方式使文本输入到某个文档中。语音识别应用程序的另一个例子也许是控制某个设备的一个项目的控制程序,例如,在免提无线电话中拨电话号码。在这样的一种应用程序中,用户可能说出所要拨的数字,然后例如可能说出命令“发送”,使得蜂窝式无线电话拨出所说的数字。这些都是特定语音识别应用程序的例子。
在1990年8月的Computer杂志第23卷第8期第50-60页,由C.Schmandt、M.S.Acherman和D.Hindus所写的“用语音输入强化视窗系统”一文中介绍了用于控制视窗导航任务的一种语音识别应用程序。名为“X Speak”的应用程序是X视窗系统的语音界面,在该应用程序中,单词和视窗相关。说出一个视窗的名字,就能使之移到计算机屏幕的前面,并将光标移入该视窗中。X Speak应用程序假定某些功能一般分配给鼠标。各种命令被描述,例如“create”表示启动一个应用程序,“recall”表示将某个视窗重新放在视窗堆栈的顶端,而“hide”则表示将某个视窗放在视窗堆栈底部。还有重新确定视窗大小和位置的命令等等。该论文的作者承认,使用视窗导航任务应用程序的任何人机接口效率是有限的或不存在的。
现有语音识别应用程序的缺点是缺乏灵活性。在典型的语音识别应用程序中,都有和该应用程序相关的可识别单词的词汇表。识别器企图从该词汇表中识别单词。识别不在该词汇表中的单词的技术也可以被提供。词汇表可以被扩充或替换,以便修改对该用户的识别操作的性能。
在X Speak应用程序的情况下,该应用程序是和某个操作系统相关的一个工具,有一组可以被识别的专用命令,这组可用的命令被预编程到应用程序中。这样做缺乏灵活性而且不能很好适应现代的多种应用程序个人计算机和类似的设备,在这些设备中新的应用程序经常被装入,而且许多应用程序可以相继运行。
希望能有一种更好的语音识别界面,至少能部分地代替数据和命令的键盘输入和鼠标的屏幕导航。
微软公司的一个语音应用程序设计界面(SAPI)为那些希望使其应用程序具有语音功能的应用开发人员提供了更大的灵活性,该界面允许通用语音搜索引擎识别不同应用程序的命令。但没有提供到非当前应用程序的语音引导,或没有提供处理多种具有语音功能的应用程序。也没有提供识别用于还没有被激活的或第一次运行的应用程序的命令。
希望能有这样的一种语音界面,能够引导语音到多种应用程序或新安装但还没有操作的应用程序。
还有一个问题,语音可以包含操作系统命令(例如“使视窗最小化”,“关闭视窗”)和应用程序导向语音命令(例如“开始听写”),以及应用程序导向内容(例如“给jones先生的备忘录”)。需要确定语音的最合适目标,不进行识别就不容易做到。最好设计为语音可以被导向的潜在任务。为此,需要为该任务专用的一个词汇表和语言模型(或其等价物)。
图1表示一典型的个人计算机,具有显示在屏幕上的多个应用程序的图标,适合于实现本发明。
图2是根据本发明的一组计算机程序的一个框图。
图3是说明本发明一个替换实施例的处理流程图。
图4是本发明一个实施例的处理流程图,该实施例是图3实施例的替换。
图5是一个硬件示意框图,进一步说明图4的实施例。
现在描述一种结构,其中具有语音识别能力的多个应用程序有机会指示操作系统,给定的语音输入是否对该单个的应用程序有用。这种结构有一个优点,一个应用程序比操作系统更适合于决定某种语音是否和该应用程序有关。可以更好地确定该语音是否应该被导向操作系统或导向某个特定的应用程序。
所示计算机10具有屏幕12,该屏幕上显示各种图标和视窗。所示计算机具有图标14,代表位于计算机10中的一个硬盘驱动器,以及第一应用程序图标16,代表存储在计算机10中的第一应用程序,还有第二应用程序图标18,代表存储在计算机10中的第二应用程序。第一应用程序取名“App A”,第二应用程序取名“AppB”。在屏幕12上显示的还有第一视窗20和第二视窗22。第一视窗20为“焦点上的”意思是指其正在前台并且是当前激活的视窗。第二视窗22为“焦点外的”,意思是指其正在后台。为了便于描述,将考虑这样的例子,其中,第一视窗20是App A的一个输入/输出界面视窗,而第二视窗则是App B的输入/输出用户界面。
在典型的现有技术的操作中,视窗20是通过将鼠标移到图标16上并且双击该图标来建立的。同样,第二视窗22也是通过将鼠标移到图标18上并双击该图标来建立的。对某个图标双击导致该图标所代表的应用程序被调用或者“被发射”。此后,单击该图标或其对应的视窗,使得该视窗进入焦点。当然,也可以是多个视窗对应一个应用程序。
图1所示有一个话筒40,可以接收语音命令作为语音输入。语音命令可以使得操作在App A或App B中执行。语音输入可以使得文本或其他的信息被输入到由App A或App B建立的文件中。本发明将解决这样的问题,确定语音命令或语音输入应该被导向哪一个应用程序。
现在看图2,该图给出了一组计算机程序,包括操作系统30,第一应用程序32和第二应用程序34以及音素网络发生器36。这组计算机程序被装入位于计算机10中的微处理器38。通过模-数转换器39连接到该处理器38的是话筒40。另外,连接到微处理器38的还有存储器42,该存储器可以是随机访问存储器,但最好是只读存储器。存在存储器42中的是第一和第二词汇表44和46以及第一和第二语言模型48和50,分别连接第一和第二应用程序32和34。
音素网络发生器36最好是一个上下文相关(CD)的音素网络发生器,正如Sreeram Balakrishnan等1997年11月26日递交的共同未决专利申请中所描述的,名称为“用于对语音信息编码的上下文相关网络”,被转让给和本专利申请相同的受让人并在此被引用。另外,音素网络发生器36也可以是上下文无关。
现在看微处理器38中程序实现的细节,音素网络发生器36包含输入60,它连接到模-数转换器39。特征提取程序模块62连接输入60,并且(最好是一个维特比解码器的)程序模块64连接特征提取程序模块62。解码器64有一个输出通过操作系统30的应用程序界面(API)35分别连接第一和第二应用程序32和34。
第一应用程序32有一个检索功能框33,能够调用第一词汇表44和第一语言模型48。第二应用程序34也有一个检索功能框37,能够调用第二词汇表46和第二语言模型50。每个词汇表具有潜在的需要由相关应用程序识别的字词或字符。每个语言模型具有该应用程序通常用到的字词或字词、字母或字符的组合。可能有(并且的确一般都是这样)对两个应用程序都通用的字词或字符。
下面将考虑这样的例子,其中App A是“Netscape”,而App B则是WisdomPenTM。第一词汇表44中有几个对浏览器有用的命令,例如“start”,“open”,“go back”,“search”,“home”和“Netscape”,第二词汇表46中也有几个对手写识别器有用的命令,例如“start”,“open”,“next character”,“phrase”,“edit”,“WisdomPen”等。注意,词组合“start Netscape”对第一词汇表44是唯一的,而“startWisdomPen”对第二词汇表46也是唯一的。即使这些词组合不是唯一的,注意到第一语言模型48给词组合“start Netscape”一个高可能性的得分,而给词组合“start WisdomPen”低可能性得分。而且也注意到,第二语言模型50给词组合“start WisdomPen”高可能性得分,而给词组合“start Netscape”低可能性得分。
在操作中,用户对话筒40发出一个命令,其语音在A/D转换器39中被数字化,并且在特征提取程序模块62中被提取特征。解码器64从被提取的特征和输出(在API35上)数据中产生CD音素网络,对于某个给定的音素,这些数据代表前面和后面音素的标识,当前音素的标识,当前音素的持续时间和得分(或少数其他音素的得分)。如此产生的CD网络被同时送到(或被拾取)应用程序32和34的检索功能框33和37。利用词汇表44和46中的数据以及语言模型48和50,功能框33和37中的每一个提供其识别操作的结果,至少包括识别的“得分”或“可信度”。应用程序32和34分别通过API输出71和72将这些得分传递到操作系统30中的判优器70。判优器70根据输入语音决定多个应用中的哪一个将要动作。
判优器70按照其最简单的形式确定提供最高识别得分的应用程序将接收输入语音并且据此动作。
因此,例如,如果输入语音为“goback”或“start Netscape”,判优器70将通过控制输出73通知应用程序A根据输入语音动作,并且也将通过控制输出74通知应用程序B忽略输入语音。另一方面,如果输入语音是“start WisdomPen”或“edit”,将是应用程序B被允许根据该输入动作。响应该输入的动作可能包括启动该应用程序,将该应用程序的视窗带到前台,改变应用程序中的模式,将数据输入到该应用程序中的某个文件,或者其他的操作。
在一种更先进的形式中,判优器70是上下文相关的,例如,赋予焦点上的应用程序比焦点外的应用程序更高的优先级。在识别中必须超出一个更高的可信度阈值才能将输入语音导向焦点外的应用程序,而一个较低的可信度阈值就允许输入语音被导向当前的应用程序。另外,判优器70区分命令和输入数据,或者区分不同类型的命令。例如,如果具有高可信度标识符(例如“Netscape”和“WisdomPen”),一组基本命令(“start”,“open”,“quit”)比其他命令或输入更容易被判优器70导向焦点外的应用程序(即,具有较低的可信度)。为了做到这一点这组基本命令可以被存储在判优器70中。
就这样,应用程序32和34(以及其他的应用程序)中的每一个都向操作系统30提出一个请求,操作系统30根据输入的语音进行判优,并且允许优于其他应用程序的应用程序接收输入的语音并据此动作。
因此,已经介绍的处理器38包含:用于接收语音命令的输入60,带有判优器70的操作系统30,以及在操作系统控制下的一个、二个或更多的应用程序32、34。应用程序在模块上独立于操作系统。每个应用程序都有一个语音识别器33、37,用于识别语音命令并且向判优器70提供识别信息71、72,表示语音识别器识别语音命令的能力。
识别信息最好依赖应用程序的词汇表44和语言模型48。
计算机操作的一种方法也已经被描述,该方法包括:输入语音命令;将被输入的语音命令和与该应用程序相关的语音命令的表示进行比较,以便提供识别结果;将识别结果提供给操作系统;并且取决于识别结果,有选择地允许该应用程序根据语音命令动作。
该方法最好进一步包括,与操作系统联合,产生表示语音命令的音素网络,并且将音素网络递交给应用程序。该音素网络最好是上下文有关的音素网络。
在该最佳实施例中,被输入的语音命令与第一和第二识别结果相关的语音命令的表示进行比较,并且有选择允许的步骤包括根据胜过其他应用程序的结果允许第一和第二应用程序中的一个执行动作,作为第一和第二应用程序的当前运行模式的一个函数。
另外,所描述的数据存储介质包括这样的一些指令和数据,当被装入通用处理器时,使得处理器包括:用于接收数字化语音的一个输出;产生音素标识信息的音素发生器;和第一和第二应用程序接口的界面;以及一个判优器。该界面被设计来将音素标识信息传递到第一和第二应用程序,并且将识别信息从第一和第二应用程序传递到判优器,而判优器则被设计来根据识别信息在第一和第二应用程序之间判优,并且根据所接收的数字化语音来控制第一和第二应用程序中被选择的一个。
第一和第二应用程序已经被描述为最好分别包含第一和第二词汇表,并且分别包含第一和第二语言模型,这样,给定的数字化语音命令从第一和第二应用程序导致不同的识别信息。
另外介绍的将应用程序安装在计算机中的方法包括:对操作系统注册该应用程序;从应用程序中提取所选择的命令及其语音表示;并且将这些命令和语音表示递交给搜索引擎的词汇表。
现在看图3,该图说明了一种结构,允许在应用程序第一次运行之前识别特定于某个应用程序的输入语音命令。
操作系统30,以及第一和第二应用程序32和34被说明。应用程序通过语音应用程序设计界面102连接搜索引擎104。应用程序32和34分别带有词汇表110和120(与图2中的词汇表44和46等同或作为其中的一部分)。在该例子中词汇表110中有词汇“Netscape”,而词汇表120中有词汇“Huibi”,该词是“WisdomPen”的中文名。
搜索引擎104有一个激活的词汇表106,该表包括许多普通命令,例如“start”、“open”、“quit”,也有词“Netscape”,该词或预编程进入搜索引擎词汇表106(是一个普遍使用的词),或已经在应用程序32较早的操作中从词汇表110中提供给词汇表106。反之,词“Huibi”不是一个很常用的词,因此不可能通过正常操作在词汇表106中提供。
安装程序130被用来安装应用程序B。在安装时,程序130对操作系统30注册应用程序B,并且在同时,从词汇表120中提取词汇及其音素网络等价物,并且将这些词汇和音素网络等价物插入搜索引擎词汇表106。
这样,当第一次说出命令“start Huibi”时,这个命令被搜索引擎104所识别。搜索引擎104能够使得操作系统30启动应用程序34。
应用程序32和34在模块结构上是互相独立的。这样做有很大优点,这些应用程序中的任何一个或者可以被删除,或者可以被替换,而不会影响系统的操作。另外,还可以进一步增加其他的应用程序。每次加上一个应用程序(或者替换一个应用程序),新的应用程序有其自己的操作系统词汇表,该表在安装时被传递给操作系统,使得操作系统能够检索出与该应用程序有关的命令。这是在现有结构上的一个显著改进,而不需要用新的语音命令等来更新操作系统本身,也不需要训练操作系统或扩充任何和语音识别器36相关的词汇表,使得系统能够被扩充或者被升级。第一和第二应用程序中的任何一个可以被删除而不会降低其他应用程序的性能。
本发明的一个进一步实施例是图3实施例的代替,并且是图2实施例的互补或增强,由图4和图5说明。
现在看图4,这是为第一识别器200(称为识别器A)和第二识别器202(称为识别器B)提供的软件替换结构。这些识别器是小应用程序,是在微处理器38启动时由操作系统30发射的。这些识别器独立于应用程序A和B,但被应用程序A和B按下述的方式调用。附加在识别器A上的是一个包含基本词汇的小词汇表210,与图3的词汇表110相似。附加在识别器B上的是基本词汇的第二词汇表220,与图3的词汇表120相似。
在操作中,操作系统30使得识别器A和B在微处理器38启动时被发射。这意味着识别器A和B从非易失性存储器,例如硬盘,被传送到计算机10的随机存取存储器中。应用程序A和B不需要在这个时候被发射。由于语音被输入到话筒40中,其语音表示(例如三重音网络表示)由操作系统30提供给识别器A和B。这些识别器根据分别存在词汇表210和220中作为语音表示的关键字,继续检索已接收到的输入语音的语音表示。识别器A和B继续提供计分给操作系统30,说明输入语音和存在各自词汇表210和220中的关键字之间的相关程度。操作系统30中的判优器70(见图2)比较这些计分,并且决定应用程序A和B中的一个是否应该被发射。因此,例如,如果在话筒40接收到词“open Netscape”,识别器A将为该命令提交一个高的计分,而识别器B则递交一个较低的计分。识别器A为命令“open  Netscape”递交的计分将是足够高的(例如超过某个预定的阈值),使得判优器70能够确定词“openNetscape”就是发射应用程序32的命令(而不是,例如对某个其他应用程序的某些输入文本)。
一旦作出这样的决定,操作系统30发射应用程序A。发射或初始化应用程序A使得该应用程序从非易失性存储器被传送到随机存取存储器,并且应用程序A成为完全功能化的并开始利用微处理器38的处理能力。同时,应用程序B继续单独驻留在非易失性存储器中并且不使用任何处理能力。
一旦被发射,应用程序A继续使用识别器A(通过链路202),对从话筒40传递过来并以应用程序A为目标的内容执行识别操作。这时,应用程序A还可以有效地补充该识别器正在使用的词汇表,并且可以引进语言的模型化特征,这些工作在应用程序A发射之前不需要做。换句话说,当识别器A正在搜索关键字时,识别器A要求最小的语言模型化能力,而当应用程序A被发射时,识别器A所用的语言模型化能力得到显著增加。
类似地,当识别器B对某个命令或存在词汇表220中的词汇组向操作系统30递交一个较高的计分时,操作系统30将发射应用程序B。
采用一种完全类似的方式,识别器A和B可以对例如“quitNetscape”或“Huibi”这样的命令向操作系统递交一个较高的计分,一旦操作系统30可以识别这些命令,就可以导致相应的应用程序A和B被关闭。
现在看图5,应用程序A和B以及识别器A和B以其在计算机10中的启动模式被说明。图5说明一个硬盘驱动器或其他非易失性存储器300,以及随机存取存储器310,二者都连接到处理器38。驻留在非易失性存储器300中的有应用程序32和34以及小识别器200和202。另外在非易失性存储器300中的还有启动程序320。
当处理器38启动时,根据预先缺省设置,处理器在非易失性存储器300中查找其启动程序并且找到启动程序320。启动程序320使得识别器200和202通过处理器38被装入随机存取存储器310。驻留在随机存取存储器310中的这些识别器被表示为识别器200’和202’。应用程序32和34这时还没有被装入随机存取存储器310中并且没有占用有价值的存储空间。同样,它们也没有利用处理器38的处理器能力。应用程序A和B要比它们各自的识别器200和202大得多,因此,将识别器装入随机存取存储器而不装入全部应用程序是可以容许的。当识别器200’和202,中的一个为与该识别器的应用程序有关的命令提供一个较高的计分时,处理器38使得该应用程序从非易失性存储器300被传送到随机存取存储器310中。
这里仅通过示例给出上述的最佳实施例,本领域中普通技术人员可以在不离开本发明的精神和范围的前提下修改其细节。

Claims (13)

1.一种处理器,包括:
接收语音命令的输入装置;
具有判优器的操作系统;以及
在操作系统控制下的至少一个应用程序,并且在程序模块上独立于操作系统,该应用程序包括语音识别器,用于识别语音命令,并且为判优器提供表示语音识别器识别该语音命令的能力的识别信息。
2.权利要求1的处理器,其中至少一个应用程序包含一个词典和语言模型,并且其中,识别信息与该词典和语言模型有关。
3.一种具有操作系统和至少一个应用程序的计算机的操作方法,包括:
输入语音命令;
比较被输入的语音命令和与该应用程序有关的语音命令的表示,以便提供识别结果;
提供识别结果给操作系统;并且
取决于识别结果有选择地允许应用程序根据语音命令执行动作。
4.权利要求3的方法,进一步包含,结合操作系统,产生表示语音命令的音素网络,并且将该音素网络递交给应用程序。
5.权利要求4的方法,其中的音素网络是上下文相关的音素网络。
6.权利要求3的方法,其中,被输入的语音命令同时和与第一和第二应用程序相关的语音命令的表示进行比较,以便提供第一和第二识别结果。
7.权利要求6的方法,其中,有选择允许的步骤包括允许第一和第二应用程序中的一个根据其胜于第一和第二应用程序中另一个的结果执行动作。
8.权利要求7的方法,其中,有选择允许的步骤包括在第一和第二应用程序之间判优,作为第一和第二应用程序的当前操作模式的一个功能。
9.权利要求6的方法,进一步包括,结合操作系统,产生表示语音命令的音素网络并且将该音素网络传递给第一和第二应用程序。
10.权利要求9的方法,其中的音素网络是上下文有关的音素网络。
11.一种在计算机中安装应用程序的方法,包括:
在操作系统上注册该应用程序;
从应用程序中提取被选择的命令及其语音表示;并且
将命令和语音表示传递给搜索引擎的词汇表。
12.一种计算机操作的方法,包括:
调用与第一应用程序有关而还没有被调用的第一识别器;
调用与第二应用程序有关而还没有被调用的第二识别器;
接收一个语音命令;
在第一和第二识别器中执行对该语音命令的识别,以便提供第一和第二识别结果;
在第一和第二识别结果之间判优;并且
根据判优步骤的结果,调用第一和第二应用程序中的一个。
13.权利要求12的方法,其中,当第一应用程序被调用时,第一识别器对第一应用程序执行语音识别,而当第二应用被调用时第二识别器对第二应用程序执行语音识别。
CN99800423A 1998-04-01 1999-03-11 一种计算机的操作方法 Expired - Fee Related CN1109328C (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/053,432 1998-04-01
US09/053,432 US6233559B1 (en) 1998-04-01 1998-04-01 Speech control of multiple applications using applets

Publications (2)

Publication Number Publication Date
CN1262762A true CN1262762A (zh) 2000-08-09
CN1109328C CN1109328C (zh) 2003-05-21

Family

ID=21984190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN99800423A Expired - Fee Related CN1109328C (zh) 1998-04-01 1999-03-11 一种计算机的操作方法

Country Status (8)

Country Link
US (1) US6233559B1 (zh)
EP (1) EP0986809B1 (zh)
CN (1) CN1109328C (zh)
AU (1) AU3080599A (zh)
CA (1) CA2289836A1 (zh)
DE (1) DE69910928T2 (zh)
HK (1) HK1029208A1 (zh)
WO (1) WO1999050831A1 (zh)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101238437B (zh) * 2005-08-11 2013-03-06 皇家飞利浦电子股份有限公司 驱动交互系统和用户接口系统的方法
CN105723449A (zh) * 2013-11-06 2016-06-29 系统翻译国际有限公司 基于对录取的语音数据的核心语提取的言语内容分析系统和利用该系统的索引方法及言语内容分析方法
CN107077380A (zh) * 2014-09-30 2017-08-18 纽昂斯通讯公司 用于模块仲裁的方法和装置
CN110825340A (zh) * 2014-07-25 2020-02-21 谷歌有限责任公司 提供预先计算的热词模型
CN112073777A (zh) * 2019-06-10 2020-12-11 海信视像科技股份有限公司 一种语音交互方法及显示设备

Families Citing this family (310)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020008751A1 (en) * 1998-03-25 2002-01-24 Stephen L. Spurgeon Decorating system for edible items
US7082397B2 (en) * 1998-12-01 2006-07-25 Nuance Communications, Inc. System for and method of creating and browsing a voice web
US7263489B2 (en) * 1998-12-01 2007-08-28 Nuance Communications, Inc. Detection of characteristics of human-machine interactions for dialog customization and analysis
DE19910236A1 (de) * 1999-03-09 2000-09-21 Philips Corp Intellectual Pty Verfahren zur Spracherkennung
JP2000276482A (ja) * 1999-03-25 2000-10-06 Matsushita Electric Ind Co Ltd 文書検索装置及び文書検索方法
US20050261907A1 (en) * 1999-04-12 2005-11-24 Ben Franklin Patent Holding Llc Voice integration platform
US20050091057A1 (en) * 1999-04-12 2005-04-28 General Magic, Inc. Voice application development methodology
US6408272B1 (en) * 1999-04-12 2002-06-18 General Magic, Inc. Distributed voice user interface
WO2001013255A2 (en) 1999-08-13 2001-02-22 Pixo, Inc. Displaying and traversing links in character array
US7016835B2 (en) * 1999-10-29 2006-03-21 International Business Machines Corporation Speech and signal digitization by using recognition metrics to select from multiple techniques
US6615172B1 (en) 1999-11-12 2003-09-02 Phoenix Solutions, Inc. Intelligent query engine for processing voice based queries
US7050977B1 (en) * 1999-11-12 2006-05-23 Phoenix Solutions, Inc. Speech-enabled server for internet website and method
US7392185B2 (en) 1999-11-12 2008-06-24 Phoenix Solutions, Inc. Speech based learning/training system using semantic decoding
US7725307B2 (en) * 1999-11-12 2010-05-25 Phoenix Solutions, Inc. Query engine for processing voice based queries including semantic decoding
US6633846B1 (en) 1999-11-12 2003-10-14 Phoenix Solutions, Inc. Distributed realtime speech recognition system
US9076448B2 (en) 1999-11-12 2015-07-07 Nuance Communications, Inc. Distributed real time speech recognition system
DE60023736T2 (de) * 1999-12-02 2006-08-10 Thomson Licensing Verfahren und vorrichtung zur spracherkennung mit verschiedenen sprachmodellen
US7024363B1 (en) * 1999-12-14 2006-04-04 International Business Machines Corporation Methods and apparatus for contingent transfer and execution of spoken language interfaces
US7881935B2 (en) * 2000-02-28 2011-02-01 Sony Corporation Speech recognition device and speech recognition method and recording medium utilizing preliminary word selection
WO2001075555A2 (en) * 2000-03-06 2001-10-11 Conita Technologies, Inc. Personal virtual assistant
US6466654B1 (en) * 2000-03-06 2002-10-15 Avaya Technology Corp. Personal virtual assistant with semantic tagging
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
DE10021389A1 (de) 2000-05-03 2001-11-08 Nokia Mobile Phones Ltd Verfahren zum Steuern eines Systems, insbesondere eines elektrischen und/oder elektronischen Systems mit zumindest einer Anwendungseinrichtung
US7139709B2 (en) * 2000-07-20 2006-11-21 Microsoft Corporation Middleware layer between speech related applications and engines
US6728676B1 (en) * 2000-10-19 2004-04-27 International Business Machines Corporation Using speech recognition to improve efficiency of an inventory task
US7257537B2 (en) * 2001-01-12 2007-08-14 International Business Machines Corporation Method and apparatus for performing dialog management in a computer conversational interface
US6950793B2 (en) * 2001-01-12 2005-09-27 International Business Machines Corporation System and method for deriving natural language representation of formal belief structures
US7249018B2 (en) * 2001-01-12 2007-07-24 International Business Machines Corporation System and method for relating syntax and semantics for a conversational speech application
US7127402B2 (en) * 2001-01-12 2006-10-24 International Business Machines Corporation Method and apparatus for converting utterance representations into actions in a conversational system
US7085723B2 (en) * 2001-01-12 2006-08-01 International Business Machines Corporation System and method for determining utterance context in a multi-context speech application
US7127401B2 (en) * 2001-03-12 2006-10-24 Ge Medical Systems Global Technology Company, Llc Remote control of a medical device using speech recognition and foot controls
US6848542B2 (en) * 2001-04-27 2005-02-01 Accenture Llp Method for passive mining of usage information in a location-based services system
US7437295B2 (en) * 2001-04-27 2008-10-14 Accenture Llp Natural language processing for a location-based services system
US7698228B2 (en) * 2001-04-27 2010-04-13 Accenture Llp Tracking purchases in a location-based services system
US7203645B2 (en) * 2001-04-27 2007-04-10 Intel Corporation Speech recognition system loading different recognition engines for different applications
US7970648B2 (en) * 2001-04-27 2011-06-28 Accenture Global Services Limited Advertising campaign and business listing management for a location-based services system
US6944447B2 (en) * 2001-04-27 2005-09-13 Accenture Llp Location-based services
US6944594B2 (en) * 2001-05-30 2005-09-13 Bellsouth Intellectual Property Corporation Multi-context conversational environment system and method
ITFI20010199A1 (it) * 2001-10-22 2003-04-22 Riccardo Vieri Sistema e metodo per trasformare in voce comunicazioni testuali ed inviarle con una connessione internet a qualsiasi apparato telefonico
WO2003060879A1 (de) * 2002-01-17 2003-07-24 Siemens Aktiengesellschaft Spracherkenner und betriebsverfahren für einen solchen
US7139713B2 (en) * 2002-02-04 2006-11-21 Microsoft Corporation Systems and methods for managing interactions from multiple speech-enabled applications
US7167831B2 (en) * 2002-02-04 2007-01-23 Microsoft Corporation Systems and methods for managing multiple grammars in a speech recognition system
US8374879B2 (en) 2002-02-04 2013-02-12 Microsoft Corporation Systems and methods for managing interactions from multiple speech-enabled applications
US7188066B2 (en) * 2002-02-04 2007-03-06 Microsoft Corporation Speech controls for use with a speech system
US7257776B2 (en) 2002-02-05 2007-08-14 Microsoft Corporation Systems and methods for scaling a graphical user interface according to display dimensions and using a tiered sizing schema to define display objects
US7603627B2 (en) * 2002-02-05 2009-10-13 Microsoft Corporation Systems and methods for creating and managing graphical user interface lists
US7587317B2 (en) * 2002-02-15 2009-09-08 Microsoft Corporation Word training interface
US7398209B2 (en) 2002-06-03 2008-07-08 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US8020114B2 (en) * 2002-06-07 2011-09-13 Sierra Wireless, Inc. Enter-then-act input handling
US6910911B2 (en) 2002-06-27 2005-06-28 Vocollect, Inc. Break-away electrical connector
US7693720B2 (en) 2002-07-15 2010-04-06 Voicebox Technologies, Inc. Mobile systems and methods for responding to natural language speech utterance
US7421390B2 (en) * 2002-09-13 2008-09-02 Sun Microsystems, Inc. Method and system for voice control of software applications
US7461352B2 (en) * 2003-02-10 2008-12-02 Ronald Mark Katsuranis Voice activated system and methods to enable a computer user working in a first graphical application window to display and control on-screen help, internet, and other information content in a second graphical application window
US7571102B2 (en) * 2003-04-29 2009-08-04 Ford Motor Company Controller for use with a motor vehicle
US7669134B1 (en) 2003-05-02 2010-02-23 Apple Inc. Method and apparatus for displaying information during an instant messaging session
EP1631899A4 (en) * 2003-06-06 2007-07-18 Univ Columbia SYSTEM AND METHOD FOR LANGUAGE ACTIVATION OF WEB SITES
JP2005031758A (ja) * 2003-07-07 2005-02-03 Canon Inc 音声処理装置及び方法
US20050010418A1 (en) * 2003-07-10 2005-01-13 Vocollect, Inc. Method and system for intelligent prompt control in a multimodal software application
US20050010892A1 (en) * 2003-07-11 2005-01-13 Vocollect, Inc. Method and system for integrating multi-modal data capture device inputs with multi-modal output capabilities
US7307615B2 (en) * 2003-08-08 2007-12-11 Lucent Technologies Inc. Method and apparatus for voice-controlled graphical user interface pointing device
CN1303582C (zh) * 2003-09-09 2007-03-07 摩托罗拉公司 自动语音归类方法
US7634720B2 (en) * 2003-10-24 2009-12-15 Microsoft Corporation System and method for providing context to an input method
US8001479B2 (en) * 2004-02-03 2011-08-16 Ronald Mark Katsuranis System and methods to access computer files and position and control application window(s) on a graphical user interface
WO2005122143A1 (ja) * 2004-06-08 2005-12-22 Matsushita Electric Industrial Co., Ltd. 音声認識装置および音声認識方法
FR2871978B1 (fr) * 2004-06-16 2006-09-22 Alcatel Sa Procede de traitement de signaux sonores pour un terminal de communication et terminal de communication mettant en oeuvre ce procede
US8942985B2 (en) * 2004-11-16 2015-01-27 Microsoft Corporation Centralized method and system for clarifying voice commands
US7827032B2 (en) 2005-02-04 2010-11-02 Vocollect, Inc. Methods and systems for adapting a model for a speech recognition system
US7865362B2 (en) 2005-02-04 2011-01-04 Vocollect, Inc. Method and system for considering information about an expected response when performing speech recognition
US8200495B2 (en) 2005-02-04 2012-06-12 Vocollect, Inc. Methods and systems for considering information about an expected response when performing speech recognition
US7949533B2 (en) * 2005-02-04 2011-05-24 Vococollect, Inc. Methods and systems for assessing and improving the performance of a speech recognition system
US7895039B2 (en) * 2005-02-04 2011-02-22 Vocollect, Inc. Methods and systems for optimizing model adaptation for a speech recognition system
DE102005016853A1 (de) * 2005-04-12 2006-10-19 Siemens Ag Verfahren zur Kontrolle von sprachgesteuerten Applikationen und zugehöriges Kontrollsystem
US7640160B2 (en) 2005-08-05 2009-12-29 Voicebox Technologies, Inc. Systems and methods for responding to natural language speech utterance
US7848928B2 (en) * 2005-08-10 2010-12-07 Nuance Communications, Inc. Overriding default speech processing behavior using a default focus receiver
US7620549B2 (en) 2005-08-10 2009-11-17 Voicebox Technologies, Inc. System and method of supporting adaptive misrecognition in conversational speech
US7949529B2 (en) * 2005-08-29 2011-05-24 Voicebox Technologies, Inc. Mobile systems and methods of supporting natural language human-machine interactions
US7634409B2 (en) 2005-08-31 2009-12-15 Voicebox Technologies, Inc. Dynamic speech sharpening
US8677377B2 (en) 2005-09-08 2014-03-18 Apple Inc. Method and apparatus for building an intelligent automated assistant
US7633076B2 (en) 2005-09-30 2009-12-15 Apple Inc. Automated response to and sensing of user activity in portable devices
US8620667B2 (en) * 2005-10-17 2013-12-31 Microsoft Corporation Flexible speech-activated command and control
US8577682B2 (en) * 2005-10-27 2013-11-05 Nuance Communications, Inc. System and method to use text-to-speech to prompt whether text-to-speech output should be added during installation of a program on a computer system normally controlled through a user interactive display
US8060368B2 (en) * 2005-12-07 2011-11-15 Mitsubishi Electric Corporation Speech recognition apparatus
DE102005061365A1 (de) * 2005-12-21 2007-06-28 Siemens Ag Verfahren zur Ansteuerung zumindest einer ersten und zweiten Hintergrundapplikation über ein universelles Sprachdialogsystem
US7899673B2 (en) * 2006-08-09 2011-03-01 Microsoft Corporation Automatic pruning of grammars in a multi-application speech recognition interface
US8346555B2 (en) 2006-08-22 2013-01-01 Nuance Communications, Inc. Automatic grammar tuning using statistical language model generation
US7949536B2 (en) * 2006-08-31 2011-05-24 Microsoft Corporation Intelligent speech recognition of incomplete phrases
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8073681B2 (en) * 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
US8977255B2 (en) 2007-04-03 2015-03-10 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
TWI345218B (en) * 2007-04-20 2011-07-11 Asustek Comp Inc Portable computer with function for identiying speech and processing method thereof
US8538757B2 (en) * 2007-05-17 2013-09-17 Redstart Systems, Inc. System and method of a list commands utility for a speech recognition command system
US8150699B2 (en) * 2007-05-17 2012-04-03 Redstart Systems, Inc. Systems and methods of a structured grammar for a speech recognition command system
US9794348B2 (en) * 2007-06-04 2017-10-17 Todd R. Smith Using voice commands from a mobile device to remotely access and control a computer
ITFI20070177A1 (it) 2007-07-26 2009-01-27 Riccardo Vieri Sistema per la creazione e impostazione di una campagna pubblicitaria derivante dall'inserimento di messaggi pubblicitari all'interno di uno scambio di messaggi e metodo per il suo funzionamento.
US9053089B2 (en) 2007-10-02 2015-06-09 Apple Inc. Part-of-speech tagging using latent analogy
US8165886B1 (en) 2007-10-04 2012-04-24 Great Northern Research LLC Speech interface system and method for control and interaction with applications on a computing system
US8595642B1 (en) 2007-10-04 2013-11-26 Great Northern Research, LLC Multiple shell multi faceted graphical user interface
EP2211689A4 (en) * 2007-10-08 2013-04-17 Univ California Ucla Office Of Intellectual Property VOICE CONTROLLED DASHBOARD WITH CLINICAL INFORMATION
US8364694B2 (en) 2007-10-26 2013-01-29 Apple Inc. Search assistant for digital media assets
US8620662B2 (en) 2007-11-20 2013-12-31 Apple Inc. Context-aware unit selection
US8140335B2 (en) 2007-12-11 2012-03-20 Voicebox Technologies, Inc. System and method for providing a natural language voice user interface in an integrated voice navigation services environment
US10002189B2 (en) 2007-12-20 2018-06-19 Apple Inc. Method and apparatus for searching using an active ontology
US8219407B1 (en) 2007-12-27 2012-07-10 Great Northern Research, LLC Method for processing the output of a speech recognizer
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US8327272B2 (en) 2008-01-06 2012-12-04 Apple Inc. Portable multifunction device, method, and graphical user interface for viewing and managing electronic calendars
USD626949S1 (en) 2008-02-20 2010-11-09 Vocollect Healthcare Systems, Inc. Body-worn mobile device
US8065143B2 (en) 2008-02-22 2011-11-22 Apple Inc. Providing text input using speech data and non-speech data
US8010465B2 (en) * 2008-02-26 2011-08-30 Microsoft Corporation Predicting candidates using input scopes
US8289283B2 (en) 2008-03-04 2012-10-16 Apple Inc. Language input interface on a device
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US8849672B2 (en) * 2008-05-22 2014-09-30 Core Wireless Licensing S.A.R.L. System and method for excerpt creation by designating a text segment using speech
US8589161B2 (en) 2008-05-27 2013-11-19 Voicebox Technologies, Inc. System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US8464150B2 (en) 2008-06-07 2013-06-11 Apple Inc. Automatic language identification for dynamic text processing
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US8768702B2 (en) 2008-09-05 2014-07-01 Apple Inc. Multi-tiered voice feedback in an electronic device
US8898568B2 (en) 2008-09-09 2014-11-25 Apple Inc. Audio user interface
US8352272B2 (en) 2008-09-29 2013-01-08 Apple Inc. Systems and methods for text to speech synthesis
US20100082328A1 (en) * 2008-09-29 2010-04-01 Apple Inc. Systems and methods for speech preprocessing in text to speech synthesis
US8712776B2 (en) 2008-09-29 2014-04-29 Apple Inc. Systems and methods for selective text to speech synthesis
US8352268B2 (en) 2008-09-29 2013-01-08 Apple Inc. Systems and methods for selective rate of speech and speech preferences for text to speech synthesis
US8396714B2 (en) 2008-09-29 2013-03-12 Apple Inc. Systems and methods for concatenation of words in text to speech synthesis
US8355919B2 (en) 2008-09-29 2013-01-15 Apple Inc. Systems and methods for text normalization for text to speech synthesis
US8583418B2 (en) 2008-09-29 2013-11-12 Apple Inc. Systems and methods of detecting language and natural language strings for text to speech synthesis
US8676904B2 (en) 2008-10-02 2014-03-18 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US8386261B2 (en) 2008-11-14 2013-02-26 Vocollect Healthcare Systems, Inc. Training/coaching system for a voice-enabled work environment
WO2010067118A1 (en) 2008-12-11 2010-06-17 Novauris Technologies Limited Speech recognition involving a mobile device
US8862252B2 (en) 2009-01-30 2014-10-14 Apple Inc. Audio user interface for displayless electronic device
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US8380507B2 (en) 2009-03-09 2013-02-19 Apple Inc. Systems and methods for determining the language to use for speech generated by a text to speech engine
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US20120311585A1 (en) 2011-06-03 2012-12-06 Apple Inc. Organizing task items that represent tasks to perform
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10540976B2 (en) 2009-06-05 2020-01-21 Apple Inc. Contextual voice commands
US20130219333A1 (en) * 2009-06-12 2013-08-22 Adobe Systems Incorporated Extensible Framework for Facilitating Interaction with Devices
US9431006B2 (en) 2009-07-02 2016-08-30 Apple Inc. Methods and apparatuses for automatic speech recognition
US9171541B2 (en) 2009-11-10 2015-10-27 Voicebox Technologies Corporation System and method for hybrid processing in a natural language voice services environment
WO2011059997A1 (en) 2009-11-10 2011-05-19 Voicebox Technologies, Inc. System and method for providing a natural language content dedication service
US8682649B2 (en) 2009-11-12 2014-03-25 Apple Inc. Sentiment prediction from textual data
US8600743B2 (en) 2010-01-06 2013-12-03 Apple Inc. Noise profile determination for voice-related feature
US8311838B2 (en) 2010-01-13 2012-11-13 Apple Inc. Devices and methods for identifying a prompt corresponding to a voice input in a sequence of prompts
US8381107B2 (en) 2010-01-13 2013-02-19 Apple Inc. Adaptive audio feedback system and method
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
WO2011089450A2 (en) 2010-01-25 2011-07-28 Andrew Peter Nelson Jerram Apparatuses, methods and systems for a digital conversation management platform
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US8639516B2 (en) 2010-06-04 2014-01-28 Apple Inc. User-specific noise suppression for voice quality improvements
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US9104670B2 (en) 2010-07-21 2015-08-11 Apple Inc. Customized search or acquisition of digital media assets
US8659397B2 (en) 2010-07-22 2014-02-25 Vocollect, Inc. Method and system for correctly identifying specific RFID tags
US8731939B1 (en) * 2010-08-06 2014-05-20 Google Inc. Routing queries based on carrier phrase registration
USD643400S1 (en) 2010-08-19 2011-08-16 Vocollect Healthcare Systems, Inc. Body-worn mobile device
USD643013S1 (en) 2010-08-20 2011-08-09 Vocollect Healthcare Systems, Inc. Body-worn mobile device
US8719006B2 (en) 2010-08-27 2014-05-06 Apple Inc. Combined statistical and rule-based part-of-speech tagging for text-to-speech synthesis
US9600135B2 (en) 2010-09-10 2017-03-21 Vocollect, Inc. Multimodal user notification system to assist in data capture
US8719014B2 (en) 2010-09-27 2014-05-06 Apple Inc. Electronic device with text error correction based on voice recognition data
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10515147B2 (en) 2010-12-22 2019-12-24 Apple Inc. Using statistical language models for contextual lookup
US9171551B2 (en) * 2011-01-14 2015-10-27 GM Global Technology Operations LLC Unified microphone pre-processing system and method
US8781836B2 (en) 2011-02-22 2014-07-15 Apple Inc. Hearing assistance system for providing consistent human speech
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US8914290B2 (en) 2011-05-20 2014-12-16 Vocollect, Inc. Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US20120310642A1 (en) 2011-06-03 2012-12-06 Apple Inc. Automatically creating a mapping between text data and audio data
WO2012169679A1 (ko) * 2011-06-10 2012-12-13 엘지전자 주식회사 디스플레이 장치, 디스플레이 장치의 제어 방법 및 디스플레이 장치의 음성인식 시스템
US8812294B2 (en) 2011-06-21 2014-08-19 Apple Inc. Translating phrases from one language into another using an order-based set of declarative rules
US8706472B2 (en) 2011-08-11 2014-04-22 Apple Inc. Method for disambiguating multiple readings in language conversion
WO2013022135A1 (en) * 2011-08-11 2013-02-14 Lg Electronics Inc. Electronic device and method of controlling the same
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
US8762156B2 (en) 2011-09-28 2014-06-24 Apple Inc. Speech recognition repair using contextual information
US8924219B1 (en) * 2011-09-30 2014-12-30 Google Inc. Multi hotword robust continuous voice command detection in mobile devices
US9183835B2 (en) * 2011-10-18 2015-11-10 GM Global Technology Operations LLC Speech-based user interface for a mobile device
US9159322B2 (en) * 2011-10-18 2015-10-13 GM Global Technology Operations LLC Services identification and initiation for a speech-based interface to a mobile device
US20130212478A1 (en) * 2012-02-15 2013-08-15 Tvg, Llc Audio navigation of an electronic interface
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9280610B2 (en) 2012-05-14 2016-03-08 Apple Inc. Crowd sourcing information to fulfill user requests
US8775442B2 (en) 2012-05-15 2014-07-08 Apple Inc. Semantic search using a single-source semantic model
US10417037B2 (en) 2012-05-15 2019-09-17 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US9721563B2 (en) 2012-06-08 2017-08-01 Apple Inc. Name recognition system
WO2013185109A2 (en) 2012-06-08 2013-12-12 Apple Inc. Systems and methods for recognizing textual identifiers within a plurality of words
US9734839B1 (en) * 2012-06-20 2017-08-15 Amazon Technologies, Inc. Routing natural language commands to the appropriate applications
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9715879B2 (en) * 2012-07-02 2017-07-25 Salesforce.Com, Inc. Computer implemented methods and apparatus for selectively interacting with a server to build a local database for speech recognition at a device
US9026448B2 (en) 2012-08-16 2015-05-05 Nuance Communications, Inc. User interface for entertainment systems
US9497515B2 (en) 2012-08-16 2016-11-15 Nuance Communications, Inc. User interface for entertainment systems
US8799959B2 (en) 2012-08-16 2014-08-05 Hoi L. Young User interface for entertainment systems
US9106957B2 (en) * 2012-08-16 2015-08-11 Nuance Communications, Inc. Method and apparatus for searching data sources for entertainment systems
US9031848B2 (en) 2012-08-16 2015-05-12 Nuance Communications, Inc. User interface for searching a bundled service content data source
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9547647B2 (en) 2012-09-19 2017-01-17 Apple Inc. Voice-based media searching
US8935167B2 (en) 2012-09-25 2015-01-13 Apple Inc. Exemplar-based latent perceptual modeling for automatic speech recognition
US9570076B2 (en) * 2012-10-30 2017-02-14 Google Technology Holdings LLC Method and system for voice recognition employing multiple voice-recognition techniques
KR102516577B1 (ko) 2013-02-07 2023-04-03 애플 인크. 디지털 어시스턴트를 위한 음성 트리거
US10642574B2 (en) 2013-03-14 2020-05-05 Apple Inc. Device, method, and graphical user interface for outputting captions
US10572476B2 (en) 2013-03-14 2020-02-25 Apple Inc. Refining a search based on schedule items
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9733821B2 (en) 2013-03-14 2017-08-15 Apple Inc. Voice control to diagnose inadvertent activation of accessibility features
US9977779B2 (en) 2013-03-14 2018-05-22 Apple Inc. Automatic supplementation of word correction dictionaries
US10652394B2 (en) 2013-03-14 2020-05-12 Apple Inc. System and method for processing voicemail
US10748529B1 (en) 2013-03-15 2020-08-18 Apple Inc. Voice activated device for use with a voice-based digital assistant
US11151899B2 (en) 2013-03-15 2021-10-19 Apple Inc. User training by intelligent digital assistant
CN112230878A (zh) 2013-03-15 2021-01-15 苹果公司 对中断进行上下文相关处理
US9978395B2 (en) 2013-03-15 2018-05-22 Vocollect, Inc. Method and system for mitigating delay in receiving audio stream during production of sound from audio stream
WO2014144949A2 (en) 2013-03-15 2014-09-18 Apple Inc. Training an at least partial voice command system
US20140282272A1 (en) * 2013-03-15 2014-09-18 Qualcomm Incorporated Interactive Inputs for a Background Task
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
US9431008B2 (en) * 2013-05-29 2016-08-30 Nuance Communications, Inc. Multiple parallel dialogs in smart phone applications
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
WO2014197334A2 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
WO2014197335A1 (en) 2013-06-08 2014-12-11 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
EP3008641A1 (en) 2013-06-09 2016-04-20 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
CN105265005B (zh) 2013-06-13 2019-09-17 苹果公司 用于由语音命令发起的紧急呼叫的系统和方法
US9997160B2 (en) * 2013-07-01 2018-06-12 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods for dynamic download of embedded voice components
US10186262B2 (en) 2013-07-31 2019-01-22 Microsoft Technology Licensing, Llc System with multiple simultaneous speech recognizers
WO2015020942A1 (en) 2013-08-06 2015-02-12 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10296160B2 (en) 2013-12-06 2019-05-21 Apple Inc. Method for extracting salient dialog usage from live data
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
RU2654789C2 (ru) 2014-05-30 2018-05-22 Общество С Ограниченной Ответственностью "Яндекс" Способ (варианты) и электронное устройство (варианты) обработки речевого запроса пользователя
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
EP3149728B1 (en) 2014-05-30 2019-01-16 Apple Inc. Multi-command single utterance input method
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US9898459B2 (en) 2014-09-16 2018-02-20 Voicebox Technologies Corporation Integration of domain information into state transitions of a finite state transducer for natural language processing
EP3195145A4 (en) 2014-09-16 2018-01-24 VoiceBox Technologies Corporation Voice commerce
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
CN107003999B (zh) 2014-10-15 2020-08-21 声钰科技 对用户的在先自然语言输入的后续响应的系统和方法
US10614799B2 (en) 2014-11-26 2020-04-07 Voicebox Technologies Corporation System and method of providing intent predictions for an utterance prior to a system detection of an end of the utterance
US10431214B2 (en) 2014-11-26 2019-10-01 Voicebox Technologies Corporation System and method of determining a domain and/or an action related to a natural language input
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US20160225369A1 (en) * 2015-01-30 2016-08-04 Google Technology Holdings LLC Dynamic inference of voice command for software operation from user manipulation of electronic device
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US9578173B2 (en) 2015-06-05 2017-02-21 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
WO2017132660A1 (en) * 2016-01-29 2017-08-03 Liquid Analytics, Inc. Systems and methods for dynamic prediction of workflows
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10049670B2 (en) 2016-06-06 2018-08-14 Google Llc Providing voice action discoverability example for trigger term
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
DK179309B1 (en) 2016-06-09 2018-04-23 Apple Inc Intelligent automated assistant in a home environment
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10586535B2 (en) 2016-06-10 2020-03-10 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
DK179415B1 (en) 2016-06-11 2018-06-14 Apple Inc Intelligent device arbitration and control
DK201670540A1 (en) 2016-06-11 2018-01-08 Apple Inc Application integration with a digital assistant
DK179049B1 (en) 2016-06-11 2017-09-18 Apple Inc Data driven natural language event detection and classification
DK179343B1 (en) 2016-06-11 2018-05-14 Apple Inc Intelligent task discovery
US20180025731A1 (en) * 2016-07-21 2018-01-25 Andrew Lovitt Cascading Specialized Recognition Engines Based on a Recognition Policy
US10714121B2 (en) 2016-07-27 2020-07-14 Vocollect, Inc. Distinguishing user speech from background speech in speech-dense environments
US10331784B2 (en) 2016-07-29 2019-06-25 Voicebox Technologies Corporation System and method of disambiguating natural language processing requests
US10261752B2 (en) 2016-08-02 2019-04-16 Google Llc Component libraries for voice interaction services
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
DK201770439A1 (en) 2017-05-11 2018-12-13 Apple Inc. Offline personal assistant
DK179496B1 (en) 2017-05-12 2019-01-15 Apple Inc. USER-SPECIFIC Acoustic Models
DK179745B1 (en) 2017-05-12 2019-05-01 Apple Inc. SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT
DK201770431A1 (en) 2017-05-15 2018-12-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
DK201770432A1 (en) 2017-05-15 2018-12-21 Apple Inc. Hierarchical belief states for digital assistants
DK179560B1 (en) 2017-05-16 2019-02-18 Apple Inc. FAR-FIELD EXTENSION FOR DIGITAL ASSISTANT SERVICES
US11423215B2 (en) 2018-12-13 2022-08-23 Zebra Technologies Corporation Method and apparatus for providing multimodal input data to client applications
US20220284894A1 (en) * 2021-03-02 2022-09-08 Samsung Electronics Co., Ltd. Electronic device for processing user utterance and operation method therefor

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2818362B2 (ja) * 1992-09-21 1998-10-30 インターナショナル・ビジネス・マシーンズ・コーポレイション 音声認識装置のコンテキスト切換えシステムおよび方法
EP0607615B1 (en) * 1992-12-28 1999-09-15 Kabushiki Kaisha Toshiba Speech recognition interface system suitable for window systems and speech mail systems
US5390279A (en) * 1992-12-31 1995-02-14 Apple Computer, Inc. Partitioning speech rules by context for speech recognition
US5384892A (en) * 1992-12-31 1995-01-24 Apple Computer, Inc. Dynamic language model for speech recognition
US5613036A (en) * 1992-12-31 1997-03-18 Apple Computer, Inc. Dynamic categories for a speech recognition system
US5689617A (en) * 1995-03-14 1997-11-18 Apple Computer, Inc. Speech recognition system which returns recognition results as a reconstructed language model with attached data values
US5677991A (en) * 1995-06-30 1997-10-14 Kurzweil Applied Intelligence, Inc. Speech recognition system using arbitration between continuous speech and isolated word modules
US5799279A (en) * 1995-11-13 1998-08-25 Dragon Systems, Inc. Continuous speech recognition of text and commands
US5819220A (en) * 1996-09-30 1998-10-06 Hewlett-Packard Company Web triggered word set boosting for speech interfaces to the world wide web
US5960399A (en) * 1996-12-24 1999-09-28 Gte Internetworking Incorporated Client/server speech processor/recognizer

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101238437B (zh) * 2005-08-11 2013-03-06 皇家飞利浦电子股份有限公司 驱动交互系统和用户接口系统的方法
CN105723449A (zh) * 2013-11-06 2016-06-29 系统翻译国际有限公司 基于对录取的语音数据的核心语提取的言语内容分析系统和利用该系统的索引方法及言语内容分析方法
US10304441B2 (en) 2013-11-06 2019-05-28 Systran International Co., Ltd. System for grasping keyword extraction based speech content on recorded voice data, indexing method using the system, and method for grasping speech content
CN110825340A (zh) * 2014-07-25 2020-02-21 谷歌有限责任公司 提供预先计算的热词模型
CN110825340B (zh) * 2014-07-25 2024-03-29 谷歌有限责任公司 提供预先计算的热词模型
CN107077380A (zh) * 2014-09-30 2017-08-18 纽昂斯通讯公司 用于模块仲裁的方法和装置
CN112073777A (zh) * 2019-06-10 2020-12-11 海信视像科技股份有限公司 一种语音交互方法及显示设备

Also Published As

Publication number Publication date
DE69910928T2 (de) 2004-05-06
AU3080599A (en) 1999-10-18
WO1999050831A1 (en) 1999-10-07
CA2289836A1 (en) 1999-10-07
HK1029208A1 (en) 2001-03-23
DE69910928D1 (de) 2003-10-09
CN1109328C (zh) 2003-05-21
EP0986809A1 (en) 2000-03-22
EP0986809B1 (en) 2003-09-03
US6233559B1 (en) 2001-05-15

Similar Documents

Publication Publication Date Title
CN1109328C (zh) 一种计算机的操作方法
US5937380A (en) Keypad-assisted speech recognition for text or command input to concurrently-running computer application
CN1145141C (zh) 改善语音识别准确性的方法和装置
EP1094445B1 (en) Command versus dictation mode errors correction in speech recognition
US5799279A (en) Continuous speech recognition of text and commands
EP0773532B1 (en) Continuous speech recognition
US7010490B2 (en) Method, system, and apparatus for limiting available selections in a speech recognition system
US7146323B2 (en) Method and system for gathering information by voice input
US8571862B2 (en) Multimodal interface for input of text
US6366882B1 (en) Apparatus for converting speech to text
CN1150452C (zh) 语音识别校正方法和装置
AU2005229636B2 (en) Generic spelling mnemonics
CN1144176C (zh) 使用两次语音转换和计算机辅助校正的自动转录系统及方法
CN1717717A (zh) 手持便携式设备的语音输入方法编辑器的体系结构
EP0854418A2 (en) Speech recognition interface
US6876967B2 (en) Speech complementing apparatus, method and recording medium
CN1394331A (zh) 具有替换命令的语音识别方法
GB2323694A (en) Adaptation in speech to text conversion
EP1687807A2 (en) Topic specific models for text formatting and speech recognition
JP2003022089A (ja) オーディオ専用インターフェースにおけるボイス・スペリング
WO2007010176A1 (en) System, method, and technique for identifying a spoken utterance as a member of a list of known items allowing for variations in the form of the utterence
CN1889171A (zh) 用于识别字符/字符串的语音识别方法和系统
CN111800657B (zh) 一种基于Gaia AI语音控制的智能电视多语种识别系统
US7054813B2 (en) Automatic generation of efficient grammar for heading selection
BARTIŠIŪTĖ et al. VOICE-BASED COMPUTER DIALOGUES FOR MOBILE USERS AND IMPAIRED PEOPLE

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee