US20040176946A1 - Pronunciation symbols based on the orthographic lexicon of a language - Google Patents

Pronunciation symbols based on the orthographic lexicon of a language Download PDF

Info

Publication number
US20040176946A1
US20040176946A1 US10/685,566 US68556603A US2004176946A1 US 20040176946 A1 US20040176946 A1 US 20040176946A1 US 68556603 A US68556603 A US 68556603A US 2004176946 A1 US2004176946 A1 US 2004176946A1
Authority
US
United States
Prior art keywords
word
symbols
words
characters
written
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/685,566
Inventor
Jayadev Billa
Francis Kubala
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Raytheon BBN Technologies Corp
Original Assignee
BBNT Solutions LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Assigned to BBNT SOLUTIONS LLC reassignment BBNT SOLUTIONS LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KUBALA, FRANCIS, BILLA, JAYADEV
Application filed by BBNT Solutions LLC filed Critical BBNT Solutions LLC
Priority to US10/685,566 priority Critical patent/US20040176946A1/en
Assigned to FLEET NATIONAL BANK, AS AGENT reassignment FLEET NATIONAL BANK, AS AGENT PATENT & TRADEMARK SECURITY AGREEMENT Assignors: BBNT SOLUTIONS LLC
Publication of US20040176946A1 publication Critical patent/US20040176946A1/en
Assigned to BBNT SOLUTIONS LLC reassignment BBNT SOLUTIONS LLC CORRECTIVE DOCUMENT PREVIOUSLY RECORDED AT REEL 014612 FRAME 0832. (ASSIGNMENT OF ASSIGNOR'S INTEREST) Assignors: KUBALA, FRANCIS G., BILLA, JAYADEV
Assigned to BBN TECHNOLOGIES CORP. reassignment BBN TECHNOLOGIES CORP. MERGER (SEE DOCUMENT FOR DETAILS). Assignors: BBNT SOLUTIONS LLC
Assigned to BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO BBNT SOLUTIONS LLC) reassignment BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO BBNT SOLUTIONS LLC) RELEASE OF SECURITY INTEREST Assignors: BANK OF AMERICA, N.A. (SUCCESSOR BY MERGER TO FLEET NATIONAL BANK)
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems

Definitions

  • the present invention relates generally to speech recognition and, more particularly, to the creation of system dictionaries for speech recognition systems.
  • Speech has not traditionally been valued as an archival information source. As effective as the spoken word is for communicating, archiving spoken segments in a useful and easily retrievable manner has long been a difficult proposition. Although the act of recording audio is not difficult, automatically transcribing and indexing speech in an intelligent and useful manner can be difficult.
  • HMMs Hidden Markov Models
  • FIG. 1 is a diagram illustrating a conventional speech recognition system.
  • a content transcription component 102 receives an input audio stream.
  • the content transcription component 102 converts speech in the input audio stream into text based on language and acoustic model(s) 101 .
  • the model(s) 101 are pre-trained based on a training audio stream that is expected to be similar to the run-time version of the input audio stream.
  • FIG. 2 is a diagram illustrating training of models 101 in additional detail.
  • models 101 receive the input audio stream 210 and a corresponding transcription 211 of the input audio stream.
  • the transcription may have been meticulously generated by a human based on the input audio stream 210 .
  • Transcription 211 may be converted into a stream of phonemes 213 by system dictionary 212 .
  • System dictionary 212 includes information regarding the relationship between the written orthographic representation of a word and the phonemes that correspond to the word.
  • a phoneme is generally defined as the smallest acoustic event that distinguishes one word from another.
  • models 101 learn associations between the audio stream 210 and the phoneme stream 213 .
  • models 101 may then generate phonemes for run-time audio stream 110 , including boundary indications between phonemes that correspond to different words.
  • Content transcription component 102 may use a phoneme dictionary to convert the generated phonemes into a conventional written transcription. In this manner, the run-time transcription is generated.
  • One disadvantage of the speech recognition system described above is that the system requires a phoneme-based system dictionary 212 to train models 101 .
  • the user When a user of the system wishes to add new words to the system, the user must update system dictionary 212 to include the new words and the phonemes corresponding to the new words. Generating the correct phonemes for any given word, however, is not a trivial task. In fact, this job is generally performed by a person with specialized training in this area (i.e., a speech expert). This can be a significant problem for speech recognition systems that are deployed in the field. If the user of the system is not a speech expert, adding words to the system can be a difficult proposition.
  • Systems and methods consistent with the present invention include speech recognition systems that use a system dictionary that discards the linguistic origin of phonemes and instead uses a pronunciation model based on the normal orthographic written form of the word. Entries in the system dictionary, consistent with the present invention, may be created in an automated manner.
  • One aspect consistent with the invention is directed to a method for specifying a pronunciation of a word.
  • the method includes receiving a written version of the word defined by a series of characters and separating the written version of the word into the series of characters.
  • the method further includes generating symbols that define a pronunciation of the word based solely on the series of characters.
  • a second aspect consistent with the invention is directed to a speech recognition system.
  • the system includes speech recognition models that convert audio containing speech into a transcription of the speech.
  • a system dictionary trains the speech recognition models by providing symbols that define pronunciations of words to the speech recognition models.
  • a dictionary creation component generates the symbols for the system dictionary, where the symbols are based on written characters of the words.
  • a third aspect consistent with the invention is directed to a method.
  • the method includes configuring a dictionary creation component to generate symbols that represent pronunciations of words in a target language. The symbols are generated based solely on written representations of the words and configuration is performed based on the target language.
  • the method also includes providing the dictionary creation component with written words and receiving the symbols from the dictionary creation component.
  • FIG. 1 is a diagram illustrating a conventional speech recognition system
  • FIG. 2 is a diagram illustrating training of models in additional detail the speech recognition system of FIG. 1;
  • FIG. 3 is a diagram illustrating an exemplary system in which concepts consistent with the invention may be implemented
  • FIG. 4 is a diagram illustrating training of speech recognition models consistent with the present invention
  • FIG. 5 is a flow chart illustrating operation of a dictionary creation component consistent with an aspect of the invention
  • FIG. 6 is a flow chart illustrating operation of a dictionary creation component consistent with another aspect of the invention.
  • FIG. 7 is a flow chart illustrating operation of a dictionary creation component consistent with yet another aspect of the invention.
  • Systems and methods consistent with the present invention create system dictionary entries that automatically define word pronunciations.
  • the specification of the pronunciation of a word is based on the normal orthographic written version of the word.
  • systems and methods consistent with the present invention specify a pronunciation for the word. These pronunciations can be effectively used by speech recognition systems.
  • FIG. 3 is a diagram illustrating an exemplary system 300 in which concepts consistent with the invention may be implemented.
  • System 300 includes a computing device 301 that has a computer-readable medium 309 , such as random access memory, coupled to a processor 308 .
  • Computing device 301 may also include a number of additional external or internal devices.
  • An external input device 320 and an external output device 321 are shown in FIG. 3.
  • the input devices 320 may include, without limitation, a mouse, a CD-ROM, or a keyboard.
  • the output devices may include, without limitation, a display or an audio output device, such as a speaker.
  • computing device 301 may be any type of computing platform, and may be connected to a network 302 .
  • Computing device 301 is exemplary only. Concepts consistent with the present invention can be implemented on any computing device, whether or not connected to a network.
  • Processor 308 executes program instructions stored in memory 309 .
  • Processor 308 can be any of a number of well-known computer processors, such as processors from Intel Corporation, of Santa Clara, Calif.
  • Memory 309 may contain application programs and data.
  • memory 309 may include a system dictionary 315 and a dictionary creation component 316 .
  • System dictionary 315 may be used in training models for speech recognition in a manner similar to system dictionary 212 (FIG. 2). Entries in system dictionary 315 may be generated automatically by dictionary creation component 316 based on the written version of the word. This is in contrast to a conventional system dictionary, such as system dictionary 212 , in which each entry is defined as a series of phonemes derived from a human expert.
  • FIG. 4 is a diagram illustrating training of speech recognition models 401 consistent with the present invention.
  • Models 401 may be implemented in a manner similar to models 101 .
  • Models 401 may be trained based on an input audio stream 410 and an input symbol stream 413 .
  • Symbol stream 413 may include a phoneme-like representation of the words in audio stream 410 from system dictionary 315 .
  • System dictionary 315 defines the written version of words as a sequence of symbols that relate to the pronunciation of the words. The symbols may be created automatically by dictionary creation component 316 . From the view-point of models 401 , the symbols in system dictionary 315 are treated as if they were phonemes. In actuality, however, the symbols are not phonemes and do not need to be defined by a speech expert.
  • Models 401 may be based on HMMs.
  • Models 401 may include acoustic models and language models.
  • the acoustic models may describe the time-varying evolution of feature vectors for each symbol in symbol stream 413 .
  • the acoustic models may employ continuous HMMs to model each of the symbols in various phonetic contexts.
  • the language models may include n-gram language models, where the probability of each word is a function of the previous word (for a bi-gram language model) and the previous two words (for a tri-gram language model).
  • the higher the order of the language model the higher the recognition accuracy at the cost of slower recognition speeds.
  • FIG. 5 is a flow chart illustrating operation of dictionary creation component 316 consistent with an aspect of the invention.
  • the acts shown in FIG. 5 are particularly appropriate for languages in which pronunciations are “regular” in the sense that each written character tends to correspond to a sound.
  • Arabic, Italian, and Spanish are examples of regular languages.
  • dictionary creation component 316 receives the written version of the words that are to be entered into system dictionary 315 (Act 501 ).
  • the written version of the words may, for example, be manually entered by a user or the words may be obtained through an automated process.
  • the automated process may include scanning documents on a network, such as a web-crawling program that scans documents on the Internet.
  • dictionary creation component 316 separates the received word into its constituent characters and writes a corresponding entry to system dictionary 315 (Acts 502 and 503 ).
  • the Spanish word “ducha”(shower) would be processed by system dictionary creation component 316 as five sequential symbols, such as the symbols D-U-C-H-A.
  • the Spanish word “esponja” would correspond to seven sequential symbols, such as the symbols E-S-P-O-N-J-A.
  • FIG. 6 is a flow chart illustrating operation of dictionary creation component 316 consistent with another aspect of the invention.
  • Some languages, such as English, are not regular in the sense that the written characters, depending on the context of the character within its surrounding characters, may correspond to more than one sound.
  • Dictionary creation component 316 begins by receiving the written version of the words that are to be entered into system dictionary 315 (Act 601 ). This act is identical to Act 501 of FIG. 5.
  • Symbols for system dictionary 315 are based on the written characters or on groupings of the written characters.
  • Dictionary creation component 316 segments the input word into symbols that may represent a single written character or a grouping of characters (Act 602 ). These symbols are then entered into system dictionary 315 (Act 603 ).
  • the determination of which character groupings are considered to be a single symbol may be determined through a statistical analysis of the written words of the language.
  • the statistical analysis includes looking at character groupings of two characters (pairs) and three characters within a standard dictionary. The most frequently occurring two and three character groupings within the dictionary are determined to correspond to single symbols.
  • the frequency threshold for when a grouping is considered to be a “most frequently occurring” grouping may be manually determined by a speech expert based on the observed effectiveness of models 401 when trained using various thresholds.
  • FIG. 7 is a flow chart illustrating operation of dictionary creation component 316 consistent with yet another aspect of the invention.
  • the method of FIG. 7 begins when the written version of a word is input to dictionary creation component 316 (Act 701 ).
  • Dictionary creation component 316 determines to which of a number of predetermined word classes the word belongs (Act 702 ).
  • the word classes may be predefined by a speech expert or may be predefined based on a statistical analysis of the lexicon. For example, words whose origins derive from old English words may be classified in an “old English” classification. As another example, words with a certain suffix or prefix may be placed into another classification.
  • Dictionary creation component 316 converts each word into a series of pronunciation symbols based on the word classification. Each classification may be assigned to one of a number of conversion methods. For example, as shown in FIG. 7, depending on the classification, the word may be converted into symbols in which each symbol directly corresponds to a character of the word (Acts 703 and 704 , identical to Acts 502 and 503 ). Alternatively, depending on the classification, dictionary creation component 316 may segment the input word into symbols that may represent a single written character or a grouping of characters (Acts 705 and 706 , identical to Acts 602 and 603 ).
  • dictionary creation component 316 converts the normal orthographic written representation of a word into a sequence of symbols that relate to the pronunciation of the word.
  • the symbols may be used to train conventional models for speech recognition.
  • dictionary creation component 316 may operate according to a number of conversion techniques, such as those shown in FIGS. 5-7.
  • a speech expert may initially configure dictionary creation component 316 for each particular language. Once configured, dictionary creation component 316 may automatically generate the symbols of a word for system dictionary 315 based on only the normal written representation of the word. Accordingly, users that are not trained speech experts can easily update system dictionary 315 .
  • the software may more generally be implemented as any type of logic.
  • This logic may include hardware, such as an application specific integrated circuit a field programmable gate array, software, or a combination of hardware and software.

Abstract

A dictionary creation component [316] converts the normal orthographic written representation of a word into a sequence of symbols that relate to the pronunciation of the word. The symbols may be used to train conventional models for speech recognition. The symbols are not phonemes and do not need to be defined by a speech expert. The symbols are created automatically by the dictionary creation component based on the written representation of the word.

Description

    RELATED APPLICATIONS
  • This application claims priority under 35 U.S.C. § 119 based on U.S. Provisional Application No. 60/419,214 filed Oct. 17, 2002, the disclosure of is incorporated herein by reference.[0001]
  • GOVERNMENT CONTRACT
  • [0002] The U.S. Government has a paid-up license in this invention and the right in limited circumstances to require the patent owner to license others on reason-able terms as provided for by the terms of (contract No. N66001-00-C-8008) awarded by DARPA.
  • BACKGROUND OF THE INVENTION
  • A. Field of the Invention [0003]
  • The present invention relates generally to speech recognition and, more particularly, to the creation of system dictionaries for speech recognition systems. [0004]
  • B. Description of Related Art [0005]
  • Speech has not traditionally been valued as an archival information source. As effective as the spoken word is for communicating, archiving spoken segments in a useful and easily retrievable manner has long been a difficult proposition. Although the act of recording audio is not difficult, automatically transcribing and indexing speech in an intelligent and useful manner can be difficult. [0006]
  • Automatic transcription systems are generally based on language and acoustic models. The models are trained on a speech signal and on a corresponding signal based on a transcription of the speech. The model will “learn” how the speech signal corresponds to the transcription. Conventional models are frequently implemented based on Hidden Markov Models (HMMs). [0007]
  • FIG. 1 is a diagram illustrating a conventional speech recognition system. A [0008] content transcription component 102 receives an input audio stream. The content transcription component 102 converts speech in the input audio stream into text based on language and acoustic model(s) 101. The model(s) 101 are pre-trained based on a training audio stream that is expected to be similar to the run-time version of the input audio stream.
  • FIG. 2 is a diagram illustrating training of [0009] models 101 in additional detail. When training, models 101 receive the input audio stream 210 and a corresponding transcription 211 of the input audio stream. The transcription may have been meticulously generated by a human based on the input audio stream 210. Transcription 211 may be converted into a stream of phonemes 213 by system dictionary 212. System dictionary 212 includes information regarding the relationship between the written orthographic representation of a word and the phonemes that correspond to the word. A phoneme is generally defined as the smallest acoustic event that distinguishes one word from another.
  • During training, [0010] models 101 learn associations between the audio stream 210 and the phoneme stream 213. During run-time operation, models 101 may then generate phonemes for run-time audio stream 110, including boundary indications between phonemes that correspond to different words. Content transcription component 102 may use a phoneme dictionary to convert the generated phonemes into a conventional written transcription. In this manner, the run-time transcription is generated.
  • One disadvantage of the speech recognition system described above is that the system requires a phoneme-based [0011] system dictionary 212 to train models 101. When a user of the system wishes to add new words to the system, the user must update system dictionary 212 to include the new words and the phonemes corresponding to the new words. Generating the correct phonemes for any given word, however, is not a trivial task. In fact, this job is generally performed by a person with specialized training in this area (i.e., a speech expert). This can be a significant problem for speech recognition systems that are deployed in the field. If the user of the system is not a speech expert, adding words to the system can be a difficult proposition.
  • Requiring a phoneme dictionary can also make it difficult to extend the speech recognition system to additional languages. In particular, for each new language, speech expert(s) must undertake the work-intensive task of generating a new phoneme dictionary for the language. [0012]
  • Accordingly, it would be desirable to simplify the operation of speech recognition systems such that the systems are not dependent on manually created phoneme dictionaries. [0013]
  • SUMMARY OF THE INVENTION
  • Systems and methods consistent with the present invention include speech recognition systems that use a system dictionary that discards the linguistic origin of phonemes and instead uses a pronunciation model based on the normal orthographic written form of the word. Entries in the system dictionary, consistent with the present invention, may be created in an automated manner. [0014]
  • One aspect consistent with the invention is directed to a method for specifying a pronunciation of a word. The method includes receiving a written version of the word defined by a series of characters and separating the written version of the word into the series of characters. The method further includes generating symbols that define a pronunciation of the word based solely on the series of characters. [0015]
  • A second aspect consistent with the invention is directed to a speech recognition system. The system includes speech recognition models that convert audio containing speech into a transcription of the speech. A system dictionary trains the speech recognition models by providing symbols that define pronunciations of words to the speech recognition models. A dictionary creation component generates the symbols for the system dictionary, where the symbols are based on written characters of the words. [0016]
  • A third aspect consistent with the invention is directed to a method. The method includes configuring a dictionary creation component to generate symbols that represent pronunciations of words in a target language. The symbols are generated based solely on written representations of the words and configuration is performed based on the target language. The method also includes providing the dictionary creation component with written words and receiving the symbols from the dictionary creation component.[0017]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate the invention and, together with the description, explain the invention. In the drawings, [0018]
  • FIG. 1 is a diagram illustrating a conventional speech recognition system; [0019]
  • FIG. 2 is a diagram illustrating training of models in additional detail the speech recognition system of FIG. 1; [0020]
  • FIG. 3 is a diagram illustrating an exemplary system in which concepts consistent with the invention may be implemented; [0021]
  • FIG. 4 is a diagram illustrating training of speech recognition models consistent with the present invention; [0022]
  • FIG. 5 is a flow chart illustrating operation of a dictionary creation component consistent with an aspect of the invention; [0023]
  • FIG. 6 is a flow chart illustrating operation of a dictionary creation component consistent with another aspect of the invention; and [0024]
  • FIG. 7 is a flow chart illustrating operation of a dictionary creation component consistent with yet another aspect of the invention.[0025]
  • DETAILED DESCRIPTION
  • The following detailed description of the invention refers to the accompanying drawings. The same reference numbers may be used in different drawings to identify the same or similar elements. Also, the following detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims and equivalents of the claim limitations. [0026]
  • Systems and methods consistent with the present invention create system dictionary entries that automatically define word pronunciations. The specification of the pronunciation of a word is based on the normal orthographic written version of the word. Thus, based on only the written version of a word, systems and methods consistent with the present invention specify a pronunciation for the word. These pronunciations can be effectively used by speech recognition systems. [0027]
  • SYSTEM OVERVIEW
  • Speech recognition, as described herein, may be performed by one or more processing devices or networks of processing devices. FIG. 3 is a diagram illustrating an [0028] exemplary system 300 in which concepts consistent with the invention may be implemented. System 300 includes a computing device 301 that has a computer-readable medium 309, such as random access memory, coupled to a processor 308. Computing device 301 may also include a number of additional external or internal devices. An external input device 320 and an external output device 321 are shown in FIG. 3. The input devices 320 may include, without limitation, a mouse, a CD-ROM, or a keyboard. The output devices may include, without limitation, a display or an audio output device, such as a speaker.
  • In general, [0029] computing device 301 may be any type of computing platform, and may be connected to a network 302. Computing device 301 is exemplary only. Concepts consistent with the present invention can be implemented on any computing device, whether or not connected to a network.
  • [0030] Processor 308 executes program instructions stored in memory 309. Processor 308 can be any of a number of well-known computer processors, such as processors from Intel Corporation, of Santa Clara, Calif.
  • [0031] Memory 309 may contain application programs and data. In particular, memory 309 may include a system dictionary 315 and a dictionary creation component 316. System dictionary 315 may be used in training models for speech recognition in a manner similar to system dictionary 212 (FIG. 2). Entries in system dictionary 315 may be generated automatically by dictionary creation component 316 based on the written version of the word. This is in contrast to a conventional system dictionary, such as system dictionary 212, in which each entry is defined as a series of phonemes derived from a human expert.
  • System Operation
  • FIG. 4 is a diagram illustrating training of speech recognition models [0032] 401 consistent with the present invention. Models 401 may be implemented in a manner similar to models 101. Models 401 may be trained based on an input audio stream 410 and an input symbol stream 413. Symbol stream 413 may include a phoneme-like representation of the words in audio stream 410 from system dictionary 315. System dictionary 315 defines the written version of words as a sequence of symbols that relate to the pronunciation of the words. The symbols may be created automatically by dictionary creation component 316. From the view-point of models 401, the symbols in system dictionary 315 are treated as if they were phonemes. In actuality, however, the symbols are not phonemes and do not need to be defined by a speech expert.
  • Models [0033] 401 may be based on HMMs. Models 401 may include acoustic models and language models. The acoustic models may describe the time-varying evolution of feature vectors for each symbol in symbol stream 413. The acoustic models may employ continuous HMMs to model each of the symbols in various phonetic contexts.
  • The language models may include n-gram language models, where the probability of each word is a function of the previous word (for a bi-gram language model) and the previous two words (for a tri-gram language model). Typically, the higher the order of the language model, the higher the recognition accuracy at the cost of slower recognition speeds. [0034]
  • Dictionary Creation Component
  • FIG. 5 is a flow chart illustrating operation of [0035] dictionary creation component 316 consistent with an aspect of the invention. The acts shown in FIG. 5 are particularly appropriate for languages in which pronunciations are “regular” in the sense that each written character tends to correspond to a sound. Arabic, Italian, and Spanish are examples of regular languages.
  • To begin, [0036] dictionary creation component 316 receives the written version of the words that are to be entered into system dictionary 315 (Act 501). The written version of the words may, for example, be manually entered by a user or the words may be obtained through an automated process. The automated process may include scanning documents on a network, such as a web-crawling program that scans documents on the Internet.
  • Symbols for [0037] system dictionary 315 are based directly on the written characters. Thus, in this implementation, dictionary creation component 316 separates the received word into its constituent characters and writes a corresponding entry to system dictionary 315 (Acts 502 and 503). For example, the Spanish word “ducha”(shower) would be processed by system dictionary creation component 316 as five sequential symbols, such as the symbols D-U-C-H-A. Similarly, the Spanish word “esponja” would correspond to seven sequential symbols, such as the symbols E-S-P-O-N-J-A.
  • FIG. 6 is a flow chart illustrating operation of [0038] dictionary creation component 316 consistent with another aspect of the invention. Some languages, such as English, are not regular in the sense that the written characters, depending on the context of the character within its surrounding characters, may correspond to more than one sound.
  • [0039] Dictionary creation component 316 begins by receiving the written version of the words that are to be entered into system dictionary 315 (Act 601). This act is identical to Act 501 of FIG. 5.
  • Symbols for [0040] system dictionary 315 are based on the written characters or on groupings of the written characters. Dictionary creation component 316 segments the input word into symbols that may represent a single written character or a grouping of characters (Act 602). These symbols are then entered into system dictionary 315 (Act 603).
  • As an example of a character grouping, consider the English word “wrought.” This word may be processed by system [0041] dictionary creation component 316 as five sequential symbols, such as the symbols W-R-O-U-GHT. The characters “W”, “R”, “O”, and “U” all correspond to individual symbols, while the three characters “GHT” together correspond to a single symbol. As another example, consider the English word “tying.” This word may be segmented into the three symbols T-Y-ING, where the three characters “ING” are considered to be a single symbol.
  • The determination of which character groupings are considered to be a single symbol may be determined through a statistical analysis of the written words of the language. In one implementation, the statistical analysis includes looking at character groupings of two characters (pairs) and three characters within a standard dictionary. The most frequently occurring two and three character groupings within the dictionary are determined to correspond to single symbols. The frequency threshold for when a grouping is considered to be a “most frequently occurring” grouping may be manually determined by a speech expert based on the observed effectiveness of models [0042] 401 when trained using various thresholds.
  • FIG. 7 is a flow chart illustrating operation of [0043] dictionary creation component 316 consistent with yet another aspect of the invention. As with the operation of dictionary creation component 316 pursuant to FIGS. 5 and 6, the method of FIG. 7 begins when the written version of a word is input to dictionary creation component 316 (Act 701). Dictionary creation component 316 then determines to which of a number of predetermined word classes the word belongs (Act 702). The word classes may be predefined by a speech expert or may be predefined based on a statistical analysis of the lexicon. For example, words whose origins derive from old English words may be classified in an “old English” classification. As another example, words with a certain suffix or prefix may be placed into another classification.
  • [0044] Dictionary creation component 316 converts each word into a series of pronunciation symbols based on the word classification. Each classification may be assigned to one of a number of conversion methods. For example, as shown in FIG. 7, depending on the classification, the word may be converted into symbols in which each symbol directly corresponds to a character of the word ( Acts 703 and 704, identical to Acts 502 and 503). Alternatively, depending on the classification, dictionary creation component 316 may segment the input word into symbols that may represent a single written character or a grouping of characters ( Acts 705 and 706, identical to Acts 602 and 603).
  • Conclusion
  • As described above, [0045] dictionary creation component 316 converts the normal orthographic written representation of a word into a sequence of symbols that relate to the pronunciation of the word. The symbols may be used to train conventional models for speech recognition. Depending on the language, dictionary creation component 316 may operate according to a number of conversion techniques, such as those shown in FIGS. 5-7. A speech expert may initially configure dictionary creation component 316 for each particular language. Once configured, dictionary creation component 316 may automatically generate the symbols of a word for system dictionary 315 based on only the normal written representation of the word. Accordingly, users that are not trained speech experts can easily update system dictionary 315.
  • The foregoing description of preferred embodiments of the invention provides illustration and description, but is not intended to be exhaustive or to limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practice of the invention. For example, while series of acts have been presented with respect to FIGS. 5-7, the order of the acts may be different in other implementations consistent with the present invention. Additionally, non-dependent acts may be implemented in parallel. [0046]
  • Certain portions of the invention have been described as software that performs one or more functions. The software may more generally be implemented as any type of logic. This logic may include hardware, such as an application specific integrated circuit a field programmable gate array, software, or a combination of hardware and software. [0047]
  • No element, act, or instruction used in the description of the present application should be construed as critical or essential to the invention unless explicitly described as such. Also, as used herein, the article “a” is intended to include one or more items. Where only one item is intended, the term “one” or similar language is used. [0048]
  • The scope of the invention is defined by the claims and their equivalents. [0049]

Claims (30)

What is claimed:
1. A method for specifying a pronunciation of a word comprising:
receiving a written version of the word defined by a series of characters;
separating the written version of the word into the series of characters; and
generating symbols that define a pronunciation of the word based solely on the series of characters.
2. The method of claim 1, wherein receiving a written version of the word includes:
receiving the written version of the word from a user.
3. The method of claim 1, wherein receiving a written version of the word includes:
receiving the written version of the word from a program that automatically scans a network.
4. The method of claim 1, wherein the generated symbols have a one-to-one correspondence with the series of characters.
5. The method of claim 1, wherein the generated symbols correspond to predetermined character groupings from the series of characters.
6. The method of claim 5, wherein the predetermined character groupings are determined based on a statistical analysis of a language.
7. The method of claim 6, wherein the statistical analysis is based on frequency of occurrence of the words in the language.
8. The method of claim 1, further comprising:
classifying the word into one of a predetermined plurality of classifications; and
generating the symbols based on the classification of the word.
9. The method of claim 8, wherein the classifications are based on word affixes.
10. A speech recognition system comprising:
speech recognition models configured to convert audio containing speech into a transcription of the speech;
a system dictionary used to train the speech recognition models by providing symbols that define pronunciations of words to the speech recognition models; and
a dictionary creation component configured to generate the symbols for the system dictionary, the symbols being based on written characters of the words.
11. The system of claim 10, wherein the dictionary creation component receives the words from a program that automatically scans a network for the words.
12. The system of claim 10, wherein the generated symbols have a one-to-one correspondence with a sequence of the written characters of the words.
13. The system of claim 10, wherein the generated symbols correspond to predetermined character groupings in a sequence of the written characters of the words.
14. The system of claim 13, wherein the predetermined character groupings are determined based on a statistical analysis of a language.
15. The system of claim 14, wherein the statistical analysis is based on frequency of occurrence of the words in the language.
16. The system of claim 10, wherein the dictionary creation component classifies each of the words into one of a predetermined plurality of classifications and generates the symbols based on the classifications.
17. A method comprising:
configuring a dictionary creation component to generate symbols that represent pronunciations of words in a target language, the symbols being generated based solely on written representations of the words and the configuring being performed based on the target language;
providing the dictionary creation component with written words; and
receiving the symbols that represent pronunciations of the written words from the dictionary creation component.
18. The method of claim 17, wherein the generated symbols have a one-to-one correspondence with a series of characters that define the written representations of the words.
19. The method of claim 17, wherein the generated symbols correspond to predetermined character groupings from a series of characters that define the written representations of the words.
20. The method of claim 19, wherein the predetermined character groupings are determined based on a statistical analysis of the target language.
21. The method of claim 20, wherein the statistical analysis is based on frequency of occurrence of the words in the target language.
22. The method of claim 17, further comprising:
classifying the words into one of a predetermined plurality of classifications; and
generating the symbols based on the classifications of the words.
23. The method of claim 22, wherein the classifications are based on word affixes.
24. A device comprising:
means for receiving a written version of a word defined by a series of characters;
means for separating the written version of the word into the series of characters; and
means for generating symbols that define a pronunciation of the word based on the series of characters.
25. The device of claim 24, wherein the generated symbols have a one-to-one correspondence with the series of characters.
26. The device of claim 24, wherein the generated symbols correspond to predetermined character groupings from the series of characters.
27. The device of claim 26, wherein the predetermined character groupings are determined based on a statistical analysis of a language.
28. The device of claim 27, wherein the statistical analysis is based on frequency of occurrence of the words in the language.
29. The device of claim 24, further comprising:
means for classifying the word into one of a predetermined plurality of classifications; and
means for generating the symbols based on the classification of the word.
30. A computer-readable medium containing programming instructions for execution by a processor, the computer-readable medium comprising:
instructions for receiving a written version of a word defined by a series of characters;
instructions for separating the written version of the word into the series of characters; and
instructions for generating symbols that define a pronunciation of the word based solely on the series of characters.
US10/685,566 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language Abandoned US20040176946A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/685,566 US20040176946A1 (en) 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US41921402P 2002-10-17 2002-10-17
US10/685,566 US20040176946A1 (en) 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language

Publications (1)

Publication Number Publication Date
US20040176946A1 true US20040176946A1 (en) 2004-09-09

Family

ID=32110223

Family Applications (9)

Application Number Title Priority Date Filing Date
US10/685,479 Abandoned US20040163034A1 (en) 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents
US10/685,478 Abandoned US20040083104A1 (en) 2002-10-17 2003-10-16 Systems and methods for providing interactive speaker identification training
US10/685,445 Abandoned US20040138894A1 (en) 2002-10-17 2003-10-16 Speech transcription tool for efficient speech transcription
US10/685,410 Expired - Fee Related US7389229B2 (en) 2002-10-17 2003-10-16 Unified clustering tree
US10/685,585 Active 2026-01-10 US7424427B2 (en) 2002-10-17 2003-10-16 Systems and methods for classifying audio into broad phoneme classes
US10/685,586 Abandoned US20040204939A1 (en) 2002-10-17 2003-10-16 Systems and methods for speaker change detection
US10/685,403 Abandoned US20040083090A1 (en) 2002-10-17 2003-10-16 Manager for integrating language technology components
US10/685,565 Active - Reinstated 2026-04-05 US7292977B2 (en) 2002-10-17 2003-10-16 Systems and methods for providing online fast speaker adaptation in speech recognition
US10/685,566 Abandoned US20040176946A1 (en) 2002-10-17 2003-10-16 Pronunciation symbols based on the orthographic lexicon of a language

Family Applications Before (8)

Application Number Title Priority Date Filing Date
US10/685,479 Abandoned US20040163034A1 (en) 2002-10-17 2003-10-16 Systems and methods for labeling clusters of documents
US10/685,478 Abandoned US20040083104A1 (en) 2002-10-17 2003-10-16 Systems and methods for providing interactive speaker identification training
US10/685,445 Abandoned US20040138894A1 (en) 2002-10-17 2003-10-16 Speech transcription tool for efficient speech transcription
US10/685,410 Expired - Fee Related US7389229B2 (en) 2002-10-17 2003-10-16 Unified clustering tree
US10/685,585 Active 2026-01-10 US7424427B2 (en) 2002-10-17 2003-10-16 Systems and methods for classifying audio into broad phoneme classes
US10/685,586 Abandoned US20040204939A1 (en) 2002-10-17 2003-10-16 Systems and methods for speaker change detection
US10/685,403 Abandoned US20040083090A1 (en) 2002-10-17 2003-10-16 Manager for integrating language technology components
US10/685,565 Active - Reinstated 2026-04-05 US7292977B2 (en) 2002-10-17 2003-10-16 Systems and methods for providing online fast speaker adaptation in speech recognition

Country Status (1)

Country Link
US (9) US20040163034A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7272558B1 (en) 2006-12-01 2007-09-18 Coveo Solutions Inc. Speech recognition training method for audio and video file indexing on a search engine
US20100250240A1 (en) * 2009-03-30 2010-09-30 Adacel Systems, Inc. System and method for training an acoustic model with reduced feature space variation

Families Citing this family (166)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7763189B2 (en) * 2001-05-16 2010-07-27 E. I. Du Pont De Nemours And Company Dielectric composition with reduced resistance
ATE508455T1 (en) * 2002-09-27 2011-05-15 Callminer Inc METHOD FOR STATISTICALLY ANALYZING LANGUAGE
WO2004090870A1 (en) * 2003-04-04 2004-10-21 Kabushiki Kaisha Toshiba Method and apparatus for encoding or decoding wide-band audio
US7567908B2 (en) * 2004-01-13 2009-07-28 International Business Machines Corporation Differential dynamic content delivery with text display in dependence upon simultaneous speech
JP2005202014A (en) * 2004-01-14 2005-07-28 Sony Corp Audio signal processor, audio signal processing method, and audio signal processing program
US8923838B1 (en) 2004-08-19 2014-12-30 Nuance Communications, Inc. System, method and computer program product for activating a cellular phone account
JP4220449B2 (en) * 2004-09-16 2009-02-04 株式会社東芝 Indexing device, indexing method, and indexing program
US7956905B2 (en) * 2005-02-28 2011-06-07 Fujifilm Corporation Titling apparatus, a titling method, and a machine readable medium storing thereon a computer program for titling
GB0511307D0 (en) * 2005-06-03 2005-07-13 South Manchester University Ho A method for generating output data
US7382933B2 (en) * 2005-08-24 2008-06-03 International Business Machines Corporation System and method for semantic video segmentation based on joint audiovisual and text analysis
WO2007023436A1 (en) 2005-08-26 2007-03-01 Koninklijke Philips Electronics N.V. System and method for synchronizing sound and manually transcribed text
US7801893B2 (en) * 2005-09-30 2010-09-21 Iac Search & Media, Inc. Similarity detection and clustering of images
US20070094023A1 (en) * 2005-10-21 2007-04-26 Callminer, Inc. Method and apparatus for processing heterogeneous units of work
US20070094270A1 (en) * 2005-10-21 2007-04-26 Callminer, Inc. Method and apparatus for the processing of heterogeneous units of work
US8756057B2 (en) 2005-11-02 2014-06-17 Nuance Communications, Inc. System and method using feedback speech analysis for improving speaking ability
KR100755677B1 (en) * 2005-11-02 2007-09-05 삼성전자주식회사 Apparatus and method for dialogue speech recognition using topic detection
US7503724B2 (en) * 2005-11-18 2009-03-17 Blacklidge Emulsions, Inc. Method for bonding prepared substrates for roadways using a low-tracking asphalt emulsion coating
US20070129943A1 (en) * 2005-12-06 2007-06-07 Microsoft Corporation Speech recognition using adaptation and prior knowledge
CA2536976A1 (en) * 2006-02-20 2007-08-20 Diaphonics, Inc. Method and apparatus for detecting speaker change in a voice transaction
US8996592B2 (en) * 2006-06-26 2015-03-31 Scenera Technologies, Llc Methods, systems, and computer program products for identifying a container associated with a plurality of files
US20080004876A1 (en) * 2006-06-30 2008-01-03 Chuang He Non-enrolled continuous dictation
US20080051916A1 (en) * 2006-08-28 2008-02-28 Arcadyan Technology Corporation Method and apparatus for recording streamed audio
KR100826875B1 (en) * 2006-09-08 2008-05-06 한국전자통신연구원 On-line speaker recognition method and apparatus for thereof
US8073681B2 (en) 2006-10-16 2011-12-06 Voicebox Technologies, Inc. System and method for a cooperative conversational voice user interface
US20080104066A1 (en) * 2006-10-27 2008-05-01 Yahoo! Inc. Validating segmentation criteria
US20080154579A1 (en) * 2006-12-21 2008-06-26 Krishna Kummamuru Method of analyzing conversational transcripts
US7818176B2 (en) 2007-02-06 2010-10-19 Voicebox Technologies, Inc. System and method for selecting and presenting advertisements based on natural language processing of voice-based input
WO2008137616A1 (en) * 2007-05-04 2008-11-13 Nuance Communications, Inc. Multi-class constrained maximum likelihood linear regression
WO2009018223A1 (en) * 2007-07-27 2009-02-05 Sparkip, Inc. System and methods for clustering large database of documents
DE602007004733D1 (en) * 2007-10-10 2010-03-25 Harman Becker Automotive Sys speaker recognition
JP4405542B2 (en) * 2007-10-24 2010-01-27 株式会社東芝 Apparatus, method and program for clustering phoneme models
US9386154B2 (en) 2007-12-21 2016-07-05 Nuance Communications, Inc. System, method and software program for enabling communications between customer service agents and users of communication devices
JPWO2009122779A1 (en) * 2008-04-03 2011-07-28 日本電気株式会社 Text data processing apparatus, method and program
US9305548B2 (en) 2008-05-27 2016-04-05 Voicebox Technologies Corporation System and method for an integrated, multi-modal, multi-device natural language voice services environment
US9020816B2 (en) * 2008-08-14 2015-04-28 21Ct, Inc. Hidden markov model for speech processing with training method
CA2680304C (en) * 2008-09-25 2017-08-22 Multimodal Technologies, Inc. Decoding-time prediction of non-verbalized tokens
US8458105B2 (en) 2009-02-12 2013-06-04 Decisive Analytics Corporation Method and apparatus for analyzing and interrelating data
US8326637B2 (en) 2009-02-20 2012-12-04 Voicebox Technologies, Inc. System and method for processing multi-modal device interactions in a natural language voice services environment
US8412525B2 (en) * 2009-04-30 2013-04-02 Microsoft Corporation Noise robust speech classifier ensemble
US8713018B2 (en) 2009-07-28 2014-04-29 Fti Consulting, Inc. System and method for displaying relationships between electronically stored information to provide classification suggestions via inclusion
CA3026879A1 (en) 2009-08-24 2011-03-10 Nuix North America, Inc. Generating a reference set for use during document review
US8554562B2 (en) * 2009-11-15 2013-10-08 Nuance Communications, Inc. Method and system for speaker diarization
US8983958B2 (en) * 2009-12-21 2015-03-17 Business Objects Software Limited Document indexing based on categorization and prioritization
JP5477635B2 (en) * 2010-02-15 2014-04-23 ソニー株式会社 Information processing apparatus and method, and program
US10273637B2 (en) 2010-02-24 2019-04-30 Blacklidge Emulsions, Inc. Hot applied tack coat
US9305553B2 (en) * 2010-04-28 2016-04-05 William S. Meisel Speech recognition accuracy improvement through speaker categories
US9009040B2 (en) * 2010-05-05 2015-04-14 Cisco Technology, Inc. Training a transcription system
US8391464B1 (en) 2010-06-24 2013-03-05 Nuance Communications, Inc. Customer service system, method, and software program product for responding to queries using natural language understanding
JP2012038131A (en) * 2010-08-09 2012-02-23 Sony Corp Information processing unit, information processing method, and program
US8630854B2 (en) * 2010-08-31 2014-01-14 Fujitsu Limited System and method for generating videoconference transcriptions
US20120084149A1 (en) * 2010-09-10 2012-04-05 Paolo Gaudiano Methods and systems for online advertising with interactive text clouds
US8791977B2 (en) 2010-10-05 2014-07-29 Fujitsu Limited Method and system for presenting metadata during a videoconference
CN102455997A (en) * 2010-10-27 2012-05-16 鸿富锦精密工业(深圳)有限公司 Component name extraction system and method
KR101172663B1 (en) * 2010-12-31 2012-08-08 엘지전자 주식회사 Mobile terminal and method for grouping application thereof
US20120197643A1 (en) * 2011-01-27 2012-08-02 General Motors Llc Mapping obstruent speech energy to lower frequencies
GB2489489B (en) * 2011-03-30 2013-08-21 Toshiba Res Europ Ltd A speech processing system and method
US9774747B2 (en) * 2011-04-29 2017-09-26 Nexidia Inc. Transcription system
US8612447B2 (en) * 2011-06-22 2013-12-17 Rogers Communications Inc. Systems and methods for ranking document clusters
US9313336B2 (en) 2011-07-21 2016-04-12 Nuance Communications, Inc. Systems and methods for processing audio signals captured using microphones of multiple devices
JP2013025299A (en) * 2011-07-26 2013-02-04 Toshiba Corp Transcription support system and transcription support method
JP5638479B2 (en) * 2011-07-26 2014-12-10 株式会社東芝 Transcription support system and transcription support method
JP5404726B2 (en) * 2011-09-26 2014-02-05 株式会社東芝 Information processing apparatus, information processing method, and program
US8433577B2 (en) 2011-09-27 2013-04-30 Google Inc. Detection of creative works on broadcast media
US20130144414A1 (en) * 2011-12-06 2013-06-06 Cisco Technology, Inc. Method and apparatus for discovering and labeling speakers in a large and growing collection of videos with minimal user effort
US9002848B1 (en) * 2011-12-27 2015-04-07 Google Inc. Automatic incremental labeling of document clusters
JP2013161205A (en) * 2012-02-03 2013-08-19 Sony Corp Information processing device, information processing method and program
US20130266127A1 (en) 2012-04-10 2013-10-10 Raytheon Bbn Technologies Corp System and method for removing sensitive data from a recording
US20140365221A1 (en) * 2012-07-31 2014-12-11 Novospeech Ltd. Method and apparatus for speech recognition
US8676590B1 (en) 2012-09-26 2014-03-18 Google Inc. Web-based audio transcription tool
US20140136204A1 (en) * 2012-11-13 2014-05-15 GM Global Technology Operations LLC Methods and systems for speech systems
US20140207786A1 (en) * 2013-01-22 2014-07-24 Equivio Ltd. System and methods for computerized information governance of electronic documents
US9865266B2 (en) * 2013-02-25 2018-01-09 Nuance Communications, Inc. Method and apparatus for automated speaker parameters adaptation in a deployed speaker verification system
US9330167B1 (en) * 2013-05-13 2016-05-03 Groupon, Inc. Method, apparatus, and computer program product for classification and tagging of textual data
US10736529B2 (en) 2013-09-25 2020-08-11 Bardy Diagnostics, Inc. Subcutaneous insertable electrocardiography monitor
US10433751B2 (en) 2013-09-25 2019-10-08 Bardy Diagnostics, Inc. System and method for facilitating a cardiac rhythm disorder diagnosis based on subcutaneous cardiac monitoring data
US9717432B2 (en) 2013-09-25 2017-08-01 Bardy Diagnostics, Inc. Extended wear electrocardiography patch using interlaced wire electrodes
US9619660B1 (en) 2013-09-25 2017-04-11 Bardy Diagnostics, Inc. Computer-implemented system for secure physiological data collection and processing
US10251576B2 (en) 2013-09-25 2019-04-09 Bardy Diagnostics, Inc. System and method for ECG data classification for use in facilitating diagnosis of cardiac rhythm disorders with the aid of a digital computer
US9615763B2 (en) 2013-09-25 2017-04-11 Bardy Diagnostics, Inc. Ambulatory electrocardiography monitor recorder optimized for capturing low amplitude cardiac action potential propagation
US9700227B2 (en) 2013-09-25 2017-07-11 Bardy Diagnostics, Inc. Ambulatory electrocardiography monitoring patch optimized for capturing low amplitude cardiac action potential propagation
US9545204B2 (en) 2013-09-25 2017-01-17 Bardy Diagnostics, Inc. Extended wear electrocardiography patch
US10624551B2 (en) 2013-09-25 2020-04-21 Bardy Diagnostics, Inc. Insertable cardiac monitor for use in performing long term electrocardiographic monitoring
US9408551B2 (en) 2013-11-14 2016-08-09 Bardy Diagnostics, Inc. System and method for facilitating diagnosis of cardiac rhythm disorders with the aid of a digital computer
US10736531B2 (en) 2013-09-25 2020-08-11 Bardy Diagnostics, Inc. Subcutaneous insertable cardiac monitor optimized for long term, low amplitude electrocardiographic data collection
US20190167139A1 (en) 2017-12-05 2019-06-06 Gust H. Bardy Subcutaneous P-Wave Centric Insertable Cardiac Monitor For Long Term Electrocardiographic Monitoring
US10820801B2 (en) 2013-09-25 2020-11-03 Bardy Diagnostics, Inc. Electrocardiography monitor configured for self-optimizing ECG data compression
US11213237B2 (en) 2013-09-25 2022-01-04 Bardy Diagnostics, Inc. System and method for secure cloud-based physiological data processing and delivery
US9408545B2 (en) 2013-09-25 2016-08-09 Bardy Diagnostics, Inc. Method for efficiently encoding and compressing ECG data optimized for use in an ambulatory ECG monitor
US9504423B1 (en) 2015-10-05 2016-11-29 Bardy Diagnostics, Inc. Method for addressing medical conditions through a wearable health monitor with the aid of a digital computer
US9364155B2 (en) 2013-09-25 2016-06-14 Bardy Diagnostics, Inc. Self-contained personal air flow sensing monitor
US9655537B2 (en) 2013-09-25 2017-05-23 Bardy Diagnostics, Inc. Wearable electrocardiography and physiology monitoring ensemble
US10888239B2 (en) 2013-09-25 2021-01-12 Bardy Diagnostics, Inc. Remote interfacing electrocardiography patch
US10806360B2 (en) 2013-09-25 2020-10-20 Bardy Diagnostics, Inc. Extended wear ambulatory electrocardiography and physiological sensor monitor
US9345414B1 (en) 2013-09-25 2016-05-24 Bardy Diagnostics, Inc. Method for providing dynamic gain over electrocardiographic data with the aid of a digital computer
US11723575B2 (en) 2013-09-25 2023-08-15 Bardy Diagnostics, Inc. Electrocardiography patch
US9655538B2 (en) 2013-09-25 2017-05-23 Bardy Diagnostics, Inc. Self-authenticating electrocardiography monitoring circuit
US9717433B2 (en) 2013-09-25 2017-08-01 Bardy Diagnostics, Inc. Ambulatory electrocardiography monitoring patch optimized for capturing low amplitude cardiac action potential propagation
US10667711B1 (en) 2013-09-25 2020-06-02 Bardy Diagnostics, Inc. Contact-activated extended wear electrocardiography and physiological sensor monitor recorder
US10463269B2 (en) 2013-09-25 2019-11-05 Bardy Diagnostics, Inc. System and method for machine-learning-based atrial fibrillation detection
US9775536B2 (en) 2013-09-25 2017-10-03 Bardy Diagnostics, Inc. Method for constructing a stress-pliant physiological electrode assembly
US9433367B2 (en) 2013-09-25 2016-09-06 Bardy Diagnostics, Inc. Remote interfacing of extended wear electrocardiography and physiological sensor monitor
US10433748B2 (en) 2013-09-25 2019-10-08 Bardy Diagnostics, Inc. Extended wear electrocardiography and physiological sensor monitor
US10799137B2 (en) 2013-09-25 2020-10-13 Bardy Diagnostics, Inc. System and method for facilitating a cardiac rhythm disorder diagnosis with the aid of a digital computer
WO2015048194A1 (en) 2013-09-25 2015-04-02 Bardy Diagnostics, Inc. Self-contained personal air flow sensing monitor
US9737224B2 (en) 2013-09-25 2017-08-22 Bardy Diagnostics, Inc. Event alerting through actigraphy embedded within electrocardiographic data
US20150100582A1 (en) * 2013-10-08 2015-04-09 Cisco Technology, Inc. Association of topic labels with digital content
US9495439B2 (en) * 2013-10-08 2016-11-15 Cisco Technology, Inc. Organizing multimedia content
US9942396B2 (en) * 2013-11-01 2018-04-10 Adobe Systems Incorporated Document distribution and interaction
DE102013224417B3 (en) * 2013-11-28 2015-05-07 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Hearing aid with basic frequency modification, method for processing a speech signal and computer program with a program code for performing the method
CN104143326B (en) * 2013-12-03 2016-11-02 腾讯科技(深圳)有限公司 A kind of voice command identification method and device
US9544149B2 (en) 2013-12-16 2017-01-10 Adobe Systems Incorporated Automatic E-signatures in response to conditions and/or events
WO2015105994A1 (en) 2014-01-08 2015-07-16 Callminer, Inc. Real-time conversational analytics facility
JP6392012B2 (en) * 2014-07-14 2018-09-19 株式会社東芝 Speech synthesis dictionary creation device, speech synthesis device, speech synthesis dictionary creation method, and speech synthesis dictionary creation program
US9728190B2 (en) * 2014-07-25 2017-08-08 International Business Machines Corporation Summarization of audio data
EP3195145A4 (en) 2014-09-16 2018-01-24 VoiceBox Technologies Corporation Voice commerce
US9703982B2 (en) 2014-11-06 2017-07-11 Adobe Systems Incorporated Document distribution and interaction
US9531545B2 (en) 2014-11-24 2016-12-27 Adobe Systems Incorporated Tracking and notification of fulfillment events
US9432368B1 (en) 2015-02-19 2016-08-30 Adobe Systems Incorporated Document distribution and interaction
JP6464411B6 (en) * 2015-02-25 2019-03-13 Dynabook株式会社 Electronic device, method and program
US10447646B2 (en) * 2015-06-15 2019-10-15 International Business Machines Corporation Online communication modeling and analysis
US10068445B2 (en) 2015-06-24 2018-09-04 Google Llc Systems and methods of home-specific sound event detection
US10089061B2 (en) * 2015-08-28 2018-10-02 Kabushiki Kaisha Toshiba Electronic device and method
US9935777B2 (en) 2015-08-31 2018-04-03 Adobe Systems Incorporated Electronic signature framework with enhanced security
US20170075652A1 (en) 2015-09-14 2017-03-16 Kabushiki Kaisha Toshiba Electronic device and method
US9626653B2 (en) 2015-09-21 2017-04-18 Adobe Systems Incorporated Document distribution and interaction with delegation of signature authority
US9754593B2 (en) * 2015-11-04 2017-09-05 International Business Machines Corporation Sound envelope deconstruction to identify words and speakers in continuous speech
EP3389477B1 (en) 2015-12-16 2023-05-10 Dolby Laboratories Licensing Corporation Suppression of breath in audio signals
US10347215B2 (en) 2016-05-27 2019-07-09 Adobe Inc. Multi-device electronic signature framework
AU2017274558B2 (en) 2016-06-02 2021-11-11 Nuix North America Inc. Analyzing clusters of coded documents
US10255905B2 (en) * 2016-06-10 2019-04-09 Google Llc Predicting pronunciations with word stress
US10217453B2 (en) 2016-10-14 2019-02-26 Soundhound, Inc. Virtual assistant configured by selection of wake-up phrase
US20180232623A1 (en) * 2017-02-10 2018-08-16 International Business Machines Corporation Techniques for answering questions based on semantic distances between subjects
US10503919B2 (en) 2017-04-10 2019-12-10 Adobe Inc. Electronic signature framework with keystroke biometric authentication
US20180336892A1 (en) * 2017-05-16 2018-11-22 Apple Inc. Detecting a trigger of a digital assistant
GB2578386B (en) 2017-06-27 2021-12-01 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
GB2563953A (en) 2017-06-28 2019-01-02 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
GB201713697D0 (en) 2017-06-28 2017-10-11 Cirrus Logic Int Semiconductor Ltd Magnetic detection of replay attack
GB201801526D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Methods, apparatus and systems for authentication
GB201801532D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Methods, apparatus and systems for audio playback
GB201801528D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Method, apparatus and systems for biometric processes
GB201801527D0 (en) 2017-07-07 2018-03-14 Cirrus Logic Int Semiconductor Ltd Method, apparatus and systems for biometric processes
GB201801664D0 (en) 2017-10-13 2018-03-21 Cirrus Logic Int Semiconductor Ltd Detection of liveness
GB201804843D0 (en) 2017-11-14 2018-05-09 Cirrus Logic Int Semiconductor Ltd Detection of replay attack
GB2567503A (en) * 2017-10-13 2019-04-17 Cirrus Logic Int Semiconductor Ltd Analysing speech signals
GB201801661D0 (en) 2017-10-13 2018-03-21 Cirrus Logic International Uk Ltd Detection of liveness
GB201801659D0 (en) 2017-11-14 2018-03-21 Cirrus Logic Int Semiconductor Ltd Detection of loudspeaker playback
TWI625680B (en) 2017-12-15 2018-06-01 財團法人工業技術研究院 Method and device for recognizing facial expressions
US11264037B2 (en) 2018-01-23 2022-03-01 Cirrus Logic, Inc. Speaker identification
US11735189B2 (en) 2018-01-23 2023-08-22 Cirrus Logic, Inc. Speaker identification
US11475899B2 (en) 2018-01-23 2022-10-18 Cirrus Logic, Inc. Speaker identification
CN109300486B (en) * 2018-07-30 2021-06-25 四川大学 PICGTFs and SSMC enhanced cleft palate speech pharynx fricative automatic identification method
US10692490B2 (en) 2018-07-31 2020-06-23 Cirrus Logic, Inc. Detection of replay attack
JP7007617B2 (en) * 2018-08-15 2022-01-24 日本電信電話株式会社 End-of-speech judgment device, end-of-speech judgment method and program
US10915614B2 (en) 2018-08-31 2021-02-09 Cirrus Logic, Inc. Biometric authentication
US11037574B2 (en) 2018-09-05 2021-06-15 Cirrus Logic, Inc. Speaker recognition and speaker change detection
US11116451B2 (en) 2019-07-03 2021-09-14 Bardy Diagnostics, Inc. Subcutaneous P-wave centric insertable cardiac monitor with energy harvesting capabilities
US11096579B2 (en) 2019-07-03 2021-08-24 Bardy Diagnostics, Inc. System and method for remote ECG data streaming in real-time
US11696681B2 (en) 2019-07-03 2023-07-11 Bardy Diagnostics Inc. Configurable hardware platform for physiological monitoring of a living body
US11410642B2 (en) * 2019-08-16 2022-08-09 Soundhound, Inc. Method and system using phoneme embedding
US11354920B2 (en) 2019-10-12 2022-06-07 International Business Machines Corporation Updating and implementing a document from an audio proceeding
US11404049B2 (en) * 2019-12-09 2022-08-02 Microsoft Technology Licensing, Llc Interactive augmentation and integration of real-time speech-to-text
US11862168B1 (en) * 2020-03-30 2024-01-02 Amazon Technologies, Inc. Speaker disambiguation and transcription from multiple audio feeds
US11373657B2 (en) * 2020-05-01 2022-06-28 Raytheon Applied Signal Technology, Inc. System and method for speaker identification in audio data
US11875791B2 (en) * 2020-05-21 2024-01-16 Orcam Technologies Ltd. Systems and methods for emphasizing a user's name
US11315545B2 (en) 2020-07-09 2022-04-26 Raytheon Applied Signal Technology, Inc. System and method for language identification in audio data
CN113284508B (en) * 2021-07-21 2021-11-09 中国科学院自动化研究所 Hierarchical differentiation based generated audio detection system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6024571A (en) * 1996-04-25 2000-02-15 Renegar; Janet Elaine Foreign language communication system/device and learning aid
US6714911B2 (en) * 2001-01-25 2004-03-30 Harcourt Assessment, Inc. Speech transcription and analysis system and method
US6999918B2 (en) * 2002-09-20 2006-02-14 Motorola, Inc. Method and apparatus to facilitate correlating symbols to sounds

Family Cites Families (123)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPQ131399A0 (en) * 1999-06-30 1999-07-22 Silverbrook Research Pty Ltd A method and apparatus (NPAGE02)
US4908866A (en) * 1985-02-04 1990-03-13 Eric Goldwasser Speech transcribing system
JPH0693221B2 (en) 1985-06-12 1994-11-16 株式会社日立製作所 Voice input device
US4879648A (en) * 1986-09-19 1989-11-07 Nancy P. Cochran Search system which continuously displays search terms during scrolling and selections of individually displayed data sets
US4908868A (en) * 1989-02-21 1990-03-13 Mctaggart James E Phase polarity test instrument and method
US6978277B2 (en) * 1989-10-26 2005-12-20 Encyclopaedia Britannica, Inc. Multimedia search system
US5418716A (en) * 1990-07-26 1995-05-23 Nec Corporation System for recognizing sentence patterns and a system for recognizing sentence patterns and grammatical cases
US5404295A (en) * 1990-08-16 1995-04-04 Katz; Boris Method and apparatus for utilizing annotations to facilitate computer retrieval of database material
US5317732A (en) * 1991-04-26 1994-05-31 Commodore Electronics Limited System for relocating a multimedia presentation on a different platform by extracting a resource map in order to remap and relocate resources
US5875108A (en) * 1991-12-23 1999-02-23 Hoffberg; Steven M. Ergonomic man-machine interface incorporating adaptive pattern recognition based control system
US5544257A (en) * 1992-01-08 1996-08-06 International Business Machines Corporation Continuous parameter hidden Markov model approach to automatic handwriting recognition
JP2524472B2 (en) * 1992-09-21 1996-08-14 インターナショナル・ビジネス・マシーンズ・コーポレイション How to train a telephone line based speech recognition system
CA2108536C (en) * 1992-11-24 2000-04-04 Oscar Ernesto Agazzi Text recognition using two-dimensional stochastic models
US5689641A (en) * 1993-10-01 1997-11-18 Vicor, Inc. Multimedia collaboration system arrangement for routing compressed AV signal through a participant site without decompressing the AV signal
JP3185505B2 (en) 1993-12-24 2001-07-11 株式会社日立製作所 Meeting record creation support device
GB2285895A (en) 1994-01-19 1995-07-26 Ibm Audio conferencing system which generates a set of minutes
JPH07319917A (en) * 1994-05-24 1995-12-08 Fuji Xerox Co Ltd Document data base managing device and document data base system
US5613032A (en) * 1994-09-02 1997-03-18 Bell Communications Research, Inc. System and method for recording, playing back and searching multimedia events wherein video, audio and text can be searched and retrieved
WO1996010799A1 (en) 1994-09-30 1996-04-11 Motorola Inc. Method and system for extracting features from handwritten text
US5831615A (en) * 1994-09-30 1998-11-03 Intel Corporation Method and apparatus for redrawing transparent windows
US5777614A (en) * 1994-10-14 1998-07-07 Hitachi, Ltd. Editing support system including an interactive interface
US5835667A (en) * 1994-10-14 1998-11-10 Carnegie Mellon University Method and apparatus for creating a searchable digital video library and a system and method of using such a library
US5614940A (en) * 1994-10-21 1997-03-25 Intel Corporation Method and apparatus for providing broadcast information with indexing
US6029195A (en) * 1994-11-29 2000-02-22 Herz; Frederick S. M. System for customized electronic identification of desirable objects
US5729656A (en) 1994-11-30 1998-03-17 International Business Machines Corporation Reduction of search space in speech recognition using phone boundaries and phone ranking
US5638487A (en) * 1994-12-30 1997-06-10 Purespeech, Inc. Automatic speech recognition
US5715367A (en) * 1995-01-23 1998-02-03 Dragon Systems, Inc. Apparatuses and methods for developing and using models for speech recognition
US5684924A (en) 1995-05-19 1997-11-04 Kurzweil Applied Intelligence, Inc. User adaptable speech recognition system
US5559875A (en) * 1995-07-31 1996-09-24 Latitude Communications Method and apparatus for recording and retrieval of audio conferences
US6151598A (en) * 1995-08-14 2000-11-21 Shaw; Venson M. Digital dictionary with a communication system for the creating, updating, editing, storing, maintaining, referencing, and managing the digital dictionary
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
US6026388A (en) * 1995-08-16 2000-02-15 Textwise, Llc User interface and other enhancements for natural language information retrieval system and method
US6006221A (en) 1995-08-16 1999-12-21 Syracuse University Multilingual document retrieval system and method using semantic vector matching
US20020002562A1 (en) 1995-11-03 2002-01-03 Thomas P. Moran Computer controlled display system using a graphical replay device to control playback of temporal data representing collaborative activities
US5960447A (en) * 1995-11-13 1999-09-28 Holt; Douglas Word tagging and editing system for speech recognition
JPH09269931A (en) * 1996-01-30 1997-10-14 Canon Inc Cooperative work environment constructing system, its method and medium
US6067517A (en) * 1996-02-02 2000-05-23 International Business Machines Corporation Transcription of speech data with segments from acoustically dissimilar environments
DE69712277T2 (en) * 1996-02-27 2002-12-19 Koninkl Philips Electronics Nv METHOD AND DEVICE FOR AUTOMATIC VOICE SEGMENTATION IN PHONEMIC UNITS
US5862259A (en) 1996-03-27 1999-01-19 Caere Corporation Pattern recognition employing arbitrary segmentation and compound probabilistic evaluation
US5778187A (en) * 1996-05-09 1998-07-07 Netcast Communications Corp. Multicasting method and apparatus
US5996022A (en) * 1996-06-03 1999-11-30 Webtv Networks, Inc. Transcoding data in a proxy computer prior to transmitting the audio data to a client
US5806032A (en) * 1996-06-14 1998-09-08 Lucent Technologies Inc. Compilation of weighted finite-state transducers from decision trees
US6169789B1 (en) * 1996-12-16 2001-01-02 Sanjay K. Rao Intelligent keyboard system
US5897614A (en) * 1996-12-20 1999-04-27 International Business Machines Corporation Method and apparatus for sibilant classification in a speech recognition system
US6732183B1 (en) * 1996-12-31 2004-05-04 Broadware Technologies, Inc. Video and audio streaming for multiple users
US6185531B1 (en) * 1997-01-09 2001-02-06 Gte Internetworking Incorporated Topic indexing method
US6088669A (en) 1997-01-28 2000-07-11 International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
JP2991287B2 (en) * 1997-01-28 1999-12-20 日本電気株式会社 Suppression standard pattern selection type speaker recognition device
US6029124A (en) * 1997-02-21 2000-02-22 Dragon Systems, Inc. Sequential, nonparametric speech recognition and speaker identification
US6463444B1 (en) * 1997-08-14 2002-10-08 Virage, Inc. Video cataloger system with extensibility
US6567980B1 (en) * 1997-08-14 2003-05-20 Virage, Inc. Video cataloger system with hyperlinked output
US6360234B2 (en) * 1997-08-14 2002-03-19 Virage, Inc. Video cataloger system with synchronized encoders
US6052657A (en) * 1997-09-09 2000-04-18 Dragon Systems, Inc. Text segmentation and identification of topic using language models
US6317716B1 (en) 1997-09-19 2001-11-13 Massachusetts Institute Of Technology Automatic cueing of speech
JP2001511991A (en) 1997-10-01 2001-08-14 エイ・ティ・アンド・ティ・コーポレーション Method and apparatus for storing and retrieving label interval data for multimedia records
US6961954B1 (en) * 1997-10-27 2005-11-01 The Mitre Corporation Automated segmentation, information extraction, summarization, and presentation of broadcast news
US6064963A (en) * 1997-12-17 2000-05-16 Opus Telecom, L.L.C. Automatic key word or phrase speech recognition for the corrections industry
JP4183311B2 (en) * 1997-12-22 2008-11-19 株式会社リコー Document annotation method, annotation device, and recording medium
US5970473A (en) * 1997-12-31 1999-10-19 At&T Corp. Video communication device providing in-home catalog services
SE511584C2 (en) * 1998-01-15 1999-10-25 Ericsson Telefon Ab L M information Routing
US6327343B1 (en) 1998-01-16 2001-12-04 International Business Machines Corporation System and methods for automatic call and data transfer processing
JP3181548B2 (en) * 1998-02-03 2001-07-03 富士通株式会社 Information retrieval apparatus and information retrieval method
US6073096A (en) * 1998-02-04 2000-06-06 International Business Machines Corporation Speaker adaptation system and method based on class-specific pre-clustering training speakers
US7257528B1 (en) * 1998-02-13 2007-08-14 Zi Corporation Of Canada, Inc. Method and apparatus for Chinese character text input
US6381640B1 (en) * 1998-09-11 2002-04-30 Genesys Telecommunications Laboratories, Inc. Method and apparatus for automated personalization and presentation of workload assignments to agents within a multimedia communication center
US6112172A (en) * 1998-03-31 2000-08-29 Dragon Systems, Inc. Interactive searching
CN1159662C (en) * 1998-05-13 2004-07-28 国际商业机器公司 Automatic punctuating for continuous speech recognition
US6076053A (en) * 1998-05-21 2000-06-13 Lucent Technologies Inc. Methods and apparatus for discriminative training and adaptation of pronunciation networks
US6243680B1 (en) * 1998-06-15 2001-06-05 Nortel Networks Limited Method and apparatus for obtaining a transcription of phrases through text and spoken utterances
US6067514A (en) * 1998-06-23 2000-05-23 International Business Machines Corporation Method for automatically punctuating a speech utterance in a continuous speech recognition system
US6246983B1 (en) * 1998-08-05 2001-06-12 Matsushita Electric Corporation Of America Text-to-speech e-mail reader with multi-modal reply processor
US6373985B1 (en) * 1998-08-12 2002-04-16 Lucent Technologies, Inc. E-mail signature block analysis
US6360237B1 (en) * 1998-10-05 2002-03-19 Lernout & Hauspie Speech Products N.V. Method and system for performing text edits during audio recording playback
US6161087A (en) * 1998-10-05 2000-12-12 Lernout & Hauspie Speech Products N.V. Speech-recognition-assisted selective suppression of silent and filled speech pauses during playback of an audio recording
US6347295B1 (en) * 1998-10-26 2002-02-12 Compaq Computer Corporation Computer method and apparatus for grapheme-to-phoneme rule-set-generation
US6332139B1 (en) * 1998-11-09 2001-12-18 Mega Chips Corporation Information communication system
JP3252282B2 (en) * 1998-12-17 2002-02-04 松下電器産業株式会社 Method and apparatus for searching scene
US6654735B1 (en) * 1999-01-08 2003-11-25 International Business Machines Corporation Outbound information analysis for generating user interest profiles and improving user productivity
US6253179B1 (en) * 1999-01-29 2001-06-26 International Business Machines Corporation Method and apparatus for multi-environment speaker verification
DE19912405A1 (en) * 1999-03-19 2000-09-21 Philips Corp Intellectual Pty Determination of a regression class tree structure for speech recognizers
CN1148965C (en) 1999-03-30 2004-05-05 提维股份有限公司 Data storage management and scheduling system
US6345252B1 (en) * 1999-04-09 2002-02-05 International Business Machines Corporation Methods and apparatus for retrieving audio information using content and speaker information
US6434520B1 (en) * 1999-04-16 2002-08-13 International Business Machines Corporation System and method for indexing and querying audio archives
US6711585B1 (en) * 1999-06-15 2004-03-23 Kanisa Inc. System and method for implementing a knowledge management system
US6219640B1 (en) * 1999-08-06 2001-04-17 International Business Machines Corporation Methods and apparatus for audio-visual speaker recognition and utterance verification
IES990800A2 (en) 1999-08-20 2000-09-06 Digitake Software Systems Ltd An audio processing system
JP3232289B2 (en) * 1999-08-30 2001-11-26 インターナショナル・ビジネス・マシーンズ・コーポレーション Symbol insertion device and method
US6480826B2 (en) * 1999-08-31 2002-11-12 Accenture Llp System and method for a telephonic emotion detection that provides operator feedback
US6711541B1 (en) * 1999-09-07 2004-03-23 Matsushita Electric Industrial Co., Ltd. Technique for developing discriminative sound units for speech recognition and allophone modeling
US6624826B1 (en) * 1999-09-28 2003-09-23 Ricoh Co., Ltd. Method and apparatus for generating visual representations for audio documents
US6571208B1 (en) * 1999-11-29 2003-05-27 Matsushita Electric Industrial Co., Ltd. Context-dependent acoustic models for medium and large vocabulary speech recognition with eigenvoice training
JP2003518266A (en) * 1999-12-20 2003-06-03 コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ Speech reproduction for text editing of speech recognition system
WO2001063597A1 (en) 2000-02-25 2001-08-30 Koninklijke Philips Electronics N.V. Speech recognition device with reference transformation means
US7197694B2 (en) * 2000-03-21 2007-03-27 Oki Electric Industry Co., Ltd. Image display system, image registration terminal device and image reading terminal device used in the image display system
US7120575B2 (en) * 2000-04-08 2006-10-10 International Business Machines Corporation Method and system for the automatic segmentation of an audio stream into semantic or syntactic units
CN1193605C (en) * 2000-04-21 2005-03-16 松下电器产业株式会社 Data reproduction apparatus
US6505153B1 (en) * 2000-05-22 2003-01-07 Compaq Information Technologies Group, L.P. Efficient method for producing off-line closed captions
US6748356B1 (en) * 2000-06-07 2004-06-08 International Business Machines Corporation Methods and apparatus for identifying unknown speakers using a hierarchical tree structure
JP2002008389A (en) * 2000-06-20 2002-01-11 Mitsubishi Electric Corp Semiconductor memory
US7047192B2 (en) * 2000-06-28 2006-05-16 Poirier Darrell A Simultaneous multi-user real-time speech recognition system
US6931376B2 (en) 2000-07-20 2005-08-16 Microsoft Corporation Speech-related event notification system
EP1176493A3 (en) 2000-07-28 2002-07-10 Jan Pathuel Method and system of securing data and systems
WO2002010981A2 (en) 2000-07-28 2002-02-07 Easyask, Inc. Distributed search system and method
WO2002019147A1 (en) * 2000-08-28 2002-03-07 Emotion, Inc. Method and apparatus for digital media management, retrieval, and collaboration
US6604110B1 (en) * 2000-08-31 2003-08-05 Ascential Software, Inc. Automated software code generation from a metadata-based repository
US6647383B1 (en) * 2000-09-01 2003-11-11 Lucent Technologies Inc. System and method for providing interactive dialogue and iterative search functions to find information
AU2000276394A1 (en) 2000-09-30 2002-04-15 Intel Corporation Method and system for generating and searching an optimal maximum likelihood decision tree for hidden markov model (hmm) based speech recognition
WO2002029614A1 (en) 2000-09-30 2002-04-11 Intel Corporation Method and system to scale down a decision tree-based hidden markov model (hmm) for speech recognition
US6934756B2 (en) * 2000-11-01 2005-08-23 International Business Machines Corporation Conversational networking via transport, coding and control conversational protocols
US20050060162A1 (en) * 2000-11-10 2005-03-17 Farhad Mohit Systems and methods for automatic identification and hyperlinking of words or other data items and for information retrieval using hyperlinked words or data items
US7221663B2 (en) 2001-12-31 2007-05-22 Polycom, Inc. Method and apparatus for wideband conferencing
SG98440A1 (en) * 2001-01-16 2003-09-19 Reuters Ltd Method and apparatus for a financial database structure
US20020133477A1 (en) * 2001-03-05 2002-09-19 Glenn Abel Method for profile-based notice and broadcast of multimedia content
ATE335195T1 (en) * 2001-05-10 2006-08-15 Koninkl Philips Electronics Nv BACKGROUND LEARNING OF SPEAKER VOICES
US6973428B2 (en) 2001-05-24 2005-12-06 International Business Machines Corporation System and method for searching, analyzing and displaying text transcripts of speech after imperfect speech recognition
US6778979B2 (en) * 2001-08-13 2004-08-17 Xerox Corporation System for automatically generating queries
US6748350B2 (en) * 2001-09-27 2004-06-08 Intel Corporation Method to compensate for stress between heat spreader and thermal interface material
US6708148B2 (en) * 2001-10-12 2004-03-16 Koninklijke Philips Electronics N.V. Correction device to mark parts of a recognized text
US20030093580A1 (en) * 2001-11-09 2003-05-15 Koninklijke Philips Electronics N.V. Method and system for information alerts
US7165024B2 (en) * 2002-02-22 2007-01-16 Nec Laboratories America, Inc. Inferring hierarchical descriptions of a set of documents
US7668816B2 (en) * 2002-06-11 2010-02-23 Microsoft Corporation Dynamically updated quick searches and strategies
US7131117B2 (en) 2002-09-04 2006-10-31 Sbc Properties, L.P. Method and system for automating the analysis of word frequencies
US7580838B2 (en) 2002-11-22 2009-08-25 Scansoft, Inc. Automatic insertion of non-verbalized punctuation

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6024571A (en) * 1996-04-25 2000-02-15 Renegar; Janet Elaine Foreign language communication system/device and learning aid
US6714911B2 (en) * 2001-01-25 2004-03-30 Harcourt Assessment, Inc. Speech transcription and analysis system and method
US6999918B2 (en) * 2002-09-20 2006-02-14 Motorola, Inc. Method and apparatus to facilitate correlating symbols to sounds

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7272558B1 (en) 2006-12-01 2007-09-18 Coveo Solutions Inc. Speech recognition training method for audio and video file indexing on a search engine
US20100250240A1 (en) * 2009-03-30 2010-09-30 Adacel Systems, Inc. System and method for training an acoustic model with reduced feature space variation
US8301446B2 (en) * 2009-03-30 2012-10-30 Adacel Systems, Inc. System and method for training an acoustic model with reduced feature space variation

Also Published As

Publication number Publication date
US20040083104A1 (en) 2004-04-29
US7424427B2 (en) 2008-09-09
US20040083090A1 (en) 2004-04-29
US7389229B2 (en) 2008-06-17
US20050038649A1 (en) 2005-02-17
US20040163034A1 (en) 2004-08-19
US20040204939A1 (en) 2004-10-14
US20040230432A1 (en) 2004-11-18
US20040172250A1 (en) 2004-09-02
US20040138894A1 (en) 2004-07-15
US7292977B2 (en) 2007-11-06

Similar Documents

Publication Publication Date Title
US20040176946A1 (en) Pronunciation symbols based on the orthographic lexicon of a language
US7421387B2 (en) Dynamic N-best algorithm to reduce recognition errors
US6910012B2 (en) Method and system for speech recognition using phonetically similar word alternatives
US6327566B1 (en) Method and apparatus for correcting misinterpreted voice commands in a speech recognition system
US6694296B1 (en) Method and apparatus for the recognition of spelled spoken words
US7139698B1 (en) System and method for generating morphemes
US20040024585A1 (en) Linguistic segmentation of speech
US7062436B1 (en) Word-specific acoustic models in a speech recognition system
US6553342B1 (en) Tone based speech recognition
WO2007097176A1 (en) Speech recognition dictionary making supporting system, speech recognition dictionary making supporting method, and speech recognition dictionary making supporting program
JP2002287787A (en) Disambiguation language model
Kirchhoff et al. Novel speech recognition models for Arabic
JP4499389B2 (en) Method and apparatus for generating decision tree questions for speech processing
US20050187767A1 (en) Dynamic N-best algorithm to reduce speech recognition errors
US7302381B2 (en) Specifying arbitrary words in rule-based grammars
US6963832B2 (en) Meaning token dictionary for automatic speech recognition
Szarvas et al. Finite-state transducer based modeling of morphosyntax with applications to Hungarian LVCSR
Toma et al. MaRePhoR—An open access machine-readable phonetic dictionary for Romanian
US7206738B2 (en) Hybrid baseform generation
Jeon et al. Automatic generation of Korean pronunciation variants by multistage applications of phonological rules.
Nair et al. Rule-based grapheme to phoneme converter for malayalam
US6772116B2 (en) Method of decoding telegraphic speech
KR20050101695A (en) A system for statistical speech recognition using recognition results, and method thereof
Sazhok et al. Punctuation Restoration for Ukrainian Broadcast Speech Recognition System based on Bidirectional Recurrent Neural Network and Word Embeddings.
Azim et al. Using Character-Level Sequence-to-Sequence Model for Word Level Text Generation to Enhance Arabic Speech Recognition

Legal Events

Date Code Title Description
AS Assignment

Owner name: BBNT SOLUTIONS LLC, TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BILLA, JAYADEV;KUBALA, FRANCIS;REEL/FRAME:014612/0832;SIGNING DATES FROM 20031003 TO 20031007

AS Assignment

Owner name: FLEET NATIONAL BANK, AS AGENT, MASSACHUSETTS

Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196

Effective date: 20040326

Owner name: FLEET NATIONAL BANK, AS AGENT,MASSACHUSETTS

Free format text: PATENT & TRADEMARK SECURITY AGREEMENT;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:014624/0196

Effective date: 20040326

AS Assignment

Owner name: BBNT SOLUTIONS LLC, MASSACHUSETTS

Free format text: CORRECTIVE DOCUMENT PREVIOUSLY RECORDED AT REEL 014612 FRAME 0832. (ASSIGNMENT OF ASSIGNOR'S INTEREST);ASSIGNORS:BILLA, JAYADEV;KUBALA, FRANCIS G.;REEL/FRAME:015847/0061;SIGNING DATES FROM 20031003 TO 20031007

AS Assignment

Owner name: BBN TECHNOLOGIES CORP.,MASSACHUSETTS

Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318

Effective date: 20060103

Owner name: BBN TECHNOLOGIES CORP., MASSACHUSETTS

Free format text: MERGER;ASSIGNOR:BBNT SOLUTIONS LLC;REEL/FRAME:017274/0318

Effective date: 20060103

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: BBN TECHNOLOGIES CORP. (AS SUCCESSOR BY MERGER TO

Free format text: RELEASE OF SECURITY INTEREST;ASSIGNOR:BANK OF AMERICA, N.A. (SUCCESSOR BY MERGER TO FLEET NATIONAL BANK);REEL/FRAME:023427/0436

Effective date: 20091026