CA2119397C - Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation - Google Patents
Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation Download PDFInfo
- Publication number
- CA2119397C CA2119397C CA002119397A CA2119397A CA2119397C CA 2119397 C CA2119397 C CA 2119397C CA 002119397 A CA002119397 A CA 002119397A CA 2119397 A CA2119397 A CA 2119397A CA 2119397 C CA2119397 C CA 2119397C
- Authority
- CA
- Canada
- Prior art keywords
- prosodic
- text
- salience
- major
- indicia
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 230000015572 biosynthetic process Effects 0.000 title claims abstract 20
- 238000003786 synthesis reaction Methods 0.000 title claims abstract 20
- 238000000034 method Methods 0.000 claims abstract 2
- 230000002194 synthesizing effect Effects 0.000 claims 3
- 230000003247 decreasing effect Effects 0.000 claims 1
- 206010022000 influenza Diseases 0.000 claims 1
- 238000007493 shaping process Methods 0.000 abstract 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Abstract
Improved automated synthesis of human audible speech from text is disclosed. Performance enhancement of the underlying text comprehensibility is obtained through prosodic treatment of the synthesized material, improved speaking rate treatment, and improved methods of spelling words or terms for the system user. Prosodic shaping of text sequences appropriate for the discourse in large groupings of text segments, with prosodic boundaries developed to indicate conceptual units within the text groupings, is implemented in a preferred embodiment.
Claims (22)
1. An automated system for synthesizing human audible speech from machine-readable representation of text wherein the system employs a synthesis device which has been designed for use with unrestricted text, said system including a prosody indicia generating means for automatically providing indicia of the text prosody to the synthesis device, said indicia being interpretable and executable by that device, and assigned on the basis of predetermined characteristics of restricted text, and wherein the prosody indicia are generated by identifying major prosodic groupings by utilizing major demarcation features to define the beginning and end of the major prosodic groupings.
2. The synthesis system of Claim 1 wherein the indicia are generated by prosody rules associated with predetermined discourse constraints particular to the context of the synthesis of the text.
3. The system of Claim 2 wherein the restricted text consists of name and address information.
4. The system of Claim 3 wherein name and address information is arranged into fields containing respectively names and addresses associated with a telephone number or numbers.
5. The system of Claim 1, 2, 3 or 4 wherein the prosody indicia are further generated by:
a) identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, b) within the prosodic subgroupings, identifying prosodically separable subgroup components, and c) generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that (i) the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) thereafter the first generated salience signifiers are modified to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) the salience signifiers are subsequently further modified to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
a) identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, b) within the prosodic subgroupings, identifying prosodically separable subgroup components, and c) generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that (i) the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) thereafter the first generated salience signifiers are modified to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) the salience signifiers are subsequently further modified to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
6. The system of Claim 5 wherein the subgroup components are identified by:
a) identifying textual indicators which mark divisions of text groupings around them, b) utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
a) identifying textual indicators which mark divisions of text groupings around them, b) utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
7. In the speech synthesis system of Claims 1, 2, 3, 4, or 6, said system having means for allowing users to obtain repeats of text segments and having means for adjusting a rate of annunciation of the synthesized segments of text by:
a) changing the rate of annunciation of a text segment after a first number of successive repeats of that segment for a first user, and b) decreasing the rate of annunciation of a further text segment for a subsequent number of successive repeats of that further text segment for the first user, and increasing the rate of annunciation if no repeats are requested by that user, and c) adjusting the initial annunciation rate for subsequent users in response to the number of consecutive prior users for whom the rate of annunciation of text had been altered.
a) changing the rate of annunciation of a text segment after a first number of successive repeats of that segment for a first user, and b) decreasing the rate of annunciation of a further text segment for a subsequent number of successive repeats of that further text segment for the first user, and increasing the rate of annunciation if no repeats are requested by that user, and c) adjusting the initial annunciation rate for subsequent users in response to the number of consecutive prior users for whom the rate of annunciation of text had been altered.
8. An automated synthesis system wherein human audible speech is synthesized from text by a synthesis device in accordance with indicia of text prosody derived from rules relating to the underlying discourse context of the synthesis, said prosody indicia including features generated by:
a) identifying major prosodic groupings by utilizing major demarcation features to define the beginning and end of the major prosodic groupings;
b) identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, c) within the prosodic subgroupings, identifying prosodically separable subgroup components, and d) generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that i) the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) modifying the first generated salience signifiers to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) further modifying the salience signifiers to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
a) identifying major prosodic groupings by utilizing major demarcation features to define the beginning and end of the major prosodic groupings;
b) identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, c) within the prosodic subgroupings, identifying prosodically separable subgroup components, and d) generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that i) the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) modifying the first generated salience signifiers to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) further modifying the salience signifiers to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
9. The system of Claim 8 wherein the subgroup components are isolated by:
a) identifying textual indicators which mark relations of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying relational words that are not predetermined textual markers, nouns, or qualifiers of nouns.
a) identifying textual indicators which mark relations of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying relational words that are not predetermined textual markers, nouns, or qualifiers of nouns.
10. The system of Claims 8 or 9 wherein the salience signifiers are indicia of pitch.
11. An automated system for synthesizing human audible speech from machine readable representation of restricted text having predetermined characteristics wherein the system employs a synthesis device which has been designed for use with unrestricted text, having a prosody indicia generator means for providing indicia of the text prosody to the synthesis device, said indicia being interpretable and executable by that device, and assigned on the basis of predetermined discourse constraints particular to the context of the synthesis of the text, and wherein the prosody indicia are generated by identifying major prosodic groupings by utilizing major demarcation features to define the beginning and end of the major prosodic groupings.
12. The system of Claim 11 wherein the restricted text consists of name and address information.
13. The system of Claim 12 wherein the name and address information is arranged into fields containing respectively names and addresses associated with a telephone number or numbers.
14. The system of Claim 11, 12, or 13 wherein the prosody indicia are further generated according to the following method:
a) ~identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, b) ~within the prosodic subgroupings, identifying prosodically separable subgroup components, and c) ~generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that i) ~the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) ~the first generated salience signifiers are modified to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) ~the salience signifiers are further modified to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
a) ~identifying prosodic subgroupings within the major prosodic groupings according to prosodic rules for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings, b) ~within the prosodic subgroupings, identifying prosodically separable subgroup components, and c) ~generating prosody indicia which include salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech such that i) ~the salience signifiers within the prosodic subgroupings are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii) ~the first generated salience signifiers are modified to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) ~the salience signifiers are further modified to further increase the salience of the beginning of the major prosodic grouping and further signify the salience of the end of the major prosodic grouping.
15. The system of Claim 14 wherein the subgroup components are identified by:
a) ~identifying textual indicators which mark divisions of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
a) ~identifying textual indicators which mark divisions of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
16. The system of Claim 14 wherein the salience signifiers are indicia of pitch.
17. The system of Claim 15 wherein the salience signifiers are indicia of pitch.
18. The system of Claims 1 or 11 wherein prosodic subgroupings are identified within the major prosodic groupings according to prosodic flues for analyzing the text for predetermined textual markers indicative of prosodically isolatible subgroupings not delineated by the major demarcations dividing the prosodic major groupings.
19. The system of Claim 18 wherein within the prosodic subgroupings, prosodically separable subgroup components are identified by:
a) ~identifying textual indicators which mark divisions of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
a) ~identifying textual indicators which mark divisions of text groupings around them, b) ~utilizing the textual indicators to separate the text within the prosodic subgrouping into units of nominal text which do not include the said predetermined textual markers, and c) ~within the units of nominal text, identifying other indicators of textual groupings that are not predetermined textual markers of divisions, identifying nouns, and identifying qualifiers of nouns.
20. The system of Claim 19 wherein salience signifiers utilizable by the synthesis device to vary the salience of segments of the synthesized speech are generated such that (i) ~the salience signifiers within the prosodic subgrouping are first generated in accordance with predetermined salience placement rules solely relating to the components themselves, (ii)~thereafter the first generated salience signifiers are modified to increase the salience at the start of the prosodic subgroup and further signify the salience at the end of the prosodic subgroup, and (iii) ~the salience signifiers are further modified to further increase the salience of the major prosodic grouping and further signify the salience at the end of the major prosodic grouping.
21. The system of Claim 20 wherein the salience signifiers are indicia of pitch.
22. An automated system for synthesizing human audible speech from machine-readable representation of text wherein the system employs a synthesis device which has been designed for use with unrestricted text, said system including a prosody indicia generating means for automatically providing indicia of the text prosody to the synthesis device, said indicia being interpretable and executable by that device, and assigned on the basis of predetermined characteristics of restricted text, and wherein the indicia are generated by prosody rules associated with predetermined discourse constraints particular to the context of the synthesis of the text.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CA2594071A CA2594071C (en) | 1993-03-19 | 1994-03-18 | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
CA2594073A CA2594073C (en) | 1993-03-19 | 1994-03-18 | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
US10/253,820 US20040057548A1 (en) | 1994-03-18 | 2002-09-25 | Quasi-synchronous multi-stage event synchronization apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US3352893A | 1993-03-19 | 1993-03-19 | |
US08/033,528 | 1993-03-19 |
Related Child Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA2594071A Division CA2594071C (en) | 1993-03-19 | 1994-03-18 | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
CA2594073A Division CA2594073C (en) | 1993-03-19 | 1994-03-18 | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2119397A1 CA2119397A1 (en) | 1994-09-20 |
CA2119397C true CA2119397C (en) | 2007-10-02 |
Family
ID=21870928
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002119397A Expired - Lifetime CA2119397C (en) | 1993-03-19 | 1994-03-18 | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation |
Country Status (2)
Country | Link |
---|---|
US (6) | US5652828A (en) |
CA (1) | CA2119397C (en) |
Families Citing this family (319)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR950704772A (en) * | 1993-10-15 | 1995-11-20 | 데이비드 엠. 로젠블랫 | A method for training a system, the resulting apparatus, and method of use |
KR0153380B1 (en) * | 1995-10-28 | 1998-11-16 | 김광호 | Apparatus and method for guiding voice information of telephone switch |
DE69722277T2 (en) * | 1996-01-31 | 2004-04-01 | Canon K.K. | Billing device and an information distribution system using the billing device |
US5943648A (en) * | 1996-04-25 | 1999-08-24 | Lernout & Hauspie Speech Products N.V. | Speech signal distribution system providing supplemental parameter associated data |
US5832433A (en) * | 1996-06-24 | 1998-11-03 | Nynex Science And Technology, Inc. | Speech synthesis method for operator assistance telecommunications calls comprising a plurality of text-to-speech (TTS) devices |
JPH10153998A (en) * | 1996-09-24 | 1998-06-09 | Nippon Telegr & Teleph Corp <Ntt> | Auxiliary information utilizing type voice synthesizing method, recording medium recording procedure performing this method, and device performing this method |
US6961700B2 (en) * | 1996-09-24 | 2005-11-01 | Allvoice Computing Plc | Method and apparatus for processing the output of a speech recognition engine |
US6006187A (en) * | 1996-10-01 | 1999-12-21 | Lucent Technologies Inc. | Computer prosody user interface |
US5950162A (en) * | 1996-10-30 | 1999-09-07 | Motorola, Inc. | Method, device and system for generating segment durations in a text-to-speech system |
US6498921B1 (en) | 1999-09-01 | 2002-12-24 | Chi Fai Ho | Method and system to answer a natural-language question |
US5836771A (en) * | 1996-12-02 | 1998-11-17 | Ho; Chi Fai | Learning method and system based on questioning |
US5875427A (en) * | 1996-12-04 | 1999-02-23 | Justsystem Corp. | Voice-generating/document making apparatus voice-generating/document making method and computer-readable medium for storing therein a program having a computer execute voice-generating/document making sequence |
US6108630A (en) * | 1997-12-23 | 2000-08-22 | Nortel Networks Corporation | Text-to-speech driven annunciation of caller identification |
KR100236974B1 (en) | 1996-12-13 | 2000-02-01 | 정선종 | Sync. system between motion picture and text/voice converter |
US5915237A (en) * | 1996-12-13 | 1999-06-22 | Intel Corporation | Representing speech using MIDI |
JPH10260692A (en) * | 1997-03-18 | 1998-09-29 | Toshiba Corp | Method and system for recognition synthesis encoding and decoding of speech |
US6092044A (en) * | 1997-03-28 | 2000-07-18 | Dragon Systems, Inc. | Pronunciation generation in speech recognition |
KR100240637B1 (en) * | 1997-05-08 | 2000-01-15 | 정선종 | Syntax for tts input data to synchronize with multimedia |
JPH10319947A (en) * | 1997-05-15 | 1998-12-04 | Kawai Musical Instr Mfg Co Ltd | Pitch extent controller |
US6226614B1 (en) | 1997-05-21 | 2001-05-01 | Nippon Telegraph And Telephone Corporation | Method and apparatus for editing/creating synthetic speech message and recording medium with the method recorded thereon |
BE1011892A3 (en) * | 1997-05-22 | 2000-02-01 | Motorola Inc | Method, device and system for generating voice synthesis parameters from information including express representation of intonation. |
JPH1138989A (en) * | 1997-07-14 | 1999-02-12 | Toshiba Corp | Device and method for voice synthesis |
JP3195279B2 (en) * | 1997-08-27 | 2001-08-06 | インターナショナル・ビジネス・マシーンズ・コーポレ−ション | Audio output system and method |
KR100238189B1 (en) * | 1997-10-16 | 2000-01-15 | 윤종용 | Multi-language tts device and method |
GB9723813D0 (en) * | 1997-11-11 | 1998-01-07 | Mitel Corp | Call routing based on caller's mood |
JP4267101B2 (en) * | 1997-11-17 | 2009-05-27 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Voice identification device, pronunciation correction device, and methods thereof |
JP2000163418A (en) * | 1997-12-26 | 2000-06-16 | Canon Inc | Processor and method for natural language processing and storage medium stored with program thereof |
JPH11265195A (en) * | 1998-01-14 | 1999-09-28 | Sony Corp | Information distribution system, information transmitter, information receiver and information distributing method |
EP1051701B1 (en) * | 1998-02-03 | 2002-11-06 | Siemens Aktiengesellschaft | Method for voice data transmission |
US6076060A (en) * | 1998-05-01 | 2000-06-13 | Compaq Computer Corporation | Computer method and apparatus for translating text to sound |
US6446040B1 (en) * | 1998-06-17 | 2002-09-03 | Yahoo! Inc. | Intelligent text-to-speech synthesis |
US6236967B1 (en) * | 1998-06-19 | 2001-05-22 | At&T Corp. | Tone and speech recognition in communications systems |
US6321226B1 (en) * | 1998-06-30 | 2001-11-20 | Microsoft Corporation | Flexible keyboard searching |
US6490563B2 (en) * | 1998-08-17 | 2002-12-03 | Microsoft Corporation | Proofreading with text to speech feedback |
US6338038B1 (en) * | 1998-09-02 | 2002-01-08 | International Business Machines Corp. | Variable speed audio playback in speech recognition proofreader |
US7272604B1 (en) * | 1999-09-03 | 2007-09-18 | Atle Hedloy | Method, system and computer readable medium for addressing handling from an operating system |
NO984066L (en) * | 1998-09-03 | 2000-03-06 | Arendi As | Computer function button |
DE19908137A1 (en) | 1998-10-16 | 2000-06-15 | Volkswagen Ag | Method and device for automatic control of at least one device by voice dialog |
US6188984B1 (en) * | 1998-11-17 | 2001-02-13 | Fonix Corporation | Method and system for syllable parsing |
US6260016B1 (en) | 1998-11-25 | 2001-07-10 | Matsushita Electric Industrial Co., Ltd. | Speech synthesis employing prosody templates |
US6208968B1 (en) | 1998-12-16 | 2001-03-27 | Compaq Computer Corporation | Computer method and apparatus for text-to-speech synthesizer dictionary reduction |
US6363342B2 (en) * | 1998-12-18 | 2002-03-26 | Matsushita Electric Industrial Co., Ltd. | System for developing word-pronunciation pairs |
US6400809B1 (en) * | 1999-01-29 | 2002-06-04 | Ameritech Corporation | Method and system for text-to-speech conversion of caller information |
US6185533B1 (en) | 1999-03-15 | 2001-02-06 | Matsushita Electric Industrial Co., Ltd. | Generation and synthesis of prosody templates |
CA2366952A1 (en) * | 1999-03-15 | 2000-09-21 | British Telecommunications Public Limited Company | Speech synthesis |
US6178402B1 (en) | 1999-04-29 | 2001-01-23 | Motorola, Inc. | Method, apparatus and system for generating acoustic parameters in a text-to-speech system using a neural network |
US6321196B1 (en) * | 1999-07-02 | 2001-11-20 | International Business Machines Corporation | Phonetic spelling for speech recognition |
US7013300B1 (en) | 1999-08-03 | 2006-03-14 | Taylor David C | Locating, filtering, matching macro-context from indexed database for searching context where micro-context relevant to textual input by user |
US7219073B1 (en) * | 1999-08-03 | 2007-05-15 | Brandnamestores.Com | Method for extracting information utilizing a user-context-based search engine |
US6622121B1 (en) | 1999-08-20 | 2003-09-16 | International Business Machines Corporation | Testing speech recognition systems using test data generated by text-to-speech conversion |
GB2353887B (en) * | 1999-09-04 | 2003-09-24 | Ibm | Speech recognition system |
US6807574B1 (en) | 1999-10-22 | 2004-10-19 | Tellme Networks, Inc. | Method and apparatus for content personalization over a telephone interface |
US7941481B1 (en) | 1999-10-22 | 2011-05-10 | Tellme Networks, Inc. | Updating an electronic phonebook over electronic communication networks |
GB2357943B (en) * | 1999-12-30 | 2004-12-08 | Nokia Mobile Phones Ltd | User interface for text to speech conversion |
US6571240B1 (en) | 2000-02-02 | 2003-05-27 | Chi Fai Ho | Information processing for searching categorizing information in a document based on a categorization hierarchy and extracted phrases |
JP2001293247A (en) * | 2000-02-07 | 2001-10-23 | Sony Computer Entertainment Inc | Game control method |
US7010489B1 (en) * | 2000-03-09 | 2006-03-07 | International Business Mahcines Corporation | Method for guiding text-to-speech output timing using speech recognition markers |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
US6272464B1 (en) * | 2000-03-27 | 2001-08-07 | Lucent Technologies Inc. | Method and apparatus for assembling a prediction list of name pronunciation variations for use during speech recognition |
US6697781B1 (en) * | 2000-04-17 | 2004-02-24 | Adobe Systems Incorporated | Method and apparatus for generating speech from an electronic form |
US7062098B1 (en) | 2000-05-12 | 2006-06-13 | International Business Machines Corporation | Method and apparatus for the scaling down of data |
US6970179B1 (en) | 2000-05-12 | 2005-11-29 | International Business Machines Corporation | Method and apparatus for the scaling up of data |
US20020120451A1 (en) * | 2000-05-31 | 2002-08-29 | Yumiko Kato | Apparatus and method for providing information by speech |
DE10031008A1 (en) * | 2000-06-30 | 2002-01-10 | Nokia Mobile Phones Ltd | Procedure for assembling sentences for speech output |
US7143039B1 (en) | 2000-08-11 | 2006-11-28 | Tellme Networks, Inc. | Providing menu and other services for an information processing system using a telephone or other audio interface |
US7092928B1 (en) * | 2000-07-31 | 2006-08-15 | Quantum Leap Research, Inc. | Intelligent portal engine |
US7269557B1 (en) * | 2000-08-11 | 2007-09-11 | Tellme Networks, Inc. | Coarticulated concatenated speech |
US7406657B1 (en) * | 2000-09-22 | 2008-07-29 | International Business Machines Corporation | Audible presentation and verbal interaction of HTML-like form constructs |
US6978239B2 (en) * | 2000-12-04 | 2005-12-20 | Microsoft Corporation | Method and apparatus for speech synthesis without prosody modification |
US7263488B2 (en) * | 2000-12-04 | 2007-08-28 | Microsoft Corporation | Method and apparatus for identifying prosodic word boundaries |
US6845358B2 (en) * | 2001-01-05 | 2005-01-18 | Matsushita Electric Industrial Co., Ltd. | Prosody template matching for text-to-speech systems |
US6845356B1 (en) * | 2001-01-31 | 2005-01-18 | International Business Machines Corporation | Processing dual tone multi-frequency signals for use with a natural language understanding system |
US6876968B2 (en) * | 2001-03-08 | 2005-04-05 | Matsushita Electric Industrial Co., Ltd. | Run time synthesizer adaptation to improve intelligibility of synthesized speech |
US6915261B2 (en) * | 2001-03-16 | 2005-07-05 | Intel Corporation | Matching a synthetic disc jockey's voice characteristics to the sound characteristics of audio programs |
US7177810B2 (en) * | 2001-04-10 | 2007-02-13 | Sri International | Method and apparatus for performing prosody-based endpointing of a speech signal |
US7020663B2 (en) * | 2001-05-30 | 2006-03-28 | George M. Hay | System and method for the delivery of electronic books |
JP4680429B2 (en) * | 2001-06-26 | 2011-05-11 | Okiセミコンダクタ株式会社 | High speed reading control method in text-to-speech converter |
GB2378877B (en) * | 2001-08-14 | 2005-04-13 | Vox Generation Ltd | Prosodic boundary markup mechanism |
US7069221B2 (en) * | 2001-10-26 | 2006-06-27 | Speechworks International, Inc. | Non-target barge-in detection |
US20030101045A1 (en) * | 2001-11-29 | 2003-05-29 | Peter Moffatt | Method and apparatus for playing recordings of spoken alphanumeric characters |
JP2003186490A (en) * | 2001-12-21 | 2003-07-04 | Nissan Motor Co Ltd | Text voice read-aloud device and information providing system |
US20040030554A1 (en) * | 2002-01-09 | 2004-02-12 | Samya Boxberger-Oberoi | System and method for providing locale-specific interpretation of text data |
US7177814B2 (en) * | 2002-02-07 | 2007-02-13 | Sap Aktiengesellschaft | Dynamic grammar for voice-enabled applications |
JP4150198B2 (en) * | 2002-03-15 | 2008-09-17 | ソニー株式会社 | Speech synthesis method, speech synthesis apparatus, program and recording medium, and robot apparatus |
KR100446627B1 (en) * | 2002-03-29 | 2004-09-04 | 삼성전자주식회사 | Apparatus for providing information using voice dialogue interface and method thereof |
US7136818B1 (en) | 2002-05-16 | 2006-11-14 | At&T Corp. | System and method of providing conversational visual prosody for talking heads |
US7076430B1 (en) | 2002-05-16 | 2006-07-11 | At&T Corp. | System and method of providing conversational visual prosody for talking heads |
US7305340B1 (en) * | 2002-06-05 | 2007-12-04 | At&T Corp. | System and method for configuring voice synthesis |
US7143037B1 (en) * | 2002-06-12 | 2006-11-28 | Cisco Technology, Inc. | Spelling words using an arbitrary phonetic alphabet |
US7386449B2 (en) | 2002-12-11 | 2008-06-10 | Voice Enabling Systems Technology Inc. | Knowledge-based flexible natural speech dialogue system |
US7324944B2 (en) * | 2002-12-12 | 2008-01-29 | Brigham Young University, Technology Transfer Office | Systems and methods for dynamically analyzing temporality in speech |
US8285537B2 (en) * | 2003-01-31 | 2012-10-09 | Comverse, Inc. | Recognition of proper nouns using native-language pronunciation |
US7496498B2 (en) * | 2003-03-24 | 2009-02-24 | Microsoft Corporation | Front-end architecture for a multi-lingual text-to-speech system |
US7313523B1 (en) * | 2003-05-14 | 2007-12-25 | Apple Inc. | Method and apparatus for assigning word prominence to new or previous information in speech synthesis |
US20050027523A1 (en) * | 2003-07-31 | 2005-02-03 | Prakairut Tarlton | Spoken language system |
JP3984207B2 (en) * | 2003-09-04 | 2007-10-03 | 株式会社東芝 | Speech recognition evaluation apparatus, speech recognition evaluation method, and speech recognition evaluation program |
US8886538B2 (en) * | 2003-09-26 | 2014-11-11 | Nuance Communications, Inc. | Systems and methods for text-to-speech synthesis using spoken example |
US8103505B1 (en) | 2003-11-19 | 2012-01-24 | Apple Inc. | Method and apparatus for speech synthesis using paralinguistic variation |
US7349836B2 (en) * | 2003-12-12 | 2008-03-25 | International Business Machines Corporation | Method and process to generate real time input/output in a voice XML run-time simulation environment |
US8583439B1 (en) * | 2004-01-12 | 2013-11-12 | Verizon Services Corp. | Enhanced interface for use with speech recognition |
AU2005207606B2 (en) * | 2004-01-16 | 2010-11-11 | Nuance Communications, Inc. | Corpus-based speech synthesis based on segment recombination |
WO2005076258A1 (en) * | 2004-02-03 | 2005-08-18 | Matsushita Electric Industrial Co., Ltd. | User adaptive type device and control method thereof |
US7542903B2 (en) * | 2004-02-18 | 2009-06-02 | Fuji Xerox Co., Ltd. | Systems and methods for determining predictive models of discourse functions |
US20050187772A1 (en) * | 2004-02-25 | 2005-08-25 | Fuji Xerox Co., Ltd. | Systems and methods for synthesizing speech using discourse function level prosodic features |
US20050234724A1 (en) * | 2004-04-15 | 2005-10-20 | Andrew Aaron | System and method for improving text-to-speech software intelligibility through the detection of uncommon words and phrases |
KR100590553B1 (en) * | 2004-05-21 | 2006-06-19 | 삼성전자주식회사 | Method and apparatus for generating dialog prosody structure and speech synthesis method and system employing the same |
US7788098B2 (en) * | 2004-08-02 | 2010-08-31 | Nokia Corporation | Predicting tone pattern information for textual information used in telecommunication systems |
US7580837B2 (en) | 2004-08-12 | 2009-08-25 | At&T Intellectual Property I, L.P. | System and method for targeted tuning module of a speech recognition system |
US20080154601A1 (en) * | 2004-09-29 | 2008-06-26 | Microsoft Corporation | Method and system for providing menu and other services for an information processing system using a telephone or other audio interface |
US7242751B2 (en) | 2004-12-06 | 2007-07-10 | Sbc Knowledge Ventures, L.P. | System and method for speech recognition-enabled automatic call routing |
US7751551B2 (en) | 2005-01-10 | 2010-07-06 | At&T Intellectual Property I, L.P. | System and method for speech-enabled call routing |
US7627096B2 (en) * | 2005-01-14 | 2009-12-01 | At&T Intellectual Property I, L.P. | System and method for independently recognizing and selecting actions and objects in a speech recognition system |
US7792264B2 (en) * | 2005-03-23 | 2010-09-07 | Alcatel-Lucent Usa Inc. | Ring tone selected by calling party of third party played to called party |
JP4570509B2 (en) * | 2005-04-22 | 2010-10-27 | 富士通株式会社 | Reading generation device, reading generation method, and computer program |
US20060245641A1 (en) * | 2005-04-29 | 2006-11-02 | Microsoft Corporation | Extracting data from semi-structured information utilizing a discriminative context free grammar |
US20080294433A1 (en) * | 2005-05-27 | 2008-11-27 | Minerva Yeung | Automatic Text-Speech Mapping Tool |
US7657020B2 (en) | 2005-06-03 | 2010-02-02 | At&T Intellectual Property I, Lp | Call routing system and method of using the same |
JP2007024960A (en) * | 2005-07-12 | 2007-02-01 | Internatl Business Mach Corp <Ibm> | System, program and control method |
US8429167B2 (en) | 2005-08-08 | 2013-04-23 | Google Inc. | User-context-based search engine |
US8027876B2 (en) | 2005-08-08 | 2011-09-27 | Yoogli, Inc. | Online advertising valuation apparatus and method |
US8977636B2 (en) * | 2005-08-19 | 2015-03-10 | International Business Machines Corporation | Synthesizing aggregate data of disparate data types into data of a uniform data type |
US20070055526A1 (en) * | 2005-08-25 | 2007-03-08 | International Business Machines Corporation | Method, apparatus and computer program product providing prosodic-categorical enhancement to phrase-spliced text-to-speech synthesis |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
TWI277947B (en) * | 2005-09-14 | 2007-04-01 | Delta Electronics Inc | Interactive speech correcting method |
CN1945693B (en) * | 2005-10-09 | 2010-10-13 | 株式会社东芝 | Training rhythm statistic model, rhythm segmentation and voice synthetic method and device |
US20070094270A1 (en) * | 2005-10-21 | 2007-04-26 | Callminer, Inc. | Method and apparatus for the processing of heterogeneous units of work |
US8694319B2 (en) * | 2005-11-03 | 2014-04-08 | International Business Machines Corporation | Dynamic prosody adjustment for voice-rendering synthesized data |
US20070162430A1 (en) * | 2005-12-30 | 2007-07-12 | Katja Bader | Context display of search results |
JP4822847B2 (en) * | 2006-01-10 | 2011-11-24 | アルパイン株式会社 | Audio conversion processor |
US8509563B2 (en) | 2006-02-02 | 2013-08-13 | Microsoft Corporation | Generation of documents from images |
US9135339B2 (en) * | 2006-02-13 | 2015-09-15 | International Business Machines Corporation | Invoking an audio hyperlink |
US8036894B2 (en) * | 2006-02-16 | 2011-10-11 | Apple Inc. | Multi-unit approach to text-to-speech synthesis |
US20090319273A1 (en) * | 2006-06-30 | 2009-12-24 | Nec Corporation | Audio content generation system, information exchanging system, program, audio content generating method, and information exchanging method |
US8280734B2 (en) | 2006-08-16 | 2012-10-02 | Nuance Communications, Inc. | Systems and arrangements for titling audio recordings comprising a lingual translation of the title |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8027837B2 (en) * | 2006-09-15 | 2011-09-27 | Apple Inc. | Using non-speech sounds during text-to-speech synthesis |
US9318100B2 (en) | 2007-01-03 | 2016-04-19 | International Business Machines Corporation | Supplementing audio recorded in a media file |
US8380519B2 (en) * | 2007-01-25 | 2013-02-19 | Eliza Corporation | Systems and techniques for producing spoken voice prompts with dialog-context-optimized speech parameters |
US8626731B2 (en) * | 2007-02-01 | 2014-01-07 | The Invention Science Fund I, Llc | Component information and auxiliary information related to information management |
US8055648B2 (en) * | 2007-02-01 | 2011-11-08 | The Invention Science Fund I, Llc | Managing information related to communication |
JP4672686B2 (en) * | 2007-02-16 | 2011-04-20 | 株式会社デンソー | Voice recognition device and navigation device |
US8719027B2 (en) * | 2007-02-28 | 2014-05-06 | Microsoft Corporation | Name synthesis |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US7895041B2 (en) * | 2007-04-27 | 2011-02-22 | Dickson Craig B | Text to speech interactive voice response system |
US20080282153A1 (en) * | 2007-05-09 | 2008-11-13 | Sony Ericsson Mobile Communications Ab | Text-content features |
JP5029168B2 (en) * | 2007-06-25 | 2012-09-19 | 富士通株式会社 | Apparatus, program and method for reading aloud |
JP5029167B2 (en) * | 2007-06-25 | 2012-09-19 | 富士通株式会社 | Apparatus, program and method for reading aloud |
JP4973337B2 (en) * | 2007-06-28 | 2012-07-11 | 富士通株式会社 | Apparatus, program and method for reading aloud |
US20090083027A1 (en) * | 2007-08-16 | 2009-03-26 | Hollingsworth William A | Automatic text skimming using lexical chains |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
WO2009101837A1 (en) * | 2008-02-13 | 2009-08-20 | Nec Corporation | Mark insertion device and mark insertion method |
US20090209341A1 (en) * | 2008-02-14 | 2009-08-20 | Aruze Gaming America, Inc. | Gaming Apparatus Capable of Conversation with Player and Control Method Thereof |
JP4968147B2 (en) * | 2008-03-31 | 2012-07-04 | 富士通株式会社 | Communication terminal, audio output adjustment method of communication terminal |
EP2107553B1 (en) * | 2008-03-31 | 2011-05-18 | Harman Becker Automotive Systems GmbH | Method for determining barge-in |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
EP2148325B1 (en) * | 2008-07-22 | 2014-10-01 | Nuance Communications, Inc. | Method for determining the presence of a wanted signal component |
US10127231B2 (en) * | 2008-07-22 | 2018-11-13 | At&T Intellectual Property I, L.P. | System and method for rich media annotation |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US20100057465A1 (en) * | 2008-09-03 | 2010-03-04 | David Michael Kirsch | Variable text-to-speech for automotive application |
US8219899B2 (en) | 2008-09-22 | 2012-07-10 | International Business Machines Corporation | Verbal description method and system |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
US8799268B2 (en) * | 2008-12-17 | 2014-08-05 | International Business Machines Corporation | Consolidating tags |
US8494857B2 (en) | 2009-01-06 | 2013-07-23 | Regents Of The University Of Minnesota | Automatic measurement of speech fluency |
US8498866B2 (en) * | 2009-01-15 | 2013-07-30 | K-Nfb Reading Technology, Inc. | Systems and methods for multiple language document narration |
US8719004B2 (en) * | 2009-03-19 | 2014-05-06 | Ditech Networks, Inc. | Systems and methods for punctuating voicemail transcriptions |
JP5269668B2 (en) * | 2009-03-25 | 2013-08-21 | 株式会社東芝 | Speech synthesis apparatus, program, and method |
US20100299621A1 (en) * | 2009-05-20 | 2010-11-25 | Making Everlasting Memories, L.L.C. | System and Method for Extracting a Plurality of Images from a Single Scan |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US20120311585A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Organizing task items that represent tasks to perform |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
GB0922608D0 (en) | 2009-12-23 | 2010-02-10 | Vratskides Alexios | Message optimization |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
WO2011089450A2 (en) | 2010-01-25 | 2011-07-28 | Andrew Peter Nelson Jerram | Apparatuses, methods and systems for a digital conversation management platform |
US8447610B2 (en) | 2010-02-12 | 2013-05-21 | Nuance Communications, Inc. | Method and apparatus for generating synthetic speech with contrastive stress |
US8571870B2 (en) * | 2010-02-12 | 2013-10-29 | Nuance Communications, Inc. | Method and apparatus for generating synthetic speech with contrastive stress |
US8949128B2 (en) * | 2010-02-12 | 2015-02-03 | Nuance Communications, Inc. | Method and apparatus for providing speech output for speech-enabled applications |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
CN102237081B (en) * | 2010-04-30 | 2013-04-24 | 国际商业机器公司 | Method and system for estimating rhythm of voice |
US9798653B1 (en) * | 2010-05-05 | 2017-10-24 | Nuance Communications, Inc. | Methods, apparatus and data structure for cross-language speech adaptation |
US20110313762A1 (en) * | 2010-06-20 | 2011-12-22 | International Business Machines Corporation | Speech output with confidence indication |
US8731939B1 (en) | 2010-08-06 | 2014-05-20 | Google Inc. | Routing queries based on carrier phrase registration |
US9792640B2 (en) | 2010-08-18 | 2017-10-17 | Jinni Media Ltd. | Generating and providing content recommendations to a group of users |
US8688435B2 (en) | 2010-09-22 | 2014-04-01 | Voice On The Go Inc. | Systems and methods for normalizing input media |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
JP4996750B1 (en) * | 2011-01-31 | 2012-08-08 | 株式会社東芝 | Electronics |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US9092131B2 (en) * | 2011-12-13 | 2015-07-28 | Microsoft Technology Licensing, Llc | Highlighting of tappable web page elements |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
CN103295576A (en) * | 2012-03-02 | 2013-09-11 | 腾讯科技(深圳)有限公司 | Voice identification method and terminal of instant communication |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9418649B2 (en) * | 2012-03-06 | 2016-08-16 | Verizon Patent And Licensing Inc. | Method and apparatus for phonetic character conversion |
US9576593B2 (en) | 2012-03-15 | 2017-02-21 | Regents Of The University Of Minnesota | Automated verbal fluency assessment |
US9368104B2 (en) * | 2012-04-30 | 2016-06-14 | Src, Inc. | System and method for synthesizing human speech using multiple speakers and context |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US10395270B2 (en) | 2012-05-17 | 2019-08-27 | Persado Intellectual Property Limited | System and method for recommending a grammar for a message campaign used by a message optimization system |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US9502050B2 (en) | 2012-06-10 | 2016-11-22 | Nuance Communications, Inc. | Noise dependent signal processing for in-car communication systems with multiple acoustic zones |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9536528B2 (en) | 2012-07-03 | 2017-01-03 | Google Inc. | Determining hotword suitability |
CN104704560B (en) | 2012-09-04 | 2018-06-05 | 纽昂斯通讯公司 | The voice signals enhancement that formant relies on |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
JP5999839B2 (en) * | 2012-09-10 | 2016-09-28 | ルネサスエレクトロニクス株式会社 | Voice guidance system and electronic equipment |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US9064318B2 (en) | 2012-10-25 | 2015-06-23 | Adobe Systems Incorporated | Image matting and alpha value techniques |
US9613633B2 (en) | 2012-10-30 | 2017-04-04 | Nuance Communications, Inc. | Speech enhancement |
US9355649B2 (en) | 2012-11-13 | 2016-05-31 | Adobe Systems Incorporated | Sound alignment using timing information |
US10638221B2 (en) | 2012-11-13 | 2020-04-28 | Adobe Inc. | Time interval sound alignment |
US9201580B2 (en) | 2012-11-13 | 2015-12-01 | Adobe Systems Incorporated | Sound alignment user interface |
US9076205B2 (en) | 2012-11-19 | 2015-07-07 | Adobe Systems Incorporated | Edge direction and curve based image de-blurring |
US10249321B2 (en) * | 2012-11-20 | 2019-04-02 | Adobe Inc. | Sound rate modification |
US9451304B2 (en) | 2012-11-29 | 2016-09-20 | Adobe Systems Incorporated | Sound feature priority alignment |
US9135710B2 (en) | 2012-11-30 | 2015-09-15 | Adobe Systems Incorporated | Depth map stereo correspondence techniques |
US10455219B2 (en) | 2012-11-30 | 2019-10-22 | Adobe Inc. | Stereo correspondence and depth sensors |
US9208547B2 (en) | 2012-12-19 | 2015-12-08 | Adobe Systems Incorporated | Stereo correspondence smoothness tool |
US10249052B2 (en) | 2012-12-19 | 2019-04-02 | Adobe Systems Incorporated | Stereo correspondence model fitting |
US9214026B2 (en) | 2012-12-20 | 2015-12-15 | Adobe Systems Incorporated | Belief propagation and affinity measures |
TWI573129B (en) * | 2013-02-05 | 2017-03-01 | 國立交通大學 | Streaming encoder, prosody information encoding device, prosody-analyzing device, and device and method for speech-synthesizing |
KR102516577B1 (en) | 2013-02-07 | 2023-04-03 | 애플 인크. | Voice trigger for a digital assistant |
US9123335B2 (en) * | 2013-02-20 | 2015-09-01 | Jinni Media Limited | System apparatus circuit method and associated computer executable code for natural language understanding and semantic content discovery |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
WO2014144949A2 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | Training an at least partial voice command system |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
WO2014148190A1 (en) * | 2013-03-19 | 2014-09-25 | Necソリューションイノベータ株式会社 | Note-taking assistance system, information delivery device, terminal, note-taking assistance method, and computer-readable recording medium |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
EP3008641A1 (en) | 2013-06-09 | 2016-04-20 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
CN105265005B (en) | 2013-06-13 | 2019-09-17 | 苹果公司 | System and method for the urgent call initiated by voice command |
WO2015020942A1 (en) | 2013-08-06 | 2015-02-12 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
WO2015105994A1 (en) | 2014-01-08 | 2015-07-16 | Callminer, Inc. | Real-time conversational analytics facility |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
EP3149728B1 (en) | 2014-05-30 | 2019-01-16 | Apple Inc. | Multi-command single utterance input method |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9606986B2 (en) | 2014-09-29 | 2017-03-28 | Apple Inc. | Integrated word N-gram and class M-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US9472196B1 (en) | 2015-04-22 | 2016-10-18 | Google Inc. | Developer voice actions system |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10504137B1 (en) | 2015-10-08 | 2019-12-10 | Persado Intellectual Property Limited | System, method, and computer program product for monitoring and responding to the performance of an ad |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10832283B1 (en) | 2015-12-09 | 2020-11-10 | Persado Intellectual Property Limited | System, method, and computer program for providing an instance of a promotional message to a user based on a predicted emotional response corresponding to user characteristics |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US9740751B1 (en) | 2016-02-18 | 2017-08-22 | Google Inc. | Application keywords |
US9922648B2 (en) | 2016-03-01 | 2018-03-20 | Google Llc | Developer voice actions system |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
US9691384B1 (en) | 2016-08-19 | 2017-06-27 | Google Inc. | Voice action biasing system |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10586079B2 (en) * | 2016-12-23 | 2020-03-10 | Soundhound, Inc. | Parametric adaptation of voice synthesis |
WO2018175892A1 (en) * | 2017-03-23 | 2018-09-27 | D&M Holdings, Inc. | System providing expressive and emotive text-to-speech |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US11443646B2 (en) | 2017-12-22 | 2022-09-13 | Fathom Technologies, LLC | E-Reader interface system with audio and highlighting synchronization for digital books |
US10671251B2 (en) | 2017-12-22 | 2020-06-02 | Arbordale Publishing, LLC | Interactive eReader interface generation based on synchronization of textual and audial descriptors |
WO2020230924A1 (en) * | 2019-05-15 | 2020-11-19 | 엘지전자 주식회사 | Speech synthesis apparatus using artificial intelligence, operation method of speech synthesis apparatus, and computer-readable recording medium |
CN112309368A (en) * | 2020-11-23 | 2021-02-02 | 北京有竹居网络技术有限公司 | Prosody prediction method, device, equipment and storage medium |
CN112820289A (en) * | 2020-12-31 | 2021-05-18 | 广东美的厨房电器制造有限公司 | Voice playing method, voice playing system, electric appliance and readable storage medium |
Family Cites Families (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3704345A (en) * | 1971-03-19 | 1972-11-28 | Bell Telephone Labor Inc | Conversion of printed text into synthetic speech |
US4685135A (en) * | 1981-03-05 | 1987-08-04 | Texas Instruments Incorporated | Text-to-speech synthesis system |
NL8200726A (en) * | 1982-02-24 | 1983-09-16 | Philips Nv | DEVICE FOR GENERATING THE AUDITIVE INFORMATION FROM A COLLECTION OF CHARACTERS. |
US4470150A (en) * | 1982-03-18 | 1984-09-04 | Federal Screw Works | Voice synthesizer with automatic pitch and speech rate modulation |
US4624012A (en) * | 1982-05-06 | 1986-11-18 | Texas Instruments Incorporated | Method and apparatus for converting voice characteristics of synthesized speech |
FR2553555B1 (en) * | 1983-10-14 | 1986-04-11 | Texas Instruments France | SPEECH CODING METHOD AND DEVICE FOR IMPLEMENTING IT |
US4797930A (en) * | 1983-11-03 | 1989-01-10 | Texas Instruments Incorporated | constructed syllable pitch patterns from phonological linguistic unit string data |
US4802223A (en) * | 1983-11-03 | 1989-01-31 | Texas Instruments Incorporated | Low data rate speech encoding employing syllable pitch patterns |
US4695962A (en) * | 1983-11-03 | 1987-09-22 | Texas Instruments Incorporated | Speaking apparatus having differing speech modes for word and phrase synthesis |
US4692941A (en) * | 1984-04-10 | 1987-09-08 | First Byte | Real-time text-to-speech conversion system |
US4783811A (en) * | 1984-12-27 | 1988-11-08 | Texas Instruments Incorporated | Method and apparatus for determining syllable boundaries |
US4831654A (en) * | 1985-09-09 | 1989-05-16 | Wang Laboratories, Inc. | Apparatus for making and editing dictionary entries in a text to speech conversion system |
US4829580A (en) * | 1986-03-26 | 1989-05-09 | Telephone And Telegraph Company, At&T Bell Laboratories | Text analysis system with letter sequence recognition and speech stress assignment arrangement |
US4884972A (en) * | 1986-11-26 | 1989-12-05 | Bright Star Technology, Inc. | Speech synchronized animation |
JPS63285598A (en) * | 1987-05-18 | 1988-11-22 | ケイディディ株式会社 | Phoneme connection type parameter rule synthesization system |
GB2207027B (en) * | 1987-07-15 | 1992-01-08 | Matsushita Electric Works Ltd | Voice encoding and composing system |
JP2623586B2 (en) * | 1987-07-31 | 1997-06-25 | 国際電信電話株式会社 | Pitch control method in speech synthesis |
US4908867A (en) * | 1987-11-19 | 1990-03-13 | British Telecommunications Public Limited Company | Speech synthesis |
DE68913669T2 (en) * | 1988-11-23 | 1994-07-21 | Digital Equipment Corp | Pronunciation of names by a synthesizer. |
US4979216A (en) * | 1989-02-17 | 1990-12-18 | Malsheen Bathsheba J | Text to speech synthesis system and method using context dependent vowel allophones |
JPH031200A (en) * | 1989-05-29 | 1991-01-07 | Nec Corp | Regulation type voice synthesizing device |
US5212731A (en) * | 1990-09-17 | 1993-05-18 | Matsushita Electric Industrial Co. Ltd. | Apparatus for providing sentence-final accents in synthesized american english speech |
KR940002854B1 (en) * | 1991-11-06 | 1994-04-04 | 한국전기통신공사 | Sound synthesizing system |
DE69232112T2 (en) * | 1991-11-12 | 2002-03-14 | Fujitsu Ltd | Speech synthesis device |
EP0543329B1 (en) * | 1991-11-18 | 2002-02-06 | Kabushiki Kaisha Toshiba | Speech dialogue system for facilitating human-computer interaction |
US5475796A (en) * | 1991-12-20 | 1995-12-12 | Nec Corporation | Pitch pattern generation apparatus |
JP3083640B2 (en) * | 1992-05-28 | 2000-09-04 | 株式会社東芝 | Voice synthesis method and apparatus |
US5384893A (en) * | 1992-09-23 | 1995-01-24 | Emerson & Stern Associates, Inc. | Method and apparatus for speech synthesis based on prosodic analysis |
US5636325A (en) * | 1992-11-13 | 1997-06-03 | International Business Machines Corporation | Speech synthesis and analysis of dialects |
US5642466A (en) * | 1993-01-21 | 1997-06-24 | Apple Computer, Inc. | Intonation adjustment in text-to-speech systems |
-
1994
- 1994-03-18 CA CA002119397A patent/CA2119397C/en not_active Expired - Lifetime
-
1996
- 1996-03-01 US US08/641,480 patent/US5652828A/en not_active Expired - Lifetime
-
1997
- 1997-01-29 US US08/790,579 patent/US5751906A/en not_active Expired - Lifetime
- 1997-01-29 US US08/790,580 patent/US5749071A/en not_active Expired - Lifetime
- 1997-01-29 US US08/790,581 patent/US5732395A/en not_active Expired - Lifetime
- 1997-01-29 US US08/790,578 patent/US5832435A/en not_active Expired - Lifetime
- 1997-03-14 US US08/818,705 patent/US5890117A/en not_active Expired - Lifetime
Also Published As
Publication number | Publication date |
---|---|
US5652828A (en) | 1997-07-29 |
US5832435A (en) | 1998-11-03 |
CA2119397A1 (en) | 1994-09-20 |
US5751906A (en) | 1998-05-12 |
US5749071A (en) | 1998-05-05 |
US5890117A (en) | 1999-03-30 |
US5732395A (en) | 1998-03-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2119397C (en) | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation | |
Bendor-Samuel | Niger-Congo, Gur | |
AU675591B2 (en) | Speech synthesis | |
CA2221762C (en) | Ideal phonetic unit duration adjustment for text-to-speech system | |
EP0831460A3 (en) | Speech synthesis method utilizing auxiliary information | |
Frankish | Intonation and auditory grouping in immediate serial recall | |
EP0848372A3 (en) | Speech synthesizing system and redundancy-reduced waveform database therefor | |
EP0059880A3 (en) | Text-to-speech synthesis system | |
EP0376501A3 (en) | Speech recognition system | |
CA2076495A1 (en) | Dynamic routing-administration | |
EP0953970A3 (en) | Method and apparatus using decision trees to generate and score multiple pronunciations for a spelled word | |
JPS57158900A (en) | Text voice synthesizer | |
EP0749109A3 (en) | Speech recognition for tonal languages | |
TW314227U (en) | Language-information providing apparatus | |
Browman | Rules for demisyllable synthesis using LINGUA, a language interpreter | |
CA2294027A1 (en) | Method and apparatus for audibly indicating when a predetermined location has been encountered in stored data | |
Chenoweth et al. | Comparative‐Generative Models of a New Guinea Melodic Structure 1 | |
Baumann et al. | On the prosody of German telephone numbers | |
Ao | Non-uniqueness condition and the segmentation of the Chinese syllable | |
CA2594071A1 (en) | Improved automated voice synthesis employing enhanced prosodic treatment of text, spelling of text and rate of annunciation | |
Wagner | The role of prosody in laryngeal neutralization | |
Gates | Forging an American poetry from speech rhythms: Williams after Whitman | |
Gustafson | Transcribing names with foreign origin in the ONOMASTICA project | |
Belhoula | A concept for the synthesis of names | |
Ramsey | The invention of the alphabet and the history of the Korean language |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKEX | Expiry |
Effective date: 20140318 |