EP1273003A1 - Verfahren und vorrichtung zum bestimmen prosodischer markierungen - Google Patents
Verfahren und vorrichtung zum bestimmen prosodischer markierungenInfo
- Publication number
- EP1273003A1 EP1273003A1 EP01940136A EP01940136A EP1273003A1 EP 1273003 A1 EP1273003 A1 EP 1273003A1 EP 01940136 A EP01940136 A EP 01940136A EP 01940136 A EP01940136 A EP 01940136A EP 1273003 A1 EP1273003 A1 EP 1273003A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- neural network
- prosodic
- autoassociators
- input
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Definitions
- the present invention relates to a method for determining prosodic markings and an apparatus for implementing the method.
- prosodic parameters For speech synthesis systems, an essential step is the preparation and structuring of the text for the subsequent generation of the prosody , A two-step approach is used to generate prosodic parameters for speech synthesis systems. In the first stage, prosodic markers or prosodic markings are generated, which are then converted into physical parameters in the second stage.
- phrase boundaries and pitch accents can serve as prosodic markings.
- Phrases are understood to mean groups of words that are generally spoken together within a text, that is, without any interruptions in speech. There are pauses in speech only at the respective ends of the phrases, the phrase boundaries. By inserting such pauses at the phrase boundaries of the synthesized language, their intelligibility and naturalness are significantly increased.
- stage 1 of such a two-stage approach both the stable prediction or determination of phrase boundaries and that of accents pose problems.
- prosodic markings are determined by a neural network on the basis of linguistic categories.
- linguistic categories Depending on the respective language of a text, subdivisions of the words m different linguistic categories are known. In the German language, for example, 14 categories are provided for this invention, for the English language, for example, 23 categories. Knowing these categories, a neural network is trained so that it can recognize structures and thus on the basis of groups of, for example, 3 a prosodic marker predicts or determines up to 15 consecutive words.
- a two-stage approach is chosen for a method according to the invention, which comprises the detection of the properties of each prosodic marker by neural autoassociators and the evaluation of the detailed output information output by each of the autoassociators, known as a so-called error vector is included in a neural classifier.
- neural networks makes it possible to precisely predict phrase boundaries when generating prosodic parameters for speech synthesis systems.
- the neural network according to the invention is robust against "little” or a small amount of training material (English: sparse traimng material).
- neural networks allow time-saving and cost-saving training methods and a flexible application of a method according to the invention and a corresponding device in any language. Little additional information and little expert knowledge is required to initialize such a system in a particular language.
- the neural network according to the invention is therefore well suited to synthesize texts from several languages using a multilmgual TTS system. Since the neural networks according to the invention can be trained without expert knowledge, they can be initialized more cost-effectively than known methods for determining phrase boundaries.
- the two-stage structure comprises a plurality of autoassociators, each of which is trained on a phrasing level for all linguistic classes to be evaluated.
- the. So formed fish parts of the neural network twistednspezi ⁇ .
- the tramings material is generally statistically asymmetrical, ie there are many words without phrase boundaries, but only a few with phrase boundaries. In contrast to the state of the art
- FIG. 1 schematically shows a neural network according to the invention
- FIG. 3 shows an example of an output with ternary evaluation of the phrasing using a text example
- FIG. 5 schematically shows an autoassociator during training (left) and during operation (right);
- FIG. 6 schematically shows a block diagram of the neural network according to FIG. 4 with the mathematical relationships
- FIG. 7 schematically shows an extended autoassociator
- 8 shows a computer system for executing the method according to the invention in a block diagram.
- 1 is a erf dungsgessenes neurona ⁇ les network unit 1 is shown with an input 2, an intermediate layer 3 and an output 4 for determining prosodic markers schematically.
- the input 2 is constructed from nine input groups 5 for carrying out a v part-of-speech '(POS) sequence examination.
- Each of the input group 5 comprises m adaptation to the German language 14 neurons 6, which are not all shown in FIG. 1 for reasons of clarity. So there is one neuron 6 for each of the linguistic categories.
- the linguistic categories are divided, for example, as follows:
- the output 4 is formed by a neuron with a continuous course, which means that the output values are all Values of a certain number range, which includes, for example, all real numbers between 0 and 1, can assume.
- nine input groups 5 are provided for entering the categories of the individual words.
- the category of the word from which it is to be determined whether there is a phase boundary or no phase boundary at the end of the word is applied to the middle input group 5a.
- the categories of the forerunners of the word to be examined are applied to the four input groups 5b on the left side of the input group 5a and the successors of the word to be examined are arranged on the input groups 5c arranged on the right side.
- Forerunners are all words that are arranged in the context immediately before the word to be examined.
- Successors are all words that are arranged in the context immediately following the word to be examined. In this way, with the inventive neural network 1 according to FIG. 1, a context of max. evaluated nine words.
- the category of the word to be examined is applied to the input group 5a, that is to say that the value +1 is applied to the neuron 6, which corresponds to the category of the word, and the value -1 to the other neurons 6 of the input group 5a becomes.
- the categories of the four words preceding or following the word to be examined are correspondingly applied to the input groups 5b and 5c. If there are no corresponding forerunners or successors, e.g. at the beginning and at the end of a text, the value 0 is applied to the neurons 6 of the corresponding input groups 5b, 5c.
- Another input group 5d is provided for entering the previous phrase boundaries.
- the last nine phrase limits can be entered at this input group 5d.
- An appropriate subdivision of the linguistic categories of the English language comprises 23 categories, so that the dimension of the entrance space is 216.
- the input data form an input vector x with the dimension m.
- the neural network according to the invention is trained with a training file which comprises a text and the information on the phrase boundaries of the text. These phrase boundaries can contain purely binary values, that is to say only information as to whether there is a phrase boundary or whether there is no phrase boundary. If the neural network is trained with such a training file, the output is on
- Output 4 binary.
- the output 4 generates continuous output values per se, which, however, are assigned to discrete values by means of a threshold value decision.
- FIG. 2 shows an example sentence which has a phrase limit behind the terms “word * and“ phrase limit *. There is no phrase limit after the other words in this example sentence.
- the output contains not only binary values, but also multi-level values, that is to say that information about the strength of the phrase limit is taken into account.
- the neural network is to be trained with a training file which comprises multi-level information on the phrase boundaries.
- the gradation can comprise any number of stages from two stages per se, so that a quasi-continuous output can be achieved.
- FIG. 3 shows an example sentence with a three-stage evaluation with the initial values 0 for no phrase limit, 1 for a primary phrase limit and 2 for a secondary phrase limit. border shown. There is a secondary phrase limit after the term "secondary *" and a primary phrase limit after the terms "phrase limit * and" required *.
- the neural network 4 shows a preferred embodiment of the neural network according to the invention.
- This neural network in turn comprises an input 2, which is shown only schematically as an element in FIG. 4, but is constructed in exactly the same way as input 2 from FIG. 1.
- the intermediate layer 3 consists of a plurality of autoassociators ⁇ (AA1, AA2, AA3) which each represent a model for a predetermined phrasing level.
- the autoassociators 7 are subnetworks that are trained to detect a specific phrasing level.
- the output of the autoassociators 7 is connected to a classifier 8.
- the classifier 8 is a further neural subnetwork, which also includes the output already described with reference to FIG. 1.
- the exemplary embodiment shown in FIG. 4 comprises three autoassociators, with each autoassociator being able to detect a determined phrasing strength, so that this exemplary embodiment is suitable for the detection of two different phrasing strengths and the absence of a phrasing limit.
- Each autoassociator is trained with the data of the class it represents. This means that each autoassociator is trained with the data belonging to the phrasing strength it represents.
- the autoassociators map the m-dimensional input vector x to an n-dimensional vector z, where n ⁇ m applies.
- the vector z is mapped to an output vector x '.
- the images are made using matrices w 2 e R rxr ⁇ and 2 e R nxr .
- the entire mapping made to the autoassociators can be represented by the following formula:
- the autoassociators are trained in such a way that their output vectors x 'correspond as exactly as possible with the input vectors x (FIG. 5, left side). As a result, the information of the m-dimensional input vector x is compressed to the n-dimensional vector z. It is assumed that no information is lost and the model captures the properties of the class.
- the compression behavior ms m: n of the individual autoassociators can be different.
- This error vector e rec is an “distance measure * , which corresponds to the distance of the vector x 'from the input vector x and is therefore indirectly proportional to the probability that the phrase limit assigned to the respective autoassociator is present.
- the complete neural network comprising the autoassociators and the classifier is shown schematically in FIG. 6. It shows autoassociators 7 for k classes.
- the elements p ⁇ of the output vector p are calculated according to the following formula: (xA, (x)) ⁇ d ⁇ g () ⁇ x -A, (x))
- a 1 (x) w 1 "tanh (W! li, x) and tanh is performed as element-wise operation and diag (Wi 1] ,..., v ⁇ m l 1 ) e R mm is a diagonal matrix with the elements (i '', ..., w r u ⁇ ).
- the individual elements p x of the output vector p indicate the probability with which a phrase boundary was detected on the autoassociator l.
- the output vector p has more than two elements p lf , it is expedient to evaluate the output vector p in such a way that the phrase boundary is present whose probability p ⁇ is greatest in comparison to the other probabilities p . of the output vector is p.
- a phrase limit is determined, the probability p of which is in the range around 0.5, e.g. is in the range from 0.4 to 0.6, to carry out a further routine with which the presence of the phrase limit is checked.
- This further routine can be based on a rule-driven as well as a data-driven approach.
- the individual autoassociators 7 When training with a tramming file that includes appropriate phrasing information, the individual autoassociators 7 are opened in a first training phase trains their predetermined phrasing strength. As indicated above, the individual autoassociators 7 are applied, the input vectors x in this case at the input and output side corresponding to the phrase boundary, which is assigned to the j eching autoassociator.
- a second training phase the weighting elements of the autoassociators 7 are recorded and the classifier 8 is trained.
- the error vectors e rec of the autoassociators are applied to the input side of the classifier 8 and the vectors containing the values for the different phrase boundaries are applied to the output side.
- the classifier learns to determine the starting vectors p from the error vectors.
- the architecture of a neural network described above with several models each trained for a specific class (here: the autoassociators) and a higher-level classifier, makes it possible to reliably map an input vector with a very large dimension to an output vector with a smaller dimension or scalar .
- This network architecture can also be used advantageously in other applications in which elements of different classes have to be treated. For example, be expedient to also use this network architecture in speech recognition to detect words and / or sentence boundaries. The input data must be adjusted accordingly.
- the classifier 8 shown in FIG. 6 has weighting matrices GW, each of which is assigned to an autoassociator 7.
- the weighting matrix GW assigned to the lth autoassociator 7 has weighting factors w r in the lth line on. The remaining elements of the matrix are zero.
- the number of weighting factors w n corresponds to the dimension of the input vector, with each weighting element w n m being related to a component of the input vector. If a weighting element w n has a larger value than the other weighting elements w n of the matrix, this means that the corresponding component of the input vector is of great importance for determining the phrase limit, which is determined by the autoassociator to which the corresponding one Weighting matrix GW is assigned.
- extended autoassociators are used (FIG. 7), which allow better detection of non-earthing.
- extended auto associates do the following:
- a neural network according to the invention has been trained with a predetermined English text. The same text was used to train an HMM recognizer.
- the performance criteria in the company were the percentage of correctly recognized phrase boundaries (B-corr), the overall correctly assessed words, regardless of whether one or no phrase boundary follows (total), and the incorrectly recognized words without phrase boundary (NB-ncorr) determined.
- B-corr percentage of correctly recognized phrase boundaries
- NB-ncorr incorrectly recognized words without phrase boundary
- results shown in the table show that the neural networks according to the invention provide approximately the same results as e HMM recognizers with regard to the correctly recognized phrase boundaries and the overall correctly recognized words.
- the neural networks according to the invention are significantly better than the HMM recognizer with regard to the incorrectly detected phrase boundaries, in places where there is no phrase boundary per se. This type of error is particularly serious in the speech-to-text conversion, since these errors produce an incorrect emphasis that is immediately apparent to the listener.
- one of the neural networks according to the invention was trained with a fraction of the basic text used in the above experiments (5%, 10%, 30%, 50%). The following results were achieved:
- the exemplary embodiment described above has k autoassociators.
- the neural networks described above are implemented as computer programs that run independently on a computer to implement the linguistic category of a text in its prosodic marker. They are therefore an automatically executable procedure.
- the computer program can also be stored on an electronically readable data carrier and can thus be transferred to another computer system.
- FIG. 8 A computer system suitable for using the method according to the invention is shown in FIG. 8.
- the computer system 9 has an internal bus 10 which is connected to a memory area 11, a central processor unit 12 and an interface 13.
- the interface 13 establishes a data connection to further computer systems via a data line 14.
- An acoustic output unit 15, a graphic output unit 16 and an output unit 17 are also connected to the internal bus.
- the acoustic output unit 15 is equipped with a loudspeaker 18, the graphic output unit 16 with a screen 19 and the emitting unit 17 connected to a keyboard 20. Text that is stored in the memory area 11 can be transmitted to the computer system 9 via the data line 14 and the interface 13.
- the memory area 11 is subdivided into a plurality of areas in which texts, audio files, application programs for carrying out the method according to the invention and further application and auxiliary programs are stored.
- the texts saved as a text file are analyzed with predetermined program packages and the respective linguistic categories of the words are determined.
- the prosodic markers are then determined from the linguistic categories using the method according to the invention. These prosodic markers are in turn entered in a further program package which uses the prosodic markers to generate audio files, which are transmitted via the internal bus 10 to the acoustic output unit 15 and are output by the latter at the loudspeaker 18 as speech.
- the method can also be used to evaluate an unknown text with regard to a prediction of stresses, e.g. according to the internationally standardized ToBI labels (tones and breaks indices), and / or the melody. These adjustments have to be made depending on the respective language of the text to be processed, since the prosody is always language-specific
Abstract
Description
Claims
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE10018134A DE10018134A1 (de) | 2000-04-12 | 2000-04-12 | Verfahren und Vorrichtung zum Bestimmen prosodischer Markierungen |
DE10018134 | 2000-04-12 | ||
PCT/DE2001/001394 WO2001078063A1 (de) | 2000-04-12 | 2001-04-09 | Verfahren und vorrichtung zum bestimmen prosodischer markierungen |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1273003A1 true EP1273003A1 (de) | 2003-01-08 |
EP1273003B1 EP1273003B1 (de) | 2005-12-07 |
Family
ID=7638473
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP01940136A Expired - Lifetime EP1273003B1 (de) | 2000-04-12 | 2001-04-09 | Verfahren und vorrichtung zum bestimmen prosodischer markierungen |
Country Status (4)
Country | Link |
---|---|
US (1) | US7409340B2 (de) |
EP (1) | EP1273003B1 (de) |
DE (2) | DE10018134A1 (de) |
WO (1) | WO2001078063A1 (de) |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10207875A1 (de) * | 2002-02-19 | 2003-08-28 | Deutsche Telekom Ag | Parametergesteuerte Sprachsynthese |
US20060293890A1 (en) * | 2005-06-28 | 2006-12-28 | Avaya Technology Corp. | Speech recognition assisted autocompletion of composite characters |
US20070055526A1 (en) * | 2005-08-25 | 2007-03-08 | International Business Machines Corporation | Method, apparatus and computer program product providing prosodic-categorical enhancement to phrase-spliced text-to-speech synthesis |
US7860705B2 (en) * | 2006-09-01 | 2010-12-28 | International Business Machines Corporation | Methods and apparatus for context adaptation of speech-to-speech translation systems |
JP4213755B2 (ja) * | 2007-03-28 | 2009-01-21 | 株式会社東芝 | 音声翻訳装置、方法およびプログラム |
WO2011007627A1 (ja) * | 2009-07-17 | 2011-01-20 | 日本電気株式会社 | 音声処理装置および方法ならびに記憶媒体 |
TWI573129B (zh) * | 2013-02-05 | 2017-03-01 | 國立交通大學 | 編碼串流產生裝置、韻律訊息編碼裝置、韻律結構分析裝置與語音合成之裝置及方法 |
US9195656B2 (en) | 2013-12-30 | 2015-11-24 | Google Inc. | Multilingual prosody generation |
CN105374350B (zh) * | 2015-09-29 | 2017-05-17 | 百度在线网络技术(北京)有限公司 | 语音标注方法及装置 |
US20180018973A1 (en) | 2016-07-15 | 2018-01-18 | Google Inc. | Speaker verification |
EP3822863B1 (de) * | 2016-09-06 | 2022-11-02 | DeepMind Technologies Limited | Erzeugung von audio mit neuronalen netzwerken |
US11080591B2 (en) | 2016-09-06 | 2021-08-03 | Deepmind Technologies Limited | Processing sequences using convolutional neural networks |
CN109844773B (zh) | 2016-09-06 | 2023-08-01 | 渊慧科技有限公司 | 使用卷积神经网络处理序列 |
CN110023963B (zh) | 2016-10-26 | 2023-05-30 | 渊慧科技有限公司 | 使用神经网络处理文本序列 |
KR102071582B1 (ko) * | 2017-05-16 | 2020-01-30 | 삼성전자주식회사 | 딥 뉴럴 네트워크(Deep Neural Network)를 이용하여 문장이 속하는 클래스(class)를 분류하는 방법 및 장치 |
CN109492223B (zh) * | 2018-11-06 | 2020-08-04 | 北京邮电大学 | 一种基于神经网络推理的中文缺失代词补全方法 |
CN111354333B (zh) * | 2018-12-21 | 2023-11-10 | 中国科学院声学研究所 | 一种基于自注意力的汉语韵律层级预测方法及系统 |
CN111508522A (zh) * | 2019-01-30 | 2020-08-07 | 沪江教育科技(上海)股份有限公司 | 一种语句分析处理方法及系统 |
US11610136B2 (en) * | 2019-05-20 | 2023-03-21 | Kyndryl, Inc. | Predicting the disaster recovery invocation response time |
KR20210099988A (ko) * | 2020-02-05 | 2021-08-13 | 삼성전자주식회사 | 뉴럴 네트워크의 메타 학습 방법 및 장치와 뉴럴 네트워크의 클래스 벡터 학습 방법 및 장치 |
CN112786023A (zh) * | 2020-12-23 | 2021-05-11 | 竹间智能科技(上海)有限公司 | 标记模型构建方法及语音播报系统 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2764343B2 (ja) * | 1990-09-07 | 1998-06-11 | 富士通株式会社 | 節/句境界抽出方式 |
WO1995002879A1 (en) * | 1993-07-13 | 1995-01-26 | Theodore Austin Bordeaux | Multi-language speech recognition system |
CN1057625C (zh) * | 1994-04-28 | 2000-10-18 | 摩托罗拉公司 | 使用神经网络变换文本为声频信号的方法 |
JP3536996B2 (ja) * | 1994-09-13 | 2004-06-14 | ソニー株式会社 | パラメータ変換方法及び音声合成方法 |
US5950162A (en) * | 1996-10-30 | 1999-09-07 | Motorola, Inc. | Method, device and system for generating segment durations in a text-to-speech system |
BE1011892A3 (fr) * | 1997-05-22 | 2000-02-01 | Motorola Inc | Methode, dispositif et systeme pour generer des parametres de synthese vocale a partir d'informations comprenant une representation explicite de l'intonation. |
US6134528A (en) * | 1997-06-13 | 2000-10-17 | Motorola, Inc. | Method device and article of manufacture for neural-network based generation of postlexical pronunciations from lexical pronunciations |
-
2000
- 2000-04-12 DE DE10018134A patent/DE10018134A1/de not_active Ceased
-
2001
- 2001-04-09 DE DE50108314T patent/DE50108314D1/de not_active Expired - Lifetime
- 2001-04-09 EP EP01940136A patent/EP1273003B1/de not_active Expired - Lifetime
- 2001-04-09 WO PCT/DE2001/001394 patent/WO2001078063A1/de active IP Right Grant
-
2003
- 2003-01-27 US US10/257,312 patent/US7409340B2/en not_active Expired - Fee Related
Non-Patent Citations (1)
Title |
---|
See references of WO0178063A1 * |
Also Published As
Publication number | Publication date |
---|---|
WO2001078063A1 (de) | 2001-10-18 |
DE50108314D1 (de) | 2006-01-12 |
US20030149558A1 (en) | 2003-08-07 |
US7409340B2 (en) | 2008-08-05 |
EP1273003B1 (de) | 2005-12-07 |
DE10018134A1 (de) | 2001-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1273003A1 (de) | Verfahren und vorrichtung zum bestimmen prosodischer markierungen | |
DE69908047T2 (de) | Verfahren und System zur automatischen Bestimmung von phonetischen Transkriptionen in Verbindung mit buchstabierten Wörtern | |
DE60111329T2 (de) | Anpassung des phonetischen Kontextes zur Verbesserung der Spracherkennung | |
DE602004012909T2 (de) | Verfahren und Vorrichtung zur Modellierung eines Spracherkennungssystems und zur Schätzung einer Wort-Fehlerrate basierend auf einem Text | |
DE3337353C2 (de) | Sprachanalysator auf der Grundlage eines verborgenen Markov-Modells | |
DE60216069T2 (de) | Sprache-zu-sprache erzeugungssystem und verfahren | |
DE60126564T2 (de) | Verfahren und Anordnung zur Sprachsysnthese | |
DE60020434T2 (de) | Erzeugung und Synthese von Prosodie-Mustern | |
DE19825205C2 (de) | Verfahren, Vorrichtung und Erzeugnis zum Generieren von postlexikalischen Aussprachen aus lexikalischen Aussprachen mit einem neuronalen Netz | |
DE602005002706T2 (de) | Verfahren und System für die Umsetzung von Text-zu-Sprache | |
DE3416238C2 (de) | Extremschmalband-Übertragungssystem und Verfahren für eine Übertragung von Nachrichten | |
DE69818161T2 (de) | Automatisierte Gruppierung von sinnvollen Sätzen | |
DE60203705T2 (de) | Umschreibung und anzeige eines eingegebenen sprachsignals | |
DE60222249T2 (de) | Spracherkennungsystem mittels impliziter sprecheradaption | |
EP1466317B1 (de) | Betriebsverfahren eines automatischen spracherkenners zur sprecherunabhängigen spracherkennung von worten aus verschiedenen sprachen und automatischer spracherkenner | |
DE60004420T2 (de) | Erkennung von Bereichen überlappender Elemente für ein konkatenatives Sprachsynthesesystem | |
DE10306599B4 (de) | Benutzeroberfläche, System und Verfahren zur automatischen Benennung von phonischen Symbolen für Sprachsignale zum Korrigieren von Aussprache | |
DE60313706T2 (de) | Spracherkennungs- und -antwortsystem, Spracherkennungs- und -antwortprogramm und zugehöriges Aufzeichnungsmedium | |
DE19942178C1 (de) | Verfahren zum Aufbereiten einer Datenbank für die automatische Sprachverarbeitung | |
DE4310190A1 (de) | Sprecher-Verifizierungssystem unter Verwendung der Abstandsmessung nächster Nachbarn | |
EP0987683A2 (de) | Spracherkennungsverfahren mit Konfidenzmassbewertung | |
DE69727046T2 (de) | Verfahren, vorrichtung und system zur erzeugung von segmentzeitspannen in einem text-zu-sprache system | |
DE112006000322T5 (de) | Audioerkennungssystem zur Erzeugung von Antwort-Audio unter Verwendung extrahierter Audiodaten | |
EP1058235B1 (de) | Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese | |
EP1282897B1 (de) | Verfahren zum erzeugen einer sprachdatenbank für einen zielwortschatz zum trainieren eines spracherkennungssystems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20021002 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE TR |
|
RBV | Designated contracting states (corrected) |
Designated state(s): DE FR GB IT |
|
17Q | First examination report despatched |
Effective date: 20040728 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
GRAC | Information related to communication of intention to grant a patent modified |
Free format text: ORIGINAL CODE: EPIDOSCIGR1 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB IT |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REF | Corresponds to: |
Ref document number: 50108314 Country of ref document: DE Date of ref document: 20060112 Kind code of ref document: P |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20060118 |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20060908 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20110427 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20110419 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: IT Payment date: 20110422 Year of fee payment: 11 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20110620 Year of fee payment: 11 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20120409 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20121228 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20120409 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 50108314 Country of ref document: DE Effective date: 20121101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20120430 Ref country code: IT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20120409 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20121101 |