« PreviousContinue »
United States Patent [w]
Tresp et al.
US005806053A [ii] Patent Number:  Date of Patent:
 METHOD FOR TRAINING A NEURAL
NETWORK WITH THE NON-
DETERMINISTIC BEHAVIOR OF A
 Inventors: Volker Tresp; Reimar Hofmann, both of Munich, Germany
 Assignee: Siemens Aktiengesellschaft, Munich, Germany
 Appl. No.: 705,834
 Filed: Aug. 30, 1996
 Foreign Application Priority Data
Aug. 30, 1995 [DE] Germany 195 31 967.2
 Int. CI.6 G06F 15/18
 U.S. CI 706/23; 706/22; 706/23;
 Field of Search 395/22, 23, 24,
 References Cited
U.S. PATENT DOCUMENTS
5,159,660 10/1992 Lu et al 395/22
5,396,415 3/1995 Konar et al 364/162
5,600,753 2/1997 Iso 395/2.09
5,649,064 7/1997 lorgensen et al 395/22
FOREIGN PATENT DOCUMENTS
41 38 053 5/1993 Germany .
Xianzhong Cui, and Kang G. shin, "Direct Control and Coordination Using Neural Networks" IEEE Transactions on Systems, Man, and Cybernetics, vol. 23, Issue 3, May 1993.
Shynk et al.; "A Stochastic training model for perception algorithms", IJCNN-91-Seattle; pp. 779-784 vol. 1, Jul. 1991.
"Direct Control and Coordination Using Neural Networks," Cui et al, IEEE Trans, on Systems, Man and Cybernetics, vol. 23, No. 3, May/Jun. 1993, pp. 686-697.
Primary Examiner—Allen R. MacDonald
Assistant Examiner—Jagdish Patel
Attorney, Agent, or Firm—Hill & Simpson
In a method for tranining a neural network with the nondeterministic behavior of a technical system, weightings for the neurons of the neural network are set during the training using a cost function. The cost function evaluates a beneficial system behavior of the technical system to be modeled, and thereby intensifies or increases the weighting settings which contribute to the beneficial system behavior, and attenuates or minimizes weightings which produce a nonbeneficial behavior. Arbitrary or random disturbances are generated by disturbing the manipulated variable with noise having a known noise distribution, these random disturbances significantly faciliating the mathematical processing of the weightings which are set, because the terms required for that purpose are simplified. The correct weighting setting for the neural network is thus found on the basis of a statistical method and the application of a cost function to the values emitted by the technical system or its model.
7 Claims, 1 Drawing Sheet
METHOD FOR TRAINING A NEURAL
NETWORK WITH THE NON-
DETERMINISTIC BEHAVIOR OF A
BACKGROUND OF THE INVENTION Field of the Invention
The present invention is directed to a method for neural modeling of dynamic processes, with the goal of training the 10 neural network to be able to control processes having a high proportion of stochastic events.
Description of the Prior Art
Neural networks are being introduced into a large variety of technical fields. Neural networks prove especially suit- 15 able wherever it is important to derive decisions from complex technical relationships and from inadequate information. For forming one or more output quantities, one or more input quantities, for example, are supplied to the neural network. To this end, such a network is first trained for the 20 specific application, is subsequently generalized, and is then validated with data differing from the training data. Neural networks prove especially suitable for many applications since they can be universally trained.
A problem that often arises in conjunction with the use of 25 neural networks, however, is that the input data for the training or during operation of the network are often not complete. This situation, as well as the fact that the measured values which are supplied to the neural network for constructing a time series are often imprecise or noise- 30 infested, can cause degraded training results of the networks. In the case of processes having a high proportion of stochastic events, a particular problem is that the training data have random character, and heretofore a suitable method does not exist to train neural networks with the behavior of 35 such systems.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a training 40 method for improving the learning process of a neural network during training thereof, which is capable of training the neural network to the behavior of a technical system having a high proportion of stochastic events.
This object is achieved in accordance with the principles 45 of the present invention in a method for training a neural network using training data having weightings which are set during the training using a cost function. The cost function evaluates a beneficial system behavior of the technical system to be modeled, and thereby intensifies or increases 50 the weighting settings which contribute to the beneficial system behavior, and attenuates or minimizes weightings which produce a non-beneficial behavior. Arbitrary or random disturbances are generated by disturbing the manipulated variable with noise having a known noise distribution, 55 these random disturbances significantly faciliating the mathematical processing of the weightings which are set, because the terms required for that purpose are simplified. The correct weighting setting for the neural network is thus found on the basis of a statistical method and the application go of a cost function to the values emitted by the technical system or its model.
Neural networks can be advantageously trained with the behavior of technical systems that have substantially completely stochastic behavior with this method, because the 65 inventive method makes use of statistical methods for evaluating the input data when training the neural network. The
manipulated variable data are varied for this purpose using noise having a known statistical distribution for generating a new controlled variable of the technical system. By frequent repetition of this procedure, and an evaluation of the controlled variable of the technical system on the basis of the cost function, weightings which achieve an improvement in the behavior of the technical system relative to a desired reference behavior are more heavily weighted using the cost function, so that an optimum weighting setting of the neural network can be achieved. Known methods for training neural networks can be employed for setting the weightings with reference to error gradients.
The number of time series to be registered for training the neural network can be varied, thereby providing the operator with the possibility of influencing the precision of the setting of the weightings of the neural network dependent on the calculating time or calculating capacity available to the operator.
A number of time series can be acquired by modeling, or by employing the real technical system itself, and their averages can then be employed for training the neural network, since a better statistical basis for the accuracy of the training values is thereby achieved.
A Gaussian distribution can be employed as the known noise distribution for varying the manipulated variable when training the neural network, since the error gradient for training the neural network can thus be calculated in a particularly simple manner.
A number of time series can be simulated and measured, since conclusions about the behavior of the manipulated variable of the technical system thus can be obtained under various conditions, and the statistics of the time series are thereby improved. An advantage of the inventive method is that not only can the manipulated variable be superimposed with noise having a known distribution, but also the controlled variable can be superimposed with noise having a known distribution, without degrading the learning behavior of the neural network.
The inventive method operates equally as well with the technical system itself, or using a model of the technical system. For simplicity, therefore, as used herein the term "technical system" will be understood as meaning the technical system itself or a model of the technical system.
DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a time series and a system behavior in accordance with the inventive method.
FIG. 2 shows a neural network that is being trained in accordance with the inventive method.
DESCRIPTION OF THE PREFERRED
FIG. 1 shows a time series of measured values that, for example, can be supplied to a neural network. The explanation associated with FIG. 1 illustrates the basic mathematical principles underlying the inventive method. In chronological succession, these measured values, are acquired, for example, from a technical system and are referenced as yt through y t_6 according to their chronological succession. For example, it is assumed in FIG. 1 that the value yt_2 is missing. The relevant values in the Markov blanket, as neighboring values of this missing measured value, are yt_4, yt_3, yc_1 and y. Such a missing measured value in a time series can, for example, arise because the measuring instrument for registering the values did not
function at the point in time in question or, in order to train the neural network better, it seems beneficial between individual measured values to supply this neural network with a further value that, consequently, is yet to be identified, i.e. that is still to be generated according to the inventive method.
FIG. 1 shows the time series in conjunction with a neural network NNW. It may be seen that y represents a timedependent variable that represents the system behavior SY of a technical system. As may be seen, the values yHd tthrough yt_6 correspond to measured values that are taken from the system behavior SY. The dashed arrows at the respective points in time symbolize that these measured values are to be supplied to the neural network NNW during operation or during training.
As in FIG. 1 as well, the questionable measured value M for the point in time yc_2 is not present. The probability density el is indicated for this measured value M. For example, the probability density can be back-calculated according to the inventive method from a predetermined, known error distribution density of the remaining measured values. What is thereby particularly exploited is that the missing measured value must be located between two known measured values and the error thereof is thus also limited by the errors of the neighboring values and the errors of the remaining measured values of the time series. The underlying time series can be described as follows:
wherein the function is either "known to" the neural network such as being stored therein or stored in a memory accessible by the neural network, or is adequately modeled by a neural network. The contribution et, denotes an additive, uncorrected error with the chronological average 0. This error—and this is essential for the inventive method—comprises a known or predetermined probability density Pe(e) and typically symbolizes the unmodeled dynamics of the time series. For example, a future value is to be predicted for such a time series that is to be completed according to the inventive method. It should be noted that future values are to be understood as being relative to the time position selected at the moment. This means that for a point in time yt_5 , the point in time yt_4 constitutes its future value. Under these conditions, the conditional probability density can be described as follows for a value of the time series to be predicted:
values being employed as completely as possible. Based on the assumptions that were made above, the entire probability density of the time series can be described as follows:
wherein M^stands for all measurements up to point in time t-TThe above equation is the basic equation for the prediction with missing data. It should be particularly noted that the unknown yt_jis dependent not only on the values of the time series before the point in time t-k but also is dependent on the measurements following t-k. The reason for this is that the variables in ymUy1 form a minimum Markov blanket Of yt_k This minimum Markov blanket is composed of the direct predecessors and the direct successors of a variable and of all direct predecessors of variables of the direct successor. In the example under consideration in FIG. 1, the direct successors are yr . . y-The direct predecessors are: