US 7496546 B2 Abstract This invention provides an interconnecting neural network system capable of freely taking a network form for inputting a plurality of input vectors, and facilitating additionally training an artificial neural network structure. The artificial neural network structure is constructed by interconnecting RBF elements relating to each other among all RBF elements via a weight. Each RBF element outputs an excitation strength according to a similarity between each input vector and a centroid vector based on a radius base function when the RBF element is excited by the input vector applied from an outside, and outputs a pseudo excitation strength obtained based on the excitation strength output from the other RBF element when the RBF element is excited in a chain reaction to excitation of the other neuron connected to the neuron.
Claims(19) 1. An interconnecting neural network system comprising:
a neural network unit that includes a plurality of neurons, each of the neurons embodying a weight holding unit, a pointer unit, a duration variable holding unit and an activation time holding unit, each of the neurons outputting an excitation strength according to a similarity between input vectors and centroid vectors based on a kernel function; and
a network control unit that constructs an artificial neural network structure by interconnecting the neurons relating to each other among the neurons in the neural network unit via respective weights,
wherein each of the neurons in the neural network unit outputs an excitation strength according to the similarity between the input vectors and the centroid vectors based on the kernel function when each neuron is excited by the input vector applied from an outside, and outputs a pseudo excitation strength obtained based on the sum of excitation strength outputs from the other neurons when each neuron is excited in a chain reaction to excitation of the other neuron connected to each neuron,
wherein each of the neurons in the neural network unit has a plurality of modalities different from one another, the plurality of modalities of the neurons including auditory modality and visual modality so that a plurality of different input vectors of auditory modality and visual modality are handled simultaneously and independently by the neurons to perform auditory and visual recognition concurrently.
2. The interconnecting neural network system according to
3. The interconnecting neural network system according to
4. The interconnecting neural network system according to
5. The interconnecting neural network system according to
6. The interconnecting neural network system according to
7. The interconnecting neural network system according to
8. A computer-implemented method of constructing an interconnecting neural network structure, the method comprising the steps of:
preparing an artificial neural network structure including a plurality of neurons, each of the neurons embodying a weight holding unit, a pointer unit, a duration variable holding unit and an activation time holding unit, each of the neurons outputting an excitation strength according to a similarity between input vectors and centroid vectors based on a kernel function, the neurons relating to each other interconnected in the artificial neural network structure via respective weights; and
training the weight that connects the neurons to each other, based on the excitation strength of each neuron,
wherein each of the neurons in the artificial neural network structure has a plurality of modalities different from one another, the plurality of modalities of the neurons including auditory modality and visual modality so that a plurality of different input vectors of auditory modality and visual modality are handled simultaneously and independently by the neurons to perform auditory and visual recognition concurrently.
9. The method according to
10. The method according to
11. The method according to
12. A computer readable recording medium storing an interconnecting neural network structure construction program that allows a computer to execute the method according to
13. A computer-implemented method of constructing a self-organizing neural network structure including a plurality of neurons, each of the neurons embodying a weight holding unit, a pointer unit, a duration variable holding unit and an activation time holding unit, each of the neurons outputting an excitation strength according to a similarity between input vectors and centroid vectors based on a kernel function, the neurons relating to each other being autonomously connected in the self-organizing neural network structure based on the input vector, the method comprising:
a first step of adding a neuron, which has input vectors as centroid vectors for a kernel function, into the self-organizing neural network structure as a new neuron based on input vectors that is input first from an outside; and
a second step of repeating the following processings (a) to (c), each of the processings being based on input vectors that is an n
^{th }input vector from the outside, where n is an integer equal to or greater than 2:(a) the processing of calculating excitation strengths of all the neurons in the self-organizing neural network structure based on the n
^{th }input vector input from the outside;(b) the processing of adding a neuron, which has the n
^{th }input vector as a centroid vector for a kernel function, into the self-organizing neural network structure as a new neuron in case that it is determined by the processing (a) that there is no neuron excited such that the excitation strength thereof exceeds a predetermined threshold, among one or a plurality of neurons in the self-organizing neural network structure; and(c) the processing of performing both of or one of formation of a weight that connects the neurons, and training of the formed weight based on the excitation strengths of the neurons in the self-organizing neural network structure;
wherein each of the neurons in the self-organizing neural network structure has a plurality of modalities different from one another, the plurality of modalities of the neurons including auditory modality and visual modality so that a plurality of different input vectors of auditory modality and visual modality are handled simultaneously and independently by the neurons to perform auditory and visual recognition concurrently.
14. The method according to
15. The method according to
16. The method according to
17. A computer readable recording medium storing an interconnecting neural network structure construction program that allows a computer to execute the method according to
18. An interconnecting neural network system comprising:
a plurality of intermediate layer neurons, each of the intermediate layer neurons embodying a weight holding unit, a pointer unit, a duration variable holding unit and an activation time holding unit, each of the intermediate layer neurons outputting an excitation strength according to a similarity between input vectors and centroid vectors based on a kernel function, and each of the intermediate layer neurons using centroid data in a matrix form in light of time series changes as the centroid vector; and
an output layer neuron connected to each of the intermediate layer neurons and outputting a change in the excitation strength output from each intermediate layer neuron at time series,
wherein each of the neurons has a plurality of modalities different from one another, the plurality of modalities of the neurons including auditory modality and visual modality so that a plurality of different input vectors of auditory modality and visual modality are handled simultaneously and independently by the neurons to perform auditory and visual recognition concurrently.
19. The interconnecting neural network system according to
Description This Nonprovisional application claims priority under 35 U.S.C. § 119(a) on Patent Application No(s). 2003-080940 filed in Japan on Mar. 24, 2003, the entire contents of which are hereby incorporated by reference. 1. Field of the Invention The present invention relates to an artificial neural network structure. More specifically, the present invention relates to an interconnecting neural network system, an interconnecting neural network structure construction method, a self-organizing neural network structure construction method having a novel network form excellent in flexibility of structure and in facility of training, and construction programs therefor. 2. Related Art As a conventional artificial neural network structure, an artificial neural network structure having a fixed network form such as a layered network that inputs a single input vector and adjusting network parameters such as weight vectors is normally known. As a method of adjusting the network parameters, a back-propagation method for iteratively updating network parameters is widely used (D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” In D. E. Rumelhart and J. L. McClelland (Eds.), “Parallel Distributed Processing: explorations in the microstructure of cognition,” 1. Chapter 8, Cambridge, Mass. MIT Press, 1986). However, for the conventional artificial neural network structure, an iterative training scheme for iteratively updating network parameters, such as the back-propagation method, is employed as the method of adjusting network parameters. Due to this, the conventional artificial neural network structure has the following disadvantages: (1) it takes considerable time to update network parameters before the connection between input and output is established; (2) a solution obtained as a result of updating network parameters tends to be a local minimum and it is difficult to obtain a correct solution; and (3) it is difficult to realize a robust additional training method. Furthermore, the conventional artificial neural network structure has disadvantages in that the structure is inferior in network configuration flexibility and no practical, effective method capable of handling a plurality of input vectors is established yet. As a conventional method of handling a plurality of input vectors in the artificial neural network structure, a modular approach for modularizing various neural networks (or agents) and integrating the neural network (or agent) modules is proposed (S. Haykin, “Neural Networks: A Comprehensive Foundation,” Macmillan College Publishing Co. Inc., N.Y., 1994). Even if such an approach is used, an artificial neural network structure having a fixed network form based on an iterative training scheme is used for every network module similarly to the conventional artificial neural network structure. Therefore, the approach is faced with the substantial disadvantages stated above, as well. In this background, the inventor of the present invention proposed a novel neural network structure to which an RBF neural network structure (see S. Haykin, “Neural Networks: A Comprehensive Foundation,” Macmillan College Publishing Co. Inc., N.Y., 1994) for outputting an excitation strength according to a similarity between an input vector and a centroid vector based on a radial basis function (“RBF”) is applied (see Japanese Patent Application No. 2001-291235 (i.e., Japanese Patent Laid-open Publication No. 2003-99756)). The inventor of the present invention also proposed a method of realizing a function of a storage chain by linking RBF elements included in the RBF neural network to one another (see Japanese Patent Application No. 2002-100223). The present invention provides a further development of the methods proposed in the Japanese Patent Application Nos. 2001-291235 and 2002-100223. It is an object of the present invention to provide an interconnecting neural network system, an interconnecting neural network system construction method, a self-organizing neural network structure construction method, and construction programs therefore, capable of freely taking a network form for inputting a plurality of input vectors, and facilitating an additional training of an artificial neural network structure. According to a first aspect of the present invention, there is provided an interconnecting neural network system comprising: a neural network unit that includes a plurality of neurons, each of the neurons outputting an excitation strength according to a similarity between an input vector and a centroid vector based on a kernel function; and a network control unit that constructs an artificial neural network structure by interconnecting neurons relating to each other among the neurons in the neural network unit via a weight, wherein each of the neurons in the neural network unit outputs an excitation strength according to a similarity between an input vector and a centroid vector based on a kernel function when the each neuron is excited by the input vector applied from an outside, and outputs a pseudo excitation strength obtained based on an excitation strength output from the other neuron when the each neuron is excited in a chain reaction to excitation of the other neuron connected to the each neuron. According to the first aspect of the invention, it is preferable that the each neuron in the neural network unit outputs the pseudo excitation strength and also outputs the centroid vector of the each neuron when the each neuron is excited in a chain reaction to the excitation of the other neuron connected to the each neuron. According to the first aspect of the invention, it is also preferable that the network control unit interconnects the neurons relating to each other among the neurons in the neural network unit, based on an order of the neurons added or excited at time series in association with a plurality of input vectors applied to the neural network unit from the outside. According to the first aspect of the invention, it is also preferable that the network control unit trains the weight that connects the neurons to each other, based on the excitation strength of the each neuron in the neural network unit. According to the first aspect of the invention, it is also preferable that the network control unit removes the each neuron at a predetermined timing determined based on the excitation strength of the each neuron in the neural network unit. According to the first aspect of the invention, it is further preferable that the each neuron in the neural network unit is an intermediate layer neuron using, as the centroid vector, centroid data in a matrix form in light of time series changes, and that the each intermediate layer neuron is connected to an output layer neuron that outputs a change in the excitation strength output from the each intermediate layer neuron at time series. According to a second aspect of the present invention, there is provided a method of constructing an interconnecting neural network structure, the method comprising the steps of: preparing an artificial neural network structure including a plurality of neurons, each of the neurons outputting an excitation strength according to a similarity between an input vector and a centroid vector based on a kernel function, the neurons relating to each other being interconnected in the artificial neural network structure via a weight; and training the weight that connects the neurons to each other based on the excitation strength of the each neuron. According to the second aspect of the invention, it is also preferable that, in the step of preparing the artificial neural network structure, the neurons relating to each other are interconnected in the artificial neural network structure based on an order of the neurons added or excited at time series in association with a plurality of input vectors applied from an outside. According to the second aspect of the invention, it is also preferable to further comprise a step of removing the each neuron at a predetermined timing determined based on the excitation strength of the each neuron. According to a third aspect of the present invention, there is provided a method of constructing a self-organizing neural network structure including a plurality of neurons, each of the neurons outputting an excitation strength according to a similarity between an input vector and a centroid vector based on a kernel function, the neurons relating to each other being autonomously connected in the self-organizing neural network structure based on the input vector, the method comprising: a first step of adding a neuron, which has an input vector as a centroid vector for a kernel function, into the self-organizing neural network structure as a new neuron based on an input vector that is input first from an outside; and a second step of repeating following processings (a) to (c), each of the processings being based on an input vector that is an n According to the third aspect of the invention, it is also preferable that, in the second step, a processing (d) of removing a neuron determined to be unnecessary based on the excitation strengths of the neurons in the self-organizing neural network structure is further performed. According to the third aspect of the invention, it is also preferable that each of the neurons in the self-organizing neural network structure holds a class label relating to a final output, and that, in the processing (c) in the second step, only in case that the class label held by the each neuron in the self-organizing neural network structure is identical, both of or one of the formation of the weight that connects the neurons, and the training of the formed weight is performed based on the excitation strengths of the neurons. According to the third aspect of the invention, the neurons in the self-organizing neural network structure may have a single modality (e.g., an auditory modality or a visual modality), or the neurons in the self-organizing neural network structure may have a plurality of modalities different from one another (e.g., both the auditory modality and the visual modalities). According to the first to third aspects of the invention, it is preferable that the kernel function employed in the each neuron includes a radial basis function. According to a fourth aspect of the present invention, there is provided a computer readable recording medium storing an interconnecting neural network structure construction program that allows a computer to execute the method according to the second aspect or the third aspect. According to a fifth aspect of the present invention, there is provided an interconnecting neural network system comprising: a plurality of intermediate layer neurons, each of the intermediate layer neurons outputting an excitation strength according to a similarity between an input vector and a centroid vector based on a kernel function, and each of the intermediate layer neurons using centroid data in a matrix form in light of time series changes as the centroid vector; and an output layer neuron connected to each of the intermediate layer neurons and outputting a change in the excitation strength output from the each intermediate layer neuron at time series. The kernel function according to the first to fifth aspects of the present invention stated above means a function for outputting a relationship between two vectors (see Literature of N. Cristianini and J. S. Taylor, “An Introduction to Support Vector Machines,” Cambridge Univ. Press, 2000). An arbitrary function can be used as the kernel function. However, an RBF based on a Gaussian function that represents a distance metric for correlation between two vectors, a function using a norm, a function using an inner product between two vectors, a function using Epanechinikov quadratic or Tri-cube, or the like is normally, preferably used. According to the first to fourth aspects of the present invention, the artificial neural network structure is constructed by interconnecting the neurons relating to each other via the weight. Each neuron outputs the excitation strength according to the similarity between the input vector and the centroid vector based on the kernel function when the neuron is excited by the input vector applied from the outside. The neuron also outputs the pseudo excitation strength obtained based on the excitation strength output from the other neuron when the neuron is excited in a chain reaction to the excitation of the other neuron connected to the neuron. Therefore, one neuron can belong to a plurality of networks, and a plurality of neurons can be connected in an arbitrary network form. Accordingly, differently from the conventional fixed network form in which a single input vector is input, a plurality of input vectors can be freely handled, and the configuration change and the like can be flexibly made. In this case, since the neuron can belong to a plurality of networks having different modalities, it is possible to freely handle input vectors having a plurality of different modalities such as the auditory modality and the visual modality, and to widely apply the system not only to single-domain pattern recognition but also to multi-domain pattern recognition. According to the first to fourth aspects of the present invention, each weight that connects the neurons to each other is updated and thereby trained. This, therefore, facilitate additionally training the artificial neural network structure. According to the first to fourth aspects of the present invention, each weight that connects the neurons to each other can be updated independently of the outputs of the neurons. Therefore, differently from the conventional training algorithm such as the back-propagation method, only the weights stored in a distributed fashion for specifying the connection relationships between the neurons can be iteratively updated while locally storing data in the neurons as the centroid vectors without influencing at all the data stored in the respective neurons during the training of the weight. Accordingly, it is possible to realize data representations having different properties, i.e., a distribution property and a localization property. In addition, it is possible to construct a memory element that is advantageous, and that possesses both generalization performance and additional training performance. According to the first to fourth aspects of the present invention, a plurality of intermediate layer neurons using centroid data in a matrix form in light of changes at time series as centroid vectors may be provided, and the output layer neurons connected to the intermediate layer neurons may output changes in excitation strengths output from the respective intermediate layer neurons at time series. It is thereby possible to facilitate constructing the recognition system such as a database incremental search function, that narrows down final candidates with the passage of time. According to the fifth aspects of the present invention, a plurality of intermediate layer neurons using centroid data in a matrix form in light of changes at time series as centroid vectors are provided, and the output layer neurons connected to the intermediate layer neurons output changes in excitation strengths output from the respective intermediate layer neurons at time series. It is thereby possible to facilitate constructing the recognition system such as a database incremental search function, that narrows down final candidates with the passage of time. An embodiment of the present invention will be described hereinafter with reference to the drawings. <Overall Configuration> Referring first to As shown in The neural network unit The network control unit <Artificial Neural Network Structure> As shown in The four RBF elements <RBF Element> The RBF elements As shown in Among the constituent elements of the RBF element Specifically, the excitation strength h
On the other hand, the pseudo excitation strength h
Alternatively, the RBF main body unit The pointer unit The weight holding unit The duration variable holding unit The activation time holding unit As shown in
Alternatively, the outputs o <Network Form> (Interconnecting Network) In the artificial neural network structure shown in As for the interconnecting neural network system (Bidirectional Network) Specifically, it is assumed herein that the input vector
In this case, if an output is finally obtained from the RBF Out of Equations (6) and (7), in Equation (7), the output of the RBF When the RBF In this case, the signal flow is x (Tree-Like Network) (Lattice-Like Network) (Layered Network) If a three-layered neural network shown in The layered neural network shown in
Alternatively, the outputs o The function of the interconnecting neural network system <Constructing Artificial Neural Network Structure> An artificial neural network structure realizing dynamic pattern recognition is constructed first in the interconnecting neural network system Specifically, the artificial neural network structure is constructed according to the following steps 1 and 2. Step 1: If the number M of the RBF elements is smaller than an upper limit M Step 2: Otherwise, a minimum excitation strength (e.g., the excitation strength h At this moment, the RBF elements Specifically, if the RBF Further, while the RBF elements Specifically, as shown in <Training of Weights Between RBF Elements> In the process of constructing the artificial neural network structure in the neural network unit Specifically, it is assumed that the weight between the RBF elements (First Algorithm for Updating Weight) Specifically, the following algorithm (first algorithm) can be used. (1) If both excitation times ε
(2) Conversely, if one of the RBF (3) If the RBF element is not connected to the other RBF element in a period p (Second Algorithm for Updating Weight) The following second algorithm may be used in place of the first algorithm. The second algorithm is based on the following two conjectures (Conjecture 1 and Conjecture 2) drawn from one of Hebb's postulates, i.e., “when an axon of a neuron A is near enough to excite a neuron B and iteratively or persistently takes part in exciting it, some growth process or metabolic change takes place in one or both neurons A and B such that A's efficiency, as one of the neurons exciting B, is increased” through neuropsychological considerations. Conjecture 1: When a pair of RBF elements are excited repeatedly, a new weight is formed between these RBF elements. If this occurs periodically, the value of the weight is increased. (Note that in Conjecture 1, the adjacency relationship between the neurons A and B in the Hebb's postulate stated above is not considered for the following reasons. First, if an algorithm to be described later is actually implemented on hardware (e.g., a memory system of a robot), it is not always necessary to consider the positional relationship between the RBF elements. Second, the Hebb's postulate implies that the excitation of the neuron A may occur due to the excitation of the other neurons connected to the neuron A via synapses. This second reason leads to Conjecture 2. Conjecture 2: When one RBF element is excited and one of the weights is connected to the other RBF element, the excitation of the one RBF element is transferred to the other RBF element via the weight. However, the excitation strength transferred to the other RBF element depends on the value of the weight. The second algorithm for updating the weights between the RBF elements is given based on Conjectures 1 and 2 as follows. (1) If the weight w (2) If the subsequent excitation of a pair of RBF elements (e.g., RBF
(3) If the excitation of the RBF element (RBF In Equations (15-1) and (15-2), ξ, w The processings (1) and (2) agree with the following postulates (i) and (ii) that rephrase Hebb's postulate more specifically. (i) If two neurons on either side of a synapse are excited asynchronously, the strength of the synapse is selectively weakened or the synapse is removed itself. (ii) If two neurons on either side of a synapse are excited simultaneously (i.e., synchronously), the strength of the synapse is selectively increased. In the processing (1), a decaying factor ξ The postulates (i) and (ii) stated above can be extended and interpreted such that (a) the decay of the synapse always occurs in a short period of time though the amount of such decay is relatively small and (b) the decay of the synapse also occurs when the other neuron(s) is/are excited due to the excitation of one neuron. The postulate (a) is represented by the decay factor ξ Conjecture 3: When the RBF element (RBF w _{ij} I _{i} (15-3)In Equation (15-3), γ denotes the decay factor and I
In Equations (15-3) and (15-4), the indicator function I (Third Algorithm for Updating Weight) The second algorithm can be modified to the following third algorithm. This third algorithm is on the assumption of the use of RBF elements used in <CONSTRUCTING SELF-ORGANIZING NEURAL NETWORK STRUCTURE> to be described later (i.e., RBF elements Specifically, contents of the third algorithm are as follows. (1) If the weight w (2) If the subsequent excitation of a pair of RBF elements (e.g., RBF
(3) If the excitation of the RBF element (RBF In Equations (15-5) and (15-6), ξ, w <Determination of Duration of RBF Element> In the process of constructing the artificial neural network structure in the neural network unit Specifically, a variable φ In the variable φ
In the variable φ In Equations (16) and (17), the update period of the factor “a” is limited to a period between t+T <Outputs of Neural Network in Consideration of Delay Element> The outputs o Specifically, a first output form can be obtained by using a method of comprising the steps of: calculating excitation strengths of the respective RBF elements _{j} =[o _{j}(1), o _{j}(2), . . . , o _{j}(N)]^{T} (18)In Equation (18), o t)) (20)In Equations (19) and (20), i denotes indices of all RBF elements connected to the j A second output form can be obtained by using a method of outputting excitation strengths of the respective RBF elements In Equation (21), In Equation (21), f(·) may be, for example, a cumulative function in a sigmoidal form and may be given according to the following Equation (23) (where b is a positive constant).
If variations in the excitation strengths output from the respective RBF elements serving as intermediate layer neurons are output in time sequence in accordance with the first and second output forms, then each RBF element
In Equation (24), if first two rows (i.e., In the first and second output forms, the final outputs of the neural network are given asynchronously. Therefore, if the output of every RBF element If the artificial neural network structure in consideration of such delay elements is applied to a system, a recognition system such as a database incremental search function, that narrows down final candidates with the passage of time can be constructed. Specifically, this artificial neural network structure can be applied to the construction of, for example, a thinking mechanism for automatically composing or estimating a complete sentence or song from a first word sequence or a phrase in the song. <Constructing Self-Organizing Neural Network Structure> The process of appropriately updating the weights between the RBF elements in any one of the artificial neural network structures as shown in A method of constructing such a self-organizing neural network structure will now be described in detail. It is assumed herein that as each RBF element (RBF (Construction Phase of Self-Organizing Neural Network Structure) A construction phase (or training phase) of the self-organizing neural network structure will first be described. Step 1: As a first step (cnt=1), an RBF element, which has a first input vector Step 2: As a second step, processings in the following steps 2.1 to 2.3 are repeatedly performed from cnt=2 up to cnt={total number of input vectors}. Step 2.1: (i) Based on the input vector (ii) The excitation strength h (iii) All the RBF elements determined to be excited in (i) and (ii) above are marked. Step 2.2: If no RBF element (RBF Step 2.3: Weights w In the step 2.3, as described in relation to the first to third algorithms, a processing for removing the RBF element (RBF (Testing Phase of Self-Organizing Neural Network Structure) A testing phase of the self-organizing neural network structure will next be described. Step 1: (i) The input vector (ii) The excitation strength h (iii) All the RBF elements determined to be excited in (i) and (ii) above are marked. Step 2: (i) The maximum excitation strength h (ii) Thereafter, if the object of constructing the self-organizing neural network structure is to perform some recognition processing, a result of the recognition is output simply by outputting a class label η The construction (training) phase and the testing phase have been described while assuming that each RBF element (RBF Specifically, an algorithm for the latter case is as follows. (1) A new RBF element is formed in the self-organizing neural network structure (note that no weight generated from this new RBF element is present at this point). (2) An RBF element that represents a new category (class label) is added into the self-organizing neural network structure as a new RBF element after a few times of this point. (3) A new RBF element added thereafter is connected to the RBF element that represents the new category (class label) via a weight. As can be understood, according to this embodiment, the artificial neural network structure is constructed by interconnecting the RBF elements According to this embodiment, each weight that connects the RBF elements According to this embodiment, each weight that connects the RBF elements Further, the artificial neural network structure realized in this embodiment possesses the same properties as those of the general regression neural network (“GRNN”) and the probabilistic neural network (“PNN”), and exhibits advantages of facilitating expansion and reduction of the network and of having fewer computational unstable factors. According to this embodiment, the respective RBF elements In the embodiment, the instance of using RBF elements each having the RBF as neurons in the artificial neural network structure has been described. As the “neurons in the artificial neural network structure” mentioned herein, arbitrary neurons each capable of outputting the excitation strength according to the similarity between the input vector and the centroid vector based on the kernel function can be used. As the “kernel function” mentioned herein, a function using a norm, a function using an inner product between two vectors, a function using Epanechinikov quadratic or Tri-cube, or the like is can be used. In the embodiment, as the centroid vector Furthermore, the interconnecting neural network system Specific examples according to the preferred embodiment stated above will be described. Problem Setting To see how the self-organizing neural network structure is actually constructed, let us consider solving an XOR problem by means of the self-organizing neural network structure, as a straightforward pattern recognition processing. As the respective RBF elements in the self-organizing neural network structure, the RBF elements shown in In addition, in accordance with the algorithms described in <CONSTRUCTING SELF-ORGANIZING NEURAL NETWORK STRUCTURE>, the self-organizing neural network structure capable of recognizing the four XOR patterns is constructed. Specific procedures are as follows. - (1) cnt=1
The radius σ and the excitation threshold θ - (2) cnt=2
An input vector At this time, the following equation is established.
c _{1}∥_{2} ^{2}/σ^{2})=0.449Thus, since h - (3) cnt=3
An input vector At this time, the following equations are established.
c _{1}∥_{2} ^{2}/σ^{2})=0.449(<θ_{k})h _{2}=exp(−∥ (3)−x c _{2}∥_{2} ^{2}/σ^{2})=0.1979(<θ_{k})Thus, since no RBF element excited by the input vector - (4) cnt=4
An input vector At this time, the following equations are established.
c _{1}∥(_{2} ^{2}/σ^{2})=0.1979(<θ_{k})h _{2}=exp(−∥ (4)−x c _{2}∥_{2} ^{2}/σ^{2})=0.449(<θ_{k})h _{3}=exp(−∥ (4)−x c _{3}∥_{2} ^{2}/σ^{2})=0.449(<θ_{k})Thus, since no RBF element excited by the input vector Thus, the self-organizing neural network structure that includes the four RBF elements (RBF <Testing Phase of Self-Organizing Neural Network>Structure Constructing the self-organizing neural network structure as stated above takes similar steps for the GRNN or the PNN. This is because the four neurons (i.e., RBF elements) are present in one neural network structure and the class labels η However, consider the situation in which another set of input vectors that represent XOR patterns, i.e., As can be understood from this observation, since the input data is locally stored in quite a small number of RBF elements, it is possible to realize a pattern classifier capable of appropriately performing data pruning (or data clustering) by appropriately adjusting the parameters relating to the respective RBF elements. A parameter adjustment processing for the self-organizing neural network structure will next be discussed while referring to several simulation experiment examples. (Single-Domain Pattern Recognition) In order to see how the self-organizing neural network structure is constructed (self-organized) in a more realistic situation, a simulation experiment is conducted to single-domain pattern recognition (pattern recognition using several single-domain datasets extracted from public databases). In the PROBLEM SETTING section described above, the connection between the RBF elements in the self-organizing neural network structure (the formation and training of weights between the RBF elements) has not been described. In the first simulation experiment example, the weights between the RBF elements are taken into consideration so as to see how the excitation between the RBF elements via the weights affects the performance of the self-organizing neural network structure. In addition, in the first simulation experiment example, the second algorithm is used as an algorithm for training the weights between the RBF elements. However, the processing (3) of the second algorithm (i.e., removal of the RBF element serving as a neuron) is not considered herein so as to more accurately track the behavior of the self-organizing neural network structure. (Parameter Setting) In the first simulation experiment example, three different domain datasets extracted from databases (SFS, OptDigit, and PenDigit) of “UCI Machine Learning Repository” at the University of California are used. These three datasets are independent of one another so as to perform recognition processing, features of which datasets are shown in the following Table 1. The SFS dataset is encoded in advance, and pattern vectors for the recognition processing are given to the SFS dataset.
Parameters of the RBF elements in the self-organizing neural network structure are selected as summarized in Table 2 below. As shown in Table 2, a combination of parameters are selected so that all of the three datasets have parameters as equal as possible so as to perform simulations in conditions as similar as possible. In order to evaluate excitation for the respective RBF elements (determine whether the excitation strength h
(Simulation Result) As can be seen from Referring to (Impact of Selection σ Upon Performance of Self-Organizing Neural Network Structure) It is empirically confirmed that, as for the PNN or the GRNN that is an ordinary neural network structure, a unique setting of radii within the self-organizing neural network structure gives a reasonable trade-off between the generalization performance and computational complexity. Therefore, in the construction phase of the self-organizing neural network structure in the first simulation experiment, a setting of radius σ Nevertheless, how to select the radii σ
As shown in If (Generalization Performance of Self-Organizing Neural Network Structure) In Table 4 below, the self-organizing neural network structure constructed using the parameters shown in Table 2 (that is, the self-organizing neural network structure for which all pattern vectors used for construction have completely been input) is compared with the PNN having the centroid calculated by a well-known MacQueen's k-means clustering algorithm in terms of performance. In order to make comparison between the two network structures as fairly as possible, the number of neurons in the PNN responsible for respective classes is set identical to the number of RBF elements (neurons) in the self-organizing neural network structure.
As shown in Table 4, for the three datasets, the overall generalization performance of the self-organizing neural network structure is substantially equal or slightly better than that of the PNN. Nevertheless, differently from the ordinary neural network structures GRNN and PNN, the number of RBF elements (neurons) in the self-organizing neural network structure is automatically determined by an autonomous algorithm. The self-organizing neural network structure is dynamic as compared with the ordinary neural network structures GRNN and PNN in this respect. (Varying Pattern Presentation Order) For the self-organizing neural network structure, a normal “well-balanced” pattern vector input order, as a typical manner of constructing the pattern classifier stated above, is for example, Pattern # (Simultaneous Dual-Domain Pattern Recognition) In the first simulation experiment example, it is confirmed that in the field of pattern recognition, the self-organizing neural network structure possesses the generalization performance equal to or slightly better than those of the ordinary neural network structures PNN and GRNN. However, this reveals only one of the features of the self-organizing neural network structure. Namely, the self-organizing neural network structure is also characterized by being applicable to a processing of multiple domains having a plurality of modalities. In the second simulation experiment example, therefore, another practical simulation is conducted to the pattern recognition of multiple domains (i.e., simultaneous pattern recognition of dual domains) for the self-organizing neural network structure in order to understand the latter feature of the self-organizing neural network structure. The self-organizing neural network structure constructed in the second simulation experiment example is obtained by integrating two partial self-organizing neural network structures. Namely, this self-organizing neural network structure is designed so as to simulate a situation in which excitation occurs not only to an auditory area but also a visual area in parallel, i.e., simultaneously by a specific voice input to a specific area (i.e., the auditory area) in the structure, thereby realizing “simultaneous dual-domain pattern recognition.” This designing implies that appropriate built-in feature extraction mechanisms for respective modalities (the auditory modality and the visual modality) are provided in the system. This designing is, therefore, somewhat relevant to an approach of modeling “association” between different modalities, or, in a more general context, an approach of “concept formation.” The “approach” used herein is an approach for handling several perceptual methods simultaneously or integrally (in a data fusion fashion), and is realized by an integral representation method called “gestalt.” (Parameter Setting) In the second simulation experiment example, an SFS dataset (for digit voice recognition) and a PenDigit dataset (for digit character recognition) are used. These two datasets are employed to construct partial self-organizing neural network structures for corresponding specific domain data, respectively. Cross-domain weights (link weights) (i.e., association links) that connect a predetermined number of RBF elements (neurons) in the two partial self-organizing neural network structures constructed using the two datasets are formed by the same method as that using the weight update algorithm stated above. Parameters for updating weights so as to perform a dual-domain pattern recognition processing are summarized in right columns of Table 2. In this example, the same weights as ordinary weights (i.e., the weights in the partial self-organizing neural network structures as summarized in left columns of Table 2) are selected. The decay factor ξ Further, in modeling such a cross-domain processing, it is necessary to consider that the order of input pattern vectors affects formation of association links. In the second simulation experiment example, therefore, pattern vectors are input alternately into two pieces of training data, i. e., in a manner like pattern vector # (Simulation Result) In second and third columns of Table 5, generalization performances in the dual-domain pattern recognition processing in the second simulation experiment example are summarized. In Table 5, “Sub-SOKM(i)→Sub-SOKM(j)” indicates the overall generalization performance obtained by the excitation of the RBF elements in the j
(Presenting Class Labels to Self-Organizing Neural Network Structure) In the first and second simulation experiment examples, when the new RBF element is added into the self-organizing neural network structure, the class label η Taking this into account, the third algorithm (the algorithm in light of the class label η (Constraints on Formation of Weights) In the self-organizing neural network structure, the class labels can be given at any time depending on applications. In this example, a situation which is not so typical in practice, and in which information on the class labels is known a priori, will be assumed and how such a modification affects the performance of the self-organizing neural network structure will be considered. (Simulation Result) Patent Citations
Non-Patent Citations
Classifications
Legal Events
Rotate |