CIS 311: Neural Networks
Kohonen Networks
1. Unsupervised Learning
The training of neural networks is supervised when the training inputs are provided together with their corresponding target outputs, that is each training input is accompanied by its output.
The training of neural networks is unsupervised when the outputs are not provided with the inputs, that is the training algorithm should also learn the outputs as well as to learn to recognize them.
The unsupervised mode of training is also called self-organized learning because there is no external teacher to preclassify the training examples. The network weight parameters in this case are trained with respect to a general independent measure of the desired quality of the performance. The neural network machinery learns to recognize clusters of similar examples, it classifies the given examples into clusters and automatically creates clusters during training.
The self-organizing neural networks consist usually of three layers: 1) an input layer that receives the data; 2) a competitive layer of neurons that compete with each other to determine to which cluster the given input belongs; and 3) an output layer which generates the result in a way suitable to the application. Unsupervised training of such networks is carried out with algorithms using competitive training rules. The competitive training rule implements a winner-takes all strategy: it chooses the neuron with the greatest total input as a winner and turns it on, while all other neurons are switched off.
Self-organizing neural networks trained in unsupervised mode are applied typically to classification tasks.
2. Self-Organizing Maps
Self-organizing maps are a special class of artificial neural networks based on competitive unsupervised learning. These are networks whose neurons are allocated into one (or two) dimensional lattice structure. During the competitive learning process the neurons are tuned selectively, that is the training data select the winning neurons. Since the locations of the neurons are ordered with respect to each other, the lattice may be considered a kind of a topographic map of the inputs. The locations of the neurons in the topographic map show the statistical features of the provided input data. The neurons in a self-organizing network transform the signals into a corresponding place-coded data distribution.
This is a loose simulation of the organization of the cells in the brain, which are assumed to form topologically ordered maps that react to common sensory input signals. Different sensor inputs are mapped into areas of the cerebral cortex in the brain in ordered way. That is why, our intention as computer engineers is to design such computational network devices that perform self-learning following the principle of topographic map formation in the brain. According to this principle the location of a neuron in the lattice reflects a particular feature from the input space.

Figure 1. One-dimensional self-organizing map
A one-dimensional lattice of fully connected neurons to all input data is shown in the figure below.
This is a feed-forward network that trains its synaptic weights adaptively after the arrival
of each next input example. The training involves tree main phases:
- competition: the neurons in the Kohonen layer compute a certain function, and thus generate outputs that are compared for selection of a winner;
- cooperation: the winner is taken as a basis for cooperation in the sense that it determines the topological neighborhood within which the example falls;
- adaptation: the neurons are adjusted to reflect the information in the provided training example by updating their weights so that the neuron output changes correspondingly.
2.1. Competition
The competition among the neurons is based on the outputs that they produce. The output of a neuron in the Kohonen layer of a self-organizing neural network computes the distance between the weight vector and the given training example, considered in the Euclidean space. The distance is computed by the vector product:
s = w xT = S i=1d wi xei
where: w is the weight vector, and x is the e-th input vector xe= ( xe1, xe2,..., xed ).
The neuron with the largest value of the summation block is the winner, this indicates the e-th input vector belongs to the cluster of this neuron. However in order to determine the winning neuron all the neuron outputs have to be computed.
2.2. Cooperation
The winning neuron is the center of the neighborhood of topologically close, cooperating neurons. The neighborhood of cooperating neurons includes a set of neurons at lateral distance computed by a special function. The distance function has to satisfy two requirements: it is symmetric, and it decreases monotonically with the increase of the distance. Such a lateral distance function is the following Gaussian:
h( n, i ) = exp( - ln,i2 / 2s2 )
where: the indices n and i enumerate the neurons, ln,i = || xe - wn || is the lateral (topological) distance between them, and s is the radius of influence. The distance ln,i is measured in the discrete output space.
2.3. Adaptation
The self-adaptation of the Kohonen network involves changing of the synaptic weights in proportion to the input vector. The effect of changing the weights is moving the weight vector toward the input vector. The unsupervised network learning involves adaptation of all neurons in the neighborhood of the winner, which entails changes in the weights distribution so that adjacent neurons have similar weight vectors. The weights of each neuron in the neighborhood of the winner are updated according to the following training rule:
wn = wn + h h( n, i ) ( xe – wn )
where: h is a learning rate constant, i is the index of the winner, and n enumerates the neurons in the neighborhood determined by the distance function h( n, i ) .
3. SOM Training Algorithm
Unsupervised Learning Algorithm for Self-organizing maps
Initialization: Examples { xe }e=1N, Kohonen layer with M neurons, initial weights (associated with each neuron) w set to small random values, learning rate h = 0.1
Repeat
Draw a training example ( xe ) with a certain probability
- calculate the output of each neuron:
sm=S i=1d wi xei, where: 1<=m<=M
- determine the winner index:
i(x) = arg max m ( sm )
- isolate the neighborhood of n neurons ( n < M )
h( n, i ) = exp( - ln,i2 / 2s2 ), where: ln,i = || xe - wn ||
- update the weights
wn = wn + h h( n, i ) ( xe – wn )
until the changes become less than the predefined threshold.
The process of training a self-organizing neural network has two main phases: ordering and convergence. During the first ordering phase the algorithm adapts the topological ordering of the neurons and their weight vectors. Starting from random initial weights, the neurons evolve toward a mesh in which each neuron is in its correct topological phase. The learning rate parameter should be initially relatively large say 0.1 and it may be decreased toward 0.01. The neighborhood function should initially include all neurons centered around the winning neuron, and it should gradually shrink by reducing its radius.
During the second convergence phase the topological map is tuned further to achieve accurate mapping of the provided input space into the feature space. During the convergence phase the learning parameter may be kept small, fixed at say 0.01 and it should be maintained always greater than zero. The neighborhood function during the convergence phase should contain only the closest neighbor to the center neuron.
4. Properties of the Feature Map
Feature map is the function that maps the continuous input data space into the discrete output space.
The feature map, defined by the synaptic weight vectors, provides a good approximation to the input space.
The feature map, computed by the SOM algorithm, has a topological ordering because the spatial location of a neuron in the network lattice corresponds to a domain of input examples.
The feature map reflects variations in the statistics of the input distribution because the input space regions, from which example vectors are drawn with higher probability, are mapped onto larger output domains thus giving better resolution.
The self-organizing map is able to select a set of best features for approximating the unknown data distribution.
Example: Let a Kohonen-type self-organizing network with three neurons be given.
A two-dimensional example: x = (x1,x2) = (-1,0) is provided for training.
The initial weight vectors are: w1 = (1,0), w2 = (0,1) and
w3 = (-0.707,-0.707).
The outputs of the three neurons are calculated as follows:
s1 = 1 * (-1) + 0 * 0 = -1
s2 = 0 * (-1) + 1 * 0 = 0
s3 = (-0.707) * (-1) + (-0.707) * 0 = 0.707
The winner index is i(x) = 3
Assuming learning rate h=0.3 and distance function h( n, i )=1.0, the third weight vector is updated as follows:
w3 = (-0.707,-0.707) + 0.3 * [ (-1,0) - (-0.707,-0.707) ]
= (-0.707,-0.707) + 0.3 * [ (-0.293,0.707) ]
= (-0.795,-0.495)
In orther to continue training of the network the weight vector has to be normalized:
c = 1.0 / sqrt( w12 + w22 ) = 1.0 / sqrt( (-0.795)2 + (-0.495)2 ) = 1.0 / 0.9365 = 1.0678
w'3 = c * w3 = ( 1.0678 * (-0.795), 1.0678 * (-0.495)) = (-0.849,-0.529)
This weight vector w'3 and the other unchanged weights w1 and w2 will be used with the next training example.
Suggested Readings:
S. Haykin (1999). Neural Networks. A Comprehensive Foundation, Second Edition, Prentice-Hall, Inc., New Jersey, 1999, pp.443-465.