CIS 311: Neural Networks

Introduction to Connectionist Learning

1. Neural Computation

This course introduces the theory and practice of neural computation.Neural computation has emerged as a reliable approach to searching for approximate solutions to unprecisely specified real-world problems.

Neurocomputing with artificial neural networks (ANN) was inspired by biological neural networks. Biological neurons, believed to be the structural constituents of the brain, are much slower than silicon logic gates. But inferencing in biological neural networks is faster than the inferencing in the fastest computer.

The brain compensates for the relatively slower operation by having an enormous number of massively interconnected neurons. A biological neural network is highly parallel device characterized by robustness and fault tolerance. It has abilities to:
- learning by adapting its synaptic weights to changes in the environment;
- generalizing from given known examples to unknown ones;
- handling unprecise, fuzzy, noisy and probabilistic information.

The ANNs are attempt to mimic these abilities. The neurocomputing is a paradigm different from a programmed instruction sequence in that information is stored in the synaptic connections.

Each neuron is an elementary processor with primitive operations like summing the weighted inputs coming to it and then amplifying or thresholding the sum.

A synchronous assembly of neurons can perform universal computations for suitably chosen weights. Such an assembly of neurons can perform the same computations as an ordinary digital computer.

A neural network is characterized by:
- network architecture (topology);
- network node properties;
- connections between the neurons (weights);
- updating (learning) rules for the weights and the states of the neurons.

The ANN are designed so that thay posses many desirable characteristics not present in the vonNeumann or modern parallel computers. These include:
- distributed representation and computation;
- learning capacity;
- generalization ability;
- adaptability;
- fault tolerance;
- massive parallelism.

von Neumann Computer vs. Biological Neural System
processor complex
high-speed
one (or a few)
simple
low-speed
a large number
memory separate
localized
noncontent addressable
integrated in the neuron
distributed
content addressable
computing centralized
sequential
strored-programs
distributed
parallel
self-learning
reliability vulnerable robust
expertise numerical
symbolic
perceptual problems
manipulations
environment well-defined poorly defined
unconstrained

2. Biological Neural Networks

Biological neural networks consist of neurons, which are special cells that process information.

A neuron is composed of a cell body (soma), and two types of out reaching tree-like branches: the dendrites (many) and the axon (one):- the cell body has a nucleus that contains information about hereditary traits and plasma that holds the equipment for producing necessary material;
- the dendrites are protuberances that with plenty of surface area facilitatethe connections with the axons of other neurons;
- the axon is a protuberance that delivers the neuron's output to connections

with other neurons.

A neuron receives signals (impulses) from other neurons through its dendrites (receivers). A neuron transmits signals generated by its cell body along the axon (transmitter).

At the terminals of the dendrite branches are the synapses. A synapse is a functional unit between two neurons: an axon strand of one neuron and a dendrite branch of another.

A biological neuron does nothing unless the collective influence of all its inputs reaches a threshold level. Whenever that threshold level is reached, the neuron produces a full-strength output in the form of a narrow pulse that proceeds from the cell body down the axon.

When this happens the neuron is said to fire. Because the neuron either fires or does nothing it is said to be an all-or-none device. Stimulation at some synapses encourages neurons to fire. Stimulation at others discourages neurons from firing.

There is a mounting evidence that learning takes place in the vicinity of synapses and has something to do with the degree to which synapses translate the pulse traveling down one neuron's axon into excitation or inhibition of the next neuron.

The frequency with which the neurons sens pulses vary from a few to several hundred hertz, which is a million time slower than the fastest switching speed in the contemporary electronic circuits.

However, complex perceptual decisions such as face regognition are typically made by humans within a few hundred milliseconds. These decisions are made by networks of neurons whose operational speed is only a few milliseconds.

This imples that the computations can not take more than about 100 serieal stages, that is the brain runs parallel programs that are about 100 steps long. The same timing considerations show that the amount of information sent from one neuron to another must be very small.

This implies that critical information is not transmitted directly, but captured and distributed in the connections- hence the name connectionist models used to describe ANN also.

3. Computational Model of Neuron

The ANNs are built of simulated neurons.

The simulated neuron is viewed as a node connected to other nodes via links that correspond to axon-synapse-dendrite connections.Each link is associated with a weight. Like a synapse that weight determines the nature and strength of one node's influence on another.

One node's influence on another is the product of the influencing neuron's output value times the connecting link's weight. Thus, a large positive weight corresponds to strong excitation, and a small negative weight corresponds to a weak inhibition.

Each node combines the separate influences received on its inputs into overall influence using an activation function.

One simple activation function passes the sum of the input values through a threshold function to determine the node's output.

The output of each node is either 0 or 1 depending on whether the sum of the inputs is below or above the threshold values used by the threshold function.

4. Neural Network Architectures

ANNs can be viewed as weighted directed graphs in which artificial neurons are nodes and directed edges (with weights) are connections between neuron outputs and neuron inputs.

Based on the connection pattern (architecture) ANNs can be grouped into two categories:

feed-forward networks- in which graphs have no loops. Gnerally speaking feed-forward networks are static because they produce only one set of output values rather than a sequence of values from a given input;

recurrent (feedback) networks- in which loops occur because of feedback connections. Recurrent networks are dynamic.

A taxonomy of feed-forward and recurrent network architectures:

- Feed-forward networks
- Single-layer perceptron
- Multilayer perceptron
- Radial-basis function networks
- Higher-order networks
- Polynomial learning networks

- Recurrent networks
- Competitive networks
- Self-organizing maps
- Hopfield networks
- Adaptive-resonanse theory models

5. Connectionist Learning Algorithms

Learning involves improvement in performance. ANNs ability to learn from examples makes them attractive and exciting.

In order to understand the learning process, we need a model of the computation according to which the network operates and we must know what information is available to the network. The model of the computation is inductive learning from available data examples.

Connectionist learning typically involves the manipulation of connection weights in a single network of units.

The aim of the learning is to reach a point where the network produces certain types of input/output behaviour. This normally involves systematically updating the weights on possible connections between units.

The network weights are updated by learning rules, which learning rules govern the learning process. A learning algorithm refers to the procedure in which learning rules are used for adjusting the network weights.

In the connectionist learning scenario, the learner is the weight updating procedure and the target representation is the network with a certain configuration of weights.

If the architecture of the network is fixed the hypothesis space is the space of possible weight configurations and a single hypothesis is a particular configuration of weights. If the architecture is not fixed, the hypothesis space is made up of all possible architecture/weight configuration combinations.

There are two learning paradigms:

- Supervised learning- when there is a tutor, or a domain expert which gives the learner immediate feedback about the appropriateness of its behaviour.

The supervised learning task is:

Given: A set of examples and their associated outcomes;

Determine: A general description for each outcome that matches only its examples.

There are two learning approches:
- Incremental learning- when the examples are processed one at a time;
- Batch learning- when a large set of examples are processed at once.

Different network architectures require appropriate learning algorithms.

There are four basic types of learning algorithms depending on the rules:

6. Applications of Artificial Neural Networks

Researchers are designing artificial neural networks to solve a variety of problems such as classification, regression, system identification, pattern recognition, data mining, time-series prediction, etc..

Neural Networks and their Applications
Paradigm Learning Rule Network Architecture Learning Algorithm Task
Supervised Error-correction Single-layer perceptron
Multilayer perceptron
Perceptron learning
Backpropagation
Pattern classification
Regression
Time-series Modeling
Boltzmann Recurrent Boltzmann learning Pattern classification
Hebbian Multilayer feed-forward Linear discriminant analysis Pattern classification
Data Mining
Competitive Competitive Learning vector quantization Data compression
Unsupervised Error-correction Multilayer feed-forward Simmon's projection Data analysis
Hebbian Feed-forward
competitive
Principal component analysis Data compression
Data analysis
Competitive Self-organizing maps Kohonen SOM Categorization
Data analysis

Suggested Readings:

Haykin,S. (1999). Neural Networks. A Comprehensive Foundation, Second Edition, Prentice-Hall, Inc., New Jersey.

Jain,A.K. and Mao,J. (1996). Artificial Neural Networks: A Tutorial, IEEE Computer, vol.29, N: 3, pp.31-44.