An Introduction to Neural Networks

Dr. Leslie Smith
Centre for Cognitive and Computational Neuroscience
Department of Computing and Mathematics
University of Stirling.
lss@cs.stir.ac.uk
last major update: 25 October 1996: last minor update 22 April 1998.
This document is a roughly HTML-ised version of a talk given at the NSYN meeting in Edinburgh, Scotland, on 28 February 1996. Please email me comments, but remember that this is just the slides from an introductory talk!

Overview:

Why would anyone want a `new' sort of computer?

What is a neural network?

Some algorithms and architectures.

Where have they been applied?

What new applications are likely?

Some useful sources of information.

Why would anyone want a `new' sort of computer?

What are (everyday) computer systems good at... .....and not so good at?
Good at Not so good at
Fast arithmetic Interacting with noisy data or data from the environment
Doing precisely what the programmer programs them to do Massive parallelism
Massive parallelism
Fault tolerance
Adapting to circumstances
Where can neural network systems help?  

What is a neural network?

Neural Networks are a different paradigm for computing: Neural networks are a form of multiprocessor computer system, with A biological neuron may have as many as 10,000 different inputs, and may send its output (the presence or absence of a short-duration spike) to many other neurons. Neurons are wired up in a 3-dimensional pattern.

Real brains, however, are orders of magnitude more complex than any artificial neural network so far considered.

Example: A simple single unit adaptive network:

The network has 2 inputs, and one output. All are binary. The output is

1 if W0 *I0 + W1 * I1 + Wb > 0 

0 if W0 *I0 + W1 * I1 + Wb <= 0 

We want it to learn simple OR: output a 1 if either I0 or I1 is 1.

Algorithms and Architectures.

The simple Perceptron:

The network adapts as follows: change the weight by an amount proportional to the difference between the desired output and the actual output.

As an equation:

&Delta Wi = &eta * (D-Y).Ii

where &eta is the learning rate, D is the desired output, and Y is the actual output.

This is called the Perceptron Learning Rule, and goes back to the early 1960's.

We expose the net to the patterns:

I0 I1 Desired output
0 0 0
0 1 1
1 0 1
1 1 1
We train the network on these examples. Weights after each epoch (exposure to complete set of patterns) 

At this point (8) the network has finished learning. Since (D-Y)=0 for all patterns, the weights cease adapting. Single perceptrons are limited in what they can learn:

If we have two inputs, the decision surface is a line. ... and its equation is

I1 = (W0/W1).I0 + (Wb/W1

In general, they implement a simple hyperplane decision surface

This restricts the possible mappings available.

Developments from the simple perceptron:

Back-Propagated Delta Rule Networks (BP) (sometimes known and multi-layer perceptrons (MLPs)) and Radial Basis Function Networks (RBF) are both well-known developments of the Delta rule for single layer networks (itself a development of the Perceptron Learning Rule). Both can learn arbitrary mappings or classifications. Further, the inputs (and outputs) can have real values
Back-Propagated Delta Rule Netwo