Simple Iteration Method to Solve Linear Equations

RBF NEURAL NETWORK

1.3.1 Radial basis function technique and RBF neural network

Radial basis function was introduced by MJD Powell to solve multivariate function interpolation problem in 1987. In the field of neural network, RBF neural network was proposed by DS Bromhead and D. Lowe in 1988 for multivariate function interpolation and approximation problem (see [5]).

Below we will briefly present the technique of using radial basis functions to solve multivariate interpolation problems.

Radial basis function technique

Maybe you are interested!

after:

Without loss of generality suppose m=1 then the interpolation function

 has the form

here  k

is the kth radius basis function . Usually  k

(1)

There are the following forms: (2)

In fact, people often give  k

(3)

(4)

in form (2) and within the key framework

This thesis only considers  k

in form (2).

u 2

i1

Note that here we use the norm ||.|| which is the Euclidean norm u  ;

vk is the center of each basis function of radius  k ;  k

is the radius or also known as the degree parameter

width of  k .

We see that with the radius function form chosen above, the distance between the vector in

put x and center vk

The larger the value of the radius function, the smaller the value. For each k , the value

of radius  k is used to control the influence domain of the radius function  k .

Accordingly, if

x  vk

 3k then the value of function k (x)

< e9 is very small, meaningless.

Figure 10: Illustration of the influence of the radius function

For example, in the figure above a large circle represents a radial basis function, which only affects points within its radius.

Substituting formula (2) into (1) we get the mathematical representation of the radial basis function technique as follows:

(x

)  �wkk (x

k 1

)  w0 �wk e

k 1

 xj vk 2 2



 w0  y

(5)

A very advantageous feature when using the radius function to solve problems

multivariate interpolation, that is when considering the squared error

E 

i1

 xi  yi 2

then

It has been proven that E has only one extreme value. Therefore, finding the parameters of the radial basis functions (w, vk, ) so that E reaches its minimum will be

very fast and effective solution

1.3.2 RBF Neural Network Architecture

RBF network is a type of feedforward artificial neural network consisting of three layers.

It consists of n nodes of the input layer for the input vector

x Rn , N hidden neurons (values

of the kth hidden neuron is the return value of the radius basis function  k ) and m output neurons.

k

INPUT

OUTPUT

HIDDEN

Figure 11: Architecture of RBF network

Of course, as mentioned above, without losing generality, the content of this thesis only considers the case m=1.

1.3.3 Training characteristics of RBF Neural Network

The advantages of RBF networks are short training time, very fast and simple setup. Nowadays RBF Neural Networks are used in many fields:

Image processing

Speech Recognition Digital Signal Processing

Targeting for Medical Diagnostic Radar

Pattern Recognition Error Detection Process

….

CHAPTER 2:

RBF NETWORK TRAINING ITERATIVE ALGORITHM

The contents of this chapter include:

2.1 Describe the two-phase HDH iteration algorithm for arbitrary training data

2.2 Description of the single-phase HDH iteration algorithm used for evenly spaced training data (details of these two algorithms can be found in [6])

2.1. TWO-PHASE ITERATIVE ALGORITHM FOR TRAINING RBF NETWORK

In this chapter, before mentioning the iterative algorithm, I would like to present the theoretical basis used to build the algorithm.

2.1.1 Simple iteration method to solve linear equation system

Suppose we need to solve the system of equations

Ax=B

First we convert to equivalent system

X=Bx+d In which B is a square matrix of order n satisfying j

||B||=max{ i,j| / i=1..n} Single iteration method

For any vector x0=, the sequence of solutions of the equation is constructed by

iterative formula

satisfy

With error estimates

xk+1 = Bxk + d; ( i,jxkj+d)

= x* where x* is the only correct solution of

||x*iski|| <= max{| xkj – xk1j |}

Usually we can choose x0=d, then we consider that we have calculated the initial approximation with x0 = 0 and x0=d is the next step.

2.1.2 Two-phase iterative algorithm for training RBF network

 

Consider the training set xk , yk N

k 1

;xk �Rn , yk �Rm , without loss of generality,

Here we consider the RBF network with one output neuron ( m = 1 ), then the mathematical representation of the RBF network is:

 ( x i ) 

k 1

wkk

( x i )  w

 yes

(1)

Consider the matrix

   

where    (xi )  e||x x || / k , note that here

ik 2 2

k,i NN

k ,ik

We choose the center of the radial basis functions to be all points in the input data set X.

We denote I as the unit matrix of order N ; W = N -dimensional space R N in which:

...

, Z =

...

are the vectors in

and put

k , j

k N

NxN

(2)

(3)

then

 

0; when : k  j

jk 2 2

(4)

k , j

e||x  x || / k ; when : kj

Then the system of equations (1) is equivalent to the system:

W= W +Z (5)

As stated in 1.3.1, with arbitrary chosen parameters k and w0, system (1) and therefore system

(5) always has a unique solution W. Later the value w0 is chosen as the average of the yk values:

w =1 N yk

N k 1

(6)

For each k

N, we have the function qk of

k is determined as follows:

qk k , j

j 1

(7)

The function qk is monotonically increasing and for every positive number q < 1 there always exists a value k such that

let qk( k )=q.

2.1.3. Algorithm description.

Given the error and positive constants q, <1, the algorithm consists of 2

phase to determine the parameters k and W*. In the first phase, we will determine the k to

qk q and closest to q (that is, if k=k/ then qk>q). The next phase finds the solution

approximate W* of (5) by a simple iterative method. The algorithm is specified

in figure 12.

Proceduce 2-phase algorithm to train RBF network for k=1 to N do

Determine the k so that qk q, and if k=k/ then qk>q; // Phase 1

Find W* by simple iteration method (or Seidel iteration method); //Phase 2

End

Figure 12: HDH algorithm for training RBF network

To find the solution W* of system (5), we perform the following iterative procedure. Initialize W0=Z ;

Calculate

Wk+1=

Wk +Z ; (8)

If the termination condition is not satisfied, assign W0 := W1 and return to step 2;

For each N-dimensional vector u, we denote the norm u *

Choose one of the following expressions. a)

uj , the termination condition can be

j 1

1 q

W 1 W 0

Z *

ln (1 q )

t in q

ln Z *

ln q

ln(1

q) , where t is the number of iterations.

(9)

(10)

2.1.4. Comments

This algorithm has the advantage of being very simple to set up and very fast convergence speed and we can adjust the interpolation error value to be arbitrarily small. However, due to the complex network architecture

complex, so the phenomenon of overfitting often occurs for the training data set. To understand more about the HDH algorithm (see more at [6]). There, the author, with experimental research results, showed that the calculation speed and generality of the two-phase HDH iteration algorithm are much better than other classic algorithms such as the Gradient iteration method or the QTL algorithm ..... For brevity and to distinguish it from the one-phase iteration algorithm that will be presented right after, we call this two-phase HDH iteration algorithm the HDH2 algorithm.

2.2 ONE-PHASE ITERATIONAL ALGORITHM FOR TRAINING RBF NETWORKS WITH EQUALLY SPACED DATASETS

The above two-phase iteration algorithm has the characteristic that the training time of phase one occupies the majority. In the case of equally spaced training milestones, the two-phase iteration algorithm can omit this first phase, becoming a single-phase algorithm. This algorithm trains on equally spaced milestones and is often applied to applications in the field of

computer graphics, pattern recognition, engineering problems

technique…. and is the basis

basis

prize

Solve the interpolation problem with the next chapter dataset.

whether training noise is to be presented in

2.2.1 Representation of interpolation landmarks

The interpolation points are equidistant points, which can be represented as xi1,i2…,in = ,…, )

in which x = + ik.hk . With k representing the direction, hk (k=1,2,..,n) is a constant representing the distance between 2 equidistant landmarks of 1 direction, representing the change of direction xk ; ik takes values from 0 to nk ; with nk+1 being the number of landmarks divided by each direction

2.2.2. Algorithm description:

Instead of the Euclidean norm, we consider the Mahalanobis norm: ||x|| = xT Ax, where A is a matrix of the form

The parameter k will be chosen = hk . Then, expression (1) in section 2.1.2 can be rewritten as the following formula:

(x) = i1..in(x)+w0

Matrix

become:

qi1..in can be rewritten as qi1..in =

Applying some mathematical transformations, see details in [6], the author has

Prove that for radius i1..in to satisfy the condition qi1..in < q then:

So, we can choose

for all radius functions to ensure that the stopping condition always occurs. Thus, the initial phase of calculating the width parameter for each radius function, which takes up most of the training time, is solved immediately, the two-phase iteration problem becomes a single-phase iteration algorithm for training on equidistant landmarks.

2.2.3. Comments

According to the experimental results in [6], along with showing that the two-phase HDH iteration algorithm has shown good generality and much faster training time than other algorithms, also in [6], with the experimental results the author also pointed out that this one-phase HDH iteration algorithm with a large reduction in training time has shown a great advantage in calculation speed, in addition to showing that the generality of the one-phase HDH iteration algorithm is even better than the two-phase HDH iteration algorithm.

This algorithm has the characteristic that for a given value domain, the more densely spaced the landmarks are, the better the generality.