next up previous contents
Next: Using a polynomial kernel Up: Selecting the Kernel function Previous: Selecting the Kernel function

Using a Radial Basis Function as kernel

Normally a Gaussian will be used as the RBF, Figure gif shows a two-dimensional version of such a kernel. From Eqn. gif , the output of the kernel is dependent on the Euclidean distance of tex2html_wrap_inline2474 from tex2html_wrap_inline1624 (one of these will be the support vector and the other will be the testing data point). The support vector will be the centre of the RBF and tex2html_wrap_inline2200 will determine the area of influence this support vector has over the data space.


Figure: The Radius Basis Function kernel

A larger value of tex2html_wrap_inline2200 will give a smoother decision surface and more regular decision boundary. This is because an RBF with large tex2html_wrap_inline2200 will allow a support vector to have a strong influence over a larger area. Figures gif, gif, gif and gif show the decision surface and boundaries for two different tex2html_wrap_inline2200 values. A larger tex2html_wrap_inline2200 value also increases the tex2html_wrap_inline2488 value (the Lagrange multiplier) for the classifier. When one support vector influences a larger area, all other support vectors in the area will increase in tex2html_wrap_inline2488 -value to counter this influence. Hence all tex2html_wrap_inline2488 -values will reach a balance at a larger magnitude. A larger tex2html_wrap_inline2200 -value will also reduce the number of support vectors (Table gif). Since each support vector can cover a larger space, fewer are needed to define a boundary.

This means that the estimate of ||w|| will increase. The estimation of the VC dimension of the SVM (Eqn. gif) depends on the tex2html_wrap_inline1712 norm of w and also the radius of the sphere that encompasses all the data (R). From Eqn. gif, as tex2html_wrap_inline2200 increases, the value of R will decrease. This will balance the increase in ||w||.

The experiments on the effect of different tex2html_wrap_inline2200 -values show that the expected risk loosely corresponds to the accuracy of the classifier tested on testing data (Figure gif and gif).

Table: The results for different tex2html_wrap_inline2200 setting

Figure: Decision surface of small tex2html_wrap_inline2200

Figure: Decision surface of large tex2html_wrap_inline2200

Figure: The values of Lagrange multipliers for small tex2html_wrap_inline2200

Figure: The values of Lagrange multipliers for large tex2html_wrap_inline2200

Figure: Decision boundary for small tex2html_wrap_inline2200

Figure: Decision boundary for large tex2html_wrap_inline2200

Figure: The expected error for different tex2html_wrap_inline2200 -values

Figure: The accuracy of the classifier with different tex2html_wrap_inline2200 value

next up previous contents
Next: Using a polynomial kernel Up: Selecting the Kernel function Previous: Selecting the Kernel function

K.K. Chin
Thu Sep 10 11:05:30 BST 1998