Support-vector machine

Support-vector machine (SVM) is a supervised learning model with associated learning algorithms that analyze data used for classification and regression analysis. Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other, making it a non-probabilistic binary linear classifier. SVM models are closely related to neural networks and have been applied in various fields, including bioinformatics, chemistry, and engineering, to solve complex problems.

Overview[edit | edit source]

An SVM model represents examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall on.

The data points are labeled as belonging to one of two categories, and the goal of the SVM algorithm is to find the hyperplane that best separates the categories. This hyperplane is the one that has the largest distance to the nearest training data point of any class (so-called functional margin), since in general the larger the margin, the lower the generalization error of the classifier.

Mathematical Formulation[edit | edit source]

The SVM algorithm seeks to find a hyperplane in an N-dimensional space (N — the number of features) that distinctly classifies the data points. To separate the two classes of data points, there are many possible hyperplanes that could be chosen. The optimal hyperplane is the one that represents the largest separation, or margin, between the two classes. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier.

Kernel Trick[edit | edit source]

To solve non-linear classification problems, SVMs employ the kernel trick. The kernel trick involves transforming the linearly inseparable data into a higher dimension where it is linearly separable. This is achieved through a kernel function, which computes the inner product between two points in the transformed space without having to compute the coordinates of the points in that space, thus avoiding the explicit mapping that is computationally expensive.

Applications[edit | edit source]

SVMs are used in a variety of applications, including:

Image classification: SVMs can classify images with high accuracy.
Bioinformatics: For protein classification, gene classification, and cancer diagnosis.
Text and hypertext categorization: Automated assignment of categories to text and hypertext documents.
Handwriting recognition: SVMs have been used to recognize handwritten characters used in postal code recognition and bank check automation.

Advantages and Disadvantages[edit | edit source]

Advantages[edit | edit source]

Effectiveness in high dimensional spaces.
Still effective in cases where the number of dimensions exceeds the number of samples.
Versatile: different Kernel functions can be specified for the decision function.

Disadvantages[edit | edit source]

If the number of features is much greater than the number of samples, the method is likely to give poor performances.
SVMs do not directly provide probability estimates, which are desirable in many applications.