-
Notifications
You must be signed in to change notification settings - Fork 2k
Linear Support Vector Machines
Besides the name, linear machines can be learned with or without the use of kernels. If the problem is known to be non-linearly separable, you can still use any kernel that implements the ITransform interface to explictly transform input points into a kernel space and perform linear classification in this extended space.
- Download a sample project containing most of the source code below.
In this example, we will create a linear SVM to model a binary AND function. The binary AND is an operation that takes two bit values and computes their logical conjunction. This is the same as returning zero whenever at least one of the inputs is zero, and one when both inputs are one. The This is a linearly separable problem, and as such, should be perfectly learned by a linear vector machine.
// Create a simple binary AND
// classification problem:
double[][] problem =
{
// a b a + b
new double[] { 0, 0, 0 },
new double[] { 0, 1, 0 },
new double[] { 1, 0, 0 },
new double[] { 1, 1, 1 },
};
// Get the two first columns as the problem
// inputs and the last column as the output
// input columns
double[][] inputs = problem.GetColumns(0, 1);
// output column
int[] outputs = problem.GetColumn(2).ToInt32();
// Plot the problem on screen
ScatterplotBox.Show("AND", inputs, outputs).Hold();
The AND problem is considered a linearly separable problem because we can easily draw a simple line that separates the red dots from the blue dots.
// Create a L2-regularized L2-loss support vector classification algorithm
var teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};
// Use the algorithm to learn the machine
var svm = teacher.Learn(inputs, outputs);
// Compute the machine's answers for the learned inputs
bool[] answers = svm.Decide(inputs);
// Convert to Int32 so we can plot:
int[] zeroOneAnswers = answers.ToZeroOne();
// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, zeroOneAnswers)
.Hold();
As we can see, as expected, the SVM's answer perfectly match the expected results for an AND function. As such, we can see that a linear SVM was perfectly able to learn the AND function without effort.
In this example, we will create a linear SVM to model a binary XOR function. The binary XOR is an operation that takes two bit values and computes their exclusive OR. It is a function that outputs 1 when the two values are different, and zero if they are the same. This is not a linearly separable problem, and as such, we can't expect it to be perfectly learned by a linear vector machine. However, we will see that we can still use kernels in this task in order to transform this problem from non-linearly separable to actually linearly separable, with a few extra steps.
// Create a simple binary XOR
// classification problem:
double[][] problem =
{
// a b a XOR b
new double[] { 0, 0, 0 },
new double[] { 0, 1, 1 },
new double[] { 1, 0, 1 },
new double[] { 1, 1, 0 },
};
// Get the two first columns as the problem
// inputs and the last column as the output
// input columns
double[][] inputs = problem.GetColumns(0, 1);
// output column
int[] outputs = problem.GetColumn(2).ToInt32();
// Plot the problem on screen
ScatterplotBox.Show("XOR", inputs, outputs).Hold();
The XOR problem is considered a non-linearly separable problem because we cannot simply draw a line that separates the red dots from the blue dots.
// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};
// Use the learning algorithm to Learn
var svm = teacher.Learn(inputs, outputs);
// Compute the machine's answers:
bool[] answers = svm.Decide(inputs);
// Convert to Int32 so we can plot:
int[] zeroOneAnswers = answers.ToZeroOne();
// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, zeroOneAnswers).Hold();
As we can see, a linear SVM cannot learn the binary XOR function. The output the machine produces differs from what we have been expecting. This happens because the XOR is a non-linearly separable problem, and a linear support vector machine is only able to learn linearly separable problems. However, with the aid of kernels, we can attempt to transform this non-linearly separable problem into a linearly separable problem. We now we will see that we don't have to give up all of the advantages of a linear classification model in order to do that by using explicit kernel expansions instead of the kernel trick to do that.
// Use an explicit kernel expansion to transform the
// non-linear classification problem into a linear one
//
// Create a quadratic kernel
Quadratic quadratic = new Quadratic(constant: 1);
// Project the inputs into a higher dimensionality space
double[][] expansion = quadratic.Transform(inputs);
At this point, we transformed our initial 2-dimensional problem into a higher dimensionality classification problem. According to Cover's theorem, the probability of a problem being linearly separable increases as the dimensionality of the space is increased when we project the original problem it into a higher-dimensional space via some non-linear transformation.
Now, all we have to do is to learn a linear SVM that can operate in this expanded higher-dimensionality space.
// Create the same learning algorithm in the expanded input space
teacher = new LinearDualCoordinateDescent()
{
Loss = Loss.L2,
Complexity = 1000,
Tolerance = 1e-5
};
// Use the learning algorithm to Learn
svm = teacher.Learn(expansion, outputs);
// Compute the machine's answers for the learned inputs
answers = svm.Decide(expansion);
// Convert to Int32 so we can plot:
zeroOneAnswers = answers.ToZeroOne();
// Plot the results
ScatterplotBox.Show("SVM's answer", inputs, zeroOneAnswers).Hold();
As we can see, after we apply a higher-dimensional transformation for our original input domain, the problem becomes linearly separable. And as such, we can learn a linaer machine that is able to draw an hyperplane separating the blue dots from the red dots even if they were not linearly separable in the original input space.
In this example, we will use the framework's Accord.IO module to create a sparse data reader that is able to learn data written in LIBSVM's sparse sample data format. The dataset chosen for this example is available at LibSVM's website.
// Create a new LibSVM sparse format data reader
// to read the Wisconsin's Breast Cancer dataset
//
var reader = new SparseReader("examples-sparse.txt");
// Read the sparse inputs and outputs from the file
var results = reader.ReadSparseToEnd();
Sparse<double>[] inputs = results.Item1;
double[] outputs = results.Item2;
// The dataset has output labels as 4 and 2. We have to convert them
// into negative and positive labels so they can be properly processed.
//
bool[] outputs = doubleOutputs.Apply(x => x == 2.0 ? false : true);
// Create a learning algorithm for Sparse data. The first generic argument
// of the learning algorithm below is the chosen kernel function, and the
// second is the type of inputs the machine should accept. Note that, using
// those interfaces, it is possible to define custom kernel functions that
// operate directly on double[], string[], graphs, trees or any object:
var teacher = new LinearDualCoordinateDescent<Linear, Sparse<double>>()
{
Loss = Loss.L2,
Complexity = 1000, // Create a hard-margin SVM
Tolerance = 1e-5
};
// Use the learning algorithm to Learn
var svm = teacher.Learn(inputs, outputs);
// Compute the machine's answers
bool[] answers = svm.Decide(inputs);
// Create a confusion matrix to show the machine's performance
var m = new ConfusionMatrix(predicted: answers, expected: outputs);
// Show it onscreen
DataGridBox.Show(new ConfusionMatrixView(m)).Hold();
Help improve this wiki! Those pages can be edited by anyone that would like to contribute examples and documentation to the framework.
Have you found this software useful? Consider donating only U$10 so it can get even better! This software is completely free and will always stay free. Enjoy!