"We will now switch to a Lagrangian formulation of the problem. There are two reasons
for doing this. The first is that the constraints (12) will be replaced by constraints on the
Lagrange multipliers themselves, which will be much easier to handle. The second is that
in this reformulation of the problem, the training data will only appear (in the actual training
and test algorithms) in the form of dot products between vectors. This is a crucial property
which will allow us to generalize the procedure to the nonlinear case (Section 4).