"
Finally, what happens if one uses a kernel which does not satisfy Mercer’s condition?
In general, there may exist data such that the Hessian is indefinite, and for which the
quadratic programming problem will have no solution (the dual objective function can
become arbitrarily large). However, even for kernels that do not satisfy Mercer’s condition,
one might still find that a given training set results in a positive semidefinite Hessian, in
which case the training will converge perfectly well. In this case, however, the geometrical
interpretation described above is lacking."
Burgess (1998)