string kernels for GP/SVM regression

Question

I want to solve a small regression problem where the inputs are variable-length strings from a small vocabulary. I'd like to use Gaussian Process regression with some kind of string kernel. (SVM regression also ok.)

I see from this page that shogun supports many kinds of string kernels - can someone please provide a high level summary (with references to papers) of how they work?

I'd also like to see a worked example (in python), since I've never used shogun before. I found this post on stackoverflow, but it's dated from 2014, and it's not clear if the interface is up to date.

Thanks Kevin

i'm just about to finish the string kernels tutorial jupyter notebook, i'll paste here the link how you can use string kernel in shogun. — Viktor, Mar 14 '18 at 12:11

score 0 · Answer 1 · answered Apr 27 '18 at 10:04

The documentation pages of string kernel classes contain the information you are looking for. For example:

http://www.shogun-toolbox.org/api/latest/classshogun_1_1CPolyMatchStringKernel.html contains a high-level summary.
http://www.shogun-toolbox.org/api/latest/classshogun_1_1CSalzbergWordStringKernel.html refers to the paper.

Quite likely not all classes will contain either one piece of information or the other.

string kernels for GP/SVM regression

1 Answers1