How to combine multiple feature sets in bag of words

Question

I have text classification data with predictions depending on categories, 'descriptions' and 'components'. I could do the classification using bag of words in python with scikit on 'descriptions'. But I want to get predictions using both categories in bag of words with weights to individual feature sets x = descriptions + 2* components How should I proceed?

You can concatenate feature sets, and you can put weights on them, too. — Has QUIT--Anony-Mousse, Sep 30 '15 at 17:45

axiom · Accepted Answer · 2015-10-01T05:57:29.860

0

You can train individual classifiers for descriptions and merchants, and obtain a final score using score = w1 * predictions + w2 * components.

The values of w1 and w2 should be obtained using cross validation.

Alternatively, you can train a single multiclass classifier by combining the training dataset.

You will now have 4 classes:

Neither 'predictions' nor 'components'
'predictions' but not 'components'
not 'predictions' but 'components'
'predictions' and 'components'

And you can go ahead and train as usual.

edited Oct 01 '15 at 05:57

answered Sep 30 '15 at 06:50

axiom

8,765
3
36
38

Is there a way to combine the two categories in the bag of words model itself, instead of training classifiers separately? – javi_p Sep 30 '15 at 08:08

How to combine multiple feature sets in bag of words

1 Answers1