I am very new to scikit and have a usecase which I am trying to solve through scikit python library.
I have CSV file like this:
Label
, userId
, message
, user_like
,user_dislike
1 , 1, "this is good message", 4,5
0, 1, "This is bad message",3,4
1, 2, "this is good message" , 4,5
0,1, "This is bad again" , 6,7
How can I train classifier MultinomialNB from above data. My Challenge is it contains both text data (messages) as well as numeric data.
I want to predict whether message "this is new message
" posted by userId 1
is spam or not ( 0,1) ..
So ? , 1 , "this is new message" , 3 4
Thanks