Hope all of you guys are healthy and well. I am new to the world of NLP and my question may sound stupid, so I apologize in advance.I would like to perform NLP on some text data which is labeled and run a text mining predictive model. I have four text columns that can be used as predictors and my labeled column is my class variable. Perhaps, the following can give you a glimpse of the data set
var1 var2 var3 var4 class_var
NA text text NA 0
text text NA text 1
text NA NA text 1
NA NA NA text 0
NA text text text 1
As shown, in some columns there are no texts ( I put NAs
) I have texts in other columns.
That being said, my question whether I should combine all text columns into one?
if so, what would be an appropriate method for dealing with this issue?
I truly appreciated your help guys.
Many thanks!