4

I have been practicing data analysis on Python and have been looking to do the same with R particularly sentiment analysis. With python to train a NB algorithm I could save it as a pickle and reuse to continue training it however I am unsure how I would do this with R. This is currently what I have followed to train and test a data set using the library e1071. After cleaning the data.

convert_count <- function(x) {
  y <- ifelse(x > 0, 1,0)
  y <- factor(y, levels=c(0,1), labels=c("No", "Yes"))
  y
}

trainNB <- apply(dtm.train.nb, 2, convert_count)
testNB <- apply(dtm.test.nb, 2, convert_count)

system.time( classifier <- naiveBayes(trainNB, df.train$class, laplace = 1) )

system.time( pred <- predict(classifier, newdata=testNB) )
table("Predictions"= pred,  "Actual" = df.test$class )

Can anyone explain to me what the Python pickle equivalent would be when using R? Another question I have is is using tm to clean the corpus and then using the document term matrix achieve bag of words?

Thanks

OasisTea
  • 67
  • 1
  • 7

1 Answers1

0

I've not used pickling in Python, but it seems like you're just compressing and saving an object, right?

In that case - I would use 'write.fst' from the "fst" package. It serializes data frames. You'll have to do a 'read.fst' when you want to access that object again.

skhan8
  • 121
  • 4