I want to use tidymodels to build a workflow for an NLP problem. I have a basic flow built in the traditional way using the naivebayes
package, which basically feeds a document-term matrix (counts of terms occurring in each document) to the multinomial_naive_bayes
function.
While there is a parsnip
interface for the naivebayes package it only seems to work with the generic naive_bayes
function. According to the naivebayes documentation it seems to be the only format that can't be accessed through the generic function:
Please note that the Multinomial Naive Bayes is not available through the naive_bayes function.
So... my 3 questions are:
- Is there a way to access the
multinomial_naive_bayes
function usingparsnip
? - Is there a way to use the generic
naive_bayes
function with data in this format (counts of features)? - What's the best alternative? I see
parsnip
also supportsh2o
andklaR
but I'm not familiar with those packages.
I'm expecting the answers to Qs 1 & 2 are "no", but worth checking. Advice on Q3 would be welcome.