I'd like to understand if I can and if it's valid approach to train your MNB model with SGD. My application is text classification. In sklearn I've found out that there is no MNB available, and by default it's SVM, however NB is the linear model, isn't it?
So if my likelihood parameters (with Laplacian smoothing) can be estimated as
Can I update my parameters with SGD and minimize the cost function?
Please let me know if SGD is irrelevant here. Thanks in advance.
UPDATE: So I got the answer and hope that I got it right, that MNB's parameters are updated by the word occurence in the given input text (like tf-idf). But I still don't understand clearly why we can't use SGD for MNB training. I'd understand it if it's explained in explicit description or with some mathematical interpretation. Thanks