For example if I attempt to predict the next word in a sentence I can use a bi gram approach and compute the probabilities of a word occurring based on the previous word in the corpus.
If instead I use a neural net to predict the next word. The training data consists of word pairs where each pair contains the current and next word in the corpus. Training the net uses an input value as a vectorized representation of the word , the output value is a vectorized representation of next word in the corpus.
I expect the neural net to perform better but I'm not sure why ?
When is it better to use a neural net versus a classical approach. In this case a neural net versus an n-gram model. Apologies if this question is ambiguous.
Maybe the answer is trial and error and check which model has faster performance and makes better predictions ?
The neural net will perform better as making the prediction is just a vector multiplication whereas using a n-gram model to predict requires a probability calculation.