I am doing text classification with scikit learn's logistic regression function (http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html). I am using grid search in order to choose a value for the C parameter. Do I need to do the same for max_iter parameter? why?
Both C and max_iter parameters have default values in Sklearn, which means they need to be tuned. But, from what I understand, early stopping and l1/l2 regularization are two desperate methods for avoiding overfitting and performing one of them is enough. Am I incorrect in assuming that tunning the value of max_iter is equivalent to early stopping?
To summarize, here are my main questions:
1- Does max_iter need tuning? why? (the documentation says it is only useful for certain solvers)
2- Is tuning the max_iter equivalent to early stopping?
3- Should we perform early stopping and L1/L2 regularization at the same time?