TfidfVectorizer provides an easy way to encode & transform texts into vectors.
My question is how to choose the proper values for parameters such as min_df, max_features, smooth_idf, sublinear_tf?
update:
Maybe I should have put more details on the question:
What if I am doing unsupervised clustering with bunch of texts. and I don't have any labels for the texts & I don't know how many clusters there might be (which is actually what I am trying to figure out)