I have a Pandas DataFrame
. I am trying to create a sample DataFrame
with replacement and also stratify it.
This allows me to replace:
df_test = df.sample(n=100, replace=True, random_state=42, axis=0)
However, I am not sure how to also stratify. Can I use the weights
parameter and if so how? The columns I want to stratify are strings.
This allows me to stratify:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size=.50, stratify=Y, random_state=42)
However, there is no option to replace.
How can I both stratify and replace?