My question is rather close to How to create a train_test_split based on a conditional in python
but I am looking for a better solution.
I have a pandas dataframe where I would typically use the train_test_split
function
X_train, X_test, y_train, y_test = train_test_split(data[xvars], data[yvar], train_size=0.98, random_state=42)
However, I would like to split based on my pandas column called week
where week < 51 would be train set, and week >= 51 would be test set, how can I achieve this efficiently?
Thanks.