Split dataset into training and test by month

Question

I was not able to find the answer to this anywhere. I have data for three months, where I would like to split it into the first two months('Jan-19', 'Feb-19') as training set and the last month as the test ('Mar-19').

Previously I have done random sampling with simple code like this:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30,random_state=109)

and before that assigned y as the label and x as the columns to use to predict. I'm not sure how to assign the test and training to the months I want.

Thank you

score 0 · Accepted Answer · answered Aug 02 '19 at 19:47

0

If your data is in a pandas dataframe, you can use subsetting like this:

X_train = X[X['month'] != 'Mar-19']
y_train = y[X['month'] != 'Mar-19']

X_test = X[X['month'] == 'Mar-19']
y_test = y[X['month'] == 'Mar-19']

answered Aug 02 '19 at 19:47

josephjscheidt

326
1
6

Happy to help! If this answer or any other one here solved your issue, please mark it as accepted. Thanks! – josephjscheidt Aug 04 '19 at 22:38

score 0 · Answer 2 · answered Aug 02 '19 at 19:49

0

You try this option and see if it helps.

dataset_train = df['2004-02-12 11:02:39':'2004-02-13 23:52:39']
dataset_test = df['2004-02-13 23:52:39':]

answered Aug 02 '19 at 19:49

Vishwas

343
2
13

Split dataset into training and test by month

2 Answers2