I have a pandas dataframe with two columns: "review"(text) and "sentiment"(1/0)
X_train = df.loc[0:25000, 'review'].values
y_train = df.loc[0:25000, 'sentiment'].values
X_test = df.loc[25000:, 'review'].values
y_test = df.loc[25000:, 'sentiment'].values
But after conversion to numpy array, using values()
method. I obtain numpy arrays of following shape:
print(df.shape) #(50000, 2)
print(X_train.shape) #(25001,)
print(y_train.shape) #(25001,)
print(X_test.shape) # (25000,)
print(y_test.shape) # (25000,)
So as you can see values()
method, added one additional row. This is really strange and I cant detect error.