I have a classification problem with two data-set with 200 and 50 points respectively. Out of these 40 data points are taken as test set. I have chosen kNN as the classifier considering five nearest neighbors.
n_neighbors = 5
std = 5
# generate data
X0, y0 = make_blobs(n_samples=200, centers=2, n_features=2, cluster_std = std, random_state=42)
h = .1 # step size in the mesh
X1, y1 = make_blobs(n_samples=50, centers=2, n_features=2, cluster_std = std, random_state=42)
# split into training and test set
X0_train, X0_test, y0_train, y0_test = train_test_split(X0, y0, test_size=0.2, random_state=42)
X1_train, X1_test, y1_train, y1_test = train_test_split(X1, y1, test_size=0.2, random_state=42)
I have to enrich the data in such a way that training data for class 1 is copied 16 times, such that class 1 has the same training size as class 0.
How can I copy the training data sixteen times? I do not have a clue, exactly what copying means here.
Can anyone throw some lines of code to explain the same?