I have one unique DataFrame which I need to train in the same model (LogisticRegression) multiple times.
_list_scores = []
for i in range(df.shape[0]):
X_train = df.iloc[0:i+1, :-1]
y_train = df.iloc[0:i+1, -1:]
model.fit(X_train, y_train)
_list_scores.append(model.score(X_test, y_test))
The logic is this model will be trained in the whole dataframe starting with 1 row until last row.
Loop 1 = train with 1 row and measure the score Loop 2 = train with 2 rows and measure the score Loop 3 = train with 3 rows and measure the score ... Loop n = train with "n" rows and measure the score
I tried with concurrent.futures and dask delayed, but for some reason my looping is faster than it...
Someone could please help me: how can I parallelize this?