I am training 2000 Logistic Regression classifiers using keras. The inputs for each classifier are:
for training: vectors: 8250X50, labels:8250
for validation:2750X50, labels:2750
for testing:3000X50, labels:3000
for every classifier, I save the predictions and the scores (kappa score, accuracy..)
The code is very slow it needs three hours for training the first 600 classifiers.
I used the following code
def lg_keras2(input_dim,output_dim,ep,X,y,Xv,yv,XT,yT,class_weight1):
model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim, activation='sigmoid'))
#model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy',metrics = ["accuracy",mcor,recall, f1])
result = model.fit(X, y, epochs=ep, verbose=0, batch_size = 128, class_weight = {0 :class_weight1[0] , 1:class_weight1[1] } ,validation_data = (Xv, yv))
test = model.evaluate(XT, yT, verbose=0)
kappa_Score=(cohen_kappa_score( yT,(model.predict_classes(XT))))
return model,result,test,kappa_Score
After that I trained the 2000 classifiers as follow:
from sklearn.utils import class_weight
from sklearn.metrics import cohen_kappa_score
directionsLGR=[]
scores=[]
predictions=[]
kappa_Score_all=[]
for i in range(0,2000):
Class_weight = class_weight.compute_class_weight('balanced',
np.unique(pmiweights_Train[:,i]),
pmiweights_Train[:,i])
#start_time = time.time()
model,results,test,kappa = lg_keras2(50,1,30,mdsTrain, pmiweights_Train[:,i],mdsVal, pmiweights_val[:,i],mdsTest,pmiweights_Test[:,i],Class_weight)
#print("--- %s seconds ---" % (time.time() - start_time))
weights=np.array(model.get_weights())[0].flatten()
directionsLGR.append(weights)
predictions.append(model.predict_classes(mds))
kappa_Score_all.append(kappa)
scores.append(test)
Is there anything that I can do to speed this process. I will appreciate any suggestions