1

I've just built a brand new powerful desktop PC in order to speed up my scikit learn computations (specs here).

I run my code in a Jupyter Notebook and I noticed that if I run the same computation on my old dying laptop and my super-PC the time difference is often small, although on some very demanding cells in can vary from simple to double between the two computers… But my new PC is suppose to be at least 5 times more powerful than my old laptop!

Demanding code example:

y_train_large = (y_train >= 7)
y_train_odd = (y_train % 2 == 1)
y_multilabel = np.c_[y_train_large, y_train_odd]
knn_clf = KNeighborsClassifier()
knn_clf.fit(X_train, y_multilabel)    
y_train_knn_pred = cross_val_predict(knn_clf, X_train, y_multilabel, cv=3)
f1_score(y_multilabel, y_train_knn_pred, average="macro")

Also, when I check the CPU usage during a classifier training for instance, it's very low on both computers (around 5% on the new one and 15-20% on the old one).

I realise that it may be a big noob question but why is that? I read here that Jupyter Notebooks run on the host machine not mine. How to use my own hardware instead? I probably search the wrong way but I cannot find a lot of informations on that subject. What to search for?

Thanks !

Time report for the code above with the small change of setting n_jobs=4 for cross_val_predict():

Computing time for AMD Ryzen 9 3900x 12 cores, RAM 32Go : 12'45'' approx. average CPU usage 15%

Computing time for Intel i7 4750HQ @ 2.00 GHz, RAM 16Go : 19'50'' approx. average CPU usage 62%

VideoPac
  • 173
  • 2
  • 11
  • You need to share your code. Is your code using multiple threads? How are you launching your jupyter notebooks? – FlyingTeller Mar 02 '20 at 14:22
  • *Public* Jupyter notebook may be running on a host machine. However, if you are creating your own Jupyter notebooks and running them in the usual way, then you probably are using your own CPUs, especially since, as you put it, "some very demanding cells in can vary from simple to double between the two computers." – jjramsey Mar 02 '20 at 14:28
  • 1
    Edited: Try the same code with following small change `y_train_knn_pred = cross_val_predict(knn_clf, X_train, y_multilabel, cv=3, n_jobs=4)`, this will make sure 4 cores in your machine will be used (not 1). The actual answer is not as straightforward as that but I'll make a detailed answer if this gives some improvement. I'm also not sure if KNNClf is the right model to test this. – D_Serg Mar 02 '20 at 14:42
  • On my old laptop I run the notebook with the command jupyter notebook in the terminal. For some reason this command is not working on the new one so I launch the Anaconda Jupyter Notebook instead. If I already use my own CPU's how to explain that I see only a very low CPU usage? How to explain that the AMD 3900x which is supposed to be maybe 10 times more powerfull than my laptop CPU is barely faster on most tasks? – VideoPac Mar 02 '20 at 14:43
  • @D_Serg, I'm on it. It may take a while. – VideoPac Mar 02 '20 at 15:31
  • @D_Serg,report posted above ;) – VideoPac Mar 02 '20 at 16:38
  • @VideoPac you want even more speedup? :) Be careful with n_jobs btw, it may bloat up the memory if you be too aggressive with it. I'd be monitoring that too. – D_Serg Mar 02 '20 at 17:42
  • @D_Serg, yeah that's better of course, thanks! But I still feel ignorant...want to understand exactly how it works, whether or not some of the computing occurs on the host machine like the other thread says, how far exactly can I go with n_jobs?, is it the only parameter and I do want more speed! :) – VideoPac Mar 03 '20 at 02:18
  • Also why only 5% CPU usage ? – VideoPac Mar 03 '20 at 02:30
  • 1
    @VideoPac, I'm not a distributed computing expert so I can't give you great info about it. `n_jobs` at a very high-level means, if there are things than can be done absolutely parallel (for example, each fold in cross-validation are technically independent of each other) two or more cores can work on it independently, halving the time needed in theory. Then the question is, how many cores you have and how fast each core is. But keep in mind that some "overhead" cost is also associated with "managing" what each core does so having more `n_jobs` might turn out to be less eficient. – D_Serg Mar 03 '20 at 04:48

1 Answers1

1

OK, so for this particular piece of code, increasing the n_jobs parameter of cross_val_predict() to n_jobs=4 gives a good improvement but still it's unclear to me:

  • How to proceed on other machine learning tasks?
  • Are there other parameters to tweak in order to get even better results than this?
  • How far can we go with n_jobs, how to evaluate the best n_jobs for a given task, how to know when we go too far and the CPU is at risk?

Any expert on those matters is still welcome to answer :)

VideoPac
  • 173
  • 2
  • 11