Your best choice is to tune the arguments.
n_jobs=4
This makes the computer compute four train-test cycles simultaneously. Different Python jobs run in separate processes, thus the full dataset is also copied. Try to reduce n_jobs
to 2 or 1 to save memory. n_jobs==4
uses four times the memory n_jobs==1
uses.
cv=20
This splits the data into 20 pieces and the code does 20 train-test iterations. This means that the training data is the size of 19 pieces of the original data. You can quite safely reduce it to 10, however your accuracy estimate might get worse. It won't save much memory, but makes runtime faster.
n_estimators = 100
Reducing this will save little memory, but it will make the algorithm run faster as the random forest will contain fewer trees.
To sum up, I'd recommend reducing n_jobs
to 2 to save the memory (2-fold increase in runtime). To compensate runtime, I'd suggest changing cv
to 10 (2-fold savings in runtime). If that does not help, change n_jobs
to 1 and also reduce the number of estimators to 50 (extra two times faster processing).