I am trying to run scikit learn random forest algorithm on the mnist handwritten digits dataset. During the training of the algorithm the system goes into a Memory Error. Please tell me what should I do to fix this issue.
CPU Statistics: Intel Core 2 Duo with 4GB RAM
The shape of dataset is 60000, 784. the complete error as on the linux terminal is as follows:
> File "./reducer.py", line 53, in <module>
> main() File "./reducer.py", line 38, in main
> clf = clf.fit(data,labels) #training the algorithm File "/usr/lib/pymodules/python2.7/sklearn/ensemble/forest.py", line 202,
> in fit
> for i in xrange(n_jobs)) File "/usr/lib/pymodules/python2.7/joblib/parallel.py", line 409, in
> __call__
> self.dispatch(function, args, kwargs) File "/usr/lib/pymodules/python2.7/joblib/parallel.py", line 295, in
> dispatch
> job = ImmediateApply(func, args, kwargs) File "/usr/lib/pymodules/python2.7/joblib/parallel.py", line 101, in
> __init__
> self.results = func(*args, **kwargs) File "/usr/lib/pymodules/python2.7/sklearn/ensemble/forest.py", line 73, in
> _parallel_build_trees
> sample_mask=sample_mask, X_argsorted=X_argsorted) File "/usr/lib/pymodules/python2.7/sklearn/tree/tree.py", line 476, in fit
> X_argsorted=X_argsorted) File "/usr/lib/pymodules/python2.7/sklearn/tree/tree.py", line 357, in
> _build_tree
> np.argsort(X.T, axis=1).astype(np.int32).T) File "/usr/lib/python2.7/dist-packages/numpy/core/fromnumeric.py", line
> 680, in argsort
> return argsort(axis, kind, order) MemoryError