When I tried to verify that the GPU does matrix operations over the CPU, I got unexpected results.CPU performs better than GPU according to my experience result, it makes me confused.
I used cpu and gpu to do matrix multiplication respectively.Programming environment is MXNet and cuda-10.1.
with gpu:
import mxnet as mx
from mxnet import nd
x = nd.random.normal(shape=(100000,100000),ctx=mx.gpu())
y = nd.random.normal(shape=(100000,100000),ctx=mx.gpu())
%timeit nd.dot(x,y)
50.8 µs ± 1.76 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
with cpu:
x1 = nd.random.normal(shape=(100000,100000),ctx=mx.cpu())
y1 = nd.random.normal(shape=(100000,100000),ctx=mx.cpu())
%timeit nd.dot(x1,y1)
33.4 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Why CPU faster? My CPU model is i5-6300HQ and GPU model is Nividia GTX 950M.