WordCount is a very easy but not efficient example. Use it to validate if your cluster is working but NEVER for performance tests.
Let me explain why.
WordCount parse each line of text and for each word found write to the mapper output the record (WORD, 1). As you cam see, the full output of the mappers will be bigger than the input. And that mappers' bigger output will be the input of the reducers. Then, you need to read more than twice the amount of the input data and write to disk the original input + counters.
Additional to that, you need transfer the mapper output to the reducers. And if you are using only one reducer then the last step will be similar than your sequential job.
The job could be optimized, for example using combiners and multiple reducers.
Hadoop will be faster than local sequential jobs when the amount of the data be bigger than the local resources (ram, HD, cpu) and/or when the cost of initialize the containers and the transfet of data among them is minimized by the number of nodes working in parallel.