Well, if you refer to Hadoop style MapReduce it is actually map-shuffle-reduce where the shuffle is a reason for map and reduce to be separated. At a little bit higher you can think about data locality. Each key-value pair passed through map can generate zero or more key-value pairs. To be able to reduce these you have to ensure that all values for a given key are available on a single reduce, hence the shuffle. What is important pairs emitted from a single input pair can be processed by different reducers.
It is possible to use patterns like map-side aggregations or combiners but at the end of the day it is still (map)-reduce-shuffle-reduce.
Assuming data locality is not an issue, higher order functions like map and reduce provide an elegant abstraction layer. Finally it is a declarative API. Simple expression like xs.map(f1).reduce(f2)
describe only what not how. Depending on a language or context these can be eagerly or lazily evaluated, operations can be squashed, in more complex scenario reordered and optimized in many different ways.
Regarding your code. Even if signatures were correct it wouldn't really reduce number of times you pass over the data. Moreover if you push map into aggregation then arguments passed to aggregation function are not longer of the same type. It means either sequential fold or much more complex merging logic.