I'll use Couchbase as an example of a distributed cache (http://www.couchbase.com/).
First question: How does a distributed cache coordinate data?
Answer: Usually the distributed cache is indeed many machines acting as one logical unit. So you might have five computers all running Couchbase and they take care of data integrity and redundancy for you. In other words if one machine dies, you can still get your data from the cluster. (But yes, each node will have a copy of data in case of failures.)
Some clustered machines will have a process in front of the machines in the cluster to route requests and sometimes you use multiple connection strings and the client will round robin the requests to the cluster. Just depends on the technology.
Second question: Why use a cache since it all goes over the network?
Answer: Quite a few of the distributed cache technologies out there live solely in RAM/memory. They never have to go to disk for a query so they are faster than a typical database.
Also databases often have to do some work to join data together from multiple tables, whereas a cache usually just stores data in a key/value. This means the cache never has to actually process anything. It just does straight lookups which are cheap.
Third question: Why a distributed cache over local caches?
Answer: When you start to scale you will want a distributed cache.
First of all the cache can grow quite large and if it runs only in memory it will compete with your web server (or whatever) for resources. Better to have a machine dedicated for caching.
Secondly the cache will scale differently than other technologies in your stack. You might need only four cache nodes for every ten web server nodes. Better to separate.
Lastly, you want any client to be able to connect and get the most current data. Otherwise if a user bounces from one web server to another in a web farm, the cached data could be quite different.