I think one main point of difference is the context of the problem.
Although a problem could be solved with either pattern, the real concerns are:
1: "How much change to bring about by the events are dependent on the general context ?"
2: "How frequently are the listeners expected to change?"
The classical case for the mediator pattern best illustrates this where you have a complex UI with a lot of components and the updation on each has a complex inter-dependency on the state of other similar components.
Although you can solve this problem with the pub/sub pattern; wherein your components listen for events and contain the logic necessary to update, the context object (along with the event) carry all necessary information. Here the advantage is obviously the proper encapsulation of logic pertaining to a component within itself. The downside is that if such components are supposed to change often then you have to replicate this logic fully in each new component you bring in.
To use a mediator is to introduce another layer and further abstract from the components. These components become thinner as they only deal with representation (UI look and feel) thus, become very easy to change. The only problem I have with this approach is that the updation logic now seems to spill to other components and any updation of the system would require one to change the component and the mediator if the component behavior is also to change.
That to me is the major dilemma/trade-off we need to solve. Please correct me if I haven't got anything correctly.