It is important to keep in mind that observables are collections over time - or another way of putting it they are a series of events. For enumerables can be thought of as collections at a moment in time.
It makes sense to join over enumerables - all of the values are available to you the moment in time that you make the join.
It's different when using Rx - it's almost like you need to do some sort of time travel!
So, whenever you try to do a "join" in the Rx world you are saying something like "for some period of time please remember all of the values on observable A and match them with values that happen on observable B during that period."
The Join
operator in Rx is specifically used to define custom periods of time and to observe events that occur within the time periods.
The classic situation is that you have a stream of events for whenever an individual enters or exits a room and you want to know who was in the room when some event (say the light was turned on) occurs.
In some ways your second query, the SelectMany
query, is just a join that occurs over the lifetime of the two observables and Rx had to remember all of the values to generate matches. It is effectively a pair of collections being built and performing joins as values are added.
The performance of SelectMany
is good as long as the input sequences do not get too large (which still may mean that large, but not too large, is OK) and that they eventually terminate. Using hot observables like event patterns from clicks would be a bad choice to do a SelectMany
against.
So, if you have a specific time period to join against - use Join
- but if you just want to join every value between two observables use SelectMany
.