The performance difference you're seeing is likely not simply an iOS/Android difference but will be very specific to both your usage of the API and the implementation of glBufferSubData in the driver. Without seeing more code, or knowing what performance metrics you're gathering, it's hard to comment further.
what does "that rendering must drain from the pipeline before the data
store can be updated" really mean?
The idea here is that whilst the OpenGL API gives the illusion that each command is executed to completion before continuing, in fact, drawing is generally batched up and done asynchronously in the background. The problem here is that glBufferSubData is potentially adding a synchronisation point, which will mean that the driver will have to stall until all previous rendering using that buffer has completed before continuing.
Consider the following example. In a good case, we might have something like this:
- glBufferSubData into buffer 1 with ABCDE
- Draw call using buffer 1
- glBufferSubData into buffer 2 with FGHIJ
- Draw call using buffer 2
- Swap buffers <----- Synchronisation point, the driver must wait for rendering to finish before swapping the buffers
However if you're overwriting the same buffer, you will get this instead.
- glBufferSubData into buffer 1 with ABCDE
- Draw call using buffer 1
- glBufferSubData into buffer 1, overwriting with FGHIJ <----- Synchronisation point, as the driver must ensure that the buffer has finished being used by first draw call before modifying the data
- Draw call using updated buffer 1
- Swap buffers <----- Synchronisation point, the driver must wait for rendering to finish before swapping the buffers
As you can see, you can potentially end up with a second synchronisation point. However as mentioned before, this is somewhat driver specific. For example some drivers might be able to detect the case where the section of the buffer you're updating isn't in use by the previous draw call, whilst others might not. Something of this nature is probably what's causing the performance difference you're seeing.