The intuition you're looking for is that the algorithm relies on the probability of seeing the entire bit pattern at the beginning of the hash (k zeros, followed by a 1), not just the zeros.
The more difficult part is getting from there to estimating the cardinality at 2k+1. Unfortunately the formal proof of this isn't straightforward. In fact, most of the original original paper which introduced the method (Flajolet and Martin, Probabilistic counting Algorithms for Data Base Applications, http://algo.inria.fr/flajolet/Publications/FlMa85.pdf) is devoted to proving that the estimate computed with it is a good one. Subsequent papers (the LogLog and HyperLogLog papers) have similar proofs for their improved estimates.
Hope that helps!