Factors involved are more complex.
It is true increasing number of indexes can possibly increase eventual consistent since more work needs to be done applying them - indexes are are synchronously written whereas the entity itself is always applied before commit returns. Read/Write pattern can also impact eventual consistency, such as exceeding the 1 transaction per second per entity group guideline. Other factors in the tail-end include things like data center/replica outages and subsequent recovery.
It is unlikely a benchmark will give you great visibility into these issues, especially since you cannot force the benchmark to hit particular replicas.
General guidance is eventual consistency is on average resolved in milliseconds and occasionally seconds. On the extreme tail-end depending on factors, this could extend to hours or more (such as a data center just coming back online).
It is recommended to use the strong consistency mechanisms (ancestor queries, look-ups) if possible in situations where eventual consistency is unacceptable.