Average Case Analysis of Sequential Search with Geometric Probability Distribution

Question

I was kind of aware of getting the average running time in a uniform distribution. Say for example we have 6 array elements.

| 1/6  | 1/6   | 1/6  | 1/6   | 1/6  |  1/6   |

Above is the array with the uniform probability distribution of a search element being positioned in every subscript in the array.

So getting the average running time in a uniform distribution will be like the solution below:

T(n) = (1/6)*1 + (1/6)*2 + (1/6)*3 + (1/6)*4 + (1/6)*5 + (1/6)*6
     = (1/6) * ( 1 + 2 + 3 + 4 + 5 + 6 )
     =  3.5

or when in express in n terms:

T(n) = (1/n) * ((n(n+1))/2)
     = (n+1) / 2
     = ϴ(n)

But what about the average-case number of key comparisons in sequential search under a geometric probability distribution?

Example:

Prob(target X is in the jth position) = 1/(2^(j+1))
where j = 0, 1, 2,3,4,5,6,...


| 1/(2^(0+1))  | 1/(2^(1+1))  | 1/(2^(2+1)) | 1/(2^(3+1))  | 1/(2^(4+1))  |  1/(2^(5+1))   |

Then

T(j) = ((1/2)* 1) + ((1/4)* 2) + ((1/8)* 3) + ((1/16)* 4) + ((1/32)* 5) + ((1/64)* 6)
     =  .5 + .25(2) + .125(3) + .0625(4) + .03125(5) + .015625(6)
     =  .5 + .5 + .375 + .25 + .15625 + .09375

     =  1.875

I dont know how to express it in j terms:

 T(j) = ?

What is the upperbound O(j)? lowerbound Ω(j)? tightbound ϴ(j)?

Any help or ideas , will be very much appreciated.

UPDATE:

I think that the formula for getting the run time is below:

 T(j) = ((sum of geometric series) or (sum of geometric series / n)?)* (n((n+1))/2)

I just need first the formula for the sum of geometric series when each element in the series has the value 1/(2^(j+1)), where j = 0, 1, 2, ...up to n. Please help.

UPDATE:

I figured out that the formula for the sum og geometric series was 1- (1/2^n)

 T(n) = (1-1/(2^n))/n * (n(n+1)/2)
      = (n+1)/2 - (n+1)/(2^(n+1))

Now, my question is what is the Big Oh, Big Omega, Big Theta, little oh and little omega? Is it possible that an algorithm has no lower bound?

I mean if j is just another variable for the number of indexes, then just use the same approach as in your previous example. — SteamyThePunk, Aug 11 '18 at 18:56
For your specific example, if you're truncating at 6 you don't a valid probability distribution because it sums to 63/64 rather than 1. You need to weight your answer by 64/63. In general, Sum( x * p(x) ) is the definition of the expected value of X for a discrete distribution. — pjs, Aug 11 '18 at 18:58
@SteamyThePunk That won't work because the p(x)'s aren't constant, so they can't be extracted as a common factor. — pjs, Aug 11 '18 at 19:00
You just need to exchange for terms that are algebraically equivalent. 1/2 + 1/4 + 1/8 is the sum of 1/(2^(j+1)) for j = 0 to 3... this is zeno's paradox, and you can easily find that you can rephrase it in terms of n. P(n) = -1/(2^(n+1)) + 1 Now just replace the equivalent term, just as you did in the prior example. — SteamyThePunk, Aug 11 '18 at 19:56
Can you explain how you obtained the result `(n+1)/2 - (n+1)/(2^(n+1))`? Seems dubious to me, especially the way you just multiplied by `n(n+1)/2`. — meowgoesthedog, Aug 14 '18 at 10:03

score 4 · Answer 1 · answered Aug 13 '18 at 14:08

Your example is the following sum:

This can be converted to a geometric series using a simple trick. First, re-write the base as a parameter:

Where the vertical bar means "evaluated at [this value]". Recall the derivative of x^i:

The term within the brackets is now a geometric series; recall the standard result:

Observe that this converges to a constant as j increases.

SteamyThePunk · Answer 2 · 2018-08-11T17:27:58.007

Initially I thought to treat this as a continuous function and suggest calculus to calculate the integral curve of the probability curve, multiply the integral function by x (the number of key comparrisons up to that check). And then take the difference of the min and Max... But that is not necessary as you can't ask for indexes in-between indexes.

So you're going to use the integer approximation. Take the probability function and determine the probability for each index value. Calculate the sum of these probabilities each multiplied by the number of key comparrisons that took place.

1*p(0)+2*p(1)+3*p(2)= your answer.

Thank you for your answer. I have updated my post to clarify things. Please read it :) — alyssaeliyah, Aug 11 '18 at 17:58

Average Case Analysis of Sequential Search with Geometric Probability Distribution

2 Answers2