0

I need help interpreting the formulas from the Quicksort algorithm description in the Clrs book, page 157

My questions are overlayed in the page screenshot. This formula is about QiuckSort Algorithm.

  1. what is behind(logic) E[X] ... I mean what is happening in quickstart that the authors wrote this formula?
  2. given n=3, the result is 8, ok, but what does 8 mean semanitcally in quicksort?
Jan Dolejsi
  • 1,389
  • 13
  • 25
Michael
  • 51
  • 1
  • 8
  • The formula is computing the expected number of comparisons that occur in quicksort. Informally, the expected value is like an average or mean over all possible sequences. – President James K. Polk Nov 03 '19 at 19:47
  • 1
    Khan Academy has a good article about the expected value of quicksort: https://www.khanacademy.org/computing/computer-science/algorithms/quick-sort/a/analysis-of-quicksort – JohanC Nov 03 '19 at 19:49

1 Answers1

1

In order to understand the analysis in the book you cited, you need to understand some concepts from probability theory.

We want to determine the average number of comparisons made by quicksort.

We define some random variables. X_ij = 1 if z_i and z_j are compared, and 0 otherwise. X is a random variable for the total number of comparisons made. Thus X = sum over i and j of X_ij.

E[X] is the expected value of X, i.e. the value "on average". Expectation is linear, so the expectation of a sum is equal to the sum of expectations. That's how the book concludes that the average number of comparisons is equal to the sum over i and j of E[X_ij]. Since X_ij is either 0 or 1, its expected value is the same as the probability of making a comparison between z_i and z_j.

If you're still have trouble following this, you'll need to read up on probability, random variables, and expected values.

jdonland
  • 189
  • 8
  • thanks ... but I don't understand what is meaning: We define some random variables. X_ij = 1 if z_i and z_j are compared, and 0 otherwise ? – Michael Nov 03 '19 at 19:57
  • We're going to sort a list z_1, z_2, ..., z_n. So z_i and z_j are just any two arbitrary elements of the list. For example, if at some point the algorithm compares z_7 and z_23 to see which should occur first in the sorted list, then X_7,23 = 1. If not, then X_7,23 = 0. If we sum these X_ij up over all values of i and j, we get the total number of comparisons. – jdonland Nov 03 '19 at 20:07
  • (Since X_ij is either 0 or 1 ) why between 0 or 1? – Michael Nov 03 '19 at 20:07
  • Not between 0 or 1, exactly 0 or exactly 1. This is just a way of encoding whether a binary event (something that either occurs or does not occur) as a random variable. The whole point of this part of the analysis is just to show that if we know the probability with which any two elements of the list are compared, then we can determine the average number of comparisons performed. – jdonland Nov 03 '19 at 20:15
  • yes thanks a lot, I don't understand random variables and I need to read more about it! if I question as like this question in StackOverflow . do you think I will ben? – Michael Nov 03 '19 at 20:36
  • Try working through these course notes: https://ocw.mit.edu/courses/mathematics/18-440-probability-and-random-variables-spring-2014/lecture-notes/ – jdonland Nov 04 '19 at 00:26