2

I came across this problem right now: http://uva.onlinejudge.org/index.php?option=com_onlinejudge&Itemid=8&category=24&page=show_problem&problem=3155 The problems asks to calculate the average number of swaps performed by a Bubble Sort algorithm when the given data is a shuffled order of first n (1 to n listed randomly) natural numbers.

So, I thought that:

  1. Max no of swaps possible=n(n-1)/2. (When they are in descending order.)
  2. Min no of swaps possible=0. (When they are in ascending order.)

So, the mode of this distribution is (0+n(n-1)/2)/2 =n(n-1)/4. But, this turned out to be the answer. I don't understand why did the mode coincide with mean.

Manohar
  • 165
  • 1
  • 1
  • 10

2 Answers2

2

Since the inputs to be sorted can be any random number with equal probability of occurrence, the distribution is symmetrical.

It is a property of symmetrical distributions that their mean, median and modes coincide which is why the mean and mode are same.

WinningAddicted
  • 471
  • 4
  • 7
  • 1
    That makes sense, but the number of swaps does not vary linearly. – Manohar Jul 26 '17 at 09:26
  • Thanks . For permutaions of {1,2,3,4}:0 swap - 1 time, 1 swap - 3 time, 2swap-5times,3swap-6times,4swap-5times,5swap-3times,6swap-1times. Got it. The data is symmetric around peak (3swaps). The mean,median and mode coincide. – Manohar Jul 26 '17 at 09:33
  • 1
    I think it does not have to be linear. If it is symmetrical, it does the job. Think of it as a bell curve. The probability of 0 swaps needed and maximum swaps needed are same. However, we expect on an average, the number of swaps to be n*(n-1)/4 . Which is where the distribution graph peaks. – WinningAddicted Jul 26 '17 at 09:40
  • This is a very nice solution, but I don't think it's obvious how the symmetry in permutations transforms into a symmetry in number of swaps. Not all arbitrary functions of permutations get a symmetrical distribution when we consider all permutations. Thus, could you please elaborate a bit? – danbanica Jul 26 '17 at 11:30
  • @qwertyman but here the permutations are symmetric, so we can assume mean=mode. If you are interested I suggest you to verfiy yourself for n=3,4.Of course, a proof would be nice to see,but the observation makes symmetry obvious. – Manohar Jul 26 '17 at 11:38
  • @Manohar I know that the property is true (spent a little time thinking of it, but I don't think that the fact that "the set of permutations is symmetric" is sufficient for proving the respective property on the number of swaps). – danbanica Jul 26 '17 at 11:40
  • @qwertyman it is sufficient. When the data is symmetric, the mean median and mode coincide. read it here: http://www.investopedia.com/terms/s/symmetrical-distribution.asp It would be interesting if someone could prove that the frequency of no of swaps will always be symmetric.edit: Actually I can think of an easy proof. – Manohar Jul 26 '17 at 16:24
  • @Manohar exactly, the symmetry in the number of swaps is the property that I am interested in. I found a proof, therefore I am sure that this answer is correct. However, my proof uses the observation that the number of swaps = number of inversions, and I was curious whether there is a simpler proof that I'm missing. – danbanica Jul 26 '17 at 18:44
1

Every swap reduces the number of inversions in the array by exactly 1.

The sorted array has no inversions, thus the number of swaps is equal to the number of inversions in the initial array. Thus we need to compute the average number of inversions in a shuffled array.

The pair of indices i, j, with i < j, is an inversion in exactly half of the shuffled arrays. There are n * (n-1) / 2 such pairs, thus we have n * (n-1) / 4 inversions on average.

danbanica
  • 3,018
  • 9
  • 22
  • 1
    But the bubble sort does not resolve every inversion in exactly one swap. So, average no of swaps required is not equal to average no of inversions??? I might be wrong here. – Manohar Jul 26 '17 at 09:38
  • 1
    No, but that is not relevant. What is relevant is that the number of initial inversions is equal to the number of swaps. Don't look for a one-to-one mapping between the initial inversions and swaps, because as we perform swaps, the inversions will start occurring between other indices (_e.g._ if we had an inversion between indices 0 and 5, and we swap elements on positions 4 and 5, we will afterwards have an inversion between indices 0 and 4). But as I said, all that is relevant is that each swaps decreases number of inversions by one, thus we have num. of swaps = num. of initial inversions. – danbanica Jul 26 '17 at 10:22
  • Thanks. Well Explained. – Manohar Jul 26 '17 at 11:36