7

If one computer can only hold 1 million numbers, how to find out the median number from 100 million numbers?

starblue
  • 55,348
  • 14
  • 97
  • 151
Stephen Hsu
  • 5,127
  • 7
  • 31
  • 39

4 Answers4

3

Do an external sort and then scan once for the median.

Hopefully, the real problem was "how do I do an external sort"? (If this is homework...I want to help in the right way. :-)

DigitalRoss
  • 143,651
  • 25
  • 248
  • 329
  • This is what I thought. :) But I'm not sure it's the correct answer, so I asked here. – Stephen Hsu Sep 25 '09 at 02:11
  • 1
    There has got to be a way to do this with the literal constraint that the device can only store 1 million numbers. Using the external sort seems like cheating. Now I'm gonna be up all night thinking about this. – JohnFx Sep 25 '09 at 02:28
  • Heh, I wondered that myself. It's a really good question. – DigitalRoss Sep 25 '09 at 17:47
3

Reduce the problem to a more difficult one: sort the 100 million numbers using merge sort Then, take the 50 millionth element.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • But the computer can hold 1 million numbers only, how can I find the 50 millionth one? – Stephen Hsu Sep 25 '09 at 02:12
  • 1
    On the tape (oh, right, this is no longer the 80s. I meant "on the disk"), at the 50 millionth position. You have storage for your 100M elements, right? Because if you don't (elements read from a stream) the exercice can be proved impossible by a counting argument. – Pascal Cuoq Sep 25 '09 at 02:15
  • 1
    It is not correct for 100 million numbers to take 50 millionth element, because 100 million is even number, so one has to take mean of 50 millionth and 50 millionth + 1 elements – Timofey Feb 06 '12 at 11:03
1

Using 101 computers and a sort-merge just like a database.

tster
  • 17,883
  • 5
  • 53
  • 72
0

Find the middle million numbers and then report the median of them. (Hmmm, now how to find those middle million numbers...)

PaulMcG
  • 62,419
  • 16
  • 94
  • 130