-2

I was asked a following interview question of late, "You have 8 GB RAM and 16 GB file " how do you perform search on this file ?

Then i asked what type of file ,what language ?, he said any format,any language only to increase my confusion !

After a while he asked me to assume it is a text file !

Answer as per knowledge that I shared with the interviewer : EDIT : Use buffered streams with custom size and sort the data on the buffer to apply binary search on the buffered streams (if it's relevant) !

I believe interviewer was not convinced !!

I understand this question is vague !

I want to know what exactly was the point i missed to ask the interviewer , what is the probable solution ? Any guidance or advice on this is appreciated !

Thanks !

SimoV8
  • 1,382
  • 1
  • 18
  • 32
chebus
  • 762
  • 1
  • 8
  • 26

1 Answers1

0

I good solution could be to search first in the first half of the file with a linear search; if the search fails then perform the same action on the second half of the file.

Your solution was wrong because you didn't take into account the memory limit : how do you buffer and sort all the content if you can't load it fully in memory (you can do that but you have to explain how)? The same applies for the binary search: you can't use the standard algorithm, you have to customize it.

Moreover, if you're searching for a substring in a text file your solution is inappropriate (what do you sort?)

SimoV8
  • 1,382
  • 1
  • 18
  • 32
  • Intention of using buffered stream with custom sizes is to buffer the data from disk and then perform search on that right ,that way we are taking into the memory constraints and performance of not hitting the disks ? – chebus Aug 16 '15 at 18:04
  • You can't buffer a 16Gb file into a 8Gb ram without using disks, that's the problem! – SimoV8 Aug 16 '15 at 18:17
  • That's strange ! when i mentioned bout the streams, he said only half the data can be buffered and it leads to poor performance and was expecting for another answer ! anyhow thanks for your inputs ! – chebus Aug 16 '15 at 18:30
  • Yes, you can buffer half the data, not all. But this means that if you want to sort the data (supposing that it makes sense) you have to access the file many times (how many depends on the specific sorting algorithm you're using). This of course is more expensive than scanning the file sequentially one time. – SimoV8 Aug 16 '15 at 18:39
  • That's exactly was my point , when i talk about buffer,we give a custom size for that ,use it and clear the buffer ! i there are apis where you can do binary search on bytes (if this any relevant) ! – chebus Aug 17 '15 at 06:01
  • If you meant it then it's right, but you didn't write it in your question. You wrote 'Use buffered streams and sort the data on the buffer' that was too generic and let thought you wanted to buffer the whole file. – SimoV8 Aug 17 '15 at 06:37