Given a large Whoosh index, how can I efficiently retrieve n
random documents from it?
I can do this horribly inefficiently just by pulling all the documents into memory and using random.sample
...
random.sample(list(some_index.searcher().documents()), n)
but that will be horribly inefficient (in terms of memory usage and disk IO) if the index contains a large number of documents.