Imagine there's have an array of integers but you aren't allowed to access any of the values (so no Arr[i] > Arr[i+1]
or whatever). The only way to discern the integers from one another is by using a query() function: this function takes a subset of elements as inputs and returns the number of unique integers in this subset. The goal is to partition the integers into groups based on their values — integers in the same group should have the same value, while integers in different groups have different values.
The catch - the code has to be O(nlog(n)), or in other words the query() function can only be called O(nlog(n)) times.
I've spent hours optimizing different algorithms in Python, but all of them have been O(n^2). For reference, here's the code I start out with:
n = 100
querycalls = 0
secretarray = [random.randint(0, n-1) for i in range(n)]
def query(items):
global querycalls
querycalls += 1
return len(set(items))
groups = []
secretarray
generates a giant random list of numbers of length n
. querycalls
keeps track of how much the function is called. groups
are where the results go.
The first thing I did was try to create an algorithm based off of merge sort (split the arrays down and then merge based on the query() value) but I could never get it below O(n^2).