3

Given an array A and m queries each query is integer T

For each query find index i and j such that

| (|sum of elements from i to j| - T) |

is minimum

wher |x| is abs(x) and array can have negative numbers as well

I was asked this question in directi interview. I had the solution of finding all possible sum and store their indices and sort.

so there will be n*n sums possible.

That would take O(n* n* log(n*n))

now for each query binary search T .That would be O(m* log(n*n))

But he asked to optimize it.I didnt clear the round.

Can anyone give hint for this?

  • can T be negative? – Jim Mischel Jan 11 '19 at 15:15
  • How big can be the size of the array? Maybe you can increase avg. case performance by storing sums in a map to avoid duplicates. This lessens the no. of sums a bit. Convert every sum to positive as we are going to to take absolute value anyway. I thought of segment trees but that doesn't help unless we find a way to break down T. – nice_dev Jan 11 '19 at 15:19
  • I asked for constraints but the interviewer asked to first tell the approach.He said to reduce below n*n – Jitendra Kumhar Jan 11 '19 at 16:18
  • As I was not going further in my approach I didn't think of asking range of T.Can u assume it as positive for now?sorry for that – Jitendra Kumhar Jan 11 '19 at 16:19
  • @JimMischel for negative T, we're tasked with minimizing the absolute value of a subarray sum, which has an O(n log n) solution. – גלעד ברקן Jan 12 '19 at 19:02
  • Note that O(log(*n* \* *n*)) = O(log(*n*)), so you can simplify some of your expressions. – ruakh Jan 18 '19 at 16:25

2 Answers2

1

If we sort the partial sums, for example,

A  = [2, -4,  6, -3,  9]
ps = [2, -2,  4,  1, 10]

sorted = [-2, 1, 2, 4, 10]

the minimum absolute value of the sum represents the smallest difference between partial sums; in this case, 1 and 2, representing a sum of:

-4 + 6 - 3 = -1

Since we'd like to minimise yet another absolute value of a sum, we want to find the absolute sum difference that's closest to T. I could not find a reference for finding a pair with closest difference to a constant in less than O(n) time, so as is, this approach does not seem better than O(n * log n + n * m). Perhaps we can take advantage of hashing or sorting the queries first since queries that are close to each other represent close ranges during our search, but I'm not sure how.

גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
  • Ok Thanks.Also Complexity will be O(n * log n + m* log n) ? where m is number of queries – Jitendra Kumhar Jan 11 '19 at 17:16
  • what if T=9 or greater than 9 in this case – Jitendra Kumhar Jan 11 '19 at 17:28
  • @JitendraKumhar you're right. Sorting the differences of the partial sums is not the right approach since there are O(n^2) of them. Rather we need a search in those differences. I'll edit. – גלעד ברקן Jan 11 '19 at 17:32
  • I am accepting the answer because I think this is best that can be achieved.Also I think Interviewer expected this only – Jitendra Kumhar Jan 16 '19 at 12:06
  • @JitendraKumhar sounds good. Thanks for letting me know. Also, I was thinking of mentioning, although I generally don't like this kind of solution, that in the case the range is limited enough, I think we could also record all the prefix sum possibilities more efficiently with an FFT, and have them available for order and search that way. – גלעד ברקן Jan 16 '19 at 12:25
0

EDIT: I suppose that solving all sums is actually tremendous wasted work. It is interesting only if m >> n. Else here is my solution.

Imagine a race between the Hare and the Tortoise. I hope you know this story... So the Hare "i" lets the Tortoise "j" going first. He knows he is faster and that he can do a nap. He worries only if the Tortoise is out of sight, "T" meters farther, then he runs very fast until he sees the Tortoise and sleep again... And so on.

So initialization

i = 0
j = 0
bestval = inf
index = none
diff = T

Main loop

while(true):
    if diff < 0:
        i++
        diff += A[i] 
    elif j==n:
        break
    else: 
        j++
        diff += A[j]

    # record best distance
    if abs(diff) < bestval:
       bestval = diff
       index = (i, j)

You cannot miss the optimal because you do not extend research in directions increasing abs(diff). It is pointless to go on summing numbers if you already have too much...

So you only do two runs on A with both j and i, once for every T. This should be O(mn). You even can break-off the loop if diff = 0.

Vince
  • 336
  • 1
  • 11
  • I don't see how we can guarantee the best difference would be found without retreating with `i` (making your idea O(n^2) per query). Can you please explain why `i` only needs to be increased and never decreased as we iterate over `j`? – גלעד ברקן Jan 11 '19 at 19:28
  • This doesn't work if there are negative numbers in the array. – Matt Timmermans Jan 11 '19 at 19:58
  • Oh you are right, I missed that A could be negative. This solution cannot detect a far negative number that would optimally balance the sum. – Vince Jan 12 '19 at 19:35