3

What is a concise and readable way of interpolating a 1D array such that the maximum difference between elements is minimized?

For instance, if I had the array [4 9 13 25] and I was allowed to add 1 more number in order to minimize the maximum difference between elements I would insert a 19 between 13 and 25 (max difference is now 6 rather than 12).

Of course a good ole' for loop will get it done, but for posterity is there a less verbose approach than below?

# current array
nums = np.array([4.0, 9.0, 13.0, 25.0])
# size of new array
N=10

# recursively find max gap (difference) and fill it with a mid point    
for k in range(N-len(nums)):
    inds = range(len(nums))
    # get the maximum difference between two elements
    max_gap = np.argmax(np.diff(nums))
    # put a new number that's equidistant from the two element values
    new_num = np.interp(np.mean([inds[max_gap],inds[max_gap+1]]), inds, nums)
    nums = np.insert(nums, max_gap+1, new_num)
    print nums

This example interpolates the 1D array, filling regions where the greatest difference was:

[  4.   9.  13.  19.  25.]
[  4.   9.  13.  16.  19.  25.]
[  4.   9.  13.  16.  19.  22.  25.]
[  4.    6.5   9.   13.   16.   19.   22.   25. ]
[  4.    6.5   9.   11.   13.   16.   19.   22.   25. ]
[  4.    6.5   9.   11.   13.   14.5  16.   19.   22.   25. ]

edit 1: As comments suggest, there is a trade-off between readability, efficiency, and accuracy. Of these three attributes, for me the most important is readability. I still give +1 for any and all improvements to my above algorithm though since it is a general problem and any answer that improves either of those three attributes is beneficial to someone if not me later on.

ecoe
  • 4,994
  • 7
  • 54
  • 72
  • 1
    Do you need any solution or best? consider following example [0, 3, 12]. Your method adds (3+12)/2 = 7, resulting [0,3,7,12], then 5 and 9 [0,3,5,7,9,12] to achieve max gap <=3, making total of 3 new points. But the same result can be achieved adding two points, 6 and 9, right from the start: [0,3,6,9,12] – alko Oct 26 '13 at 22:06
  • thanks, and that's a good point, but simpler code is what I'm mainly after. – ecoe Oct 26 '13 at 23:01
  • 1
    Your current implementations does not necessarily minimize maximum difference between elements. For example, we have `[2,10]` and we want to add two elements: `-> [2,6,10] -> [2,4,6,10]` and the max difference is 4. However, if we could add two elements at once we would have `[2, 4.66, 7.32, 10]`. Is adding two elements at once not allowed? – Akavall Oct 26 '13 at 23:28
  • thanks and yes I totally with you and alko. There are no rules, just looking for the simplest way of filling the largest gaps. I updated the question because of your comment, thanks. – ecoe Oct 27 '13 at 17:14

3 Answers3

2

And, if you want efficiency for long arrays, even if the code is not as short, I suggest:

nums = np.array([4., 9., 13., 25])
diffs = np.diff(nums)
N = 10

# Number of interpolation points proportional to length of gaps
new_points = diffs/sum(diffs) * (N-len(nums))
while sum(np.floor(new_points)) != N -len(nums): # from continuum to discrete 
    pos = np.argmax(new_points - np.floor(new_points))
    new_points[pos] = np.floor(new_points[pos] + 1)
new_points = np.floor(new_points)

# Now we interpolate by inserting linspace values starting from the end to 
# avoid the loop limits being spoiled when introducing values. 
for ii in range(len(new_points))[::-1]:
    #linspace includes borders
    introduce_these = np.linspace(nums[ii], nums[ii+1], new_points[ii] + 2)[1:-1]
    nums = np.insert(nums, ii+1, introduce_these)

That produces:

In [205]: print nums
[4.   5.66666667   7.33333333   9.   11.   13.   16.   19.   22.   25. ]
Jblasco
  • 3,827
  • 22
  • 25
1

It is possible that you can update multiple values at once. Consider the following:

for k in xrange(N/count):
    max_gaps = np.argsort(np.diff(nums))
    inds = np.sort(max_gaps[-count:])
    new_num = (nums[inds]+nums[inds+1])/2

    nums = np.insert(nums, inds+1, new_num)
    print nums

Count will be the number of gaps simultaneously filled.

First example when count=1 and N=6:

[  4.   9.  13.  19.  25.]
[  4.   9.  13.  19.  22.  25.]
[  4.   9.  13.  16.  19.  22.  25.]
[  4.    6.5   9.   13.   16.   19.   22.   25. ]
[  4.    6.5   9.   11.   13.   16.   19.   22.   25. ]
[  4.    6.5   9.   11.   13.   16.   19.   22.   23.5  25. ]

This example simplifies the linear interpolation and code, but will take the last largest equivalent gap not the first.

Second example when count=2 and N=6:

[  4.    6.5   9.   13.   19.   25. ]
[  4.    6.5   9.   13.   16.   19.   22.   25. ]
[  4.    6.5   9.   11.   13.   16.   19.   22.   23.5  25. ]

Increasing count will change your results, but will help vectorize the code.

Daniel
  • 19,179
  • 7
  • 60
  • 74
  • shorter and more efficient, thanks. ill accept it if nothing else comes along soon. – ecoe Oct 26 '13 at 23:04
  • 1
    Out of curiosity what size arrays will you be applying this to? It is possible to make this much more efficient if `N` is large (`N>50` or so) although the code will be significantly more complex. – Daniel Oct 26 '13 at 23:15
  • N is generally less than 50, but I'm sure someone will appreciate the sol'n for N>50, thanks. – ecoe Oct 27 '13 at 17:10
1

If the array is small and readability is more important than efficiency I suggest:

  nums = [4., 9., 13., 25]
  N = 10
  while len(nums) < N:
      pos = np.argmax(np.diff(nums))   # where maximum difference is
      nums.insert(pos+1, (nums[pos+1] + nums[pos]) / 2.)  #introduce value

Of couse, this suffers from the problems already mentioned that probably this is not the most efficient way of interpolating with the minimum differences between the points at the end of the run.

Jblasco
  • 3,827
  • 22
  • 25
  • nearly a 1-liner, quite efficient and just as accurate as my initial approach (and avoids interp altogether). Can you (or can I?) move this answer as an addendum to your first answer and then I'll mark that as the solution? That way its a solution to both readability and accuracy. – ecoe Oct 27 '13 at 17:39
  • I would like to know the opinion of other, more experienced, users. I honestly see no need to join the answers. There is just three answers, with different approaches and limitations. I would say to chose this one if you think is the best for YOUR problem. The other one is just very different approach, and that is why I wrote both separatedly in the first place. – Jblasco Oct 27 '13 at 19:29