3

Say I have an array:

values = np.array([1.1,2.2,3.3,4.4,2.1,8.4])

I want to round these values to members of an arbitrary array, say:

rounds = np.array([1.,3.5,5.1,6.7,9.2])

ideally returning an array of rounded numbers and an array of the residues:

rounded = np.array([1.,1.,3.5,5.1,1.,9.2])
residues = np.array([-0.1,-1.2,0.2,0.7,-1.1,0.6])

Is there a good pythonic way of doing this?

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
Kieran Hunt
  • 1,738
  • 4
  • 17
  • 29
  • Can you explain what is the rounds array? – Or Duan Jan 08 '15 at 13:37
  • An array of numbers containing values that I want the elements of the first array to be rounded to. – Kieran Hunt Jan 08 '15 at 13:38
  • what I see here is a substract between the two array. What do you mean by round values from one aarray to another ? – Bestasttung Jan 08 '15 at 13:38
  • For example `2.1` in `values` is closest to `1.` in `rounds`, therefore the corresponding element in `rounded` is `1.`. It is not a subtraction, the arrays have different lengths. – Kieran Hunt Jan 08 '15 at 13:40

5 Answers5

6

One option is this:

>>> x = np.subtract.outer(values, rounds)
>>> y = np.argmin(abs(x), axis=1)

And then rounded and residues are, respectively:

>>> rounds[y]
array([ 1. ,  1. ,  3.5,  5.1,  1. ,  9.2])

>>> rounds[y] - values
array([-0.1, -1.2,  0.2,  0.7, -1.1,  0.8])

Essentially x is a 2D array of every value in values minus every value in rounds. y is a 1D array of the index of the minimum absolute value of each row of x. This y is then used to index rounds.

I should caveat this answer by noting that if len(values) * len(rounds) is big (e.g. starting to exceed 10e8), memory usage may start to become of concern. In this case, you could consider building up y iteratively instead to avoid having to allocate a large block of memory to x.

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
2

As the items in rounds array are sorted(or if not sort them) we can do this is O(n logn) time using numpy.searchsorted:

from functools import partial

def closest(rounds, x):
   ind = np.searchsorted(rounds, x, side='right')
   length = len(rounds)
   if ind in (0, length) :
      return rounds[ind]
   else:
      left, right = rounds[ind-1], rounds[ind]
      val = min((left, right), key=lambda y:abs(x-y))
      return val

f = partial(closest, rounds)
rounded = np.apply_along_axis(f, 1, values[:,None])[:,0]
residues = rounded - values
print repr(rounded)
print repr(residues)

Output:

array([ 1. ,  1. ,  3.5,  5.1,  1. ,  9.2])
array([-0.1, -1.2,  0.2,  0.7, -1.1,  0.8])
Ashwini Chaudhary
  • 244,495
  • 58
  • 464
  • 504
2

The same time complexity as the answer by Ashwini Chaudhary, but fully vectorized:

def round_to(rounds, values):
    # The main speed is in this line
    I = np.searchsorted(rounds, values)

    # Pad so that we can index easier
    rounds_p = np.pad(rounds, 1, mode='edge')

    # We have to decide between I and I+1
    rounded = np.vstack([rounds_p[I], rounds_p[I+1]])
    residues = rounded - values
    J = np.argmin(np.abs(residues), axis=0)

    K = np.arange(len(values))
    return rounded[J,K], residues[J,K]
1

Find the closest number of x in rounds:

def findClosest(x,rounds):
    return rounds[np.argmin(np.absolute(rounds-x))]

Loop over all values:

rounded = [findClosest(x,rounds) for x in values]
residues = values - rounded

This is a straightforward method, but you can be more efficient using that your rounds array is ordered.

def findClosest(x,rounds):
    for n in range(len(rounds)):
        if x > rounds[n]:
            if n == 0:
                return rounds[n]
            elif rounds[n]-x > x-rounds[n-1]:
                return rounds[n-1]
            else:
                return rounds[n]  

        return rounds[-1]

This might be, but not necessarily is faster than the argmin approach because you lose time with the python for loop, but you don't have to check along the whole rounds array.

leeladam
  • 1,748
  • 10
  • 15
0

The selected answer is already great. This one may seem convoluted to those that aren't necessarily used to more complex list-comprehensions, but otherwise it's actually quite clear (IMO) if you're familiar with it.

(Interestingly enough, this happens to run faster than the selected answer. Why would the numPy version be slower than this? Hmm... )

values = np.array([1.1,2.2,3.3,4.4,2.1,8.4])
rounds = np.array([1.,3.5,5.1,6.7,9.2])

rounded, residues = zip(*[
    [
        (rounds[cIndex]),
        (dists[cIndex])
    ]
    for v in values
    for dists in [[r-v for r in rounds]]
    for absDists in [[abs(d) for d in dists]]
    for cIndex in [absDists.index(min(absDists))]
])

print np.array(rounded)
print np.array(residues)
Eithos
  • 2,421
  • 13
  • 13