which numbers in list 2 are bigger and smaller than each number in list 1

Question

I am using python. I have two lists, list 1 is 7000 integers long, list 2 is 25000 integers. I want to go through each number in list 1 and find the closest number in list 2 that is bigger and the closest number that is smaller than each number in list 1, and then calculate the difference between these two numbers in list 2. So far I have:

for i in list1:
    for j in list 2:
        if list2[j]<list1[i]:
            a = max(list2)
        elif list2[j]>list1[i]:
            b = min(list2)
            interval = b-a

This doesn't seem to work. I want to find the explicit numbers in list 2 that are less than a specific number in list 1 and know the maximum, and then find out the smallest number in list 2 that is bigger than the number in list 1. Does anyone have any ideas? Thanks

For starters, `list2[j] < list1[i]` should be simply `j < i`. — Burhan Khalid, May 28 '13 at 12:09
since you have `if..` and `elif...`, each time you iterate through list2 you will only have one of `a` or `b` defined, not both — , May 28 '13 at 12:10
If your lists have duplicates, better to convert them to a set first with `set(list1)` — Burhan Khalid, May 29 '13 at 03:59

Ashwini Chaudhary · Answer 1 · 2013-05-28T13:25:00.117

You can use the bisect module, worst case complexity O(N * logN):

import bisect
lis1 = [4, 20, 26, 27, 30, 53, 57, 76, 89, 101]
lis2 = [17, 21, 40, 49, 53, 53, 53, 53, 70, 80, 81, 95, 99] #this must be sorted
#use lis2.sort() in case lis2 is not sorted
for x in lis1:
       #returns the index where x can be placed in lis2, keeping lis2 sorted
       ind=bisect.bisect(lis2,x) 
       if not (x >= lis2[-1] or x <= lis2[0]):
           sm, bi = lis2[ind-1], lis2[ind]

           if sm == x:  
               """ To handle the case when an item present in lis1 is 
               repeated multiple times in lis2, for eg 53 in this case"""
               ind -= 1
               while lis2[ind] == x:
                   ind -= 1
               sm = lis2[ind]

           print "{} <= {} <= {}".format(sm ,x, bi)

output:

17 <= 20 <= 21
21 <= 26 <= 40
21 <= 27 <= 40
21 <= 30 <= 40
49 <= 53 <= 70
53 <= 57 <= 70
70 <= 76 <= 80
81 <= 89 <= 95

Though this will not output anything for 4 and 101, as 4 is smaller than any element in lis2 and 101 is greater than any element in lis2. But that can be fixed if required.

score 6 · Answer 2 · answered May 28 '13 at 12:26

Here's a vectorized solution using NumPy. It should be extremely fast, as it has no loops in Python (apart from the printing stage at the end).

import numpy as np

# set up fake data
l1 = np.array([1.9, 2, 2.1]) # or whatever list you have
l2 = np.array([1, 2, 5, 10]) # as above
l2.sort() # remove this line if it's always sorted

# the actual algorithm
indexes = np.searchsorted(l2, l1, side='right')
lower = l2[indexes - 1]
upper = l2[indexes]
diffs = upper - lower

# print results for debugging
for value, diff in zip(l1, diffs):
    print "value", value, "gap", diff

Here's the output with the hard-coded test data as above:

value 1.9 gap 1
value 2.0 gap 3
value 2.1 gap 3

Okay, definitely ordering the numpy book now. +1 – Burhan Khalid May 29 '13 at 03:57 — Burhan Khalid, May 29 '13 at 03:57

score 4 · Answer 3 · edited May 28 '13 at 12:43

4

First of all, your example is not valid code, or at least it doesn't do what you want it to do. If you have

for i in list1:

then i is not the index, but an element of list1. So first of all you would compare i and j, not list[i] and list[j].

It should be easier to use list comprehensions>

for i in list1:
    a = max([n for n in list2 if n < i])
    b = min([n for n in list2 if n > i])

You might have to add an if or two to make sure a and b exist, but it should work like this.

edited May 28 '13 at 12:43

Vikas

8,790
4
38
48

answered May 28 '13 at 12:11

Harpe

316
1
9

5

You don't need the second loop `for j in list2:`, do you? – Vikas May 28 '13 at 12:14
+1 for list comprehensions, although generator expressions might be a bit easier to read: `a = max(n for n in list2 if n < i)`. I like how (either of) these read as being almost directly from the spec the OP gave: "a is the maximum of those elements in `list2` that are lower than this particular element of `list1`". Also, `a` and `b` will certainly exist here: if the argument to `max` or `min` amounts to an empty sequence (or, in Python 3, contain types that can't be mutually ordered), they will raise `ValueError`. – lvc May 28 '13 at 12:30
yes that's great! thank you so much. it only worked when i didn't include the second for loop, that's correct. – user2428358 May 28 '13 at 12:35
@wim `max` and `min` are both linear. The loop over `list1` makes it `O(len(list1) * len(list2))`. This isn't *wonderful*, but calling it obnoxious is a bit harsh. – lvc May 28 '13 at 12:50
yeah, the question has been edited - when I voted and commented it had another loop over list2 which did all of nothing, and made it O(n^3) but another user fixed it now. – wim May 28 '13 at 13:00
@wim if you're happy that the problem you downvoted for is fixed, perhaps undoing your vote would be in order? – lvc May 28 '13 at 13:11
You don't need to generate a list inside `max()` and `min()` as they will work on sequences. So the `[ ]` is not required. – Burhan Khalid May 29 '13 at 03:58

score 0 · Answer 4 · answered May 28 '13 at 12:55

Here's a solution not using numpy, bisect module or list comprehensions! Enjoy

list1=[1,2,4,8,16,32,64]
list2=[3,6,9,12,15,18,21]

correct={4:3, 8:3, 16:3}

lower=0
for t in list1:
  print t
  difference = 0
  index = lower
  while (difference == 0 and index<len(list2)-1):
    print "consider %d < %d and %d > %d" % (list2[index],t,list2[index+1],t)
    if list2[index]<t and list2[index+1] > t:
          lower = index
          upper = index + 1
          difference = list2[upper] - list2[lower]                              
          print "%d difference %d" % (t,list2[upper] - list2[lower])
          break
    index = index +1

  if t in correct.keys():
       assert(difference == correct[t])

which numbers in list 2 are bigger and smaller than each number in list 1

4 Answers4