0

I'm trying to understand this code:

i = bisect.bisect(self.A, [id + 1]) - 1

Here, the developer passed [id + 1] in the bisect() as x (the target value). In both Python docs and its source code for bisect, I don't see anywhere mentioned that x can be an array of length 1 like that.

The goal of the code is to find in A the value of a tuple that has id or the biggest id in A. For example:

A = [[0, 0], [2, 4]]
id = 1
Output = 0 // Here: 1 isn't in A so return value 0 from id 0 which is the biggest id < input id
A = [[0, 0], [2, 4], [3, 12]]
id = 3
Output = 12 // 3 is in A so return 12

I've tried taking out the ids to try to find the value but my code returns the wrong answer:

A = [[0, 0], [2, 4]]
id = 1
ids = [0, 2]
i = bisect.bisect(ids, id) // return 1 which is correct for bisect but not the expected result
i = bisect.bisect(ids, id + 1) - 1 // still returns 1
i = bisect.bisect_left(ids, id + 1) - 1 // still returns 1
i = bisect.bisect_left(ids, id + 1) - 1 // returns 0

However, bisect_left() will return the wrong answer for:

A = [[0, 0], [2, 4], [3, 12]] 
id = 3
i = bisect.bisect(self.A, [id + 1]) - 1 // returns 12 which is correct
ids = [0, 2, 3]
i = bisect.bisect_left(ids, id + 1) - 1 // returns 4 

So why is the difference? How does passing [x] work?

Viet
  • 6,513
  • 12
  • 42
  • 74
  • 1
    Snapshot array anyone? =) Lists as values are no special case whatsoever. The +- kungfu is done to get the correct snap-id, value pair for the given snap-id regardless whether during that id's lifetime a value was set for the key. – user2390182 Jan 19 '21 at 16:33
  • Haha @schwobaseggl you got me. So if I don't want to do that "magic" and just take the `ids` out to find the right index, how should I do that? – Viet Jan 19 '21 at 16:41
  • 1
    You do not show how you then go on to retrieve the actual value (`4` or `12`). The index you obtain from the two approaches should be the same. – user2390182 Jan 19 '21 at 16:48
  • Yeah. I figured it out. I was confused between the 2 `i` results. `i` from the other developer's code is the index in A, my `i` is the index in `ids`. Getting the `ids` however, makes the solution run **a lot** slower. My solution that uses dict and a simple while loop is faster than the binary search with list of lists. – Viet Jan 19 '21 at 16:52
  • 1
    Ah you extracted the ids from the list of pairs first? Yes, that linear action would negate the whole bisect approach. You could store ids and values in separate lists tho, obtain the index via bisect from one and use the index to retrieve the value from the other. – user2390182 Jan 19 '21 at 16:54

1 Answers1

2

There's no special case for bisecting with a list as the argument. lists can be compared just like anything else, and the comparison is lexicographic on the elements. So if you search for [id + 1] in a list of two-element list of ints, you'll find the spot where the inner list has id+1 as the first element. By being shorter than the two-element lists, it's always less than anything with an identical first element, so bisect_left will give a consistent placement relative to equal elements, always at the very beginning of the run of equal elements (whereas if you'd passed [id + 1, 0], you'd come after any elements where the second int was negative).

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Thank you. Your answer explains why passing `[x]` works. I assumed that taking the `ids` out and try `bisect` against them, I should get the same result but I didn't. Can you help me understand that? – Viet Jan 19 '21 at 16:35
  • 1
    @Viet: It doesn't work because `int` is not comparable to `list`. You can't test `if 1 < [1, 2]:`, it's nonsensical (at least in Python, I'm sure there are weird languages where the list construction is so deeply embedded in the language it might allow that). Testing `if [1] < [1, 2]:` is fine though. Internally, `bisect` is doing tests of that form; if the value to search for doesn't match the type of the elements in the `list` being searched, you end up with `TypeError`s. – ShadowRanger Jan 19 '21 at 16:37
  • I meant when I used `bisect` against the list of `ids` themselves, not the tuples. Supposedly, it should work as expected, right? Because bisect will find the insert index in the list of the ids. – Viet Jan 19 '21 at 16:38
  • @schwobaseggl: thank you, your comment makes sense. If I were to get the correct result using the lookup-ids, how should I do that? – Viet Jan 19 '21 at 16:40
  • 1
    @Viet: In your examples, you keep switching from `bisect.bisect` (equivalent to `bisect.bisect_right`) to `bisect.bisect_left` when switching from `list[list[int]]` to `list[int]`. If you're consistent, [they work identically in your example](https://tio.run/##dYw7DoMwEAV7n@KVoCwI47QpOIe1QuKnWErAgm04vVnkmuoVb2biKd9tdSmFf9x2wRCOeRTT4QPvG0LDBN8S3vc6gm2ZYcKkvzNxD6sUWanz9L95kaJTWJkXLJeoYEs1jjupRY05flYVJGQ3qyld "Python 3 – Try It Online"). `bisect_left` and `bisect(_right)` can still differ in some cases because `0` is equal to `0`, but `[0]` is less than `[0, ANYTHING]` (as I noted in my answer). – ShadowRanger Jan 19 '21 at 16:41
  • Thank you! You're right. I was confused by the 2 `i`. In the other developer's code, `i` is the index inside A. In my code, `i` is the index inside `ids`. – Viet Jan 19 '21 at 16:46