You cannot do better than O(n)
, because you must traverse all the points to determine the max
and min
for x
and y
.
But, you can reduce the constant factor, and traverse the list only once; however, it is unclear if that would give you a better execution time, and if it does, it would be for large collections of points.
[EDIT]: in fact it does not, the "naive" approach is the most efficient.
Here is the "naive" approach: (it is the fastest of the two)
def bounding_box_naive(points):
"""returns a list containing the bottom left and the top right
points in the sequence
Here, we use min and max four times over the collection of points
"""
bot_left_x = min(point[0] for point in points)
bot_left_y = min(point[1] for point in points)
top_right_x = max(point[0] for point in points)
top_right_y = max(point[1] for point in points)
return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]
and the (maybe?) less naive:
def bounding_box(points):
"""returns a list containing the bottom left and the top right
points in the sequence
Here, we traverse the collection of points only once,
to find the min and max for x and y
"""
bot_left_x, bot_left_y = float('inf'), float('inf')
top_right_x, top_right_y = float('-inf'), float('-inf')
for x, y in points:
bot_left_x = min(bot_left_x, x)
bot_left_y = min(bot_left_y, y)
top_right_x = max(top_right_x, x)
top_right_y = max(top_right_y, y)
return [(bot_left_x, bot_left_y), (top_right_x, top_right_y)]
profiling results:
import random
points = [(random.randrange(-1000, 1000), random.randrange(-1000, 1000)) for _ in range(1000000)]
%timeit bounding_box_naive(points)
%timeit bounding_box(points)
size = 1,000 points
1000 loops, best of 3: 573 µs per loop
1000 loops, best of 3: 1.46 ms per loop
size = 10,000 points
100 loops, best of 3: 5.7 ms per loop
100 loops, best of 3: 14.7 ms per loop
size 100,000 points
10 loops, best of 3: 66.8 ms per loop
10 loops, best of 3: 141 ms per loop
size 1,000,000 points
1 loop, best of 3: 664 ms per loop
1 loop, best of 3: 1.47 s per loop
Clearly, the first "not so naive" approach is faster by a factor 2.5 - 3