I have two functions that compute the same metric. One ends up using a list comprehension to cycle through a calculation, the other uses only numpy tensor operations. The functions take in a (N, 3) array, where N is the number of points in 3D space. When N <~ 3000 the tensor function is faster, when N >~ 3000 the list comprehension is faster. Both seem to have linear time complexity in terms of N i.e two time-N lines cross at N=~3000.
def approximate_area_loop(section, num_area_divisions):
n_a_d = num_area_divisions
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = lambda u: (1 - u) * a1 + u * a2
d = lambda u: (1 - u) * b1 + u * b2
x = lambda u, v: (1 - v) * c(u) + v * d(u)
area = np.sum([np.linalg.norm(np.cross((x((i + 1)/n_a_d, j/n_a_d) - x(i/n_a_d, j/n_a_d)),\
(x(i/n_a_d, (j +1)/n_a_d) - x(i/n_a_d, j/n_a_d))), axis = 1)\
for i in range(n_a_d) for j in range(n_a_d)])
Dt = section[-1, 0] - section[0, 0]
return area, Dt
def approximate_area_tensor(section, num_area_divisions):
divisors = np.linspace(0, 1, num_area_divisions + 1)
interp_vectors = get_section_interp_(section)
a1 = section[:-1]
b1 = section[1:]
a2 = interp_vectors[:-1]
b2 = interp_vectors[1:]
c = np.multiply.outer(a1, (1 - divisors)) + np.multiply.outer(a2, divisors) # c_areas_vecs_divs
d = np.multiply.outer(b1, (1 - divisors)) + np.multiply.outer(b2, divisors) # d_areas_vecs_divs
x = np.multiply.outer(c, (1 - divisors)) + np.multiply.outer(d, divisors) # x_areas_vecs_Divs_divs
u = x[:, :, 1:, :-1] - x[:, :, :-1, :-1] # u_areas_vecs_Divs_divs
v = x[:, :, :-1, 1:] - x[:, :, :-1, :-1] # v_areas_vecs_Divs_divs
sub_area_norm_vecs = np.cross(u, v, axis = 1) # areas_crosses_Divs_divs
sub_areas = np.linalg.norm(sub_area_norm_vecs, axis = 1) # areas_Divs_divs (values are now sub areas)
area = np.sum(sub_areas)
Dt = section[-1, 0] - section[0, 0]
return area, Dt
Why does the list comprehension version work faster at large N? Surely the tensor version should be faster? I'm wondering if it's something to do with the size of the calculations meaning it's too big to be done in cache? Please ask if I haven't included enough information, I'd really like to get to the bottom of this.