2

I'm new to numpy and have been tasked with the following situation: I need to create two numpy arrays of random integers (between 0 + 1). One numpy array represents the x-coordinates and the other one represents the y-coordinates. I then need to check to see if the points fall inside a circle of radius one by using squareroot(x^2 + y^2) < 1.

I'm currently just trying to square my arrays and add them together. What is probably a very simple task is giving me no ends of trouble.

import matplotlib.pyplot as plt
import numpy as np

plots = 100

dataOne = np.random.random(size = plots)
dataTwo = np.random.random(size = plots)

circle = plt.Circle((0,0), 1, alpha = 0.1)
plt.gca().add_patch(circle)
plt.xlim(0, 5)
plt.ylim(0, 5)
plt.show()

squareDataOne = dataOne ** 2
squareDataTwo = dataTwo ** 2

if squareDataOne + squareDataTwo < 1:
    print("Now check square-root!")

I keep receiving an error message: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all(). Could anyone explain why Python/Numpy does not like this? I've been told to try and use a Boolean expression to slice the array. Can anyone provide suggestions on the best way to incorporate this into my given code? Any suggestions or tips for a newbie are appreciated.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
Dyland
  • 91
  • 1
  • 11
  • Possible duplicate of [How to filter a numpy array using a condition in python](https://stackoverflow.com/questions/48737447/how-to-filter-a-numpy-array-using-a-condition-in-python) – MFisherKDX Apr 04 '19 at 02:07
  • What output are you looking for? The indices of the (x, y) pairs? Or are you supposed to plot them? – subnivean Apr 04 '19 at 02:44

5 Answers5

2

squareDataOne looks like this:

[7.43871942e-02 2.73007883e-01 5.23115388e-03 6.57541340e-01
 3.08779564e-01 1.24098667e-02 5.08258990e-01 6.52590269e-01
 8.90656103e-02 3.76389212e-02 2.12513661e-01 2.79683875e-01
 7.76233370e-01 6.48353342e-02 8.01663208e-01 8.69331480e-01
 4.34903542e]

squareData2 looks similar. The expression in your if statement:

squareDataOne + squareDataTwo < 1

produces this array:

[ True False  True  True  True  True  True  True False False  True False
  True False False  True  True  True  True  True False  True False False
  True  True False  True  True  True  True  True  True  True  True  True
  True  True False False  True False]

So your if statement is expecting a True or False value, and is getting this array. The error message is telling you that Python doesn't know how to turn this array into a single True or False value.

I don't understand the logic of your code well enough to know what you need to do to fix this. Clearly you have a lot of data, and yet you expect to decide a binary event; if you should print "Now check square-root!" or not. I have no idea how you should do that.

CryptoFool
  • 21,719
  • 5
  • 26
  • 44
0

I then need to check to see if the points fall inside a circle of radius one by using squareroot(x^2 + y^2) < 1.

You could use array filtering

pt_norm = (squareDataOne + squareDataTwo)
r_inside_circle = np.sqrt(pt_norm[pt_norm < 1])

This will give you the radius of all the points that fall inside the circle in r_inside_circle. As you increase the value plots you will see that (4.0*len(r_inside_circle))/len(dataOne) will approach PI.

MFisherKDX
  • 2,840
  • 3
  • 14
  • 25
0

I think you want to replace this:

if squareDataOne + squareDataTwo < 1:
    print("Now check square-root!")

with something like:

# Calculate radii
radii = (squareDataOne + squareDataTwo)**0.5

# Create a boolean array (True/False values) for filtering
incircle = radii < 1

# Print the number of points inside the circle (Python 3.6+ for the f-string)
print(f"There are {np.sum(incircle)} points inside the circle")

# Plot the points within the circle
plt.scatter(dataOne[incircle], dataTwo[incircle])

The dataOne[incircle] and dataTwo[incircle] will extract only those elements (i.e. [x, y] coordinate pairs) from each array where the value of incircle is True.

subnivean
  • 1,132
  • 11
  • 19
  • I typically check if the magnitude is less than the radius squared as doing the square root calc over all the points is much more expensive than calculating the square once – VoteCoffee Jan 23 '21 at 19:26
0
# Change number of random points to 5.
plots = 5
...
inside_circle = squareDataOne + squareDataTwo < 1
print(inside_circle)

output:
[False  True  True False False]

It shows that inside_circle is a numpy array of dtype "bool", indicating whether each point is inside circle, rather than a single "bool" variable, which is why python if statement throws error. You can sum all the "bool"s in inside_circle to obtain number of points inside circle (when summing a bool array, numpy treats True as 1, and False as 0).

plots = 100
...
num_inside = np.sum(inside_circle)
# Check if all points are inside.
if num_inside == plots:
    print('All inside.')

# If what you really want to achieve is to estimate the area of one quarter of a circle
# using Monte Carlo Sampling.
print(num_inside / plots)    # Your estimate.
print(np.pi / 4)             # Actual area.

output:
0.79
0.7853981633974483

The estimate is close, right?

0

Below is an example of using radius**2 and prefiltering with a bounding box. These are optimizations that are helpful for large sets.

Prefiltering with a bounding box and Radius**2 is 10x as fast. This code executes in 0.128 seconds:

import time
import numpy as np

plots = 10000000
radius = 1
center = np.array([0,0])

all_points = np.random.randint(low = -5, high = 5, size = plots*2).reshape(-1, 2)
print('All Points: ' + str(len(all_points)))

start_time = time.time()

# Limit expensive circle search to only points within box bounding circle
bounding_box_lowerleft = center - np.array([radius, radius])
bounding_box_upperright = center + np.array([radius, radius])
in_box_points = all_points[(all_points[:, 0] < bounding_box_upperright[0]) & (all_points[:,0] > bounding_box_lowerleft[0]) & (all_points[:, 1] < bounding_box_upperright[1]) & (all_points[:,1] > bounding_box_lowerleft[1])]
    
print('In Box: ' + str(len(in_box_points)))

# Use squared to avoid cost of sqrt
radius_squared = radius**2
in_circle_points = in_box_points[((in_box_points[:, 0] - center[0])**2 + (in_box_points[:, 1] - center[1])**2) <= radius_squared ]

print('In Circle: ' + str(len(in_circle_points)))
print('Elapsed time: ' + str(time.time() - start_time) + ' seconds')

Radius**2 is 4x as fast. This code executes in 0.349 seconds:

import time
import numpy as np

plots = 10000000
radius = 1
center = np.array([0,0])

all_points = np.random.randint(low = -5, high = 5, size = plots*2).reshape(-1, 2)
print('All Points: ' + str(len(all_points)))

start_time = time.time()

# Use squared to avoid cost of sqrt
radius_squared = radius**2
in_circle_points = all_points[((all_points[:, 0] - center[0])**2 + (all_points[:, 1] - center[1])**2) <= radius_squared ]

print('In Circle: ' + str(len(in_circle_points)))
print('Elapsed time: ' + str(time.time() - start_time) + ' seconds')

Without any optimizations. This code executes in 1.185 seconds:

import time
import numpy as np

plots = 10000000
radius = 1
center = np.array([0,0])

all_points = np.random.randint(low = -5, high = 5, size = plots*2).reshape(-1, 2)
print('All Points: ' + str(len(all_points)))

start_time = time.time()

in_circle_points = all_points[((all_points[:, 0] - center[0])**2 + (all_points[:, 1] - center[1])**2)**0.5 <= radius ]

print('In Circle: ' + str(len(in_circle_points)))
print('Elapsed time: ' + str(time.time() - start_time) + ' seconds')
VoteCoffee
  • 4,692
  • 1
  • 41
  • 44