I'm trying to write my own code for a 'friends-of-friends' algorithm. This algorithm acts on a set of 3d data points and returns the number of 'halos' in the dataset. Each halo is a collection of point whose distance is less than the linking length, b, the only parameter of the program.
Algorithmic Description: The FOF algorithm has a single free parameter called the linking length. Any two particles that are separated by a distance less than or equal to the linking length are called "friends". The FOF group is then defined by the set of particles for which each particle within the set is connected to every other particle in the set through a network of friends.
Set FOF group counter j=1.
For each particle, n, not yet associated with any group:
Assign n to group j, initialize a new member list, mlist, for group j with particle n as first entry,
Recursively, for each new particle p in mlist:
- Find neighbors of p lying within a distance less than or equal to the linking length, add to mlist those not already assigned to group j,
- Record mlist for group j, set j=j+1.
This is my attempt to code the algorithm. The only language I'm comfortable in doing this is Python. However, I need this code to be written in Fortran or make it faster. I really hope someone would help me.
First I generate a set of points that should mimic the presence of 3 halos:
import random
from random import *
import math
from math import *
import numpy
from numpy import *
import time
points = 1000
halos=[0,100.,150.]
x=[]
y=[]
z=[]
id=[]
for i in arange(0,points,1):
x.append(halos[0]+random())
y.append(halos[0]+random())
z.append(halos[0]+random())
id.append(i)
for i in arange(points,points*2,1):
x.append(halos[1]+random())
y.append(halos[1]+random())
z.append(halos[1]+random())
id.append(i)
for i in arange(points*2,points*3,1):
x.append(halos[2]+random())
y.append(halos[2]+random())
z.append(halos[2]+random())
id.append(i)
Then I coded the FOF algorithm:
x=array(x)
y=array(y)
z=array(z)
id=array(id)
t0 = time.time()
id_grp=[]
groups=zeros((len(x),1)).tolist()
particles=id
b=1 # linking length
while len(particles)>0:
index = particles[0]
# remove the particle from the particles list
particles.remove(index)
groups[index]=[index]
print "#N ", index
dx=x-x[index]
dy=y-y[index]
dz=z-z[index]
dr=sqrt(dx**2.+dy**2.+dz**2.)
id_to_look = where(dr<b)[0].tolist()
id_to_look.remove(index)
nlist = id_to_look
# remove all the neighbors from the particles list
for i in nlist:
if (i in particles):
particles.remove(i)
print "--> neighbors", nlist
groups[index]=groups[index]+nlist
new_nlist = nlist
while len(new_nlist)>0:
index_n = new_nlist[0]
new_nlist.remove(index_n)
print "----> neigh", index_n
dx=x-x[index_n]
dy=y-y[index_n]
dz=z-z[index_n]
dr=sqrt(dx**2.+dy**2.+dz**2.)
id_to_look = where(dr<b)[0].tolist()
id_to_look = list(set(id_to_look) & set(particles))
nlist = id_to_look
if (len(nlist)==0):
print "No new neighbors found"
else:
groups[index]=groups[index]+nlist
new_nlist=new_nlist+nlist
print "------> neigh-neigh", new_nlist
for k in nlist:
particles.remove(k)
At the end one ends up with a list of the halos in the list groups
This part of the code is a bit off topic but I thought it would be nice to show it to you. I am basically deleting all the groups with no particles, sorting them according to the number of particles and showing some properties.
def select(test,list):
selected = []
for item in list:
if test(item) == True:
selected.append(item)
return selected
groups=select(lambda x: sum(x)>0.,groups)
# sorting groups
groups.sort(lambda x,y: cmp(len(x),len(y)))
groups.reverse()
print time.time() - t0, "seconds"
mass=x
for i in arange(0,len(groups),1):
total_mass=sum([mass[j] for j in groups[i]])
x_cm = sum([mass[j]*x[j] for j in groups[i]])/total_mass
y_cm = sum([mass[j]*y[j] for j in groups[i]])/total_mass
z_cm = sum([mass[j]*z[j] for j in groups[i]])/total_mass
dummy_x_cm = [x[j]-x_cm for j in groups[i]]
dummy_y_cm = [y[j]-y_cm for j in groups[i]]
dummy_z_cm = [z[j]-z_cm for j in groups[i]]
dummy_x_cm = array(dummy_x_cm)
dummy_y_cm = array(dummy_y_cm)
dummy_z_cm = array(dummy_z_cm)
dr = max(sqrt(dummy_x_cm**2.+dummy_y_cm**2.+dummy_z_cm**2.))
dummy_x_cm = max(dummy_x_cm)
dummy_y_cm = max(dummy_y_cm)
dummy_z_cm = max(dummy_z_cm)
print i, len(groups[i]), x_cm, y_cm, z_cm,dummy_x_cm,dummy_y_cm,dummy_z_cm