How to vectorize a nested for loop in which the second loop is dependent on the first loop in Python?

Question

I am new to using NumPy, my knowledge of Python is limited and also new to working on images. I don't understand how to do this.

I need to know how to vectorize a nested for loop in Python in which the second loop in dependant on the first loop.

Example:

for(condition)
{ // 
  if (condition) 
  { //
    for(condition)
    { //
    }
  }
}

Let me give the context and a snippet of code so that its easier to explain what I'm looking for.

I have found the end points(a pixel with only one neighbouring pixel) and junction points(a pixel with three or more neighbouring pixels) of a skeletonized image, and in this code snippet, the code is trying to find any connections between the junction points or between a junction point and an end point.

#FOR JUNCTION POINTS TO OTHER JUNCTION POINTS OR END POINTS

for i in xrange(lenj):
    a=junc_points[i]
    point_junc=en2(a[0],a[1],skeleton,point)
    point.append(a)   
    for c in point_junc:
        a1=c
        point.append(a1)
        while True:
            flag=0
            a2=en1(a1[0],a1[1],skeleton,point)    
            a1=a2[1:]
            if a2[0]==0:
                break
            else: 
                point.append(a1)  
                for j in xrange(lenej):
                    b=end_junc_points[j]
                    if a1==b:
                       #print(a," is connected to ",b)
                        flag=1
                        adj[i][j]=1
                        break                    
            if flag==1:
                break

point is a list variable storing all the previously visited pixel coordinates

e2() returns all the neighbouring pixels of point a so that each branch can be traversed to look for end points or junction points

e1() returns [0] if a pixel has already been visited else returns [1,x,y] where x and y are the pixels in front of it

junc_points is a list which stores all the junction points

end_junc_points is a list which stores all both the junction and the end points

lenej is length of end_junc_points

lenj is length of junc_points

point_junc stores the pixel coordinates returned by e2()

adj is an adjacency matrix

The reason I need this is because I am trying to make a key point graph from the end points and junction points in the skeletonized image. adj stores that graph

I wrote the code in a very basic way in Python. As you can understand, this is not efficient and slow, which is why I need vectorization. Other than vectorization, please point me to any functions or libraries for working with skeletonized image which would help me optimize my code.

I appreciate my code looks very vague, I tried to explain everything as much as I can. The reason I didn't give my full code is because there are other sections of the code having the same stuff and I wanted to be able to do them on my own, which is why I asked for help for only one portion, as understanding how to vectorize this might help me do the rest as well.

Edit: Adding input output.

For this image

This was the skeleton image formed

On running a piece of code, I find out the end points and junction points, which are in this case

There's no end points, naturally, these junction points and end points go on as the input for this snippet of code, and it's supposed to form an adjacency matrix to convert it into a graph. The graph formed for this image (represented by an adjacency matrix) is :

I hope this clears up the doubts on input and output

Wow, I guess you've got some bad experience with "silent downvoters"! :) — Cris Luengo, Feb 08 '18 at 03:10
You should be able to parallelize this by simply dividing up the outer loop -- each path traced is independent of the others. Just have each thread start at different junction points. But lines in between two junction points could be duplicated then, you'll have to de-duplicate paths afterwards. But I think you'll get a better speed up by compiling this code, as suggested in a comment above. — Cris Luengo, Feb 08 '18 at 03:18
@CrisLuengo Yes I got banned once for too many downvotes for no reason. And the reason I am using the `point` list is for the de-duplicate purpose. Can you elaborate what you mean by dividing the outer loop? — Tuhin, Feb 08 '18 at 16:13
@Cleb added the input and output for a sample image, and also added the image. Please check :) — Tuhin, Feb 08 '18 at 17:00
@Tuhin: The loop `for i in xrange(lenj)` can be vectorized trivially. All threads need to have their own `point` array, so some lines will be traced twice. After the parallel loop you thus need to find duplicate lines and remove them (this should be trivial). -- But as I said earlier, you'll get a much better speedup by simply compiling the code. — Cris Luengo, Feb 08 '18 at 17:13
@CrisLuengo any idea how can I do JIT compilation in python? As I said Im quite new to this, so any links or pointers would be a great help. Thanks. Yes actually I didnt wish to trace some lines more than once as I thought it would just be more time consuming, so I took this approach so that each line is traced just once. — Tuhin, Feb 08 '18 at 19:01
Did you profile? I bet that looking up if a pixel is visited is the most expensive part of your code. You don't show the function where that happens, but them being in a vector is a bad sign. You'd be better off using a separate image where you mark visited pixels. Then lookup is O(1) instead of O(n). — Cris Luengo, Feb 08 '18 at 19:19
@CrisLuengo `point` list does that. as soon as Im on a pixel, I am adding it to point. So the `e2()` and `e1()` functions both first check if the pixel is in point, then return the value(s) of an unvisited pixel (basically, it moves forward that way and no chance of tracking back) — Tuhin, Feb 08 '18 at 20:10
What I'm saying is that it's a list. Make it into an image `visited[a[0]][a[1]] = 1`. Looking up a coordinate in a list to see if you've visited is extremely expensive! — Cris Luengo, Feb 08 '18 at 20:59

How to vectorize a nested for loop in which the second loop is dependent on the first loop in Python?

0 Answers0