0

I am using the following method to perform operations on every pixel of an image but it is too slow. Roughly takes around 110-120s on my machine.

 for i, j in product(xrange(15, width - 15), xrange(15, height - 15)):
        # finding the avg of 15x15 window
        temp = image.crop((i - 7, j - 7, i + 8, j + 8))
        N = numpy.mean(list(temp.getdata()))

        # calling the functions
        avg0_3,avg0_5,avg0_7,avg0_9,avg0_11,avg0_13,avg0_15 = angle_0(7, 7, temp)
        avg15_3,avg15_5,avg15_7,avg15_9,avg15_11,avg15_13,avg15_15 = angle_15(7, 7, temp)
        avg30_3,avg30_5,avg30_7,avg30_9,avg30_11,avg30_13,avg30_15 = angle_30(7, 7, temp)
        avg45_3,avg45_5,avg45_7,avg45_9,avg45_11,avg45_13,avg45_15 = angle_45(7, 7, temp)
        avg60_3,avg60_5,avg60_7,avg60_9,avg60_11,avg60_13,avg60_15 = angle_60(7, 7, temp)
        avg75_3,avg75_5,avg75_7,avg75_9,avg75_11,avg75_13,avg75_15 = angle_75(7, 7, temp)
        avg90_3,avg90_5,avg90_7,avg90_9,avg90_11,avg90_13,avg90_15 = angle_90(7, 7, temp)

        avg105_3,avg105_5,avg105_7,avg105_9,avg105_11,avg105_13,avg105_15 = angle_105(7, 7, temp)
        avg120_3,avg120_5,avg120_7,avg120_9,avg120_11,avg120_13,avg120_15 = angle_120(7, 7, temp)
        avg135_3,avg135_5,avg135_7,avg135_9,avg135_11,avg135_13,avg135_15 = angle_135(7, 7, temp)
        avg150_3,avg150_5,avg150_7,avg150_9,avg150_11,avg150_13,avg150_15 = angle_150(7, 7, temp)
        avg165_3,avg165_5,avg165_7,avg165_9,avg165_11,avg165_13,avg165_15 = angle_165(7, 7, temp)

        # largest grey level lines (L3,L5,L7,L9,L11,L13,L15)
        L3 = max(avg0_3, avg15_3, avg30_3, avg45_3, avg60_3, avg75_3, avg90_3, avg105_3, avg120_3, avg135_3, avg150_3, avg165_3)
        L5 = max(avg0_5, avg15_5, avg30_5, avg45_5, avg60_5, avg75_5, avg90_5, avg105_5, avg120_5, avg135_5, avg150_5,avg165_5)
        L7 = max(avg0_7, avg15_7, avg30_7, avg45_7, avg60_7, avg75_7, avg90_7, avg105_7, avg120_7, avg135_7, avg150_7,avg165_7)
        L9 = max(avg0_9, avg15_9, avg30_9, avg45_9, avg60_9, avg75_9, avg90_9, avg105_9, avg120_9, avg135_9, avg150_9,avg165_9)
        L11 = max(avg0_11, avg15_11, avg30_11, avg45_11, avg60_11, avg75_11, avg90_11, avg105_11, avg120_11, avg135_11, avg150_11,avg165_11)
        L13 = max(avg0_13, avg15_13, avg30_13, avg45_13, avg60_13, avg75_13, avg90_13, avg105_13, avg120_13, avg135_13, avg150_13,avg165_13)
        L15 = max(avg0_15, avg15_15, avg30_15, avg45_15, avg60_15, avg75_15, avg90_15, avg105_15, avg120_15, avg135_15, avg150_15,avg165_15)

        '''
        # largest grey level orthognal line
        L2 = max(avgorth0, avgorth15, avgorth30, avgorth45, avgorth60, avgorth75, avgorth90, avgorth105, avgorth120,
             avgorth135, avgorth150, avgorth165)
        strength2 = L2 - N
        '''
        # line strengths of lines (L3,L5,L7,L9,L11,L13,L15)

        strength3 = L3 - N
        strength5 = L5 - N
        strength7 = L7 - N
        strength9 = L9 - N
        strength11 = L11 - N
        strength13 = L13 - N
        strength15 = L15 - N


        S3.append(strength3)
        S5.append(strength5)
        S7.append(strength7)
        S9.append(strength9)
        S11.append(strength11)
        S13.append(strength13)
        S15.append(strength15)
        #S2.append(strength2)
        p = image.getpixel((i,j))
        I.append(p)
        R = (strength3 + strength5 + strength7 + strength9 + strength11 + strength13 + strength15 + p) * 0.125
        R_comb.append(R)
        result.putpixel((i, j), R)

def angle_0(i, j, image):


sum3 = image.getpixel(((i - 1), j)) + image.getpixel((i, j)) + image.getpixel(((i + 1), j))
sum5 = image.getpixel(((i - 2), j)) + sum3 + image.getpixel(((i + 2), j))
sum7 = image.getpixel(((i - 3), j)) + sum5 + image.getpixel(((i + 3), j))
sum9 = image.getpixel(((i - 4), j)) + sum7 + image.getpixel(((i + 4), j))
sum11= image.getpixel(((i - 5), j)) + sum9 + image.getpixel(((i + 5), j))
sum13= image.getpixel(((i - 6), j)) + sum11+ image.getpixel(((i + 6), j))
sum15= image.getpixel(((i - 7), j)) + sum13+ image.getpixel(((i + 7), j))

avg_sum3 = sum3 / 3
avg_sum5 = sum5 / 5
avg_sum7 = sum7 / 7
avg_sum9 = sum9 / 9
avg_sum11 = sum11 / 11
avg_sum13 = sum13 / 13
avg_sum15 = sum15 / 15

return avg_sum3,avg_sum5,avg_sum7,avg_sum9,avg_sum11,avg_sum13,avg_sum15

what is more efficient way to do this? Keep in mind, my operations require that I need the coordinates of pixels because I also need pixel values some of the pixels that are at a position relative to i,j

Sami Bilal
  • 21
  • 3
  • 2
    Well, there seems to be a lot of questions on this already: https://stackoverflow.com/questions/13461153/how-can-i-iterate-over-image-pixels-in-a-faster-manner-in-python, https://stackoverflow.com/questions/13003949/faster-way-to-loop-through-every-pixel-of-an-image-in-python, https://stackoverflow.com/questions/36353262/i-need-a-fast-way-to-loop-through-pixels-of-an-image-stack-in-python, https://stackoverflow.com/questions/36118854/python-iterate-through-pixels-of-image, https://stackoverflow.com/questions/45353997/fastest-way-to-iterate-over-all-pixels-of-an-image-in-python – mechalynx Aug 18 '17 at 18:59
  • 2
    Use vectorized functions such as those provided by opencv or numpy. Hard to tell you which, it depends of what you want to achieve. – Dan Mašek Aug 18 '17 at 18:59
  • 1
    https://stackoverflow.com/questions/26445153/iterations-through-pixels-in-an-image-are-terribly-slow-with-python-opencv - I haven't checked the details, but you're likely to find something. – mechalynx Aug 18 '17 at 18:59
  • tried all of these already. Didn't help at all – Sami Bilal Aug 18 '17 at 21:45
  • If you show more of your code I can see if you can parallelize it. – atru Aug 19 '17 at 17:53
  • `for i, j in product(xrange(15, width - 15), xrange(15, height - 15)): N = avg_sqr_window(i, j, image) # calling the functions avg0, avgorth0 = angle_0(i, j, image) avg15, avgorth15 = angle_15(i, j, image) avg30, avgorth30 = angle_30(i, j, image) avg45, avgorth45 = angle_45(i, j, image)` – Sami Bilal Aug 20 '17 at 17:35
  • So in the innermost loop you call all this functions and arguments to these functions are indices i and j and the whole image? I'd imagine that would be indeed slow and memory inefficient because you would be sending a big chunk of data (the `image`) to each of these functions and creating a copy of it. Both sending data and making copies take some time. Can you try sending elements instead? like `image[i,j]`? I see you may not be able to do that. Maybe then a smaller chunk? Also, this looks like something that could be parallelized. What do you do with `image` in those functions? – atru Aug 20 '17 at 23:07
  • Are you doing a convolution operation? What exact function do you need to do inside the for loop? – alkasm Aug 20 '17 at 23:33
  • @atru well if sending an entire image is indeed slow process, I may try to send only that part which is required by me. And since you ask what I am doing inside these functions. Well I sum the values of few required pixels around the pixel i,j and calculate avg. I'm gonna try to measure the time by sending only part of the image and will let you know here in a few moments Also do let me know how can I parallelize it? – Sami Bilal Aug 21 '17 at 21:26
  • @AlexanderReynolds I need to find the avg of some pixel values around the pixel i,j – Sami Bilal Aug 21 '17 at 21:35
  • @SamiBilal selecting these additional pixels may also add an overhead. But you should still try, it's interesting to see. As for parallelization, there's multithreading in python, it worked for me once. I'll check it out now. – atru Aug 21 '17 at 22:17
  • 1
    @SamiBilal this sounds like an [XY problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). Please edit the post with what you are *actually* trying to do, exactly. It sounds like you want to do an averaging convolution, which you can do without for loops and with built-in OpenCV methods. Edit the post with exactly which pixels you want to average, how you want to weight the average (if at all), and what you intend to do at the borders of the image. – alkasm Aug 21 '17 at 22:46
  • @AlexanderReynolds is right - I can help you do a loop parallelization in a number of ways, but you need to show more details of your code. And if you can avoid the loop and the parallelization that's perfect. I can only help you paralellizing, that's my thing, unfortunately know very little on image processing. – atru Aug 21 '17 at 23:39
  • @AlexanderReynolds kindly let me know your email ids, I will mail you the code along with the explanation.. The comment section allows a very little to be explained – Sami Bilal Aug 22 '17 at 00:03
  • @atru You should also give me your ID please – Sami Bilal Aug 22 '17 at 00:03
  • Email is not good as this is a public forum and others might have your same question in the future, so they should be able to see the code. Don't paste it into a comment, edit your original post and include it. And include a description along with the code. If you're just trying to average neighboring pixels, you'll just be doing a box filter convolution. Considering your function is doing some calculations with different angles, am I correct in assuming you are weighting the average by the angle of the neighboring pixels? – alkasm Aug 22 '17 at 00:06
  • what I'm trying to do is I wanna pass a line at 12 different angles through pixel i,j and calculate the avg intensity of pixels on each line (The line is supposed to be 15 pixels in length) from these 12 avg intensities, I wanna find the max value for example: I'm finding avg at angle 45 as below – Sami Bilal Aug 22 '17 at 00:34
  • `sum=image.getpixel(((i - 7), (j + 7))) + image.getpixel(((i - 6), (j + 6))) + image.getpixel( ((i - 5), (j + 5))) + image.getpixel( ((i - 4), (j + 4))) + image.getpixel(((i - 3), (j + 3))) + image.getpixel(((i - 2), (j + 2))) + image.getpixel( ((i - 1), (j + 1))) + image.getpixel((i, j)) + image.getpixel(((i + 1), (j - 1))) + image.getpixel( ((i + 2), (j - 2))) + image.getpixel(((i + 3), (j - 3))) + image.getpixel(((i + 4), (j - 4))) + image.getpixel( ((i + 5),(j-5)))+image.getpixel(((i+6),(j-6)))+image.getpixel(((i+7),(j-7))) avg_sum=sum/15` – Sami Bilal Aug 22 '17 at 00:35
  • Ok, I have edited the original post along with one of the functions called in the for loop, the rest of the functions are the same angle_0() – Sami Bilal Aug 22 '17 at 01:53
  • So since you need to use the neighboring pixels for each pixel, simple multithreading doesn't seem to work. Do you have a lot of images to process? Then you could parallelize the processing of individual images. Worked great for me when I needed to process larger amounts of files. If not then why is 120 s a problem? That's actually fast compared to many things out there. – atru Aug 23 '17 at 06:32
  • I do need to process an entire set of images. How do one parallelize that? – Sami Bilal Aug 23 '17 at 16:18
  • and it might be possible that multi threading works, because after all the processing I'm putting the new value of pixel in a new image of the same dimension as the original image meaning for a pixel i+1,j I don't need the new value of i,j. I need the old (original) value of the pixel. – Sami Bilal Aug 23 '17 at 16:36
  • Problem was that in simple multithreading (splitting the matrix chunks to operate on across threads) I started to send that matrix around and that just ruined the performance. Also, I was using multithreading module (not the multiprocessing one) and apparently threads in that module have a considerable overhead. But if you're processing a lot of images then it may be very simple, just process n of them on each thread. I can post an answer based on some matrix example and then if you're having trouble applying it, you can post more details of your code. – atru Aug 23 '17 at 16:58
  • How can I do That? – Sami Bilal Aug 24 '17 at 00:39

1 Answers1

1

This is a simple solution that you can adapt to your problem. The program loads, processes, and saves the output for a series of files using multithreading. Basically, you would split processing of your images into multiple threads that would, ideally, process batches images in parallel. Here I split processing of some .txt files named Matrix_#.txt.

Depending on the details of your problem this may or may not speed up your code - it may even slow it down, but you won't really know until you try it.

#!/usr/bin/python2.7

import time
import math
import numpy as np
import threading

def process_matrix(pathM, nameM, n0, nf):
    """ Operations applied by each thread
        for it's range of files """
    # pathM is the path to files ending in /
    # or \ depending on the OS. Files have a
    # common name nameM with .txt extension
    # and numbers that go from 0 to nf
    for i in range(n0,nf+1):
        # Load as numpy arrays
        in_name = pathM + nameM + str(i) + '.txt'
        temp = np.loadtxt(in_name)
        # Initialize array with results
        res = np.zeros(3)
        # Send to function that processes the data
        res = processed(temp)
        # Save the output
        out_name =  pathM + 'Out_' + nameM + str(i) + '.txt'
        np.savetxt(out_name, res)

def processed(M):
    """ Function with example operations
        on the data """
    # Add this to simulate processing time
    time.sleep(20)
    # Actual simple operations 
    res = np.zeros(3)
    res[0] = np.mean(M)
    res[1] = np.amax(M)
    res[2] = np.amin(M)
    return res

# Time it - at least to optimize thread number
start = time.time()

# Array with Thread objects
threads = []
# Number of threads
num_threads = 10
# Number of files to process
num_files = 10
# "Increment" for each thread
dnth = math.floor(num_files/num_threads)

# Distribute the processing across threads
for ith in range(0,num_threads):
    # Lower limit for file names for a given thread
    low_lim = int(ith*dnth+1)
    # Upper limit for file names (last thread gets all until 
    # the end)
    up_lim = int(num_files) if (ith == num_threads-1) else int((ith+1)*dnth)
    # Start a thread and append the resulting Thread object to the
    # threads list
    thread = threading.Thread(target=process_matrix, 
            args=('/Users/atru/Research/stack/multi_matrix/', 'Matrix_', 
            low_lim, up_lim))
    threads.append(thread)
    thread.start()

# This allows the program to wait until all
# threads finish the execution (i.e. reach this point)
for t in threads:
    t.join()

# Measure execution time and print it
end = time.time()
print(end - start)

This strategy worked great for me once, the processing was speed up 10 times with 14 threads (the system had 6-8 processors so the number was OK). I double checked the performance just today. This particular program has an almost 10 times speed up with 10 threads when ran on 10 relatively small files on a 2 core, 4 thread processor. Since the program is pausing instead of processing this may not be the best scenario, when running you should also check how it performs with more reasonable number of threads like 2 or 4.

It may also not work as intended, due to various overheads and RAM issues (if your one file is half of your RAM then if two threads open two files your RAM is getting full)- but since your processing takes 2 min it seems similar to my problem, where this was successful.

Let me know if you have questions.

atru
  • 4,699
  • 2
  • 18
  • 19