0

Again I have a question concerning large loops.

Suppose I have a function

limits

def limits(a,b):
  *evaluate integral with upper and lower limits a and b*
  return float result

A and B are simple np.arrays that store my values a and b. Now I want to calculate the integral 300'000^2/2 times because A and B are of the length of 300'000 each and the integral is symmetrical.

In Python I tried several ways like itertools.combinations_with_replacement to create the combinations of A and B and then put them into the integral but that takes huge amount of time and the memory is totally overloaded. Is there any way, for example transferring the loop in another language, to speed this up?

I would like to run the loop

for i in range(len(A)):
  for j in range(len(B)):
    np.histogram(limits(A[i],B[j]))

I think histrogramming the return of limits is desirable in order not to store additional arrays that grow squarely.

From what I read python is not really the best choice for this iterative ansatzes.

So would it be reasonable to evaluate this loop in another language within Python, if yes, How to do it. I know there are ways to transfer code, but I have never done it so far.

Thanks for your help.

madzone
  • 195
  • 1
  • 2
  • 9
  • 1
    `itertools.combinations` and friends produce iterators. Usually, you loop directly over those, and not try to expand them in memory (by converting them to a list, for example); you don't need all the intermediary results for random access, do you? – Martijn Pieters Jan 03 '13 at 10:58
  • 300k^2 / 2 = 45 billion. Just porting it to another language won't cut it, you'll need beefy hardware and clever optimizations to even get started. –  Jan 03 '13 at 11:01
  • @MartijnPieters, no I dont need the intermediate results, just the final (histrogrammed) one. delnan, there is a piece of code in C that manages this in 4 hours in total. I would like to be similarly efficient. – madzone Jan 03 '13 at 11:09
  • @madzone: Then don't materialize the result of an iterator into a list; just loop over it and use each result as an intermediary, discarding it again at the end of the loop iteration. – Martijn Pieters Jan 03 '13 at 11:13
  • @MartijnPieters, may I ask you for a short snippet? – madzone Jan 03 '13 at 11:18
  • @madzone: just `for result in itertools.combinations(inputs):` then use `result` in that loop, and don't add it to a list. Do not call `list(tertools.combinations(inputs))` or use it in a list comprehension. – Martijn Pieters Jan 03 '13 at 11:19
  • @MartijnPieters, do you know a good way to use `itertools.combinations` if I want to mix two arrays. I just get it running with `product`. – madzone Jan 03 '13 at 12:41
  • @madzone: I know next to nothing about your problem space, that is way too vague a question for me to answer, and certainly not in comments. – Martijn Pieters Jan 03 '13 at 12:43
  • @MartijnPieters, my bad, the question was definitely too vague. Two arrays A and B with stored floats. I want to combine each element of A with each of B like `itertools.product` does, but get rid of redundant combinations, as my function I use is symmetrical, so `A[1]B[3]` is the same as `B[3]A[1]`. – madzone Jan 03 '13 at 12:48

1 Answers1

0

If you're worried about memory footprint, all you need to do is bin the results as you go in the for loop.

num_bins = 100
bin_upper_limits = np.linspace(-456, 456, num=num_bins-1)
# (last bin has no upper limit, it goes from 456 to infinity)
bin_count = np.zeros(num_bins)
for a in A:
    for b in B:
        if b<a:
             # you said the integral is symmetric, so we can skip these, right?
             continue
        new_result = limits(a,b)
        which_bin = np.digitize([new_result], bin_upper_limits)
        bin_count[which_bin] += 1

So nothing large is saved in memory.

As for speed, I imagine that the overwhelming majority of time is spent evaluating limits(a,b). The looping and binning is plenty fast in this case, even in python. To convince yourself of this, try replacing the line new_result = limits(a,b) with new_result = 234. You'll find that the loop runs very fast. (A few minutes on my computer, much much less than the 4 hour figure you quote.) Python does not loop very fast compared to C, but it doesn't matter in this case.

Whatever you do to speed up the limits() call (including implementing it in another language) will speed up the program.

If you change the algorithm, there is vast room for improvement. Let's take an example of what it seems you're doing. Let's say A and B are 0,1,2,3. You're integrating a function over the ranges 0-->0, 0-->1, 1-->1, 1-->2, 0-->2, etc. etc. You're re-doing the same work over and over. If you have integrated 0-->1 and 1-->2, then you can add up those two results to get the integral 0-->2. You don't have to use a fancy integration algorithm, you just have to add two numbers you already know.

Therefore it seems to me that you can compute integrals in all the smallest ranges (0-->1, 1-->2, 2-->3), store the results in an array, and add subsets of the results to get the integral over whatever range you want. If you want this program to run in a few minutes instead of 4 hours, I suggest thinking through an alternative algorithm along those lines.

(Sorry if I'm misunderstanding the problem you're trying to solve.)

Steve Byrnes
  • 2,210
  • 1
  • 20
  • 25
  • Thanks for answering. I use `np.histogram` to directly histogram my output similar to your suggestion. Still, I think the loop causes he problem not my simple function that is working on. I really try since one month to boost this up, but failed in Python. – madzone Feb 01 '13 at 16:16
  • Is there a method to parallelize like MPI for `C` or evaluate jsut the loop in another language? – madzone Feb 01 '13 at 16:17