1

I've got an optimization problem in which I need to minimize the sum product of two uneven but consecutive arrays, say:

A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]

Shuffling of values is not allowed i.e. the index of the arrays must remain the same. In other words, it is a distribution minimization of array A over array B in consecutive order.

Or: Given that len(B)>=len(A) Minimize the sum product the values of Array A of length n over n values of array B without changing the order of array A or B.

In this case, the minimum would be:

min_sum = 1*4 + 2*3 + 3*2 = 16

A brute force approach to this problem would be:

from itertools import combinations

sums = [sum(a*b for a,b in zip(A,b)) for b in combinations(B,len(A))]
min_sum = min(sums)

I need to do this for many sets of arrays however. I see a lot of overlap with the knapsack problem and I have the feeling that it should be solved with dynamic programming. I am stuck however in how to write an efficient algorithm to perform this.

Any help would be greatly appreciated!

AnsFourtyTwo
  • 2,480
  • 2
  • 13
  • 33
  • You should be careful with what you call an array/list in python. The variables `A` and `B` in your example are `set`s. Make sure you know the difference! – AnsFourtyTwo Jan 25 '21 at 11:57
  • As @AnsFourtyTwo mentioned, you need to declare array with [ ]. Look here to get your minimum values in the B array up to length n [(https://stackoverflow.com/questions/44002239/how-to-get-the-two-smallest-values-from-a-numpy-array)] Then you can multiply each of those by maximum value from A array until you run out of numbers. – Rob Py Jan 25 '21 at 12:06

6 Answers6

1

Having two lists

A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]

the optimal sum product can be found using:

min_sum = sum(a*b for a,b in zip(sorted(A), sorted(B)[:len(A)][::-1]))

In case A is always given sorted, this simplified version can be used:

min_sum = sum(a*b for a,b in zip(A, sorted(B)[:len(A)][::-1]))

The important part(s) to note:

  • You need factors of A sorted. sorted(A) will do this job, without modifying the original A (in contrast to A.sort()). In case A is already given sorted, this step can be left out.
  • You need the N lowest values from B, where N is the length of A. This can be done with sorted(B)[:len(A)]
  • In order to evaluate the minimal sum of products, you need to multiply the highest number of A with the lowest of B, the second hightst of A with the second lowest of B. That is why after getting the N lowest values of B the order gets reversed with [::-1]

Output

print(min_sum)
# 16
print(A)
# [1, 2, 3]              <- The original list A is not modified
print(B)
# [4, 9, 5, 3, 2, 10]    <- The original list B is not modified
AnsFourtyTwo
  • 2,480
  • 2
  • 13
  • 33
  • Thanks for your answer! Unfortunately, in sorting A, the requirement of the ordered array of A is violated. Say for example `A = [1,3,2]`. Then the solution would be: `1*4 + 3*3 + 2*2 = 17`. However, in this solution you would sort A to `A = [1,2,3]` which still results in 16 and is not allowed. Maybe I'm missing something here, but it seems like it's almost there but not fully yet. Sorry if I didn't formulate the problem clearly enough. – Jules van Dijk Jan 25 '21 at 12:49
  • If `A = [1, 3, 2]` is given, what is the expected result? `16` or `17`? Even without sorting of `A`, you can calculate a minimum via `1*4 + 3*2 + 2*3`. – AnsFourtyTwo Jan 25 '21 at 16:26
  • Oh wow, i think i just fully understood your problem. Hard to grasp on first read. – AnsFourtyTwo Jan 25 '21 at 16:29
0

With Python, you can easily sort and flip sets. The code you are looking for is

A, B = sorted(A), sorted(B)[:len(A)]
min_sum = sum([a*b for a,b in zip(A, B[::-1])])
morhc
  • 194
  • 11
0

You may need to get the values one by one from B, and keep the order of the list by having each value assigned to a key.

A = [1, 3, 2]
B = [4, 9, 5, 3, 2, 10]

#create a new dictionary with key value pairs of B array values
new_dict = {}
j=0
for k in B:
    new_dict[j] = k
    j+= 1


#create a new list of the smallest values in B up to length of array A
min_Bmany =[]
for lp in range(0,len(A)):
    #get the smallest remaining value from dictionary new_dict
    rmvky=  min(zip(new_dict.values(), new_dict.keys()))

    #append this item to minimums list
    min_Bmany.append((rmvky[1],rmvky[0]))
    #delete this key from the dictionary new_dict
    del new_dict[rmvky[1]]

#sort the list by the keys(instead of the values)
min_Bmany.sort(key=lambda r: r[0])

#create list of only the values, but still in the same order as they are in original array
min_B =[]
for z in min_Bmany:
    min_B.append(z[1])


print(A)
print(min_B)

ResultStr = ""
Result = 0

#Calculate the result
for s in range(0,len(A)):
    ResultStr = ResultStr + str(A[s]) +"*" +str(min_B[s])+ " + "
    Result = Result + A[s]*min_B[s]

print(ResultStr)
print("Result = ",Result)

The output will be as follows:

A = [1, 3, 2]
B = [4, 9, 5, 3, 2, 10]
1*4 + 3*3 + 2*2 + 
Result =  17

Then change the A, and the output becomes:

A = [1, 2, 3]
B = [4, 9, 5, 3, 2, 10]
1*4 + 2*3 + 3*2 + 
Result =  16
Rob Py
  • 156
  • 1
  • 1
  • 9
0

Not sure if this is helpful, but anyway.

This can be formulated as a mixed-integer programming (MIP) problem. Basically, an assignment problem with some side constraints.

  min sum((i,j),x(i,j)*a(i)*b(j))
      sum(j, x(i,j)) = 1    ∀i        "each a(i) is assigned to exactly one b(j)"
      sum(i, x(i,j)) ≤ 1    ∀j        "each b(j) can be assigned to at most one a(i)"
      v(i) = sum(j, j*x(i,j))         "position of each a(i) in b"
      v(i) ≥ v(i-1)+1       ∀i>1      "maintain ordering"
      x(i,j) ∈ {0,1}                  "binary variable"       
      v(i) ≥ 1                        "continuous (or integer) variable"

Example output:

----     40 VARIABLE z.L                   =       16.000  

----     40 VARIABLE x.L  assign

            j1          j4          j5

i1       1.000
i2                   1.000
i3                               1.000


----     40 VARIABLE v.L  position of a(i) in b

i1 1.000,    i2 4.000,    i3 5.000

Cute little MIP model.

Just as an experiment I generated a random problem with len(a)=50 and len(b)=500. This leads to a MIP with 650 rows and 25k columns. Solved in 50 seconds (to proven global optimality) on my slow laptop.

Erwin Kalvelagen
  • 15,677
  • 2
  • 14
  • 39
  • This is very helpful! I expected it to be a MIP problem. I was currently exploring greedy algorithms such as Dijkstra's Algorithm to improve the time complexity. [link](https://www.geeksforgeeks.org/python-program-for-dijkstras-shortest-path-algorithm-greedy-algo-7/). Your model is very helpful, however I do not have a lot of experience in coding MIP problems. I have played around a little with API's such as GUROBI but I'm not quite sure how to implement your code here. Would I require some third-party optimization module? I'm curious to how you implemented the model. Thank you!!! – Jules van Dijk Jan 25 '21 at 20:47
  • I used GAMS+Cplex. However, the model is simple enough that you can use any solver or modeling tool. I don't see anything that would make it particularly difficult with the Gurobi Python API. – Erwin Kalvelagen Jan 25 '21 at 21:03
  • I am currently implementing the MIP through gurobi. However, I'm having some trouble defining the ordering constraint. Here's what I'm working with so far: `#Binary variable x = m.addVars(total_cost, name='assign') # Constraints c1 = m.addConstrs((x.sum('*',a) ==1 for a in A_vals), 'a') #each A(a) assigned to exactly 1 B(b) c2 = m.addConstrs((x.sum(b,'*') <=1 for b in B_keys), 'b') #each B(b) can be assigned to at most one A(a) # Define objective function m.setObjective(x.prod(total_cost), GRB.MINIMIZE)` Any chance you could nudge me in the right direction? – Jules van Dijk Jan 26 '21 at 22:28
  • Just use a loop for lhat. – Erwin Kalvelagen Jan 26 '21 at 23:29
0

It turns out using a shortest path algorithm on a direct graph is pretty fast. Erwin did a post showing a MIP model. As you can see in the comments section there, a few of us independently tried shortest path approaches, and on examples with 100 for the length of A and 1000 for the length of B we get optimal solutions in the vicinity of 4 seconds.

prubin
  • 366
  • 2
  • 14
0

The graph can look like:

enter image description here

Nodes are labeled n(i,j) indicating that visiting the node means assigning a(i) to b(j). The costs a(i)*b(j) can be associated with any incoming (or any outgoing) arc. After that calculate the shortest path from src to snk.

BTW can you tell a bit about the background of this problem?

Erwin Kalvelagen
  • 15,677
  • 2
  • 14
  • 39