13

I am stuck with a problem and could not find much help online. I need to find the minimum cost combination of numbers from multiple vectors of numbers. The vector size is same for all vectors. For example, consider the following :

row [0]:  a  b  c  d   
row [1]:  e  f  g  h  
row [2]:  i  j  k  l  

Now I need to find the combination of numbers by taking one element from each row i.e vector, eg: aei
After this, i need to find other three combinations that do not intersect with one another, eg: bfj, cgk, dhl. I calculate the cost based on these four combination chosen. The goal is to find the combination that gives the minimum cost. Another possible combination can be: afj, bei, chk, dgl. If the total number of columns is d and rows is k, the total combination possible is d^k. The rows are stored as vectors. I am stuck here, I am finding it hard to write an algorithm for the above process. I would really appreciate if somebody could help.
Thanks.

// I am still working on the algorithm. I just have the vectors and the cost function.  

//Cost Function  , it also depends on the path chosen
float cost(int a, int b, PATH to_a) {  
float costValue;  
...  
...  
return costValue;  
}  

vector< vector < int > > row;  
//populate row  
...   
...
//Suppose  

//    row [0]:  a  b  c  d   
//    row [1]:  e  f  g  h  
//    row [2]:  i  j  k  l   

// If a is chosen from row[0] and e is chosen from row[1] then,  
float subCost1 = cost(a,e, path_to_a);  

// If i is chosen from row[2] ,  
float subCost2 = cost(e,i,path_to_e);  

// Cost for selecting aei combination is  
float cost1 = subCost1 + subCost2;  

//similarly other three costs need to be calculated by selecting other remaining elements  
//The elements should not intersect with each other eg. combinations aei and bej cannot exist on the same set.  

//Suppose the other combinations chosen are bfj with cost cost2, cgk with cost cost3   and dhl with cost cost4  
float totalCost = cost1 + cost2 + cost3 + cost4;   

//This is the cost got from one combination. All the other possible combinations should be enumerated to get the minimum cost combination. 
coolcav
  • 131
  • 1
  • 6
  • 1
    If you're only looking for C++ answers as the title implies why tag it with C too? – Flexo Sep 12 '11 at 17:20
  • Is this a homework problem? If so it should be tagged as such – GWW Sep 12 '11 at 17:21
  • 1
    If they're stored as vectors, the `C` tag is probably not appropriate – Mooing Duck Sep 12 '11 at 17:22
  • This seems to be a costbased graph routing problem. If you edit to show some of the code you already have, we might have a few starts for you. – sehe Sep 12 '11 at 17:28
  • @awoodland Well, if there is an algorithm for c, it can be applied to c++ as well. So I initially tagged C. – coolcav Sep 12 '11 at 17:33
  • @GWW This is not a homework problem. I encountered this problem during my research work. – coolcav Sep 12 '11 at 17:35
  • @ MStodd I meant to say multiple vectors of numbers. I am editing it now. – coolcav Sep 12 '11 at 17:38
  • @coolcav: I see that you made the cost function a dynamic function. Can we assume that the cost between two given nodes is a constant (i.e. is the cost function deterministic)? – sehe Sep 12 '11 at 18:58
  • @coolcav: can the cost between nodes be a negative value? – sehe Sep 12 '11 at 20:15
  • @coolcav. Is there anything we know about the cost function other than that it needs the full path? Being able to create bounds (non-negative link cost being the simplest) would allow some ability to prune the search space. Also, do you absolutely need the true optimimum, or can you live with some probable result? – Keith Sep 14 '11 at 01:26
  • @Keith: as long as no constraints are going to be found, he is going to _have to live_ with an probable result (or no result). See the complexity stats in my answer. – sehe Sep 14 '11 at 15:14
  • @sehe. That's my point - knowing whether a probable result is any better than no result will determine whether its worth exploring such approaches. – Keith Sep 15 '11 at 00:04
  • is the function cost linear? if so, you can probably find an analytic solution. – Alessandro Teruzzi Sep 15 '11 at 14:07

5 Answers5

15

Posting more utility code

see github: https://gist.github.com/1233012#file_new.cpp

This is basically an much better approach to generating all possible permutations based on much simpler approach (therefore I had no real reason to post it before: As it stands right now, it doesn't do anything more than the python code).

I decided to share it anyways, as you might be able to get some profit out of this as the basis for an eventual solution.

Pro:

  • much faster
    • smarter algorithm (leverages STL and maths :))
    • instruction optimization
    • storage optimization
  • generic problem model
  • model and algorithmic ideas can be used as basis for proper algorithm
  • basis for a good OpenMP parallelization (n-way, for n rows) designed-in (but not fleshed out)

Contra:

  • The code is much more efficient at the cost of flexibility: adapting the code to build in logic about the constraints and cost heuristics would be much easier with the more step-by-step Python approach

All in all I feel that my C++ code could be a big win IFF it turns out that Simulated Annealing is appropriate given the cost function(s); The approach taken in the code would give

  • a highly efficient storage model
  • a highly efficient way to generate random / closely related new grid configurations
  • convenient display functions

Mandatory (abritrary...) benchmark data point (comparison to the python version:)

  a  b  c  d e
  f  g  h  i j
  k  l  m  n o
  p  q  r  s t

Result: 207360000

real  0m13.016s
user  0m13.000s
sys   0m0.010s

Here is what we got up till now:

  • From the description I glean the suggestion that you have a basic graph like enter image description here

  • a path has to be constructed that visits all nodes in the grid (Hamiltonian cycle).

  • The extra constraint is that subsequent nodes have to be taken from the next rank (a-d, e-h, i-l being the three ranks; once a node from the last rank was visited, the path has to continue with any unvisited node from the first rank

  • The edges are weighted, in that they have a cost associated. However, the weight function is not traditional for graph algorithms in that the cost depends on the full path, not just the end-points of each edge.

In the light of this I believe we are in the realm of 'Full Cover' problems (requiring A* algorithm, most famous from Knuths Dancing Links paper).

Specifically Without further information (equivalence of paths, specific properties of the cost function) the best known algorithm to get the 'cheapest' hamiltonian path that satisfies the constraints will be to

  • generate all possible such paths
  • calculate the actual cost function for each
  • choose the minimum cost path

Which is why I have set off and wrote a really dumb brute force generator that generates all the unique paths possible in a generic grid of NxM.

The End Of The Universe

Output for the 3×4 sample grid is 4!3 = 13824 possible paths... Extrapolating that to 6×48 columns, leads to 6!48 = 1.4×10137 possibilities. It is very clear that without further optimization the problem is untractible (NP Hard or something -- I never remember quite the subtle definitions).

The explosion of runtime is deafening:

  • 3×4 (measured) takes about 0.175s
  • 4×5 (measured) took about 6m5s (running without output and under PyPy 1.6 on a fast machine)
  • 5×6 would take roughly 10 years and 9+ months...

At 48x6 we would be looking at... what... 8.3x10107 years (read that closely)

See it live: http://ideone.com/YsVRE

Anyways, here is the python code (all preset for 2×3 grid)

#!/usr/bin/python
ROWS = 2
COLS = 3

## different cell representations
def cell(r,c): 
    ## exercise for the reader: _gues_ which of the following is the fastest
    ## ...
    ## then profile it :)
    index = COLS*(r) + c
    # return [ r,c ]
    # return ( r,c )
    # return index
    # return "(%i,%i)" % (r,c)

    def baseN(num,b,numerals="abcdefghijklmnopqrstuvwxyz"):
        return ((num == 0) and numerals[0]) or (baseN(num // b, b, numerals).lstrip(numerals[0]) + numerals[num % b])

    return baseN(index, 26)

ORIGIN = cell(0,0)

def debug(t): pass; #print t
def dump(grid): print("\n".join(map(str, grid)))

def print_path(path):
    ## Note: to 'normalize' to start at (1,1) node:
    # while ORIGIN != path[0]: path = path[1:] + path[:1] 
    print " -> ".join(map(str, path))

def bruteforce_hamiltonians(grid, whenfound):
    def inner(grid, whenfound, partial):

        cols = len(grid[-1]) # number of columns remaining in last rank
        if cols<1:
            # assert 1 == len(set([ len(r) for r in grid ])) # for debug only
            whenfound(partial)                             # disable when benchmarking
            pass
        else:
            #debug(" ------ cols: %i ------- " % cols)

            for i,rank in enumerate(grid):
                if len(rank)<cols: continue
                #debug("debug: %i, %s (partial: %s%s)" % (i,rank, "... " if len(partial)>3 else "", partial[-3:]))
                for ci,cell in enumerate(rank):
                    partial.append(cell)
                    grid[i] = rank[:ci]+rank[ci+1:] # modify grid in-place, keeps rank

                    inner(grid, whenfound, partial)

                    grid[i] = rank # restore in-place
                    partial.pop()
                break
        pass
    # start of recursion
    inner(grid, whenfound, [])

grid = [ [ cell(c,r) for r in range(COLS) ] for c in range(ROWS) ]

dump(grid)

bruteforce_hamiltonians(grid, print_path)
sehe
  • 374,641
  • 47
  • 450
  • 633
  • 1
    nice charts. How did you make them? – Mooing Duck Sep 12 '11 at 18:08
  • @Mooing Duck: added source [here](https://gist.github.com/1211974) -- I use vimdot to draft them – sehe Sep 12 '11 at 18:15
  • @sehe: Yes, the graph you drew corresponds to my problem. But I have not yet created this graph structure, I just have the vector of vectors and the function to calculate the cost as I have written in the code. The goal is to find the set of minimum cost path form a, b, c, d to all the leaves i, j, k, l and the paths should not intersect. In the problem I am dealing with there are maximum 48 columns and 6 rows. – coolcav Sep 12 '11 at 19:10
  • @coolcav: Two questions: **(a)** are the letters of significance or are they meaningless placeholders (could you use matrix coordinates instead)? **(b)** by intersect, do you mean that the connecting edges cross or do you mean the paths overlap in that a single node is visited more than once? -- Oh, PS: **(c)** _and the question about the cost function posted earlier_ – sehe Sep 12 '11 at 19:13
  • @sehe: in my implementation, the letters represent integers which have mapping to a corresponding vector. The corresponding vectors associated with these numbers are used to calculate the cost. For example, "a" may be mapped to [5,2,1,4] and "e" may be mapped to [4,3,2,1]. These two vectors are used to calculate the cost of edge connecting a and b. I hope this answers (a) and (c). – coolcav Sep 12 '11 at 19:24
  • @sehe: By intersect, i mean that for a combination, the single node should not be visited more than once. For example: one combination can be: aej,bfi,chl,dgk with the total cost C1. The other combination can be: afk, bei, cgj, dhl with cost C2. The combination with lower cost is better. I hope this answers (b). – coolcav Sep 12 '11 at 19:31
  • I forgot to mention, the cost also depends on the previous node chosen, i.e, consider the path a-e-i and b-e-i. The cost of the edge e-i in these cases when the parent nodes are different, may be different. This condition makes things more complicated. – coolcav Sep 12 '11 at 20:16
  • @sehe: Ya, that was my error. I tried to generalize the problem and missed some important things out. Consider the path a-e-i. These numbers a, e and i have certain vectors associated with them which is used to calculate the cost. After the cost calculation of the path a-e, the vectors associated with a and e are modified. Hence, the vectors for "e" won't be same for paths a-e-i and b-e-i. So, the cost for e-i path will be different in this case. – coolcav Sep 12 '11 at 20:51
  • @coolcav: I will still try to help out, but all optimization bets are off if the cost function is not deterministic. Perhaps tomorrow :) – sehe Sep 12 '11 at 21:20
  • @Sehe: The cost function is not non-deterministic. Given the previous path and the next node, the cost is always the same. To find the cost for new set of combinations, the original values of the vectors associated with the nodes are used. For example, for a combination :{ aei, bfj, cgk, dhl } , there is a certain cost and applying cost function changes the vectors associated with these numbers. However, to calculate the next set of combination, suppose {afj, bei, cgk, dhl} all the vectors are reset to the original values. The cost for path cgk will be same for both the combination. – coolcav Sep 12 '11 at 23:20
  • Updated answer summarizing problem description as I understood it now. (_removed old discussion_), added algorithm demo – sehe Sep 14 '11 at 01:08
  • Added python version of the algorithm - mainly out of guilt for overcomplicating the C++ version :) See it running: http://ideone.com/YsVRE – sehe Sep 14 '11 at 01:35
  • Ok, got rid of the C++, got the algorithm fixed in python and the complexity stats... Not good news. You need a heuristic on what paths to choose based on knowledge of the cost function and/or reduce the solution space by e.g. knowing which paths are equivalent beforehand – sehe Sep 14 '11 at 14:51
  • Wish I could give you more than 1 point for this answer. You've put some real time into it and I find it very enlightening! – Michael Dorgan Sep 14 '11 at 15:40
  • Thanks sehe, now i am working on generating a heuristic to solve the problem. Well, i initially wanted to see if i could generate optimal solution this way, but all the supercomputers running in parallel would also take a lot of years :) – coolcav Sep 15 '11 at 21:33
  • @thiton: oops/it sure sounded impressive, now didn't it () – sehe Sep 16 '11 at 19:57
  • @coolcav: sharing some more code that I didn't previously upload because it doesn't do anything over the python code (besides being way faster). Decided to share because it _might_ be useful when building a solution as suggested by others. I explain the pro's and contra's in the body edit. – sehe Sep 21 '11 at 19:25
6

First, one observation that helps very slightly.

I think the 4!^3 result does not capture the fact that { aei, bfj, cgk, dhl } and (for example) { bfj, aei, cgk, dhl } have the same cost.

What this means is that we only need to consider sequences of the form

{ a??, b??, c??, d?? }

This equivalence cuts the number of distinct cases by 4!

On the other hand, @sehe has 3x4 gives 4!^3 (I agree), so similarly 6x48 requires 48!^6. Of these “only” 48!^5 are distinct. This is now 2.95 × 10^305.

Using the 3x4 example, here is a start on an algorithm which gives some sort of answer.

Enumerate all the triplets and their costs. 
Pick the lowest cost triplet.
Remove all remaining triplets containing a letter from that triplet.
Now find the lowest cost triplet remaining.
And so on.

Note that is not a full exhaustive search.

I also see from this is that this is still a lot of computation. That first pass still requires 48^6 (12,230, 590, 464) costs to be computed. I guess that can be done, but will take a lot of effort. The subsequent passes will be cheap in comparison.

Keith
  • 6,756
  • 19
  • 23
  • You're thinking in exactly the same direction as I am. However, I was going to _ask_ the OP whether the cost function actually operates on separate 'vericals' only. +1 for posting the idea before I did :) – sehe Sep 15 '11 at 21:43
  • @sehe. I was working from the OP comment "The cost for path cgk will be same for both the combination" – Keith Sep 16 '11 at 00:07
  • +2 This answer would have my vote for the bounty: I haven't found the time to do any leg work on this (I might have been more motivated if I were in fact convinced that the assumptions hold; I'd think a prospective programmer would have to work more closely with the problem owner before embarking on the coding) – sehe Sep 21 '11 at 18:53
4

EDIT: ADDED A COMPLETE SOLUTION

As the other answers have already pointed out you problem is too difficult to be face with brute force. The starting point of this kind of problem is always Simulated annealing. I have create a small application that implement the algorithm.

Another way to see your problem is to minimize a complex function. Also you have an extra constraint on the possible solution. I start with a random valid configuration (that meet your constraints), then I modified that random solution changing an element per time. I force the application to perform just valid transition. The code is pretty clear.

I have created a template function, so you need just to provide the necessary function objects and structure.

#include <iostream>
#include <cmath>
#include <ctime>
#include <vector>
#include <algorithm>
#include <functional>

//row [0]:  c00  c01  c02  c03   
//row [1]:  c10  c11  c12  c13  
//row [2]:  c20  c21  c22  c23 


typedef std::pair<int,int> RowColIndex;
// the deeper vector has size 3 (aei for example)
// the outer vector has size 4
typedef std::vector<std::vector<RowColIndex> > Matrix;

size_t getRandomNumber(size_t up)
{
    return rand() % up;
}

struct Configuration
{
    Configuration(const Matrix& matrix) : matrix_(matrix){}
    Matrix matrix_;
};

std::ostream& operator<<(std::ostream& os,const Configuration& toPrint)
{
    for (size_t row = 0; row < toPrint.matrix_.at(0).size(); row++)
    {
        for (size_t col = 0; col < toPrint.matrix_.size(); col++)
        {
            os << toPrint.matrix_.at(col).at(row).first  << "," 
               << toPrint.matrix_.at(col).at(row).second << '\t';
        }
        os << '\n';
    }   
    return os;
}

struct Energy 
{ 
    double operator()(const Configuration& conf)
    {
        double result = 0;
        for (size_t col = 0; col < conf.matrix_.size(); col++)
        {
            for (size_t row =0; row < conf.matrix_.at(col).size(); row++)
            {
                result += pow(static_cast<double>(row) - static_cast<double>(conf.matrix_.at(col).at(row).first),2) +
                          pow(static_cast<double>(col) - static_cast<double>(conf.matrix_.at(col).at(row).second),2);
            }
        }
        return result;
    }
};

size_t calculateNewColumn(std::vector<int>& isAlreadyUse)
{
    size_t random;
    do
    {
        random = getRandomNumber(isAlreadyUse.size());
    }
    while (isAlreadyUse.at(random) != 0);

    isAlreadyUse.at(random) = 1;
    return random;
}

Configuration createConfiguration(size_t numberOfRow,size_t numberOfColumn)
{
    //create suitable matrix
    Matrix matrix;
    //add empty column vector
    for (size_t col = 0; col < numberOfColumn; col++)
        matrix.push_back(std::vector<RowColIndex>());

    //loop over all the element
    for (size_t row = 0; row < numberOfRow; row++)
    {
        std::vector<int> isAlreadyUse(numberOfColumn);
        for (size_t col = 0; col < numberOfColumn; col++)
        {
            size_t newCol = calculateNewColumn(isAlreadyUse);
            matrix.at(newCol).push_back(std::make_pair(row,col));
        }
    }   

    return Configuration(matrix);
}


struct CreateNewConfiguration
{
    Configuration operator()(const Configuration& conf)
    {
        Configuration result(conf);

        size_t fromRow = getRandomNumber(result.matrix_.at(0).size());

        size_t fromCol = getRandomNumber(result.matrix_.size());
        size_t toCol = getRandomNumber(result.matrix_.size());

        result.matrix_.at(fromCol).at(fromRow) = conf.matrix_.at(toCol).at(fromRow);
        result.matrix_.at(toCol).at(fromRow) = conf.matrix_.at(fromCol).at(fromRow);

        return result;
    }
};

template<typename Conf,typename CalcEnergy,typename CreateRandomConf>
Conf Annealing(const Conf& start,CalcEnergy energy,CreateRandomConf createNewConfiguration,
               int maxIter = 100000,double minimumEnergy = 1.0e-005)
{
    Conf first(start);
    int iter = 0;
    while (iter < maxIter && energy(first) > minimumEnergy )
    {
        Configuration newConf(createNewConfiguration(first));
        if( energy(first) > energy(newConf))
        {
            first = newConf;
        }
        iter++;
    }
    return first;
}

int main(int argc,char* argv[])
{

    size_t nRows = 25;
    size_t nCols = 25;
    std::vector<Configuration> res;
    for (int i =0; i < 10; i++)
    {
        std::cout << "Configuration #" << i << std::endl;
        Configuration c = createConfiguration(nRows,nCols);
        res.push_back(Annealing(c,Energy(),CreateNewConfiguration()));
    }

    std::vector<Configuration>::iterator it = res.begin();


    std::vector<Configuration>::iterator lowest = it;
    while (++it != res.end())
    {
        if (Energy()(*it) < Energy()(*lowest))
            lowest = it;
    }

    std::cout << Energy()(*lowest) << std::endl;

    std::cout << std::endl;

    std::cout << *lowest << std::endl;


    std::cin.get();
    return 0;
}

Of course you have no guarantee that the solution is the best one (it is an heuristic method). However it is a good starting point.

You haven't provide a complete function cost, so I have implement my own one that allows me to simply check the final result. You need just to provide the function cost and the job is done.

You will probably make the program more efficient, there is plenty of room for improvement, but the logic is there and you can implement your function easily enough.

Complexity

The complexity of the algorithm is E*I*C where I = number of iterations C = number of random configuration (to avoid local minimum) E = calculation of energy function (or function cost)

In this case E is actually N*M where N and M are the dimensions of the initial matrix.

If you are not happy with the Simulated annealing result you may try genetic algorithms.

sehe
  • 374,641
  • 47
  • 450
  • 633
Alessandro Teruzzi
  • 3,918
  • 1
  • 27
  • 41
  • +1, Definitely use simulated annealing or genetic algorithms for this problem. – TiansHUo Sep 21 '11 at 04:26
  • As of now I don't think the sample is simple, nor complete. The cost function is _as of yet_ unspecified. I can't assume that it is suitable for SA. **If so, this is a very good bet** (the implementation could be more efficient). However for now see [WP](http://en.wikipedia.org/wiki/Simulated_annealing#Selecting_the_parameters): the problem is that there really isn't any logical `neighbour` generator (the energy function might be discrete or even singular at transitions randomly chosen. Again, applying heuristics from the problem domain could completely bridge those gaps, if they were known – sehe Sep 21 '11 at 19:04
  • I did my best with the amount of information available, I know that the implementation is not efficient however I thought it is better be clear than fast. The purpose of the code was to show an algorithm. – Alessandro Teruzzi Sep 22 '11 at 07:35
3

You can solve the problem recursively.

The input to the method is the index of the first vector to compute, and the vector is shared outside the function.

For the case there are two rows remaining, solution can be computed using backtracking. In this case, you only need to find the pair-isation less expensive.

For the case there are more than two rows, you should call the method with the next index, get the partial result, and then compute the minimum value using backtracking again.

When the flow get back to the first vector, you can consolidate the result as the final result.

robermorales
  • 3,293
  • 2
  • 27
  • 36
1

It's worth noting that for some interesting choices of path cost, there's a poly-time algorithm, e.g., if the path cost is the sum of the edge costs, then the optimal solution can be found by running, for all i, the Hungarian algorithm on rows i and i + 1.