4

Let's say that I have a database of foods, each with an amount of Fat, Carbs and Protein. For example, let's say that I had this database:

Item          Fat         Carbs         Protein
================================================
Milk           12           36             8
Chicken         1           12            18
Juice           0           50             2
Bacon           9            1             4

What would be an efficient algorithm to see what combinations of these foods would fit a certain range of desired Fat, Carbs and Protein, and each item can be used multiple times?

Like if I wanted a combination that was in the range of Fat: 20-30, Carbs: 170-190, Protein: 100-110, then 2 Milks, 5 Chickens, 1 Juice, and 0 Bacon would be one possible solution, as would 0 Milks, 5 Chickens, 2 Juices and 2 Bacons.

It would also be fine if the algorithm stopped once it hit just a possible solution, but I would like it to not be a deterministic algorithm so the next time it is run there is the possibility that a different solution would be found.

This problem sounds like an NP-hard problem like the subset sum problem or the knapsack problem, and I have looked into algorithms for those but I don't understand the algorithms for the multiple-constraint problems. Also knapsack problems are optimizing while here there is no optimizing.

I suppose this problem would be much more difficult if there were more items in the database, and much more easier (to find a single solution that fits the constraints) if the solution was not limited to integers (like 0.2 Milks).

I plan to incorporate something like this in Python so Python solutions would be appreciated, thanks.

Luke
  • 61
  • 3
  • Look at possible ways to count change. This is a very common problem; google should have lots of answers. – Boris the Spider Aug 12 '14 at 20:03
  • 1
    If you add a linear objective function to select from the possible solutions, then this sounds like a [linear programming](http://en.wikipedia.org/wiki/Linear_programming) problem. The [simplex algorithm](http://en.wikipedia.org/wiki/Simplex_algorithm) would yield good (but not guaranteed optimal) results in that case. – phs Aug 12 '14 at 20:03
  • sounds like there should be plenty of answers at [mathexchange](http://math.stackexchange.com/) – Hedde van der Heide Aug 12 '14 at 20:03
  • How large data do you have? Is the runtime of a brute-force approach critical? – Falko Aug 12 '14 at 22:23
  • If the ranges are relatively small and you only have integers, you can use dynamic programming. For example, if you have 0-1000 fats and 0-1000 carbs, then for each amount of fats and carbs you can find m[f][c] and M[f][c] - the minimum and maximum amount of protein you can get with `f` fats and `c` carbs. You can also find if (f,c,p) is possible to achieve, but this only works for really small ranges, like 200 units each. In general - yes, use linear programming. – Sergey Orshanskiy Aug 12 '14 at 23:10
  • I wonder if you found something which works for you. I am looking for something similar as well. – zeeshan Oct 31 '16 at 15:57

2 Answers2

1

Perhaps start with a stochastic hill climber if mathematical programming is too difficult for the time being.

This problem you describe is reminiscent to the blending problem. Here's a solution in MiniZinc that I found (a toy example).

orange
  • 7,755
  • 14
  • 75
  • 139
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. – pascalhein Aug 12 '14 at 21:17
  • @pascalhein There are 2 links in my answer. The wikipedia link, I consider rather uncritical in being deleted/substantially changed. The MiniZinc link is complementary, and, if deleted, similar solutions can be found by searching for the relevant keywords that I provided (posting the source code or parts of the source code is meaningless as it's only one way of describing the problem. The key to the solution is part of how the solver works and beyond the scope of this question). – orange Aug 12 '14 at 21:24
  • I agree that Wikipedia is rather unlikely to be deleted. However, [answers should not only contain links](https://meta.stackexchange.com/questions/8231/are-answers-that-just-contain-links-elsewhere-really-good-answers). You could improve your post by giving a general idea of how this specific approach works - it doesn't have to be the source code. – pascalhein Aug 12 '14 at 21:55
1

Even if you only had one nutrient (e.g. protein alone) to worry about, your problem is at least as hard as subset sum with duplicates allowed because whatever you want your range to be, you could multiply the target sum by a positive integer multiple and then define the range to go up to the next multiple minus 1, and similarly multiply all your set's numbers by the same positive integer and add 1, and you would get that you could solve subset sum if you could solve your problem with a particular range.

You can use integer linear programming to solve your problem, by letting a variable Xi denote how many of item i you will include, and then having constraints like

Fmin <= F1*X1 + F2*X2 + ... + Fn*Xn <= Fmax

where Fi is the amount of fat in item1 and [Fmin,Fmax] is the range of fat you are allowed. You also need constraints that each Xi >= 0. Integer linear programming finds a valid solution for the Xi such that a linear function

C1*X1 + C2*X2 + ... + Cn*Xn

is minimized or maximized, where the Ci are constants. You can get different valid solutions by changing the Ci. Getting all valid solutions, or even weaker, counting the number of valid solutions, is a much harder problem.

user2566092
  • 4,631
  • 15
  • 20