How to find minimum cost and combination that satisfies (or over) the goal?

Question

I have an algorithm problem created by myself, and I am seeking some guidance here. The goal is to get at least X gold. There are different dealers selling different amounts of golds at different prices. I need to find an optimal combination that minimizes the cost while buying at least X gold.

Rules:

You must buy the gold in full amounts
You can buy any number of the same item

This resembles an unbounded knapsack problem, but in this problem there are no limit to the amount of gold to buy. Solving this using knapsack will give me solutions that are below the required amount of the goal, which are obviously wrong answers. Obviously I could brute-force it by gradually increasing the limit of the knapsack, but that would not be optimal in terms of performance.

As mentioned, I am seeking for some guidance here, on some less known algorithms that would solve this problem. I am not asking for code.

In an example, these are the prices available.

Item ID	Gold	Price
1	120	0.99
2	600	4.99
3	1,960	14.99
4	3,960	29.99
5	4,970	38.89
6	6,560	49.99
7	12,960	99.99
8	14,000	104.99

For this example, my task is to buy at least 12,880 gold. I need to find the combination and how many of each item I need to buy to satisfy the goal, which is get at least 12,880 gold while minimizing the cost.

My attempt to solve is the process of finding the algorithm to solve it. Here is a sheet with the different combinations I have tested, but I still cannot find a viable algorithm to find the optimal combination.

In the image, you can see that buying 2 item_4 and 1 item_5 is currently my best solution. But it is not for certain the optimal solution.

Edit 1:

Add some more details/rules above

Edit 2:

I am just asking for some guidance here and not any code from you guys. I have solved similar problems with unbounded knapsack, but the solution cannot be applied here at all. This is because I need at least the amount, and not less than or equal to the amount. These are two different problems and I don't see any reason to post an unrelated code regarding the knapsack that does not even works as an attempt to solve the problem.

Edit 3:

I realized that I need to scope this question a bit, to make it reasonable. So here are the constraints.

n: Amount of gold in the goal
k: Number of dealers/items
m: Maximum amount of gold that will be sold as a single purchase

n <= 1000000
k <= 1000
m <= 10000000

I am again not looking for any code from you, just some advice for possible solutions, and I will write my own code.

You want the ratio of price/gold though, not gold/price, to get the unit rate. — Amolgorithm, Jul 05 '23 at 02:53
I don't understand, it is basically the same. "One gold is worth x amount of money" vs "For one dollar, you can buy y gold". As for the algorithm, I am not sure the ratio would help much anyways. — Timmy Chan, Jul 05 '23 at 03:05
Do you understand how to solve unbounded knapsack problem? What stops you from applying the same approach here? — maxplus, Jul 05 '23 at 03:42
**1.** You have not specified: can you buy gold in parts or only in full amounts? Example: Can you buy 300 gold from vendor 2 for 2.495 ? Or does it always have to be the full amount 600? **2.** If 'no' to question one, this is a typical variation of the _knapsack_ problem. — aneroid, Jul 05 '23 at 03:47
**3**. StackOverflow is not a free coding service. You're expected to [try to solve the problem first](https://meta.stackoverflow.com/q/261592). Please update your question to **show what you have already tried** in a [**minimal reproducible example**](https://stackoverflow.com/help/minimal-reproducible-example), and then ask a _specific question_ about your algorithm or technique. For further information, see [How to Ask](https://stackoverflow.com/questions/how-to-ask), and take the [tour](https://stackoverflow.com/tour). — aneroid, Jul 05 '23 at 03:47
@aneroid As mentioned in the question, I am seeking some **guidance** to find the **algorithm**, and not asking for code. Please understand this much. 1. I thought it was obvious that you must buy the items in full amounts. I will update the question for it. 2. Thanks for mentioning the name of an algorithm, which is what I needed, as stated. — Timmy Chan, Jul 05 '23 at 03:52
@maxplus I think I understand the unbounded knapsack problem as I have solved some of them before. Now this problem does not have any limits on the amount of goal as opposed to the knapsack problem having limit on the total weight. The approach will be different, I believe. — Timmy Chan, Jul 05 '23 at 04:04
The same approach can be applied here. Obviously calculating least-cost solution to buy exactly 1000000 gold is not useful to find an answer for your example, so there is some upper limit you can impose yourself. — maxplus, Jul 05 '23 at 04:10
@aneroid Yes, I have solved the knapsack version already, and brute-forcing it from 12880 upwards is not what I am looking for, because I think there should be a better solution somehow. — Timmy Chan, Jul 05 '23 at 04:14
@maxplus Are you proposing to set a limit on for instance 13000, and then work downwards possibly using some kinds of binary search for the correct knapsack limit? That will be similar to brute-forcing as mentioned by aneroid, and I am not looking for such solutions. I wanted to know if there are any less known algorithms for a knapsack "over-limit" problem that I don't know about. — Timmy Chan, Jul 05 '23 at 04:19
No, I don't propose any kind of bruteforce. I'm sure there is no well-known algorithm for your version of knapsack because it is trivial to adapt an algorithm solving unbounded knapsack with almost zero overhead (and certainly not changing complexity). — maxplus, Jul 05 '23 at 04:23
@aneroid Also I am not asking for opinions nor to have discussions about it. I seek guidance on how to solve this with an algorithm that I could not think of. Brute-forcing is always what I could do, but we are now talking about optimizations. — Timmy Chan, Jul 05 '23 at 04:23
Since you were interested in solving other knapsack problems, I would consider it a great exercise for you to adapt this algorithm a tiniest bit outside its standard application. Just give yourself an hour, if you indeed understand how unbounded knapsack solution works (as opposed to memorized its implementation), I'm pretty confident you'll answer your question yourself. — maxplus, Jul 05 '23 at 04:27
For such problems, I generally first try backtracking (or DFS in general). Efficiency of backtracking strongly depends on implementation details (at the algorithmic level). Fore example, I would first sort the items by their unitary costs. At a given node of the tree, you can bound the cost of the remaining buying, and then delete pans of the tree. — Damien, Jul 05 '23 at 09:56
The optimal solution is `4; 4x 3; 9x 1`, which is exactly 12880 gold, with a cost of 98.86. So your first step is to implement the unbounded knapsack algorithm with an upper limit of 12880, because that *will* find the optimal solution. Then we can talk about how to adapt your code to the case where the optimal solution needs more gold than the minimum. — user3386109, Jul 05 '23 at 19:51
@user3386109 yes, I found that solution already :) But what I want is the general solution. And currently I am looking at this post. https://stackoverflow.com/questions/38120363/knapsack-with-at-least-x-value-constraint?rq=3 — Timmy Chan, Jul 05 '23 at 21:36
When the minimum gold you need is `n = 1000000`, you can't declare a `dp` table like `dp[1000000][1000000]` since that takes terabytes of memory. So you need a more memory-efficient solution. Unbounded knapsack can be solved with a single 1D table, i.e. `dp[1000000]`. — user3386109, Jul 05 '23 at 21:58

Timmy Chan · Accepted Answer · 2023-07-11T01:05:24.703

To solve the problem, we need to extend the generic unbounded knapsack algorithm for minimizing cost with additional steps. Here I present a generic solution for all unbounded knapsack problems with at least W weight while minimizing total cost.

Preconditions:

It is assumed that non-integer numbers have a limited number of decimal places. To work with integers, these numbers are multiplied by a factor. For example, 49.99 becomes 4999.
The terms "gold" and "price" are used interchangeably with "weight" and "cost," respectively, to align with the knapsack algorithm terminology.
Let W represents the required weight.
Let N be the number of items.

Solution:

Apply the general algorithm for solving the unbounded knapsack problem (minimizing cost) to construct a dp table. The dp table contains minimized costs and maximized weights for all capacities up to the required weight W. Additionally, maintain an array to store the combination of items for later retrieval.
Assuming the dp table is sorted in ascending order based on capacity, follow these steps:
- Initialize min_cost as infinity.
- Iterate over each item in the items list and calculate the difference W - item_weight. For each iteration, search the dp table to find the first maximized weight that is equal to or greater than the calculated value. Use binary search for this.
- As a result, the sum of found_weight and item_weight represents a total weight of at least W.
- Use the index of the found weight in the dp_costs array to calculate the total cost.
- If the total cost is less than the current min_cost, update both min_cost with the new value.
- After completing the item loop, use the found_weight and item weights to find the combination of needed to achieve the min_cost.

Complexity:

Time: O(N * (W + log(W)))

First Knapsack with O(N * W), then binary search within a loop O(N log(W))

Space: O(W)

We are using 1D lists for the "dp table".

Final Code:

# Items with weights and costs
items = [
    (120, 99),
    (600, 499),
    (1960, 1499),
    (3960, 2999),
    (4960, 3889),
    (6560, 4999),
    (12960, 9999),
    (14000, 10499),
]

def find_index_of_first_occurrence_or_higher(lst, value):
    ''' Binary search on a sorted list and returns the index of the first occurrence of a value or the next higher value if the value is not found.'''
    lo, hi = 0, len(lst) - 1
    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if lst[mid] < value:
            lo = mid + 1
        else:
            hi = mid - 1
    return hi + 1

def unbounded_knapsack_minimize_cost_required_capacity(items, capacity):
    dp_costs = [float('inf')] * (capacity + 1)      # Accumulated costs
    dp_wts = [0] * (capacity + 1)                   # Accumulated weights
    dp_items = [0] * (capacity + 1)                 # Item IDs in sequence
    dp_costs[0] = 0
    
    # Iterate over each capacity from 1 to the maximum capacity
    for i in range(1, capacity + 1):
        for j, (wt, cost) in enumerate(items):
            # Check if the weight of the current item is less than or equal to the current capacity
            if wt <= i:
                dp_wt = dp_wts[i - wt] + wt
                # If adding the item gives more weight than previously
                if dp_wt >= dp_wts[i]:
                    dp_cost = dp_costs[i - wt] + cost
                    # If the accumulated cost is less than previously
                    if dp_cost < dp_costs[i]:
                        dp_costs[i] = dp_cost
                        dp_wts[i] = dp_wt
                        dp_items[i] = j
        if dp_wts[i] == 0:
            # Use previous best as current best
            dp_costs[i] = dp_costs[i-1]
            dp_wts[i] = dp_wts[i-1]
            dp_items[i] = dp_items[i-1]
    
    # There could be items that gives more weight but less total cost. To handle these cases, loop through all the items once
    # Make use of the already created dp (i.e dp_costs) table, remove equivalent amount or less from the accumulated weight. Then apply the item weight and check if the cost is less.
    min_cost = float('inf')
    for i, (wt, cost) in enumerate(items):
        weight_to_find = max(capacity - wt, 0)
        
        found_weight = find_index_of_first_occurrence_or_higher(dp_wts, weight_to_find)
        dp_cost = dp_costs[found_weight] + cost
        
        if dp_cost < min_cost:
            min_cost = dp_cost
            min_cost_last_item = i
            total_weight = found_weight
    
    # Find the total combination that gives the min_cost
    min_combination = [min_cost_last_item]
    while total_weight > 0:
        item = dp_items[total_weight]
        min_combination.append(item)
        # Find previous item by subtracting weights
        item_weight = items[item][0]
        total_weight -= item_weight
    return min_cost, min_combination

capacity = int(input("Input capacity of the knapsack: "))
min_cost, min_combination = unbounded_knapsack_minimize_cost_required_capacity(items, capacity)

print("Minimum cost:", min_cost/100)
print("Minimum combination:", min_combination)
print("Final weight:", sum(map(lambda x: items[x][0], min_combination)))

Danilo · Answer 2 · 2023-07-05T05:46:47.267

tl;dr;

In general (when you have issue like this) steps are:

First you model mathematics/logic of your problem , define set domain, variable types and their relationship or limits.
Then you analyze your math in form of change (derivatives/integrals) and try to find independent actors ( by following operation order PEMDAS.
Then you separate each independent actor into smaller tasks that you can test, execute and verify MRE.
Then you join it all together into cohesive algorithm that implements your solution.

I think you require additional info, to solve an set of equations containing 2 unknowns (ie amount of gold and price) you tend to need either one formula that connect the two or two independent formulas for each.

Since you have 2 variables (unknowns), you can approach this in 4 ways that might branch out later on:

Forget about one of those variables (GOLD, or PRICE) and just find algorithm that handles one of them - prioritize your data.
Find new variable that ties them together in some sort of function or formula which limits your applicability - and test it by finding differential/derivative or integral of relationship described by new variable.
Use them both but independently and then revalue results with some sort of weighted AI trough iteration in order to find some sort of rule or law.
Find a larger problem (which those 2 variables are part of) and find algorithm for that. - superset for GOLD and PRICE

The way I see it, it seems that you have a floating number limit issue here, since there are infinite amount of Rational Numbers within 0 and 1 in any progressive set ( where amount of GOLD or PRICE increases over time) you will always have some issues matching the GOAL in any random progressive set. So you can approach it by limiting it to specific interval that you may search in and finding direction of "movement" with logic of:

For set A (a in A: a > 0, a in Q+) and set B (b in B: b > 0, b in Q+) we have a variable c (c in C, a_i <= c <= b_i | b_i <= c <= a_i, c in Q+), direction of mapping (either A or B) is defined by m(a,b,c)=(2C/(a+b))-1 where if m is lesser than 0 it is closer to smaller of a,b. However issue in logic is the domain of the set, since Q is defined by p/q of 2 integers for every real number (float, double) you have issues with range (or step) of linear mapping function.

For any value C (in this case QUANTITY of gold) and sets of A and B (GOAL and FUNDS) you can have potentially infinite amount of values c that are small enough to be disregarded if minimizing the cost is secondary goal. So without any "reference" in amount of gold compared to GOAL (that is derivative of time) you might have issues of never reaching a goal by a wide margin - programmatically.

So you need additional information (or function analysis) in order to find a bordering limit at which buying gold for X price is justified. I would approach this problem from Rational number perspective and represent each price as integer division of closest value, then if DIFFERENCE between GOAL and FUNDS have same GCD as price then I would "buy" it (if GCD isn't 1 and is naturally decreasing as FUNDS are closer to GOAL). Of course in this approach primary numbers are a pain in the butt, and you might still have wide margin to reach your goal but it should be less than removing chunks out of funds.

Second possible approach is from economic standpoint where you need additional variable of "Required Price" or some sort of estimation of wealth of whomever is buying. But that can change your desire for this algo quite a bit, since you will not have diversity in your portfolio and would (sort of) limit your domain to specific range. If in same case as before:

For set A (a in A: a > 0, a in Q+) and set B (b in B: b > 0, b in Q+) we have a variable c (c in C, a_i <= c <= b_i | b_i <= c <= a_i, c in Q+) then price becomes c/a, c/b and which if c is p/q, a is w/e and b is v/u then we have typical Rational division (which is multiplication of its inverse):

c/a = (p/q)/(w/e)=(pe/wq) | c/b = (p/q)/(v/u) = (pu/qv) -> c/a=c*a^-1, c/b=c*b^-1

Then for some Required Price (RP) and some acceptable "leeway"[lw] we can write it as -lw <= some (a^-1, b^-1) <= lw if RP is also p/q. So if you have Minimum amount of Gold and Capital, RQ = Minimum amount of Gold / Capital ( or Ratio in your second image). This approach will ignore any price that is outside of "leeway" and will tend to have margin as wide as inverse of lw.

This is why you need another variable in data set that has 2 elements (GOLD,PRICE) if you are searching for solution for only one variable (GOLD). You can ignore PRICE and focus on GOLD to GOLD mapping, or you can add some additional variables (such as PRICE, FUNDS, CAPITAL...) that are either combination of two or describe their relationship.

Each step of algorithm is reduction in information, since we can't handle infinity outside of abstractions. You can think about it, and you can model it but once you try to apply it probability rises alot. So here you can either sit down and go trough each and every probability in mapping and try and find some relationship (or theory) or you can simplify it by introducing another variable or relationship.

Hope this helps.

This is a nice long answer, and I will need some time to consume this. Thanks for now. — Timmy Chan, Jul 05 '23 at 05:50
No problem, if you have trouble understanding what i've said about sets, here is a [free book of proofs](https://www.people.vcu.edu/~rhammack/BookOfProof/Main.pdf). — Danilo, Jul 05 '23 at 06:01