Knowing exactly how many items can be selected from each category is a big constraint. Consider the simplest case that there is one category. You much choose exactly N objects to maximize the value sum[v_i x_i] for a cost sum[w_i x_i] < W, where x_i is equal to 0 or 1 (following Wikipedia's notation). The new constraint is that sum[x_i] = N. This constraint can be included in the problem by adding another dimension in the dynamic programming, but explicitly checking whether or not a solution is valid and has exactly the number of required elements.
Vanilla knapsack problem
Here is a brief demonstration: Take as the starting point this solution to the standard 0/1 knapsack problem via memoization:
#include <cstdio>
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
using uint = unsigned int;
template <typename T>
struct item {
T value;
uint weight;
};
template <typename T>
T knapSack(uint W, const std::vector< item<T> >& items) {
std::map< std::pair<uint, uint>, T> cache;
std::function<T(uint, uint)> recursion;
recursion = [&] (uint n, uint w) {
if (n == 0)
return 0;
auto it = cache.find(std::make_pair(n,w));
if (it != cache.end())
return it->second;
T _v = items[n-1].value;
uint _w = items[n-1].weight;
T nextv;
if (_w <= w)
nextv = std::max(_v + recursion(n-1,w-_w),recursion(n-1,w));
else
nextv = recursion(n-1,w);
cache.insert(std::make_pair(std::make_pair(n,w),nextv));
return nextv;
};
return recursion(items.size(),W);
}
My implementation here (with a recursive lambda function) emphasizes readability over optimality. A choice of objects with indices < N and sum of weights < W is either a choice of objects with indices < N-1 and sum of weights < W or the object at N-1 together with objects with indices < N-1 and sum of weights < W - w[N-1].
Knapsack problem with one category with fixed required number of objects
We can go on to add a new restriction that keeps track of the number of elements selected. We'll do this by noting that the new choice of objects at each recursive step has 0 or 1 more elements than the one before it, in the same way that it has the same or greater sum of weights -- i.e., a choice of K objects with indices < N and sum of weights < W is either a choice of K objects with indices < N-1 and sum of weights < W or the object at N-1 together with K-1 objects with indices < N-1 and sum of weights < W - w[N-1]. However, we'll also want to track violations -- for instance, we can't find K objects with indices < N when K>N. In such a case we should report that the maximum possible value is 0, because the choice is impossible, but we should mark this as "invalid" to differentiate this from the the trivial base case of the recursion. Moreover, any solution higher up the chain that attempts to use this as a sub-solution should also be marked as invalid. For this reason, we change the return type from a simple value to a pair of a value and boolean. As part of the base case, we mark all entries with K>N as having a maximum value of 0 but as invalid:
template <typename T>
std::pair<T,bool> knapSackConstrained(uint W, uint K, const std::vector< item<T> >& items) {
std::map< std::tuple<uint, uint, uint>, std::pair<T,bool> > cache;
std::function<std::pair<T, bool>(uint, uint, uint)> recursion;
recursion = [&] (uint n, uint w, uint k) {
if (k > n)
return std::make_pair(0,false);
if (n == 0 || k == 0)
return std::make_pair(0,true);
auto it = cache.find(std::make_tuple(n,w,k));
if (it != cache.end())
return it->second;
T _v = items[n-1].value;
uint _w = items[n-1].weight;
T nextv;
bool nextvalid = true;
if (_w <= w) {
auto take = recursion(n-1,w-_w,k-1);
auto reject = recursion(n-1,w,k);
if (take.second and reject.second) {
nextv = std::max(_v + take.first,reject.first);
} else if (take.second) {
nextv = _v + take.first;
} else if (reject.second) {
nextv = reject.first;
} else {
nextv = 0;
nextvalid = false;
}
} else {
std::tie(nextv,nextvalid) = recursion(n-1,w,k);
}
std::pair<T,bool> p = std::make_pair(nextv,nextvalid);
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
};
return recursion(items.size(),W,K);
}
Here is a simple main routine running this code and its output:
int main(int argc, char *argv[]) {
std::vector< item<int> > items = {{60,10},{10,6},{10,6}};
int j = 13;
std::cout << "Unconstrained: " << knapSack(j,items) << std::endl;
for (uint k = 1; k <= items.size(); ++k) {
auto p = knapSackConstrained(j,k,items);
std::cout << "K = " << k << ": " << p.first;
if (p.second)
std::cout << std::endl;
else
std::cout << ", no valid solution" << std::endl;
}
return 0;
}
% OUTPUT %
Unconstrained: 60
K = 1: 60
K = 2: 20
K = 3: 0, no valid solution
Since the sum of the 3 weights is already greater than the threshold, the solution requiring all three is impossible.
Knapsack problem with multiple categories with fixed required number of objects
The above only partly addresses your problem, since you have multiple categories rather than simply one. However, I believe that this can be extended to be multi-dimensional without too much extra work. Indeed, I suspect the following code is the right strategy for the multi-dimensional case, modulo bugs -- it would require some good test cases for verification. The single argument K is replaced with a vector of category numbers, and the item struct is given a category field. The base case must consider each possible K>N case (for each category), and in addition the check that the (i-1)st weight is less than W must be extended to check that there is at least 1 item more required of the (i-1)st category.
#include <cstdio>
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
using uint = unsigned int;
template <typename T>
struct item {
T value;
uint weight;
uint category;
};
template <typename T>
std::pair<T,bool> knapSack(uint W, const std::vector<uint>& K, const std::vector< item<T> >& items) {
std::map< std::tuple<uint, uint, std::vector<uint> >, std::pair<T,bool> > cache;
std::function<std::pair<T, bool>(uint, uint, std::vector<uint>)> recursion;
recursion = [&] (uint n, uint w, std::vector<uint> k) {
auto it = cache.find(std::make_tuple(n,w,k));
if (it != cache.end())
return it->second;
std::vector<uint> ccount(K.size(),0);
for (uint c = 0; c < K.size(); ++c) {
for (uint i = 0; i < n; ++i) {
if (items[i].category == c)
++ccount[c];
}
}
for (uint c = 0; c < k.size(); ++c) {
if (k[c] > ccount[c]) {
auto p = std::make_pair(0,false);
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
}
}
uint sumk = 0; for (const auto& _k : k) sumk += _k;
if (n == 0 || sumk == 0) {
auto p = std::make_pair(0,true);
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
}
T _v = items[n-1].value;
uint _w = items[n-1].weight;
uint _c = items[n-1].category;
T nextv;
bool nextvalid = true;
if (_w <= w and k[_c] > 0) {
std::vector<uint> subk = k;
--subk[_c];
auto take = recursion(n-1,w-_w,subk);
auto reject = recursion(n-1,w,k);
if (take.second and reject.second) {
nextv = std::max(_v + take.first,reject.first);
} else if (take.second) {
nextv = _v + take.first;
} else if (reject.second) {
nextv = reject.first;
} else {
nextv = 0;
nextvalid = false;
}
} else {
std::tie(nextv,nextvalid) = recursion(n-1,w,k);
}
std::pair<T,bool> p = std::make_pair(nextv,nextvalid);
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
};
return recursion(items.size(),W,K);
}
int main(int argc, char *argv[]) {
std::vector< item<int> > items = {{60,10,0}, {100,20,1}, {120,30,0}, {140,35,1}, {145,40,0}, {180,45,1}, {160,50,1}, {170,55,0}};
int j = 145;
for (uint k1 = 0; k1 <= items.size(); ++k1) {
for (uint k2 = 0; k2 <= items.size(); ++k2) {
auto p = knapSack(j,std::vector<uint>({k1,k2}),items);
if (p.second)
std::cout << "K0 = " << k1 << ", K1 = " << k2 << ": " << p.first << std::endl;
}
}
return 0;
}
% OUTPUT (with comments) %
K0 = 0, K1 = 0: 0
K0 = 0, K1 = 1: 180 // e.g. {} from 0, {180} from 1
K0 = 0, K1 = 2: 340 // e.g. {} from 0, {160,180} from 1
K0 = 0, K1 = 3: 480 // e.g. {} from 0, {140,160,180} from 1
K0 = 1, K1 = 0: 170 // e.g. {170} from 0, {} from 1
K0 = 1, K1 = 1: 350 // e.g. {170} from 0, {180} from 1
K0 = 1, K1 = 2: 490 // e.g. {170} from 0, {140, 180} from 1
K0 = 1, K1 = 3: 565 // e.g. {145} from 0, {100, 140, 180} from 1
K0 = 2, K1 = 0: 315 // e.g. {145,170} from 0, {} from 1
K0 = 2, K1 = 1: 495 // e.g. {145,170} from 0, {180} from 1
K0 = 2, K1 = 2: 550 // e.g. {60,170} from 0, {140,180} from 1
K0 = 2, K1 = 3: 600 // e.g. {60,120} from 0, {100,140,180} from 1
K0 = 3, K1 = 0: 435 // e.g. {120,145,170} from 0, {} from 1
K0 = 3, K1 = 1: 535 // e.g. {120,145,170} from 0, {100} from 1
K0 = 3, K1 = 2: 605 // e.g. {60,120,145} from 0, {100,180} from 1
K0 = 4, K1 = 0: 495 // e.g. {60,120,145,170} from 0, {} from 1
For a given set of items with two categories, the output appears to be correct, though my manual check might have failed to spot some problems [an earlier version of this answer did indeed have some bugs]. All the cases not printed are those with no solution.
Returning the set of selected objects
If you want the function to return the set of selected objects, this is no obstacle in principle -- the code just gets messier. The most human-understandable thing to do is simply add an std::set<std::size_t>
to the tuple of objects returned by recursion
and by knapSack
, and stored in the cache, representing the collection of indices of the chosen objects. Each time a new object is added, this set can be augmented. The resulting code involves lots of copying of sets of integers, and is probably far from optimal -- a better solution might involve a static boolean vector whose entries are toggled on and off. However, it works and makes sense, so here it is:
#include <cstdio>
#include <iostream>
#include <vector>
#include <map>
#include <set>
#include <algorithm>
using uint = unsigned int;
template <typename T>
struct item {
T value;
uint weight;
uint category;
};
template <typename T>
std::tuple<T,bool,std::set<size_t> > knapSack(uint W, std::vector<uint> K, const std::vector< item<T> >& items) {
std::map< std::tuple<uint, uint, std::vector<uint> >, std::tuple<T,bool,std::set<std::size_t> > > cache;
std::function<std::tuple<T,bool,std::set<std::size_t> >(uint, uint, std::vector<uint>&)> recursion;
recursion = [&] (uint n, uint w, std::vector<uint>& k) {
auto it = cache.find(std::make_tuple(n,w,k));
if (it != cache.end())
return it->second;
std::vector<uint> ccount(K.size(),0);
for (uint i = 0; i < n; ++i) {
++ccount[items[i].category];
}
for (uint c = 0; c < k.size(); ++c) {
if (k[c] > ccount[c]) {
auto p = std::make_tuple(0,false,std::set<std::size_t>{});
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
}
}
uint sumk = 0; for (const auto& _k : k) sumk += _k;
if (n == 0 || sumk == 0) {
auto p = std::make_tuple(0,true,std::set<std::size_t>{});
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
}
T _v = items[n-1].value;
uint _w = items[n-1].weight;
uint _c = items[n-1].category;
T nextv;
bool nextvalid = true;
std::set<std::size_t> nextset;
if (_w <= w and k[_c] > 0) {
--k[_c];
auto take = recursion(n-1,w-_w,k);
++k[_c];
auto reject = recursion(n-1,w,k);
T a = _v + std::get<0>(take);
T b = std::get<0>(reject);
if (std::get<1>(take) and std::get<1>(reject)) {
nextv = std::max(a,b);
if (a > b) {
nextset = std::get<2>(take);
nextset.insert(n-1);
} else {
nextset = std::get<2>(reject);
}
} else if (std::get<1>(take)) {
nextv = a;
nextset = std::get<2>(take);
nextset.insert(n-1);
} else if (std::get<1>(reject)) {
nextv = b;
nextset = std::get<2>(reject);
} else {
nextv = 0;
nextvalid = false;
nextset = {};
}
} else {
std::tie(nextv,nextvalid,nextset) = recursion(n-1,w,k);
}
auto p = std::make_tuple(nextv,nextvalid,nextset);
cache.insert(std::make_pair(std::make_tuple(n,w,k),p));
return p;
};
return recursion(items.size(),W,K);
}
int main(int argc, char *argv[]) {
std::vector< item<int> > items = {{60,10,0}, {100,20,1}, {120,30,0}, {140,35,1}, {145,40,0}, {180,45,1}, {160,50,1}, {170,55,0}};
int j = 145;
for (uint k1 = 0; k1 <= items.size(); ++k1) {
for (uint k2 = 0; k2 <= items.size(); ++k2) {
auto p = knapSack(j,std::vector<uint>({k1,k2}),items);
if (std::get<1>(p)) {
std::cout << "K0 = " << k1 << ", K1 = " << k2 << ": " << std::get<0>(p);
std::cout << "; contents are {";
for (const auto& index : std::get<2>(p))
std::cout << index << ", ";
std::cout << "}" << std::endl;
}
}
}
return 0;
}
The output of this is
K0 = 0, K1 = 0: 0; contents are {}
K0 = 0, K1 = 1: 180; contents are {5, }
K0 = 0, K1 = 2: 340; contents are {5, 6, }
K0 = 0, K1 = 3: 480; contents are {3, 5, 6, }
K0 = 1, K1 = 0: 170; contents are {7, }
K0 = 1, K1 = 1: 350; contents are {5, 7, }
K0 = 1, K1 = 2: 490; contents are {3, 5, 7, }
K0 = 1, K1 = 3: 565; contents are {1, 3, 4, 5, }
K0 = 2, K1 = 0: 315; contents are {4, 7, }
K0 = 2, K1 = 1: 495; contents are {4, 5, 7, }
K0 = 2, K1 = 2: 550; contents are {0, 3, 5, 7, }
K0 = 2, K1 = 3: 600; contents are {0, 1, 2, 3, 5, }
K0 = 3, K1 = 0: 435; contents are {2, 4, 7, }
K0 = 3, K1 = 1: 535; contents are {1, 2, 4, 7, }
K0 = 3, K1 = 2: 605; contents are {0, 1, 2, 4, 5, }
K0 = 4, K1 = 0: 495; contents are {0, 2, 4, 7, }
Algorithmic complexity
This is not my forte, but I believe the runtime complexity is psuedo-polynomial, since the algorithm is very similar to the standard knapsack algorithm.