Extract possible sample combinations from multiple count constraints

Question

I have some input data like this.

unique ID	Q1	Q2	Q3
1	1	1	2
2	1	1	2
3	1	0	3
4	2	0	1
5	3	1	2
6	4	1	3

And my target is to extract some data which satisfy the following conditions:

total count: 4
Q1=1 count: 2
Q1=2 count: 1
Q2=1 count: 1~3
Q3=1 count: 1

In this case, both data set with ids [1, 2, 4, 5] or [2, 3, 4, 5] are acceptable answers.

In reality, I will possibly have 6000+ rows of data and up to 12 count limitation like above. The count might varies from 1 to 50. I've written a solution which firstly group all ids by each condition, then use deapth first search to exhaustedly try out all possible combinations between the groups. (I believe this is a brute-force solution...) However, I always run out my computer's memory and my time before I can get a possible answer.

My question is,

what's the possible least time complexity of this problem. (I believe this is kind of subset sum problem, but I am not sure)
how can I solve this problem instead of a brute-force one? I'm considering dynamic programming or decision tree. However, I believe that I will possibly run out of my computer's memory with either of this one. Or can I solve this problem by each data row's probabilities/entropy (and I would appreciate more details on this)?

My brute-force solution sample codes are not worth reading at all. Thus, I'll skip posting my code snippets...

You can find one solution by posing a linear programming problem, for example by using `pulp` with Python. You will have `n` binary variables indicating if a row should be included, and the constraints should be possible to define given the values in the dataframe. — hilberts_drinking_problem, Apr 18 '22 at 20:16
@hilberts_drinking_problem Thanks for your recommendation! I found out that this is indeed a integer programming problem. I've already found an answer to approximate the result. I will try to answer this question once I have time haha — cindy50633, Apr 25 '22 at 09:32

Extract possible sample combinations from multiple count constraints

0 Answers0