1

Suppose, I have a sorted list with 2000 elements. List elements may contain duplicate or same value. Now I want to make 74 sub list where every sub list will contain 27 elements (number of sub list and number of containing elements may vary according to users' choice). Every sub list will give same average value. Lets I have a value[2000] list-

# value list having 2000 elements
value = [ 3.3, 3.4, 3.4, ...., 4.1, 4.1, 4.2, ....., 5.1, 5.2, 5.2, .... 6.3, 6.3,...., 6.6]

Now I want to make 74 sub list with each 27 elements (74*27 = 1998 list elements will be used from value[] list which contains 2000 elements)

My expecting result will be-

# every sub_list having 27 elements
sub_list_1 = [ 3.4, 3.4,...., 4.1,..., 4.2,...., 5.2, ...., 6.3,..., 6.6]
average_value = sum(sub_list_1)/27 = 4.5

............................
............................

sub_list_74 = [ 3.3,...., 4.4,..., 4.9,...., 5.2, ....,]
average_value = sum(sub_list_74)/27 = 4.5

How can I do this with Python? Please explain details....

MSI Shafik
  • 139
  • 6
  • This is not duplicate of your mentioned post. Please check again. – MSI Shafik May 11 '19 at 02:40
  • 1
    It's not clear from your question whether the sublists need to be formed from contiguous blocks of the original list or not. If they do, then @AlexandreB.'s linked question is equivalent, since when the sublists have equal sizes, then "all sublists have equal averages" is equivalent to "all sublists have equal sums". – j_random_hacker May 11 '19 at 11:47
  • 1
    If OTOH the sublists can contain arbitrary elements from the original list, this problem would be NP-hard even if you only wanted to split the list into just 2 sublists instead of 72: it would then be a variant of the Partition Problem in which the two sets are required to have equal sizes, and this variant is, like the original Partition Problem, NP-hard (see https://en.wikipedia.org/wiki/Partition_problem#Variants_and_generalizations). – j_random_hacker May 11 '19 at 11:53
  • This is not a clustering problem. Clustering is about finding patterns in the data. You are forcing a predefined pattern into your data. Clearly, the proper way is to treat this as a constraint optimization problem. If you can formulate it as an ILP, you should be able to use solvers. Or do a Langrangian relaxation. – Has QUIT--Anony-Mousse May 11 '19 at 18:59
  • @Anony-Mousse: I read this Q. I read what you said it was a duplicate of. They don't share a need to split on original order. MSIShafik has pointed this out too. Please remove your miss categorisation. – Paddy3118 May 12 '19 at 05:37
  • The same *strategy* supposedly works for your problem, too. You can't just copy and paste the code, but in essence it's the same. – Has QUIT--Anony-Mousse May 12 '19 at 05:58
  • @Anony-Mousse, it isn't my problem. When thinking of solutions your linked problem has the restriction "I need to maintain the order of elements in the list." that is not mentioned above. This problem is much less constrained; it's best answers will not be answers to your supposed "duplicate". You have stopped those answers from appearing here. (E.g. any answer involving moving an item out of its order in the list). – Paddy3118 May 12 '19 at 11:34
  • Which is essentially the same, just that you can choose any item, not just the first from the next or the last from the previous list. Not much of a difference. A greedy first guess, iterative refinement approach is easiest, but may fail (but it may well be unsolvable anyway). As mentioned above, this is certainly NP hard, and hence best solved with an existing ILP solver. – Has QUIT--Anony-Mousse May 12 '19 at 12:36
  • You're doing this user as well as S. O. a disservice. Such a heavy hand. – Paddy3118 May 12 '19 at 16:06
  • @Anony-Mousse, please remove your duplicate marking... my question and your mentioned linked is not same.. please check my question again... – MSI Shafik May 12 '19 at 16:35
  • @Anony-Mousse, you, yourself state that the questions are not the same and will have different answers.Your marking of duplication gives a message stating "This question already has an answer here:" You are contradicting yourself and yet refuse to remedy the situation. Poor show! – Paddy3118 May 12 '19 at 20:16
  • To quote: "the same strategy", "essentially the same". Where am I "contradicting" myself? The existing answers there can solve this Q, too. – Has QUIT--Anony-Mousse May 12 '19 at 22:19
  • But there are even closer duplicates to be found with 1 minute of searching: https://stackoverflow.com/questions/16415169/n-fold-partition-of-an-array-with-equal-sum-in-each-partition – Has QUIT--Anony-Mousse May 12 '19 at 22:28
  • @Anony-Mousse, you need to update your annotation. Whilst this new question you mention should share solutions, your original link would not. – Paddy3118 May 13 '19 at 12:43
  • I have neither an option to reopen nor to update the link here, you know... Just research old questions better in the first place, then this won't happen. I can't fix it. – Has QUIT--Anony-Mousse May 13 '19 at 18:18
  • OK then. Thanks for getting that latest link. – Paddy3118 May 14 '19 at 11:54

0 Answers0