How to get maximum sum of a coprime subset of naturals less than n?

Question

I need a function which lists smaller co-prime numbers of given number. For example, co(11) gives [1,7,9,10], sum of it gives me 27. But i want to get co-prime numbers which generates maximum sum. For co(11) it should eliminate 10 (since 5+8 > 10) and return [1,5,7,8,9] to get maximum sum which is 30. Here is the function:

import math
def Co(n):
    Mylist = [x for x in range(1, n)]
    removeds  =[]
    for x in Mylist:
        y = Mylist.index(x)
        for z in Mylist[y+1:]:
            if math.gcd(x, z) != 1:
                removed = Mylist.pop(y)
                removeds.append(removed)
                #print(removed)
                Mylist[1:] = Mylist
                #print(Mylist)
                break
    
    Mylist= list(dict.fromkeys(Mylist))
    removeds = list(dict.fromkeys(removeds))
    removeds.sort(reverse = True)
    for a in removeds:
        check = []
        for b in Mylist:
           if math.gcd(a, b) != 1:
               break
           else:
               check.append(a)
        if len(check) == len(Mylist):
           Mylist.append(a)
           
      
    print(Mylist)
    print(sum(Mylist))
Co(11)

and result is:

[1, 7, 9, 10]
27

In order to get maximum sum of possible co-prime sets, it should return

[1, 5, 7, 8, 9]
30

I thought about getting all possible co-prime sets then compare them to get the maximum summed one. But when the Co(N) gets bigger, it becomes uncontrollable and not efficient. I know this is more math problem than python but any hints would be appreciated.

David Eisenstat · Accepted Answer · 2020-11-29T22:29:34.630

Backtracking works, but to handle large n, you have to be careful about the branching strategy (I used the Bron–Kerbosch algorithm with pivoting for enumerating maximal cliques) and have an effective pruning strategy. The pruning strategy that I used colors the graph at the outset (I used a greedy coloring in reverse degeneracy order). To compute a bound for a particular recursive invocation of Bron–Kerbosch, add up the nodes already chosen (R) and for each color the maximum node of that color that may still be chosen (P), since two nodes of the same color definitely do not belong to the same clique.

In Python 3:

import math


def coprime_graph(n):
    return {
        i: {j for j in range(1, n) if j != i and math.gcd(j, i) == 1}
        for i in range(1, n)
    }


def degeneracy_order(g):
    g = {v: g_v.copy() for (v, g_v) in g.items()}
    order = []
    while g:
        v, g_v = min(g.items(), key=lambda item: len(item[1]))
        for w in g_v:
            g[w].remove(v)
        del g[v]
        order.append(v)
    return order


def least_non_element(s):
    s = set(s)
    i = 0
    while i in s:
        i += 1
    return i


def degeneracy_coloring(g):
    coloring = {}
    for v in reversed(degeneracy_order(g)):
        coloring[v] = least_non_element(coloring.get(w) for w in g[v])
    return coloring


def max_cliques(g, coloring, bound, r, p, x):
    if not p and not x:
        yield r

    best = {}
    for v in p:
        i = coloring[v]
        if v > best.get(i, 0):
            best[i] = v
    if sum(r) + sum(best.values()) <= bound[0]:
        return

    u_opt = min(p | x, key=lambda u: len(p - g[u]))
    for v in sorted(p - g[u_opt], reverse=True):
        p.remove(v)
        yield from max_cliques(g, coloring, bound, r | {v}, p & g[v], x & g[v])
        x.add(v)


def max_sum_clique(g):
    coloring = degeneracy_coloring(g)
    bound = [0]
    best_so_far = set()
    for clique in max_cliques(g, coloring, bound, set(), set(g), set()):
        objective = sum(clique)
        if objective > bound[0]:
            bound[0] = objective
            best_so_far = clique
    return best_so_far


def main(n):
    print(max_sum_clique(coprime_graph(n)))


if __name__ == "__main__":
    main(500)

thanks for your explanation and effort, there are so many things to understand. But when i tried for `main(30)`, it returns `{1, 2, 5, 7, 9}` as best so far which is the not **best** one actually. Am i missing something about whole function? — fatih arslan, Nov 29 '20 at 22:21
@fatiharslan fixed a bug -- I was iterating over the wrong set of vertices v in `max_cliques`. — David Eisenstat, Nov 29 '20 at 22:30
thank you so much. I think i'm gonna need at least 1 week to understand what this code does:) — fatih arslan, Nov 29 '20 at 22:34
Tested up to 450, all the numbers in result sets have at most two prime factors, which could probably lead to a more efficient algorithm. I wonder if there's a way to prove this is the case for larger numbers. — גלעד ברקן, Dec 02 '20 at 15:52

CristiFati · Answer 2 · 2020-11-29T18:51:51.497

I didn't actually went through your code, I assume it doesn't work as expected because once it ads in the solution an element that fits, it doesn't go back (to search for better alternatives), so you somehow get stuck with the 1^st solution.
There are a number of methods that solve this kind of problems, I'm going to use Backtracking.

Couple of notes about it:

Is simple (from my PoV)
Generates all possible solutions (that can be its strength, but also its flaw)
Is highly inefficient (for our problem), generally, it's the brute force equivalent. For higher n values, time complexity will grow exponentially

Also, Backtracking can be implemented in a number of forms, I chose the recurring one.

code00.py:

#!/usr/bin/env python

import sys
from math import gcd


def is_valid(item, arr):
    for elem in arr:
        if gcd(elem, item) > 1:
            return False
    return True


def bt(n, sols, cur_sol):
    if cur_sol:
        sols.add(tuple(cur_sol))
    for i in range(cur_sol[-1] + 1 if cur_sol else 1, n):
        if is_valid(i, cur_sol):
            cur_sol.append(i)
            bt(n, sols, cur_sol)
    else:
        if cur_sol:
            cur_sol.pop()


def main(*argv):
    solutions = set()
    current_solution = []
    n = 11
    start_time = time.time()
    bt(n, solutions, current_solution)
    end_time = time.time()
    print("Solutions:", sorted(solutions))
    print("\nFor n={0:d}, it took {1:.6f} seconds".format(n, end_time - start_time))
    best_solution = max(solutions, key=sum)
    print("\nBest solution: {0:}\nSum: {1:}".format(best_solution, sum(best_solution)))


if __name__ == "__main__":
    print("Python {0:s} {1:d}bit on {2:s}\n".format(" ".join(elem.strip() for elem in sys.version.split("\n")), 64 if sys.maxsize > 0x100000000 else 32, sys.platform))
    main(*sys.argv[1:])
    print("\nDone.")

Output:

[cfati@CFATI-5510-0:e:\Work\Dev\StackOverflow\q065062781]> "e:\Work\Dev\VEnvs\py_pc064_03.07.06_test0\Scripts\python.exe" code00.py
Python 3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)] 64bit on win32

Solutions: [(1,), (1, 2), (1, 2, 3), (1, 2, 3, 5), (1, 2, 3, 5, 7), (1, 2, 3, 7), (1, 2, 5), (1, 2, 5, 7), (1, 2, 5, 7, 9), (1, 2, 5, 9), (1, 2, 7), (1, 2, 7, 9), (1, 2, 9), (1, 3), (1, 3, 4), (1, 3, 4, 5), (1, 3, 4, 5, 7), (1, 3, 4, 7), (1, 3, 5), (1, 3, 5, 7), (1, 3, 5, 7, 8), (1, 3, 5, 8), (1, 3, 7), (1, 3, 7, 8), (1, 3, 7, 10), (1, 3, 8), (1, 3, 10), (1, 4), (1, 4, 5), (1, 4, 5, 7), (1, 4, 5, 7, 9), (1, 4, 5, 9), (1, 4, 7), (1, 4, 7, 9), (1, 4, 9), (1, 5), (1, 5, 6), (1, 5, 6, 7), (1, 5, 7), (1, 5, 7, 8), (1, 5, 7, 8, 9), (1, 5, 7, 9), (1, 5, 8), (1, 5, 8, 9), (1, 5, 9), (1, 6), (1, 6, 7), (1, 7), (1, 7, 8), (1, 7, 8, 9), (1, 7, 9), (1, 7, 9, 10), (1, 7, 10), (1, 8), (1, 8, 9), (1, 9), (1, 9, 10), (1, 10), (2,), (2, 3), (2, 3, 5), (2, 3, 5, 7), (2, 3, 7), (2, 5), (2, 5, 7), (2, 5, 7, 9), (2, 5, 9), (2, 7), (2, 7, 9), (2, 9), (3,), (3, 4), (3, 4, 5), (3, 4, 5, 7), (3, 4, 7), (3, 5), (3, 5, 7), (3, 5, 7, 8), (3, 5, 8), (3, 7), (3, 7, 8), (3, 7, 10), (3, 8), (3, 10), (4,), (4, 5), (4, 5, 7), (4, 5, 7, 9), (4, 5, 9), (4, 7), (4, 7, 9), (4, 9), (5,), (5, 6), (5, 6, 7), (5, 7), (5, 7, 8), (5, 7, 8, 9), (5, 7, 9), (5, 8), (5, 8, 9), (5, 9), (6,), (6, 7), (7,), (7, 8), (7, 8, 9), (7, 9), (7, 9, 10), (7, 10), (8,), (8, 9), (9,), (9, 10), (10,)]

For n=11, it took 0.000000 seconds

Best solution: (1, 5, 7, 8, 9)
Sum: 30

Done.

Thanks for your answer. This is what i've thought, but couldnt figure how to do it in python. But this solution fails when N becomes bigger. For example, when you give n=100 its lasting forever to return something. — fatih arslan, Nov 29 '20 at 18:30
But bear in mind, that (I don't think that) there is a way that would work ok for any *n*. So what is the expected *max* *n* (and what's the expected time)? — CristiFati, Nov 29 '20 at 19:39
Actually _max n_ is **200000** and time isn't really important. — fatih arslan, Nov 29 '20 at 22:23

How to get maximum sum of a coprime subset of naturals less than n?

2 Answers2

Linked