3

I've read Wikipedia on the Smith Set, Schwartz Set, Kosaraju's Algorithm, Tarjan's Algorithm, and the path-based strongly component algorithms; however, my experience with such algorithms is…lacking. Wikipedia also says you can use a version of Kosaraju's algorithm to generate the Schwartz set—and that these algorithms can calculate the Smith set.

Wikpedia also has some pseudo-code for Tarjan's algorithm, but not the others; and it's not specific to this relatively-sensitive application. I'm also not 100% certain which is the simplest to implement—which has the feature of least likelihood of errors in implementation.

I'd like some more-direct pseudocode to cover computing the Smith and Schwartz set from one of these algorithms, given a set of ranked ballots. I find it easier to grasp concepts when I have a practical process I can walk. I'll turn it into actual code myself.

Consider the following data structure:

Type Ballot {
  Array Votes[] {
    Candidate Candidate; // We do this in C#
    int Rank;
  }
}

For a collection of Ballots, each individual Ballot will contain an array Votes such as the following:

Ballot b.Votes[] =
  [ Vote(Alex.ID, 1),
    Vote(Chris.ID, 3),
    Vote(Sam.ID, 2) ];

This corresponds to a voter casting Alex>Sam>Chris, and there may be further candidates who are all equally less-preferred than Chris.

I assume the first step would be to tally the individual votes and graph the victory. For example: if 100 voters rank Alex above Sam (Alex = 1, Sam >= 2) and 50 voters rank Sam above Alex, Alex defeats Sam. Thus I guess there'd be a data structure as such:

Type GraphNode {
  Candidate Candidate;
  GraphNode Array Defeats[];
  GraphNode Array Ties[];
  GraphNode Array DefeatedBy[];
}

Whereby a GraphNode for Alex would have an element in Defeats[] pointing to a GraphNode for Sam, and vice versa.

Given these GraphNodes, what do I do with it to identify the Smith and Schwartz sets?

Thanks in advance.

John Moser
  • 465
  • 5
  • 11
  • 1
    How efficient does this have to be? The Wikipedia articles also list the Floyd-Warshall algorithm that would be slower but much, much easier to implement. – gus Apr 04 '19 at 20:04
  • If it's the same time complexity it's fine. Most likely the graph's maximum size will only ever be 9, and so the upper bound for repeatedly re-calculating either set would be 7 runs. Most likely any of these algorithms is fine, even the O(n^3) one. – John Moser Apr 04 '19 at 21:06
  • With such a small graph you could even try all the subsets, there are only 2^9 = 512 possibilities. Still the O(n^3) Floyd-Warshall approach might even be easier. – gus Apr 04 '19 at 21:22
  • I'm not sure I see how Floyd-Warshall applies. It identifies shortest path, but I'm looking at strongly-connected components. If A defeats B, B defeats C, C defeats A, and each of {A,B,C} defeats all of {D,E,F,G}, then the Smith Set is {A,B,C}. I *believe* if C only *ties* with A, the Smith Set is {A,B,C} and the Schwartz Set is {A}. – John Moser Apr 04 '19 at 21:56

1 Answers1

3

I guess python is close enough to pseudocode.

Let's say we have n candidates numbered from 0 to n - 1.

First you can compute a matrix beats[i][j] equal to True if candidate i beats candidate j and False otherwise.

Now compute the transitive closure of the matrix using the Floyd-Warshall algorithm:

for k in range(n):
    for i in range(n):
        for j in range(n):
            beats[i][j] = beats[i][j] or (beats[i][k] and beats[k][j])

After that the matrix has a slightly different meaning: beats[i][j] means there is a "beating path" i -> c1 -> c2 -> ... -> j such that i beats c1, c1 beats c2 and so on up to j.

The Schwartz components are the ones in which all pairs i, j have beating paths running both ways, and there's no other candidate that beats any of them (see the Wikipedia section on properties mentioning a top cycle).

Basically for each candidate i try building a component around it, like this:

schwartz_components = []

for i in range(n):
    schwartz_component = {i}
    is_schwartz = True
    for j in range(n):
        if beats[j][i]:
            if beats[i][j]:
                schwartz_component.add(j)
            else:
                is_schwartz = False
    if is_schwartz:
         schwartz_components.append(schwartz_component)

schwartz_set = set.union(*schwartz_components)
print(schwartz_set)

For the Smith set it would be a bit different, you'd need to start with cannot_beat[i][j] = not beats[i][j], use Floyd-Warshall on that and then build the set around each i by adding all the candidates with a path to it through the cannot_beat relation.

I guess it would go something like this (after the Floyd-Warshall step):

smith_candidates = []

for i in range(n):
    smith_candidate = {i}
    for j in range(n):
        if cannot_beat[i][j]:
            smith_candidate.add(j)
    smith_candidates.append(smith_candidate)

# Take the smallest candidate
smith_set = min(smith_candidates, key=len)

There's likely a bug somewhere, but that's the idea, roughly.

gus
  • 831
  • 6
  • 9
  • Aha! That helps. So, if I'm reading this right, the first iteration of k only finds things that immediately defeat one another (if A beats B beats C, we don't know that A beats C); but for (n) nodes, it re-walks the graph (n) times, and on pass (k) we've identified all beatpaths (k) long. Yes, ok, this will work, and that's dead-simple. Thanks! – John Moser Apr 04 '19 at 23:25