2

Let's say I have the following unweighted, undirected graph where edges can be connected by two different types of edges: support edges (green) and opposition edges (red).

Here's an example:

Example support-opposition graph

I want to calculate the "distance" of opposition or support between any given two nodes. For example, if the nodes represented countries at war or political candidates, even though A and D have no edge between them we might conclude that they are likely to be opposed to one another since A is opposed to C and C supports D.

This is a simple example, but given a large graph with many nodes of high degree, how might I determine how likely any two nodes might be opposed to or supporting one another if they cannot be directly connected by a successive chain of opposition/support edges?

I imagine you'd represent each node as a vector whose components where whether an edge of a type exist between any other nodes. If this is a good way to go, what distance measure would you use (Euclidean, Hamming, etc?)

user1569339
  • 683
  • 8
  • 20
  • Interesting problem, I think it's more like engineering problem than algorithm one? I do not have an idea yet, but could you tell me what is the relation (likely) between A & C, B&D in the following graph: { B oppose A; B oppose C; D support A; D support B } – shole Aug 20 '15 at 06:28
  • I think you must read about `bipartite`. – vish4071 Aug 20 '15 at 06:56
  • 1
    Also, if in your example, lets say there is one more node F. BC is not present. CF and BF are support edges. In that case, what is your relation between C/B (as CAB shows opposition and CFB shows support) – vish4071 Aug 20 '15 at 06:59
  • 1
    Also,your nodes don't have coordinates, so how do you plan on representing your nodes as vectors? If it is according to edge types, won't there be collision as in my example (above comment)? – vish4071 Aug 20 '15 at 07:02
  • @shole Are you talking about something like a mechanical equilibrium problem? I'm not too familiar with the engineering discipline. – user1569339 Aug 20 '15 at 07:35
  • 1
    @user1569339 no, I am not familiar with that too :) , what i meant is that it sounds you are asking for some good "definition" of "distance" between nodes, while this definition has to be tuned based on your conditions. Therefore we try to ask you "how you handle that manually?" with some examples above, to gain some sense to see how to define the "distance" – shole Aug 20 '15 at 07:39
  • @vish4071 Can you explain how modeling this as a bipartite graph may help? Yes having different paths between two nodes yield different results is possible. My idea for vectors would be that they are adjacency lists. – user1569339 Aug 20 '15 at 07:42
  • I didn't mean how you would implement vectors but what logic would you use to create that (adjacency list) of yours. – vish4071 Aug 20 '15 at 07:46
  • I modeled your ques as bipartite as I thought that final result would be two groups such that each member within group supports others while opposes those in other group. eg. In your example, if you use this concept, sets will be (A,B) and (C,D,E). – vish4071 Aug 20 '15 at 07:49
  • And this is precisely why I asked you what to do in case of conflict – vish4071 Aug 20 '15 at 07:49
  • @shole Is the correct picture of your example: https://docs.google.com/drawings/d/10JE4itIytpEZICqtxI8Jz5WrUIhxiy8eei6F9JRjtxE/pub?w=360&h=276 ? If so, I cannot give a definitive answer. I have 2 heuristics: "an enemy of my ally is my enemy" and "an enemy of my enemy is a friend". The first yields A&C=supporter. Second yields A&C=opponent. They seem equally likely without being able to give one more weight. – user1569339 Aug 20 '15 at 08:14
  • @user1569339 the image has a little bit wrong: it only has 4 nodes, and D connects to C, not B...it's my fault, I wrote wrongly above :( – shole Aug 20 '15 at 08:21
  • @vish4071 Given that you only follow 1 type of path (only support, only oppose) I guess you could partition the graph into k-partite graph. – user1569339 Aug 20 '15 at 08:22
  • Why would you use k-partite? Are you using some other attribute as k-partite mean we have k types of relations, but here we have 2 (oppose and support). – vish4071 Aug 20 '15 at 08:36
  • I can also propose one other solution (but this would be expensive and I'm not sure how I would implement this). You can create n*n matrix. Then, for each node, find every path to every other node. If x paths show support and y paths show opposition, entry of that matrix would be (x-y). Now, positive values show support, negative show opposition and 0 means neutral. – vish4071 Aug 20 '15 at 08:41
  • I see, now you have changed the example. I can explain above method on this example. Say, we find matrix entry for A/D. Since A-D has 2 paths (AD and ABD), while 1 shows support, 1 shows oppose, entry for A/D will be 0. Similarly, entry for A/C will be 0. But for BC, matrix entry would be -1(only 1 path that shows opposition). – vish4071 Aug 20 '15 at 08:46
  • @vish4071 I did not intend to change the example. I have changed it back. Your matrix solution is exactly what I'm looking for, however my question is how to produce that result. – user1569339 Aug 20 '15 at 19:19
  • Changing example is fine, as far as you understand. I'll think over and try and propose an implementation. – vish4071 Aug 20 '15 at 21:51

2 Answers2

0

This problem looks as if it needs numerical optimization. Here is an approach:

Introduce a random variable for every node. This variable will be in the range [-1, 1], where -1 means clear opposition and +1 means clear support. Values in between give you the probabilities for either. Fix the variable for the node you're interested in to 1 (hence, it will not be part of the optimization).

Now, define a potential function on the edges. I would suggest the absolute difference:

v1, v2 are the incident nodes' values
for supporting edges:
  P(v1, v2) = abs(v1 - v2)
for opposing edges:
  P(v1, v2) = abs(v1 + v2)

Depending on your optimization method, you might need a differentiable potential function. You could for example make these functions differentiable with:

for supporting edges:
  P(v1, v2) = sqrt((v1 - v2)^2 + epsilon)
for opposing edges:
  P(v1, v2) = sqrt((v1 + v2)^2 + epsilon)

where epsilon is a rather small number.

The sum of all potentials is the energy:

E = Sum P_i

Then, you need to find arg min E with respect to v. There are several optimizers. Just try, which one performs best (e.g. gradient descent, simulated annealing, L-BFGS...). If you stick to convex potentials (like the absolute difference), simple gradient-descent is probably enough.

This gives you a support value for every other node. If edges do not contradict each other, all values will be either +1 or -1. If you have contradicting edges, other values are possible as well.

Your example would result in these support values (with respect to A):

B: -0.333  (probably opposing)
C:  0.333  (probably supporting)
D:  0.333  (probably supporting)
Nico Schertler
  • 32,049
  • 4
  • 39
  • 70
  • This is very interesting. Do you have a resource I could peruse to get more perspective on this approach? – user1569339 Aug 20 '15 at 19:25
  • Unfortunately not. But I may give you some keywords. The entire procedure is a continuous optimization of a graphical model (Markov Random Field). Do not mix it with discrete optimization. The algorithms are fundamentally different. – Nico Schertler Aug 20 '15 at 20:16
  • It seems as if the example graph has changed again. Here is the result for the current one: B:1, C:-1, D:-1, E:-1. – Nico Schertler Aug 20 '15 at 20:22
  • Apologies for the changing example. – user1569339 Aug 21 '15 at 01:46
0
  • Find the shortest path(s) between your two nodes
    • If no path exists your result is unknown
  • Find all of the distinct edges in your collection of shortest paths
  • Sum all of the distinct edges (green:+1 and red:-1)
  • The result is your score
    • A positive score is support
    • A negative score is opposition
    • A score of 0 is neutral
  • The magnitude of the score can show greater levels of support or opposition
Louis Ricci
  • 20,804
  • 5
  • 48
  • 62
  • Wouldn't your solution would yield a result of neutral for the example in the question. – user1569339 Aug 20 '15 at 19:22
  • Why only find shortest path? Isn't non-shortest path should have effect to the result as well with a lower weight, naturally? – shole Aug 21 '15 at 01:11
  • @shole - The shorter the path the more relavent the score will be. If A and B are connected by a green edge, but there are also ACB (green, red) and ADEB (red, green, red) those longer paths are less important, because we already know from the shortest path that A and B support eachother. – Louis Ricci Aug 21 '15 at 11:40